Software
Open source software development is central for reproducible open science. As part of my time as a Data Science Fellow at the UW eScience Institute I contribute to many open-source software projects on GitHub. Here are a few recent ones:
Xarray
Xarray is an open source project and Python package that introduces labels in the form of dimensions, coordinates, and attributes on top of raw NumPy-like arrays, which allows for more intuitive, more concise, and less error-prone user experience. This is a fundamental piece of software for working with multi-dimensional gridded geoscience data!
I'm a member of the Xarray Core team, helping to bring in grants to fund software developers, run workshops and develop the Xarray Tutorial Website.
Coincident
A Python library for efficient, on-demand search, subsetting, and data access of datasets relevant to NASA's Surface Topography and Vegetation (STV) program
SlideRule
Sliderule is a Cloud-processing framework for interactive analysis of NASA's ICESat-2 dataset. It makes working with a very large and complicated geospatial vector dataset intuitive and interactive.
Pangeo Docker Images
Reproducible scientific computations require sophisticated dependency management. This project maintains default environment for geospatial workflows with Python, including machine-learning libraries like Tensorflow and Pytorch: https://github.com/pangeo-data/pangeo-docker-images