Software

Open source software development is central for reproducible open science. As part of my time as a Data Science Fellow at the UW eScience Institute I contribute to many open-source software projects on GitHub. Here are a few recent ones:

Xarray

Xarray is an open source project and Python package that introduces labels in the form of dimensions, coordinates, and attributes on top of raw NumPy-like arrays, which allows for more intuitive, more concise, and less error-prone user experience. This is a fundamental piece of software for working with multi-dimensional gridded geoscience data!

I'm a member of the Xarray Core team, helping to bring in grants to fund software developers, run workshops and develop the Xarray Tutorial Website.

SlideRule

Sliderule is a Cloud-processsing framework for interactive analysis of NASA's ICESat-2 dataset https://slideruleearth.io/web/. It makes working with a very large and complicated geospatial vector dataset intuitive and interactive.

Pangeo-docker-images

Reproducible scientific computations require sophisticated dependency management. This project maintains default environment for geospatial workflows with Python, including machine-learning libraries like Tensorflow and Pytorch https://github.com/pangeo-data/pangeo-docker-images

InSAR Processing

My first foray into Cloud computing was setting up software to generate thousands of InSAR interferograms (very computationally intensive) on AWS Batch. https://github.com/scottyhq/dinosar

links

social