U.S. IOOS participated for the second time as a mentoring organization for Google’s Summer of Code. The 2022 GSoC edition featured four new projects ranging from data visualization and data accessibility to data pre-processing for cloud optimization and code re-factorization.

Project Details

The four projects were:

  • echoshader: Interactive visualization of ocean sonar data.
  • kerchunk - interfacing WRF forecast grib2 data to zarray.
  • pyobis: Making ocean biodiversity data easily accessible with python.
  • erddapy: Refactoring into separate core and object layers.
Echoshader Echoshader, an open source project, aims to enhance the ability to interactively visualize large amounts of cloud-based data to accelerate the data exploration and discovery process. Ocean sonar data is generated from echopype, which handles the normalization, preprocessing and organization of echo data. Echoshader will be developed in parallel with the ongoing development of echopype. As participants of GSoC, the team aimed to develop the main APIs of echoshader based on the HoloViz suite of tools, test configuration for using echoshader widgets in Panel dashboards, and create Jupyter notebooks to demo use of the combination of tools. The main technology stacks used are Hvplot, Panel, fogrib2 data to zarraylium and PyVista. Contributor: Dingrui Lei Mentors: Valentina Staneva, Wu-Jung Lee, Brandon Reyes, Don Setiawan, and Emilio Mayorga Kerchunk The Kerchunk package provides a means to access legacy geospatial datasets, such as GRIB2, in a cloud performant manor by creating a reference file which maps to variables within the datasets, this allows direct access to many data chunks within various files in one go, reducing overall latency and allowing easy parallel access. At present the scan_grib module within Kerchunk does not work when considering GFS operational data as GRIB data from different institutions can differ. The team proposed adapting the existing module to successfully access operational GFS forecast data before providing an example of the utility of Kerchunk to create a dictionary of lazy dimensional arrays containing all GFS variables. Contributor: Peter Marsh Mentors: Rich Signell and Martin Durant Pyobis The goal of the project is to modify the pyobis python package to support the new OBIS API v3. This project also aims to improve tests coverage, diversify example usage, analyze and visualize fetched data from the updated package, and push the new version to PyPI. The pyobis package is really powerful and can fetch huge ocean biodiversity data through the OBIS API. It is interesting to note that OBIS also holds data for species even dating back to around 1078 AD, which makes pyobis even more essential to be maintained. As an extended objective, the team also aimed to create a high-level module inside pyobis package that can allow users to visualize data directly. Contributor: Ayush Anand Mentors: Tylar Murray, Mathew Biddle, and Filipe Fernandes Erddapy The erddapy package provides a Python interface to the ERDDAP data server API. Currently, most of erddapy's functionality is concentrated into a single class, and the URL building features are implemented in that class along with the data transformation methods that process server responses into Python objects, such as Pandas DataFrames. This project proposes to separate erddapy into core and object (or opinionated) layers. The former will hold the URL building and data transformation functionality, which will be reused by the rest of the library. The latter layer will provide high-level objects that will support a functional API that does not depend on the state of the underlying object (which is the case for the current version of erddapy). This functional API will provide cleaner iterative usage when querying multiple servers and datasets, and new classes implemented in the object layer will support serialization so that they can be pickled and passed on to other processes or machines. To execute this project, the team delineated two separate aims: refactoring the URL building and data transformation functionality into a new module containing minimal, standalone functions and reusing those functions in the existing primary class; and implementing an additional layer containing the high-level objects that will provide the basis for the functional API. Overall, this will greatly improve the flexibility and scalability of the package, and will help support its wide spectrum of users. Contributor: Vini Salazar Mentors: Filipe Fernandes, Alex Kerney, Mathew Biddle, and Callum Rollo More information about each project is also available on the IOOS organization page on the Google Summer of Code Website.