Reproducibility starts at the source: R, Python, and Julia Packages for retrieving USGS hydrologic data
December 9, 2023
Much of modern science takes place in a computational environment, and, increasingly, that environment is programmed using R, Python, or Julia. Furthermore, most scientific data now live on the cloud, so the first step in many workflows is to query a cloud database and load the response into a computational environment for further analysis. Thus, tools that facilitate programmatic data retrieval represent a critical component in reproducible scientific workflows. Earth science is no different in this regard. To fulfill that basic need, we developed R, Python, and Julia packages providing programmatic access to the U.S. Geological Survey’s National Water Information System database and the multi-agency Water Quality Portal. Together, these packages create a common interface for retrieving hydrologic data in the Jupyter ecosystem, which is widely used in water research, operations, and teaching. Source code, documentation, and tutorials for the packages are available on GitHub. Users can go there to learn, raise issues, or contribute improvements within a single platform, which helps foster better engagement and collaboration between data providers and their users.
Citation Information
Publication Year | 2023 |
---|---|
Title | Reproducibility starts at the source: R, Python, and Julia Packages for retrieving USGS hydrologic data |
DOI | 10.3390/w15244236 |
Authors | Timothy O. Hodson, Laura A. DeCicco, Jayaram Athreya Hariharan, Lee Stanish, Scott Black, J. S. Horsburgh |
Publication Type | Article |
Publication Subtype | Journal Article |
Series Title | Water |
Index ID | 70250791 |
Record Source | USGS Publications Warehouse |
USGS Organization | Central Midwest Water Science Center; WMA - Integrated Information Dissemination Division |