FAQs About Data Release
Answers to FAQs about the ScienceBase data release process.
Frequently Asked Questions
- Where can I find information about how to create and/or review a metadata record?
- How can I grant read/write permissions to USGS and non-USGS users while a data release is still in progress?
- What if I need to revise my data after they have been released?
- Will ScienceBase send the XML metadata record(s) from my data release to the USGS Science Data Catalog?
- Why is CSV format recommended instead of Excel?
- What is the file size limit for uploading and downloading files?
- Can I release legacy data in ScienceBase?
- A). My data release is associated with a publication. How will the two reference each other?
B). I don’t have the publication’s citation yet, but I would like to release the data now. Can I add the citation at some point in the future? - Which repository should I use to release code?
- What repository services does ScienceBase provide for USGS data release products?
- How can I see other data releases from my Science Center or from a particular period of time?
- How do I update my name on a previously published data release?
► Where can I find information about how to create and/or review a metadata record?
- The USGS data management website: https://www.usgs.gov/products/data-and-tools/data-management/describe-metadatadocumentation.
- The USGS tool for metadata creation is the Metadata Wizard. Users fill out a form by answering questions about their data. They can then generate and output XML metadata records in the correct format. The Metadata Wizard has the ability to parse information from certain geospatial and tabular file types, as well as automate the process of describing column (and value) definitions.
- The USGS Metadata Parser tool (https://mrdata.usgs.gov/validation/) allows users to validate an XML metadata file against the FGDC CSDGM standard and view it in an easy-to-read format.
► How can I grant read/write permissions to USGS and non-USGS users while a data release is still in progress?
-
To give permissions to USGS employees and other users with ScienceBase accounts, select the "Manage Item" dropdown menu, then "Manage Item Permissions":
Sources/Usage: Public Domain. View Media DetailsSelect "Custom Permissions". Enter a user’s name or email address into the "User" text box. Wait for the autocomplete to find the user's ScienceBase account, then select it and click "Add".ScienceBase accounts are automatically created for users the first time they log in with their Active Directory credentials. If someone hasn't logged in to ScienceBase before, they won’t yet have an account. Users without Active Directory credentials can request a ScienceBase account if they are collaborating with USGS partners.
If you would like to create a user group in ScienceBase for managing permissions, please contact sciencebase@usgs.gov.
-
To share a private data release with someone outside the USGS (e.g., for a journal review), click "Manage Anonymous Access Links" in the "Item Actions" section at the bottom of the page:
Sources/Usage: Public Domain. View Media DetailsSelect "Create New Anonymous Entry Link". This will create a temporary URL you can share with reviewers, allowing them to view the data release without having to sign up for a ScienceBase account. The data release will be locked for editing while the link is active. To unlock, select "Manage Anonymous Access Links" again and remove the link.
► What if I need to revise my data after they have been released?
The USGS Fundamental Science Practices (FSP) website describes procedures for documenting revisions to data releases. Please follow this guidance if you need to correct or add to published data. Contact the ScienceBase team at sciencebase_datarelease@usgs.gov when you are ready to update your data release.
Here are examples of revised data releases in ScienceBase:
- Pinzari, C.A. and Bonaccorso, F.J., 2018, Hawaiian Islands Hawaiian Hoary Bat Genetic Sexing 2009-2018 (ver. 3.0, November 2019): U.S. Geological Survey data release, https://doi.org/10.5066/P9R7L1NS.
- Engott, J.A., 2018, Mean annual water-budget components for the Island of Oahu, Hawaii, for current conditions, 2001-10 rainfall and 2001-10 land cover (ver. 2.0, February 2018): U.S. Geological Survey data release, https://doi.org/10.5066/F72F7KH4.
► Will ScienceBase send the XML metadata record(s) from my data release to the USGS Science Data Catalog?
Yes, by default ScienceBase will automatically perform this function for authors. Metadata records on the landing page and all child items will be sent to the USGS Science Data Catalog (SDC) after the data release is made public.
Some science centers and programs have alternate methods of submitting metadata records to the SDC and may not wish for their records to be sent from ScienceBase. This option is also supported; ScienceBase keeps a list of these centers, and XML records associated with their data release products will not be sent from ScienceBase. If you would like to add your center to this list, please contact sciencebase_datarelease@usgs.gov.
► Why is CSV format recommended instead of Excel?
Comma-separated values format (.csv) is preferable to Microsoft Excel format (.xlsx) because .csv is often more machine-readable and can be more easily incorporated into other workflows. While both .csv and .xlsx are considered open formats (that is, you don't need proprietary software to view them), .xlsx supports features that can make it less machine-readable. For example, if there are multiple worksheets in an Excel workbook or if some of the information is conveyed through formatting, it would be more difficult to use or work with the data in other applications (e.g. Python, R).
► What is the file size limit for uploading and downloading files?
Files larger than 1GB should be uploaded using the ScienceBase Cloud Uploader tool available in the "Item Actions" section at the bottom of a ScienceBase page. While performance may still be dependent on users' local internet connections, files up to ~30 GB in size can be uploaded.
► Can I release legacy data in ScienceBase?
Yes, but ScienceBase has a formal process for publicly releasing data, which enables the ScienceBase team to catalog, track, and update these resources in a uniform way. If you would like to release your legacy data in ScienceBase, you will need to go through FSP review and work with the ScienceBase team.
► A). My data release is associated with a publication. How will the two reference each other?
A). The citation will be added to the landing page in the "Related External Resources" section (see example). In associated publications, data release citations should be included in the reference section. USGS publications have links to their associated data releases at the top of their landing pages in the USGS Publications Warehouse.
B). I don’t have the publication’s citation yet, but I would like to release the data now. Can I add the citation at some point in the future?
B). Yes, a publication’s citation can be added to a data release at any time, even after it has been made public and the edit permissions have been restricted. If you would like to add a citation to a public data release, please send the citation to sciencebase_datarelease@usgs.gov (or to someone on the ScienceBase team) and we’ll add it to the landing page. If you’ve updated the metadata to include the publication’s citation, please also send the most recent version of the metadata and we’ll replace the metadata in the data release.
► Which repository should I use to release code?
The repository for software is USGS GitLab (https://code.usgs.gov), a Git-based platform for software development (additional information). Users can mint a DOI using the USGS Asset Identifier Service to point to the software in GitLab.
If a data release has associated code (e.g., a Python script used to process the data), it can be included as part of the data release in ScienceBase. All code uploaded to ScienceBase must be well-documented.
► What repository services does ScienceBase provide for USGS data release products?
ScienceBase supports the following services:
- Providing reliable access to public data release items
- Curating landing page content
- Creating multiple backups of data and metadata
- Calculating checksums to ensure file integrity
- Directing inquiries about the data to the point of contact listed for the data release
Science centers / data authors are responsible for the following:
- Answering questions about the data
- Correcting any errors discovered in the data
- Records management and data archival responsibilities for internal Bureau purposes (e.g., Scientific Case Files) according to the USGS Records Program. These responsibilities extend beyond public data access requirements for open data. Contact your local Records Management Contact or the USGS Records Management Program at recman@usgs.gov for additional information.
- Performing file format migrations or data transcriptions, if necessary
► How can I see other data releases from my Science Center or from a particular period of time?
Check out the ScienceBase Data Release Summary Dashboard to see a breakdown of data releases by Mission Area, Region, and Science Center and to filter by time ranges. This dashboard uses ScienceBase's advanced querying capabilities to generate this information. Learn how to create these queries yourself here.
► How do I update my name on a previously published data release?
Once your name change has been updated in Active Directory, email sciencebase_datarelease@usgs.gov with a list of data release DOIs that need to be updated.