INHABIT species potential distribution across the contiguous United States (ver. 4.0, June 2024)
This is a dataset containing the potential distribution of 259 invasive terrestrial plant species. We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies and other managers. We applied the modeling workflow developed in Young et al. (2020, https://doi.org/10.1371/journal.pone.0229253) and adapted by Jarnevich et al. (2023, https://doi.org/10.1016/j.ecoinf.2023.101997). We developed a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1: https://doi.org/10.1371/journal.pone.0263056) and relied on human input based on natural history knowledge to narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling (SAHM 2.2.2, Morisette et al., 2013). For each species, we generated up to three groups of models reflecting various levels of suitability including suitability for occurrence, suitability for abundance (greater than 5% cover), and suitability for high abundance (greater than 25% cover), where there were enough data available to create models. For occurrence, we accounted for uncertainty related to sampling bias by using two alternative sources of background samples. For all three groups of models, we constructed weighted ensembles using up to 20 models (occurrence) or 10 models (abundance) for each species. We also combined the three ensembles using three different thresholds converting the continuous values to suitable/unsuitable, ranging from inclusive to restrictive.
This data bundle contains a single file of tabular summaries by management unit (including each species/ensemble type/abundance level combination), a file describing the changes from version 3, and a species metadata file.
There is also a subfolder for each species that contains the merged data sets used to create models, up to 9 raster files associated with the species, and tabular outputs including response curve data, variable importance information, and model assessment metrics.
The potential nine rasters included in each species subfolders represent the following:
1) Occurrence suitability - Continuous value ensemble
2) Abundance suitability - Continuous value ensemble
3) High abundance suitability - Continuous value ensemble
4) Restricted occurrence suitability - Continuous value ensemble with restricted environmental conditions*
5) Restricted abundance suitability - Continuous value ensemble with restricted environmental conditions*
6) Restricted high abundance suitability - Continuous value ensemble with restricted environmental conditions*
7) 0.01 – first percentile threshold applied to model group ensemble
8) 0.05 – fifth percentile threshold applied to model group ensemble
9) 0.1 – tenth percentile threshold applied to model group ensemble
*Restricted environmental conditions = only display areas where environmental characteristics are inside the range of the values used to develop the model. For example, a location with a minimum winter temperature of 12 C would be outside the range of -10 to 10 C used in model development.
The bundle documentation files are:
1) 'project_metadata.xml' (this file) which contains the project-level metadata.
2) managementSummaries.csv is the tabular summaries by management unit.
3) 'INHABIT_VersionHistory.txt' contains information on the methodological changes incurred between this release and the previous data release.
4) 'species_metadata.csv' contains information on specific model changes of each species from tuning algorithm parameters to ensure model quality.
5) 'mergedDataset.csv' contains the merged data set used to create the models, including location and associated environmental data, for each species.
6) XX.tif where XX is the raster type explained above.
7) 'responseCurves.csv' is the tabular information need to produce response curves for each predictor retained in each of the 10 models produced for each species.
8) 'variableImportance.csv' is the tabular summaries indicating predictor importance for each of the models produced for each species.
9) 'assessmentMetrics.csv' is the tabular summaries of assessment metrics for each model or ensemble for each species.
These data will be integrated into the fourth version of the Invasive Species Habitat Tool (INHABIT), a web application displaying visual and statistical summaries of nationwide habitat suitability models for manager identified invasive plant species.
Citation Information
Publication Year | 2024 |
---|---|
Title | INHABIT species potential distribution across the contiguous United States (ver. 4.0, June 2024) |
DOI | 10.5066/P14HNEJF |
Authors | Catherine S Jarnevich, Peder Engelstad, Demetra Williams, Keana Shadwell, Cameron Reimer, Grace Henderson, Janet S Prevey, Ian S Pearse |
Product Type | Data Release |
Record Source | USGS Asset Identifier Service (AIS) |
USGS Organization | Fort Collins Science Center |
Rights | This work is marked with CC0 1.0 Universal |