Model archive summary and suspended-sediment concentrations from a surrogate ordinary least square regression analysis for station 05517500, Kankakee River at Dunns Bridge, Indiana, April 2016 through July 2020
March 31, 2025
Sediment accumulation and transport negatively affect flood control, water supply, aquatic life, reclamation, and recreation (Angino and O’Brien, 1968) and are concerns of resource managers in the Kankakee River Basin of northern Indiana and throughout many regions of the United States. By relating continuously monitored water-quality data to discrete data collected from April 2016 through July 2020, linear regression was used to develop models for estimating concentrations of suspended sediment. Developed regression models indicated a strong correlation between continuous turbidity and suspended-sediment concentration (adjusted coefficient of determination equals 0.765, predicted residual error sum of squares equals 0.122). Daily loads of suspended sediment were computed from regression model concentrations and instantaneous streamflow. Monthly loads were then calculated to provide a clearer representation of seasonality. The estimated mean monthly suspended sediment load (April 2016 through July 2020) was 4726.5 tons per month; the estimated median monthly suspended sediment load was 4447.2 tons per month with a range in monthly loads from 741.2 to 9992.8 tons per month. The development of regression models for suspended sediment, total nitrogen, and total phosphorus relied on the collection of representative discrete water-quality samples and the operation of continuously deployed monitors throughout the range of hydrologic and seasonal conditions at the site. Regression models were developed following USGS protocols and methods (Helsel and others, 2020; Rasmussen and others, 2009). Each regression model relates laboratory-analyzed discrete water-quality sample data with continuously deployed water-quality monitor measurements. Ordinary least squares regression analysis was done using the R statistical software programming language (R Core Team, 2021) to evaluate the relationship between the discrete concentrations of suspended sediment and continuously measured parameters as well as seasonality and time over the study period (explanatory variables) (water temperature, specific conductance, pH, dissolved oxygen, turbidity, and streamflow). To improve potential models, explanatory and response variables were evaluated for transformations (log, square root, or square) that linearize the relation or change the distributional characteristics of data resulting in model residuals that are more symmetric, linear, and homoscedastic. Statistical models for all possible combinations of explanatory and response variables were evaluated using stepwise regression. To further evaluate potential models, diagnostic plots were created to assess how each model’s residuals varied as a function of (1) predicted values, (2) normal quantiles, (3) date, and (4) streamflow. Additional plots highlighted differences among predicted and observed values, residuals by season, and residuals by year. A variety of model statistics and diagnostics were used to determine the best predictors of each modeled constituent including tests of significance, standard error, adjusted coefficient of determination (R2), and the predicted residual error sum of squares (PRESS) statistic. The PRESS statistic is a leave-one-out form of cross-validation that provides a measure of model fit for sample observations not used to develop the regression model. In general, the smaller the PRESS statistic, the better the model’s predictive ability (Helsel and Hirsch, 2002). The optimal models commonly used a mathematically transformed response variable. In those instances, a bias correcting factor (BCF) was used to correct for bias that occurs when back-transforming model results back into base-10 units (Helsel and Hirsch, 2002). Prediction intervals were computed for each model following methods from Helsel and Hirsch (2002), to define the range of values within which there is 90-percent certainty that the true value occurs.
Citation Information
Publication Year | 2025 |
---|---|
Title | Model archive summary and suspended-sediment concentrations from a surrogate ordinary least square regression analysis for station 05517500, Kankakee River at Dunns Bridge, Indiana, April 2016 through July 2020 |
DOI | 10.5066/P90R7T0E |
Authors | Cole R Downhour, Dawn R. Piotrowski |
Product Type | Data Release |
Record Source | USGS Asset Identifier Service (AIS) |
Rights | This work is marked with CC0 1.0 Universal |