Implementing Cross Validation Approaches for Model Selection and Evaluating Goodness of Fit in Complex Hierarchical Models
It is (relatively) easy to construct complex hierarchical models for analysis of the North American Breeding Bird Survey (BBS), but deciding which model best describes population change is difficult. We are developing methods for model selection for BBS and other important survey data sets, and using them to refine our estimates of population change from this important survey.
The Challenge: Many critical wildlife surveys, such as the North American Breeding Bird Survey (BBS), are analyzed using complex hierarchical models. These models are generally multi-scale and contain random effects; the standard approaches to model selection and assessment of model fit are often inappropriate and no simple way exists to compare alternative models. However, a clear need exists for these assessments. Many alternative models new exist for analysis of BBS data, and simply presenting multiple results without clear guidance on which model is most appropriate will lead to confusion among users of BBS data and limit use of this important survey.
The Science: We are evaluating the use of Bayesian cross-validation as a general tool for comparison of complex hierarchical models. In crossvalidation, model is fit to a data set from which an observation is omitted and the resulting model and estimated parameters is then used to predict the missing observation. The process is repeated many times, comparison of the true value with the prediction of the model in the absence of the data provides a measure of how well the model fits the data. Crossvalidation is time-consuming. One important aspect of this study involves development of efficient ways of implementing the procedure, or predicting overall fit by modeling crossvalidation results as a function of the more easily computed Watanabe–Akaike information criterion results.
The Future: Development of appropriate model sets for BBS analyses and implementing model selection procedures will always be a critical component of BBS analyses. We have developed a model set and compared model fit for ~548 BBS species. Our goal for 2020 is institutionalizing our results for model selection in BBS analyses, expanding the model set, and exploring use of crossvalidation in model selection for other surveys.
It is (relatively) easy to construct complex hierarchical models for analysis of the North American Breeding Bird Survey (BBS), but deciding which model best describes population change is difficult. We are developing methods for model selection for BBS and other important survey data sets, and using them to refine our estimates of population change from this important survey.
The Challenge: Many critical wildlife surveys, such as the North American Breeding Bird Survey (BBS), are analyzed using complex hierarchical models. These models are generally multi-scale and contain random effects; the standard approaches to model selection and assessment of model fit are often inappropriate and no simple way exists to compare alternative models. However, a clear need exists for these assessments. Many alternative models new exist for analysis of BBS data, and simply presenting multiple results without clear guidance on which model is most appropriate will lead to confusion among users of BBS data and limit use of this important survey.
The Science: We are evaluating the use of Bayesian cross-validation as a general tool for comparison of complex hierarchical models. In crossvalidation, model is fit to a data set from which an observation is omitted and the resulting model and estimated parameters is then used to predict the missing observation. The process is repeated many times, comparison of the true value with the prediction of the model in the absence of the data provides a measure of how well the model fits the data. Crossvalidation is time-consuming. One important aspect of this study involves development of efficient ways of implementing the procedure, or predicting overall fit by modeling crossvalidation results as a function of the more easily computed Watanabe–Akaike information criterion results.
The Future: Development of appropriate model sets for BBS analyses and implementing model selection procedures will always be a critical component of BBS analyses. We have developed a model set and compared model fit for ~548 BBS species. Our goal for 2020 is institutionalizing our results for model selection in BBS analyses, expanding the model set, and exploring use of crossvalidation in model selection for other surveys.