Skip to main content
U.S. flag

An official website of the United States government

Sampling issues affecting accuracy of likelihood-based classification using genetical data

January 1, 2004

We demonstrate the effectiveness of a genetic algorithm for discovering multi-locus combinations that provide accurate individual assignment decisions and estimates of mixture composition based on likelihood classification. Using simulated data representing different levels of inter-population differentiation (Fst ~ 0.01 and 0.10), genetic diversities (four or eight alleles per locus), and population sizes (20, 40, 100 individuals in baseline populations), we show that subsets of loci can be identified that provide comparable levels of accuracy in classification decisions relative to entire multi-locus data sets, where 5, 10, or 20 loci were considered. Microsatellite data sets from hatchery strains of lake trout, Salvelinus namaycush, representing a comparable range of inter-population levels of differentiation in allele frequencies confirmed simulation results. For both simulated and empirical data sets, assignment accuracy was achieved using fewer loci (e.g., three or four loci out of eight for empirical lake trout studies). Simulation results were used to investigate properties of the 'leave-one-out' (L1O) method for estimating assignment error rates. Accuracy of population assignments based on L1O methods should be viewed with caution under certain conditions, particularly when baseline population sample sizes are low (<50).

Publication Year 2004
Title Sampling issues affecting accuracy of likelihood-based classification using genetical data
DOI 10.1023/B:EBFI.0000022869.72448.cd
Authors B. Guinand, K.T. Scribner, A. Topchy, K.S. Page, W. Punch, M. K. Burnham-Curtis
Publication Type Article
Publication Subtype Journal Article
Series Title Environmental Biology of Fishes
Index ID 70006448
Record Source USGS Publications Warehouse
USGS Organization Great Lakes Science Center