Maximizing genetic differentiation in core collections by PCA-based clustering of molecular marker data

van Heerwaarden, J. and Odong, T.L. and van Eeuwijk , F.A. (2012) Maximizing genetic differentiation in core collections by PCA-based clustering of molecular marker data. Theoretical and Applied Genetics . pp. 1-10.

[img] PDF - Published Version
Restricted to ICRISAT researchers only


Developing genetically diverse core sets is key to the effective management and use of crop genetic resources. Core selection increasingly uses molecular marker-based dissimilarity and clustering methods, under the implicit assumption that markers and genes of interest are genetically correlated. In practice, low marker densities mean that genome-wide correlations are mainly caused by genetic differentiation, rather than by physical linkage. Although of central concern, genetic differentiation per se is not specifically targeted by most commonly employed dissimilarity and clustering methods. Principal component analysis (PCA) on genotypic data is known to effectively describe the inter-locus correlations caused by differentiation, but to date there has been no evaluation of its application to core selection. Here, we explore PCA-based clustering of marker data as a basis for core selection, with the aim of demonstrating its use in capturing genetic differentiation in the data. Using simulated datasets, we show that replacing full-rank genotypic data by the subset of genetically significant PCs leads to better description of differentiation and improves assignment of genotypes to their population of origin. We test the effectiveness of differentiation as a criterion for the formation of core sets by applying a simple new PCA-based core selection method to simulated and actual data and comparing its performance to one of the best existing selection algorithms. We find that although gains in genetic diversity are generally modest, PCA-based core selection is equally effective at maximizing diversity at non-marker loci, while providing better representation of genetically differentiated groups.

Item Type: Article
Additional Information: The authors wish to thank Carmen de Vicente, former leader of subprogram 5 of the Generation Challenge Program (GCP), for providing financial support (GCP 4008.23) and guidance. We thank Diego Ortega Del Vecchyo for contributing software and three anonymous reviewers for comments on earlier versions of the manuscript
Author Affiliation: Biometris, Wageningen UR, Wageningen, The Netherlands
Subjects: Crop Improvement
Divisions: General
Depositing User: Mr Arbind Seth
Date Deposited: 04 Dec 2012 05:54
Last Modified: 04 Dec 2012 05:54
Official URL:

Actions (login required)

View Item View Item