ISCB-Asia/SCCG 2012, session on Computational Statistics in Modern Biology


Min Zhang
Department of Statistics, Purdue University

A fast and efficient approach for genomic selection with high density markers

Abstract

Recent advances in high-throughput genotyping have motivated genomic selection using high-density markers. However, an increasingly available large number of markers bring up both statistical and computational issues, and make it difficult to estimate the breeding values. We propose to apply the penalized orthogonal-components regression (POCRE) method to estimate breeding values. As a supervised dimension reduction method, POCRE sequentially constructs linear combinations of markers, i.e., orthogonal components, such that these components are most closely correlated to the phenotype. Such a dimension reduction is able to group highly correlated predictors and allows for collinear or nearly collinear markers. Different from BayesB which predetermines hyperparameters, POCRE uses an empirical Bayes thresholding method to obtain data-driven optimal hyperparameters and effectively select important markers when constructing each component. Demonstrated through simulation studies, POCRE greatly reduces the computing time when compared to BayesB. On the other hand, unlike fBayesB which slightly sacrifices prediction accuracy for fast computation, POCRE provides similar or even better accuracy of predicting breeding values than BayesB in both simulation studies and real data analyses.

Biography

Min Zhang received an M.D. from Hebei Medical University and a Ph.D. from Beijing Medical University, before moving to the USA, where he obtained an M.S. in biometry and a Ph.D. in biological statistics & computational biology & from Cornell University. Currently he is an Associate Professor in the Department of Statistics at Purdue University.

Min Zhang's research interests include Bayesian methods, Bioinformatics and Biologically Related Disciplines (genomics, nutrition, proteomics, statistical genetics), Genomics, Inference from High Dimensional Data, Markov Chain Monte Carlo, Massive Data, Missing Data, Modeling and Model Selection, Physician Profiling in Managed Healthcare, Proteomics and Statistical Genetics. Min Zhang regularly publishes research articles applying statistical methods to problems in biology and medicine.

Min Zhang has received numerous awards including a Duke University Fellowship, an Excellent Dissertation Award, an Outstanding Teaching Assistant Award, a Best Student Paper Award, a Graduate School Fellowship, a Thesis Research Award, a Liu Memorial Award, a Seed for Success at Purdue University award, and College of Science Interdisciplinary Award,.