TY - JOUR
T1 - Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies
AU - Zhang, Xiaoshuai
AU - Xue, Fuzhong
AU - Liu, Hong
AU - Zhu, Dianwen
AU - Peng, Bin
AU - Wiemels, Joseph L.
AU - Yang, Xiaowei
PY - 2014
Y1 - 2014
N2 - Background: Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this "missing heritability" problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets. Results: Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case-control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies. Conclusions: The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.
AB - Background: Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this "missing heritability" problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets. Results: Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case-control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies. Conclusions: The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.
KW - Bayesian hierarchical modeling
KW - Bayesian variable selection
KW - Biomarker discovery
KW - Gene-based biomarkers
KW - Integrative biomarker identification
UR - http://www.scopus.com/inward/record.url?scp=84964312479&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84964312479&partnerID=8YFLogxK
U2 - 10.1186/s12863-014-0130-7
DO - 10.1186/s12863-014-0130-7
M3 - Article
C2 - 25491445
AN - SCOPUS:84964312479
VL - 15
JO - BMC Genetics
JF - BMC Genetics
SN - 1471-2156
IS - 1
M1 - 130
ER -