Grant Details
Grant Number: |
1R01CA158113-01 Interpret this number |
Primary Investigator: |
Johnson, Valen |
Organization: |
University Of Tx Md Anderson Can Ctr |
Project Title: |
Consistent Model Selection in the P>>n Setting |
Fiscal Year: |
2011 |
Abstract
DESCRIPTION (provided by applicant): Among the most fundamental and commonly encountered statistical problems in medical research is the problem of model selection. Model selection is the process by which researchers identify the relationships between measured quantities; thus it plays a central role in the analysis of essentially all high-throughput screening data. Model selection procedures represent the primary analytical mechanism through which the associations between diseases and large numbers of biochemical, genetic and pharmacological variables are discovered. The fundamental hypothesis tested in this application is that a new class of model selection procedures can be used to effectively identify associations between biological variables and disease outcomes, even in settings where there are many more potential biological correlates than there are observations on each variable. The goals of this project are to develop these variable selection procedures so that they can be applied to high-throughput screening data, and to apply the resulting methodology in three important application areas. To achieve these goals, the following specific aims will be addressed. Known theoretical properties of the proposed model selection procedures will be extended to cases in which there are many more biological measurements available than there are observations on each measurement (i.e., p n setting). Constraints on the number of variables that can be included in final models for outcome variables will be determined, and efficient numerical algorithms will be developed so that these methods can be applied to actual high-throughput screening data. The new model selection procedures will be used to define binary classification algorithms that can predict clinical outcomes from high-dimensional gene expression data sets. The new model selection procedures will be used to identify and analyze interactions between genes that are associated with cancer and other diseases in genome-wide association studies using single-nucleotide polymorphism data. The new model selection procedures will be used to analyze biological pathways as informed by high- throughput molecular interrogation data. The algorithms developed during this project constitute a major innovation in the field of model selection and will provide medical researchers with a new and unique set of tools for effectively identifying biological associations among biomarkers, disease attributes, and patient outcomes from high-throughput screening data.
PUBLIC HEALTH RELEVANCE: Model selection procedures are statistical techniques that allow researchers to discover the associations between disease and the large number of variables that are measured in emerging high-throughput screening technologies. For example, model selection techniques are used to discover which genes are associated with particular forms of cancer. This project proposes a new class of model selection procedures that will make it easier for researchers to discover such associations.
Publications
Bayes factor functions for reporting outcomes of hypothesis tests.
Authors: Johnson V.E.
, Pramanik S.
, Shudde R.
.
Source: Proceedings Of The National Academy Of Sciences Of The United States Of America, 2023-02-21 00:00:00.0; 120(8), p. e2217331120.
EPub date: 2023-02-13 00:00:00.0.
PMID: 36780516
Related Citations
Efficient alternatives for Bayesian hypothesis tests in psychology.
Authors: Pramanik S.
, Johnson V.E.
.
Source: Psychological Methods, 2022-04-14 00:00:00.0; , .
EPub date: 2022-04-14 00:00:00.0.
PMID: 35420854
Related Citations
Bayesian Edge Regression in Undirected Graphical Models to Characterize Interpatient Heterogeneity in Cancer.
Authors: Wang Z.
, Kaseb A.O.
, Amin H.M.
, Hassan M.M.
, Wang W.
, Morris J.S.
.
Source: Journal Of The American Statistical Association, 2022; 117(538), p. 533-546.
EPub date: 2022-01-05 00:00:00.0.
PMID: 36090952
Related Citations
A Hyperparameter-Free, Fast and Efficient Framework to Detect Clusters From Limited Samples Based on Ultra High-Dimensional Features.
Authors: Rahman S.
, Johnson V.E.
, Rao S.S.
.
Source: Ieee Access : Practical Innovations, Open Solutions, 2022; 10, p. 116844-116857.
EPub date: 2022-11-01 00:00:00.0.
PMID: 37275750
Related Citations
Single-cell ATAC and RNA sequencing reveal pre-existing and persistent cells associated with prostate cancer relapse.
Authors: Taavitsainen S.
, Engedal N.
, Cao S.
, Handle F.
, Erickson A.
, Prekovic S.
, Wetterskog D.
, Tolonen T.
, Vuorinen E.M.
, Kiviaho A.
, et al.
.
Source: Nature Communications, 2021-09-06 00:00:00.0; 12(1), p. 5307.
EPub date: 2021-09-06 00:00:00.0.
PMID: 34489465
Related Citations
A Modified Sequential Probability Ratio Test.
Authors: Pramanik S.
, Johnson V.E.
, Bhattacharya A.
.
Source: Journal Of Mathematical Psychology, 2021 Apr; 101, .
EPub date: 2021-03-04 00:00:00.0.
PMID: 35496657
Related Citations
On the Existence of Uniformly Most Powerful Bayesian Tests With Application to Non-Central Chi-Squared Tests.
Authors: Nikooienejad A.
, Johnson V.E.
.
Source: Bayesian Analysis, 2021 Mar; 16(1), p. 93-109.
EPub date: 2020-01-07 00:00:00.0.
PMID: 34113418
Related Citations
A pedigree-based prediction model identifies carriers of deleterious de novo mutations in families with Li-Fraumeni syndrome.
Authors: Gao F.
, Pan X.
, Dodd-Eaton E.B.
, Recio C.V.
, Montierth M.D.
, Bojadzieva J.
, Mai P.L.
, Zelley K.
, Johnson V.E.
, Braun D.
, et al.
.
Source: Genome Research, 2020 Aug; 30(8), p. 1170-1180.
EPub date: 2020-08-18 00:00:00.0.
PMID: 32817165
Related Citations
BAYESIAN VARIABLE SELECTION FOR SURVIVAL DATA USING INVERSE MOMENT PRIORS.
Authors: Nikooienejad A.
, Wang W.
, Johnson V.E.
.
Source: The Annals Of Applied Statistics, 2020 Jun; 14(2), p. 809-828.
EPub date: 2020-06-29 00:00:00.0.
PMID: 33456641
Related Citations
Penetrance Estimates Over Time to First and Second Primary Cancer Diagnosis in Families with Li-Fraumeni Syndrome: A Single Institution Perspective.
Authors: Shin S.J.
, Dodd-Eaton E.B.
, Gao F.
, Bojadzieva J.
, Chen J.
, Kong X.
, Amos C.I.
, Ning J.
, Strong L.C.
, Wang W.
.
Source: Cancer Research, 2020-01-15 00:00:00.0; 80(2), p. 347-353.
EPub date: 2019-11-12 00:00:00.0.
PMID: 31719099
Related Citations
Penetrance of Different Cancer Types in Families with Li-Fraumeni Syndrome: A Validation Study Using Multicenter Cohorts.
Authors: Shin S.J.
, Dodd-Eaton E.B.
, Peng G.
, Bojadzieva J.
, Chen J.
, Amos C.I.
, Frone M.N.
, Khincha P.P.
, Mai P.L.
, Savage S.A.
, et al.
.
Source: Cancer Research, 2020-01-15 00:00:00.0; 80(2), p. 354-360.
EPub date: 2019-11-12 00:00:00.0.
PMID: 31719101
Related Citations
Functional Horseshoe Priors for Subspace Shrinkage.
Authors: Shin M.
, Bhattachrya A.
, Johnson V.E.
.
Source: Journal Of The American Statistical Association, 2020; 115(532), p. 1784-1797.
EPub date: 2019-09-17 00:00:00.0.
PMID: 33716358
Related Citations
Transformed low-rank ANOVA models for high-dimensional variable selection.
Authors: Jung Y.
, Zhang H.
, Hu J.
.
Source: Statistical Methods In Medical Research, 2019 04; 28(4), p. 1230-1246.
EPub date: 2018-01-31 00:00:00.0.
PMID: 29384042
Related Citations
GWASinlps: non-local prior based iterative SNP selection tool for genome-wide association studies.
Authors: Sanyal N.
, Lo M.T.
, Kauppi K.
, Djurovic S.
, Andreassen O.A.
, Johnson V.E.
, Chen C.H.
.
Source: Bioinformatics (oxford, England), 2019-01-01 00:00:00.0; 35(1), p. 1-11.
PMID: 29931045
Related Citations
statistics.
Authors: Johnson V.E.
.
Source: The American Statistician, 2019; 73(Suppl 1), p. 129-134.
EPub date: 2019-03-20 00:00:00.0.
PMID: 31123367
Related Citations
Transcriptome Deconvolution of Heterogeneous Tumor Samples with Immune Infiltration.
Authors: Wang Z.
, Cao S.
, Morris J.S.
, Ahn J.
, Liu R.
, Tyekucheva S.
, Gao F.
, Li B.
, Lu W.
, Tang X.
, et al.
.
Source: Iscience, 2018-11-30 00:00:00.0; 9, p. 451-460.
EPub date: 2018-11-02 00:00:00.0.
PMID: 30469014
Related Citations
Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings.
Authors: Shin M.
, Bhattacharya A.
, Johnson V.E.
.
Source: Statistica Sinica, 2018 Apr; 28(2), p. 1053-1078.
PMID: 29643721
Related Citations
Tractable Bayesian variable selection: beyond normality.
Authors: Rossell D.
, Rubio F.J.
.
Source: Journal Of The American Statistical Association, 2018; 113(524), p. 1742-1758.
EPub date: 2018-06-28 00:00:00.0.
PMID: 30906086
Related Citations
Bayesian block-diagonal variable selection and model averaging.
Authors: Papaspiliopoulos O.
, Rossell D.
.
Source: Biometrika, 2017 Jun; 104(2), p. 343-359.
EPub date: 2017-04-24 00:00:00.0.
PMID: 29861501
Related Citations
On the Reproducibility of Psychological Science.
Authors: Johnson V.E.
, Payne R.D.
, Wang T.
, Asher A.
, Mandal S.
.
Source: Journal Of The American Statistical Association, 2017; 112(517), p. 1-10.
EPub date: 2016-10-07 00:00:00.0.
PMID: 29861517
Related Citations
NON-LOCAL PRIORS FOR HIGH-DIMENSIONAL ESTIMATION.
Authors: Rossell D.
, Telesca D.
.
Source: Journal Of The American Statistical Association, 2017; 112(517), p. 254-265.
EPub date: 2017-05-03 00:00:00.0.
PMID: 29881129
Related Citations
Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors.
Authors: Nikooienejad A.
, Wang W.
, Johnson V.E.
.
Source: Bioinformatics (oxford, England), 2016-05-01 00:00:00.0; 32(9), p. 1338-45.
EPub date: 2016-05-01 00:00:00.0.
PMID: 26740524
Related Citations
A robust Bayesian dose-finding design for phase I/II clinical trials.
Authors: Liu S.
, Johnson V.E.
.
Source: Biostatistics (oxford, England), 2016 Apr; 17(2), p. 249-63.
PMID: 26486139
Related Citations
Designing alternative splicing RNA-seq studies. Beyond generic guidelines.
Authors: Stephan-Otto Attolini C.
, Peña V.
, Rossell D.
.
Source: Bioinformatics (oxford, England), 2015-11-15 00:00:00.0; 31(22), p. 3631-7.
EPub date: 2015-11-15 00:00:00.0.
PMID: 26220961
Related Citations
Predictive classification of correlated targets with application to detection of metastatic cancer using functional CT imaging.
Authors: Wang Y.
, Hobbs B.P.
, Hu J.
, Ng C.S.
, Do K.A.
.
Source: Biometrics, 2015 Sep; 71(3), p. 792-802.
PMID: 25851056
Related Citations
A Unified Family of Covariate-Adjusted Response-Adaptive Designs Based on Efficiency and Ethics.
Authors: Hu J.
, Zhu H.
, Hu F.
.
Source: Journal Of The American Statistical Association, 2015-04-22 00:00:00.0; 110(509), p. 357-367.
PMID: 26120220
Related Citations
Detecting differential patterns of interaction in molecular pathways.
Authors: Yajima M.
, Telesca D.
, Ji Y.
, Müller P.
.
Source: Biostatistics (oxford, England), 2015 Apr; 16(2), p. 240-51.
PMID: 25519431
Related Citations
Estimating and Identifying Unspecified Correlation Structure for Longitudinal Data.
Authors: Hu J.
, Wang P.
, Qu A.
.
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2015-04-01 00:00:00.0; 24(2), p. 455-476.
PMID: 26361433
Related Citations
A K-fold Averaging Cross-validation Procedure.
Authors: Jung Y.
, Hu J.
.
Source: Journal Of Nonparametric Statistics, 2015; 27(2), p. 167-179.
PMID: 27630515
Related Citations
BIG DATA AND STATISTICS: A STATISTICIAN'S PERSPECTIVE.
Authors: Rossell D.
.
Source: Metode Science Studies Journal : Annual Review, 2015; 5, p. 143-149.
PMID: 27722040
Related Citations
Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA.
Authors: Jung Y.
, Huang J.Z.
, Hu J.
.
Source: Journal Of The American Statistical Association, 2014-12-01 00:00:00.0; 109(508), p. 1355-1367.
PMID: 25642005
Related Citations
Evaluation of image registration spatial accuracy using a Bayesian hierarchical model.
Authors: Liu S.
, Yuan Y.
, Castillo R.
, Guerrero T.
, Johnson V.E.
.
Source: Biometrics, 2014 Jun; 70(2), p. 366-77.
PMID: 24575781
Related Citations
QUANTIFYING ALTERNATIVE SPLICING FROM PAIRED-END RNA-SEQUENCING DATA.
Authors: Rossell D.
, Stephan-Otto Attolini C.
, Kroiss M.
, Stöcker A.
.
Source: The Annals Of Applied Statistics, 2014 Mar; 8(1), p. 309-330.
PMID: 24795787
Related Citations
On Numerical Aspects of Bayesian Model Selection in High and Ultrahigh-dimensional Settings.
Authors: Johnson V.E.
.
Source: Bayesian Analysis, 2013-12-01 00:00:00.0; 8(4), p. 741-758.
PMID: 24683431
Related Citations
Revised standards for statistical evidence.
Authors: Johnson V.E.
.
Source: Proceedings Of The National Academy Of Sciences Of The United States Of America, 2013-11-26 00:00:00.0; 110(48), p. 19313-7.
EPub date: 2013-11-26 00:00:00.0.
PMID: 24218581
Related Citations
Bayesian adaptive phase II screening design for combination trials.
Authors: Cai C.
, Yuan Y.
, Johnson V.E.
.
Source: Clinical Trials (london, England), 2013; 10(3), p. 353-62.
PMID: 23359875
Related Citations
UNIFORMLY MOST POWERFUL BAYESIAN TESTS.
Authors: Johnson V.E.
.
Source: Annals Of Statistics, 2013; 41(4), p. 1716-1741.
PMID: 24659829
Related Citations
Reno: regularized non-parametric analysis of protein lysate array data.
Authors: Li B.
, Liang F.
, Hu J.
, He A.X.
.
Source: Bioinformatics (oxford, England), 2012-05-01 00:00:00.0; 28(9), p. 1223-9.
EPub date: 2012-05-01 00:00:00.0.
PMID: 22467912
Related Citations
Goodness-of-fit diagnostics for Bayesian hierarchical models.
Authors: Yuan Y.
, Johnson V.E.
.
Source: Biometrics, 2012 Mar; 68(1), p. 156-64.
PMID: 22050079
Related Citations
Bayesian Model Selection in High-Dimensional Settings.
Authors: Johnson V.E.
, Rossell D.
.
Source: Journal Of The American Statistical Association, 2012; 107(498), .
PMID: 24363474
Related Citations