Grant Details
Grant Number: |
1R21CA191383-01 Interpret this number |
Primary Investigator: |
Ma, Shuangge |
Organization: |
Yale University |
Project Title: |
Penalization Methods for Identifying Gene Envrionment Interactions and Applications to Melanoma and Other Cancer Types |
Fiscal Year: |
2015 |
Abstract
DESCRIPTION (provided by applicant): Considerable effort has been devoted to developing statistical methods for identifying G*E interactions in cancer GWAS studies. The existing methods suffer serious limitations. First, most of them take a model-based approach. The model assumptions are difficult to verify in data analysis, and there is a high risk of model mis- specification, which leads to false marker identification. The existing robust methods have limited applicability. Second, the existing methods adopt ineffective statistical techniques. Recently, we and others introduced effective penalization techniques for identifying important G*E interactions and showed that they significantly outperform the existing techniques. However, the existing penalization methods also have limitations. They adopt an estimation-based marker identification strategy, which is sensitive to tuning parameter selection, lacks stability, and does not have a direct false discovery rate control. In addition, they incur prohibitively high computational cost. The aforementioned limitations can mask the identification of important effects, lead to inconsistent findings across studies, and result in suboptimal predictive models. In this study, we will develop novel methods for detecting G*E interactions in the analysis of cancer etiology, prognosis, and biomarker data. The proposed methods will have the robustness property not shared by the model-based approach. They will adopt novel penalization techniques and advance from the existing penalization methods by adopting and directly comparing multiple marker identification strategies. They will be able to conduct both marginal and joint analyses and both individual marker- and pathway-level analyses. By adopting a progressive approach, they will be computationally affordable with whole-genome data. Specifically, we will (Aim 1) Develop robust penalization methods for identifying important environmental, genetic, and G*E risk factors associated with cancer risk, survival, and biomarker. We will develop effective computational algorithms and rigorously prove the robustness and consistency properties. Extensive simulations and comparisons will be conducted. (Aim 2) Develop user-friendly software and a project website. We will make the software and other research results easily accessible. (Aim 3) Analyze data on melanoma and other cancer types and identify important G*E interactions. We will comprehensively evaluate the identified markers and compare with the results obtained using existing methods. This study will deliver a set of novel methods which will have superior statistical and numerical properties and identify important markers missed by existing methods. They will be broadly applicable to a large number of cancer types and to multiple types of genetic, genomic, and epigenetic measurements. In data analysis, the identified markers will provide important insights into the biological mechanisms underlying melanoma and other cancers and serve as basis for future validation studies and clinical practice.
Publications
Semiparametric Bayesian variable selection for gene-environment interactions.
Authors: Ren J.
, Zhou F.
, Li X.
, Chen Q.
, Zhang H.
, Ma S.
, Jiang Y.
, Wu C.
.
Source: Statistics In Medicine, 2019-12-21 00:00:00.0; , .
EPub date: 2019-12-21 00:00:00.0.
PMID: 31863500
Related Citations
Identifying gene-environment interactions incorporating prior information.
Authors: Wang X.
, Xu Y.
, Ma S.
.
Source: Statistics In Medicine, 2019-04-30 00:00:00.0; 38(9), p. 1620-1633.
EPub date: 2019-01-13 00:00:00.0.
PMID: 30637789
Related Citations
Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis.
Authors: Ren J.
, Du Y.
, Li S.
, Ma S.
, Jiang Y.
, Wu C.
.
Source: Genetic Epidemiology, 2019 04; 43(3), p. 276-291.
EPub date: 2019-02-11 00:00:00.0.
PMID: 30746793
Related Citations
INFERENCE FOR LOW-DIMENSIONAL COVARIATES IN A HIGH-DIMENSIONAL ACCELERATED FAILURE TIME MODEL.
Authors: Chai H.
, Zhang Q.
, Huang J.
, Ma S.
.
Source: Statistica Sinica, 2019 Apr; 29(2), p. 877-894.
PMID: 31073263
Related Citations
Integrative Interaction Analysis using Threshold Gradient Directed Regularization.
Authors: Li Y.
, Li R.
, Qin Y.
, Wu M.
, Ma S.
.
Source: Applied Stochastic Models In Business And Industry, 2019 Mar-Apr; 35(2), p. 354-375.
EPub date: 2018-05-29 00:00:00.0.
PMID: 33071651
Related Citations
A Selective Review of Multi-Level Omics Data Integration Using Variable Selection.
Authors: Wu C.
, Zhou F.
, Ren J.
, Li X.
, Jiang Y.
, Ma S.
.
Source: High-throughput, 2019-01-18 00:00:00.0; 8(1), .
EPub date: 2019-01-18 00:00:00.0.
PMID: 30669303
Related Citations
Robust identification of gene-environment interactions for prognosis using a quantile partial correlation approach.
Authors: Xu Y.
, Wu M.
, Zhang Q.
, Ma S.
.
Source: Genomics, 2018-07-16 00:00:00.0; , .
EPub date: 2018-07-16 00:00:00.0.
PMID: 30009922
Related Citations
Dissecting gene-environment interactions: A penalized robust approach accounting for hierarchical structures.
Authors: Wu C.
, Jiang Y.
, Ren J.
, Cui Y.
, Ma S.
.
Source: Statistics In Medicine, 2018-02-10 00:00:00.0; 37(3), p. 437-456.
EPub date: 2017-10-16 00:00:00.0.
PMID: 29034484
Related Citations
Analysis of cancer gene expression data with an assisted robust marker identification approach.
Authors: Chai H.
, Shi X.
, Zhang Q.
, Zhao Q.
, Huang Y.
, Ma S.
.
Source: Genetic Epidemiology, 2017 Dec; 41(8), p. 779-789.
EPub date: 2017-09-14 00:00:00.0.
PMID: 28913902
Related Citations
Integrative sparse principal component analysis of gene expression data.
Authors: Liu M.
, Fan X.
, Fang K.
, Zhang Q.
, Ma S.
.
Source: Genetic Epidemiology, 2017 Dec; 41(8), p. 844-865.
EPub date: 2017-11-08 00:00:00.0.
PMID: 29114920
Related Citations
Sparse boosting for high-dimensional survival data with varying coefficients.
Authors: Yue M.
, Li J.
, Ma S.
.
Source: Statistics In Medicine, 2017-11-19 00:00:00.0; , .
EPub date: 2017-11-19 00:00:00.0.
PMID: 29152776
Related Citations
Identifying gene-gene interactions using penalized tensor regression.
Authors: Wu M.
, Huang J.
, Ma S.
.
Source: Statistics In Medicine, 2017-10-16 00:00:00.0; , .
EPub date: 2017-10-16 00:00:00.0.
PMID: 29034516
Related Citations
Identifying gene-environment interactions for prognosis using a robust approach.
Authors: Chai H.
, Zhang Q.
, Jiang Y.
, Wang G.
, Zhang S.
, Ahmed S.E.
, Ma S.
.
Source: Econometrics And Statistics, 2017 Oct; 4, p. 105-120.
EPub date: 2016-11-16 00:00:00.0.
PMID: 31157309
Related Citations
Assisted clustering of gene expression data using ANCut.
Authors: Teran Hidalgo S.J.
, Wu M.
, Ma S.
.
Source: Bmc Genomics, 2017-08-16 00:00:00.0; 18(1), p. 623.
EPub date: 2017-08-16 00:00:00.0.
PMID: 28814280
Related Citations
Inferring gene regulatory relationships with a high-dimensional robust approach.
Authors: Zang Y.
, Zhao Q.
, Zhang Q.
, Li Y.
, Zhang S.
, Ma S.
.
Source: Genetic Epidemiology, 2017 Jul; 41(5), p. 437-454.
EPub date: 2017-05-02 00:00:00.0.
PMID: 28464328
Related Citations
Accommodating missingness in environmental measurements in gene-environment interaction analysis.
Authors: Wu M.
, Zang Y.
, Zhang S.
, Huang J.
, Ma S.
.
Source: Genetic Epidemiology, 2017-06-28 00:00:00.0; , .
EPub date: 2017-06-28 00:00:00.0.
PMID: 28657194
Related Citations
Focused Information Criterion and Model Averaging with Generalized Rank Regression.
Authors: Zhang Q.
, Duan X.
, Ma S.
.
Source: Statistics & Probability Letters, 2017 Mar; 122, p. 11-19.
EPub date: 2016-10-31 00:00:00.0.
PMID: 28566799
Related Citations
Greedy Outcome Weighted Tree Learning Of Optimal Personalized Treatment Rules
Authors: Zhu R.
, Zhao Y.Q.
, Chen G.
, Ma S.
, Zhao H.
.
Source: Biometrics, 2016-10-04 00:00:00.0; , .
PMID: 27704531
Related Citations
A penalized robust semiparametric approach for gene-environment interactions.
Authors: Wu C.
, Shi X.
, Cui Y.
, Ma S.
.
Source: Statistics In Medicine, 2015-12-30 00:00:00.0; 34(30), p. 4016-30.
EPub date: 2015-12-30 00:00:00.0.
PMID: 26239060
Related Citations