Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5R01CA204120-08 Interpret this number
Primary Investigator: Ma, Shuangge
Organization: Yale University
Project Title: Novel Methods for Identifying Genetic Interactions for Cancer Prognosis
Fiscal Year: 2024


Abstract

Project Summary For the prognosis of melanoma, lung cancer, and many other cancers, G-E (gene-environment) interactions have important implications. Through a series of studies, our group has taken a unique robustness perspective and a leading role in developing the foundation of G-E interaction analysis using cutting-edge high-dimensional and regularized statistics. Recently, our group pioneered I-E (histopathological imaging-environment) interaction analysis and significantly expanded the scope of cancer analytics. We have made important discoveries for NHL, melanoma, and lung cancer, impactfully advancing their translational research and clinical practice. Our overarching goal is to construct more powerful prognosis models and more accurately identify G-E/I- E interactions so as to truthfully describe cancer biology and informatively guide clinical decision-making. In this project, we will be the first to develop paradigm-shifting SDL (statistically principled deep learning) techniques tailored to G-E/I-E interaction analysis for cancer prognosis. The proposed methods will inherit strengths from the existing deep learning and regression techniques and be superior to both. We will continue analyzing data on melanoma and lung cancer, further enhancing the high translational and clinical impact of our study. We will: (Aim 1) Develop foundational SDL techniques tailored to G-E/I-E interaction analysis. We will first develop “benchmark” nonrobust losses and then innovatively advance to losses that are robust to model mis-specification and long-tailed distribution/contamination. A novel penalization technique will be applied for architecture construction, which will accommodate the unique characteristics of the main G/I effects, main E effects, and their interactions in a customized manner, screen out noises, and respect the “main effects, interactions” hierarchy. (Aim 2) Boost performance by incorporating additional information. We will cost- effectively improve SDL performance by incorporating additional information on (a) the interconnections between prognosis and G-E/I-E interactions as well as main G/I effects, and (b) the interconnections among G/I variables. (Aim 3) Expand analysis scope and integrate multiple types of G/I measurements. Motivated by their overlapping but also independent information for prognosis, we will develop novel SDL methods and be the first to integrate multiple types of molecular and imaging measurements in interaction analysis. (Aim 4) Analyze the Yale SPORE and TCGA data on melanoma and lung cancer. Analysis will be conducted on multiple prognosis outcomes. Demographic/clinical/environmental risk factors, multiple types of molecular measurements (protein, gene expression, mutation, methylation, and microRNA), and histopathological imaging features will be analyzed. The analysis results will be thoroughly and rigorously evaluated, extensively compared to those using alternatives, and validated in multiple ways.



Publications

Bayesian Modeling of Cancer Outcomes Using Genetic Variables Assisted by Pathological Imaging Data.
Authors: Im Y. , Li R. , Ma S. .
Source: Statistics In Medicine, 2025-02-10 00:00:00.0; 44(3-4), p. e10350.
PMID: 39840672
Related Citations

The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies.
Authors: Liu Y. , Ren J. , Ma S. , Wu C. .
Source: Statistics In Medicine, 2024-09-11 00:00:00.0; , .
EPub date: 2024-09-11 00:00:00.0.
PMID: 39260448
Related Citations

Estimation of multiple networks with common structures in heterogeneous subgroups.
Authors: Qin X. , Hu J. , Ma S. , Wu M. .
Source: Journal Of Multivariate Analysis, 2024 Jul; 202, .
EPub date: 2024-02-13 00:00:00.0.
PMID: 38433779
Related Citations

Hierarchical False Discovery Rate Control for High-dimensional Survival Analysis with Interactions.
Authors: Liang W. , Zhang Q. , Ma S. .
Source: Computational Statistics & Data Analysis, 2024 Apr; 192, .
EPub date: 2023-12-05 00:00:00.0.
PMID: 38098875
Related Citations

Information-incorporated sparse hierarchical cancer heterogeneity analysis.
Authors: Han W. , Zhang S. , Ma S. , Ren M. .
Source: Statistics In Medicine, 2024-03-30 00:00:00.0; , .
EPub date: 2024-03-30 00:00:00.0.
PMID: 38553996
Related Citations

Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering.
Authors: Sun X. , Zhang S. , Ma S. .
Source: Entropy (basel, Switzerland), 2024-03-30 00:00:00.0; 26(4), .
EPub date: 2024-03-30 00:00:00.0.
PMID: 38667864
Related Citations

FunctanSNP: an R package for functional analysis of dense SNP data (with interactions).
Authors: Ren R. , Fang K. , Zhang Q. , Ma S. .
Source: Bioinformatics (oxford, England), 2023-12-01 00:00:00.0; 39(12), .
PMID: 38060266
Related Citations

The Bayesian Regularized Quantile Varying Coefficient Model.
Authors: Zhou F. , Ren J. , Ma S. , Wu C. .
Source: Computational Statistics & Data Analysis, 2023 Nov; 187, .
EPub date: 2023-06-23 00:00:00.0.
PMID: 38746689
Related Citations

Locally sparse quantile estimation for a partially functional interaction model.
Authors: Liang W. , Zhang Q. , Ma S. .
Source: Computational Statistics & Data Analysis, 2023 Oct; 186, .
EPub date: 2023-05-25 00:00:00.0.
PMID: 39555004
Related Citations

Aligned deep neural network for integrative analysis with high-dimensional input.
Authors: Zhang S. , Zhang S. , Yi H. , Ma S. .
Source: Journal Of Biomedical Informatics, 2023 Aug; 144, p. 104434.
EPub date: 2023-06-28 00:00:00.0.
PMID: 37391115
Related Citations

Pathological imaging-assisted cancer gene-environment interaction analysis.
Authors: Fang K. , Li J. , Zhang Q. , Xu Y. , Ma S. .
Source: Biometrics, 2023-05-03 00:00:00.0; , .
EPub date: 2023-05-03 00:00:00.0.
PMID: 37132273
Related Citations

Bi-level structured functional analysis for genome-wide association studies.
Authors: Wu M. , Wang F. , Ge Y. , Ma S. , Li Y. .
Source: Biometrics, 2023-04-26 00:00:00.0; , .
EPub date: 2023-04-26 00:00:00.0.
PMID: 37098961
Related Citations

Bayesian finite mixture of regression analysis for cancer based on histopathological imaging-environment interactions.
Authors: Im Y. , Huang Y. , Tan A. , Ma S. .
Source: Biostatistics (oxford, England), 2023-04-14 00:00:00.0; 24(2), p. 425-442.
PMID: 37057611
Related Citations

Gene-environment interaction analysis via deep learning.
Authors: Wu S. , Xu Y. , Zhang Q. , Ma S. .
Source: Genetic Epidemiology, 2023 Apr; 47(3), p. 261-286.
EPub date: 2023-02-19 00:00:00.0.
PMID: 36807383
Related Citations

Unified model-free interaction screening via CV-entropy filter.
Authors: Xiong W. , Chen Y. , Ma S. .
Source: Computational Statistics & Data Analysis, 2023 Apr; 180, .
EPub date: 2022-12-28 00:00:00.0.
PMID: 36910335
Related Citations

HETEROGENEITY ANALYSIS VIA INTEGRATING MULTI-SOURCES HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO CANCER STUDIES.
Authors: Zhong T. , Zhang Q. , Huang J. , Wu M. , Ma S. .
Source: Statistica Sinica, 2023 Apr; 33(2), p. 729-758.
PMID: 38037567
Related Citations

Spatio-temporally smoothed deep survival neural network.
Authors: Li Y. , Liang D. , Ma S. , Ma C. .
Source: Journal Of Biomedical Informatics, 2023 Jan; 137, p. 104255.
EPub date: 2022-12-01 00:00:00.0.
PMID: 36462600
Related Citations

A General Framework for Identifying Hierarchical Interactions and Its Application to Genomics Data.
Authors: Xiao Z. , Xingjie S. , Yiming L. , Xu L. , Ma S. .
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2023; 32(3), p. 873-883.
EPub date: 2023-02-06 00:00:00.0.
PMID: 38009111
Related Citations

Rank-Based Greedy Model Averaging for High-Dimensional Survival Data.
Authors: He B. , Ma S. , Zhang X. , Zhu L.X. .
Source: Journal Of The American Statistical Association, 2023; 118(544), p. 2658-2670.
EPub date: 2022-07-07 00:00:00.0.
PMID: 39552724
Related Citations

Two-level Bayesian interaction analysis for survival data incorporating pathway information.
Authors: Qin X. , Ma S. , Wu M. .
Source: Biometrics, 2022-12-16 00:00:00.0; , .
EPub date: 2022-12-16 00:00:00.0.
PMID: 36524727
Related Citations

A tree-based gene-environment interaction analysis with rare features.
Authors: Liu M. , Zhang Q. , Ma S. .
Source: Statistical Analysis And Data Mining, 2022 Oct; 15(5), p. 648-674.
EPub date: 2022-03-01 00:00:00.0.
PMID: 38046814
Related Citations

Sparse group variable selection for gene-environment interactions in the longitudinal study.
Authors: Zhou F. , Lu X. , Ren J. , Fan K. , Ma S. , Wu C. .
Source: Genetic Epidemiology, 2022-06-29 00:00:00.0; , .
EPub date: 2022-06-29 00:00:00.0.
PMID: 35766061
Related Citations

Network-based cancer heterogeneity analysis incorporating multi-view of prior information.
Authors: Li Y. , Xu S. , Ma S. , Wu M. .
Source: Bioinformatics (oxford, England), 2022-05-13 00:00:00.0; 38(10), p. 2855-2862.
PMID: 35561185
Related Citations

Network-based cancer heterogeneity analysis incorporating multi-view of prior information.
Authors: Li Y. , Xu S. , Ma S. , Wu M. .
Source: Bioinformatics (oxford, England), 2022-05-13 00:00:00.0; 38(10), p. 2855-2862.
PMID: 35561185
Related Citations

Biclustering analysis of functionals via penalized fusion.
Authors: Fang K. , Chen Y. , Ma S. , Zhang Q. .
Source: Journal Of Multivariate Analysis, 2022 May; 189, .
EPub date: 2021-10-29 00:00:00.0.
PMID: 36817965
Related Citations

GEInfo: an R package for gene-environment interaction analysis incorporating prior information.
Authors: Wang X. , Liu H. , Ma S. .
Source: Bioinformatics (oxford, England), 2022-04-29 00:00:00.0; , .
EPub date: 2022-04-29 00:00:00.0.
PMID: 35485739
Related Citations

iSFun: an R package for integrative dimension reduction analysis.
Authors: Fang K. , Ren R. , Zhang Q. , Ma S. .
Source: Bioinformatics (oxford, England), 2022-04-20 00:00:00.0; , .
EPub date: 2022-04-20 00:00:00.0.
PMID: 35441661
Related Citations

Integrative functional linear model for genome-wide association studies with multiple traits.
Authors: Li Y. , Wang F. , Wu M. , Ma S. .
Source: Biostatistics (oxford, England), 2022-04-13 00:00:00.0; 23(2), p. 574-590.
PMID: 33040145
Related Citations

Integrative functional linear model for genome-wide association studies with multiple traits.
Authors: Li Y. , Wang F. , Wu M. , Ma S. .
Source: Biostatistics (oxford, England), 2022-04-13 00:00:00.0; 23(2), p. 574-590.
PMID: 33040145
Related Citations

Robust Bayesian variable selection for gene-environment interactions.
Authors: Ren J. , Zhou F. , Li X. , Ma S. , Jiang Y. , Wu C. .
Source: Biometrics, 2022-04-08 00:00:00.0; , .
EPub date: 2022-04-08 00:00:00.0.
PMID: 35394058
Related Citations

Robust Bayesian variable selection for gene-environment interactions.
Authors: Ren J. , Zhou F. , Li X. , Ma S. , Jiang Y. , Wu C. .
Source: Biometrics, 2022-04-08 00:00:00.0; , .
EPub date: 2022-04-08 00:00:00.0.
PMID: 35394058
Related Citations

Bayesian hierarchical finite mixture of regression for histopathological imaging-based cancer data analysis.
Authors: Im Y. , Huang Y. , Huang J. , Ma S. .
Source: Statistics In Medicine, 2022-01-13 00:00:00.0; , .
EPub date: 2022-01-13 00:00:00.0.
PMID: 35028949
Related Citations

Gene-environment interaction identification via penalized robust divergence.
Authors: Ren M. , Zhang S. , Ma S. , Zhang Q. .
Source: Biometrical Journal. Biometrische Zeitschrift, 2021-11-01 00:00:00.0; , .
EPub date: 2021-11-01 00:00:00.0.
PMID: 34725857
Related Citations

Gene-gene interaction analysis incorporating network information via a structured Bayesian approach.
Authors: Qin X. , Ma S. , Wu M. .
Source: Statistics In Medicine, 2021-09-20 00:00:00.0; , .
EPub date: 2021-09-20 00:00:00.0.
PMID: 34542187
Related Citations

Hierarchical cancer heterogeneity analysis based on histopathological imaging features.
Authors: Ren M. , Zhang Q. , Zhang S. , Zhong T. , Huang J. , Ma S. .
Source: Biometrics, 2021-08-14 00:00:00.0; , .
EPub date: 2021-08-14 00:00:00.0.
PMID: 34390584
Related Citations

Marginal false discovery rate for a penalized transformation survival model.
Authors: Liang W. , Ma S. , Lin C. .
Source: Computational Statistics & Data Analysis, 2021 Aug; 160, .
EPub date: 2021-04-02 00:00:00.0.
PMID: 34393307
Related Citations

Multidimensional molecular measurements-environment interaction analysis for disease outcomes.
Authors: Xu Y. , Wu M. , Ma S. .
Source: Biometrics, 2021-07-02 00:00:00.0; , .
EPub date: 2021-07-02 00:00:00.0.
PMID: 34213006
Related Citations

GEInter: an R package for robust gene-environment interaction analysis.
Authors: Wu M. , Qin X. , Ma S. .
Source: Bioinformatics (oxford, England), 2021-05-07 00:00:00.0; , .
EPub date: 2021-05-07 00:00:00.0.
PMID: 33961050
Related Citations

Information-incorporated Gaussian graphical model for gene expression data.
Authors: Yi H. , Zhang Q. , Lin C. , Ma S. .
Source: Biometrics, 2021-02-02 00:00:00.0; , .
EPub date: 2021-02-02 00:00:00.0.
PMID: 33527365
Related Citations

Histopathological imaging features- versus molecular measurements-based cancer prognosis modeling.
Authors: Zhang S. , Fan Y. , Zhong T. , Ma S. .
Source: Scientific Reports, 2020-09-14 00:00:00.0; 10(1), p. 15030.
EPub date: 2020-09-14 00:00:00.0.
PMID: 32929170
Related Citations

Histopathological imaging-based cancer heterogeneity analysis via penalized fusion with model averaging.
Authors: He B. , Zhong T. , Huang J. , Liu Y. , Zhang Q. , Ma S. .
Source: Biometrics, 2020-08-21 00:00:00.0; , .
EPub date: 2020-08-21 00:00:00.0.
PMID: 32822084
Related Citations

Tests for regression coefficients in high dimensional partially linear models.
Authors: Liu Y. , Zhang S. , Ma S. , Zhang Q. .
Source: Statistics & Probability Letters, 2020 Aug; 163, .
EPub date: 2020-04-09 00:00:00.0.
PMID: 32431467
Related Citations

An integrative sparse boosting analysis of cancer genomic commonality and difference.
Authors: Sun Y. , Sun Z. , Jiang Y. , Li Y. , Ma S. .
Source: Statistical Methods In Medical Research, 2020 May; 29(5), p. 1325-1337.
EPub date: 2019-07-07 00:00:00.0.
PMID: 31282286
Related Citations

Genetic susceptibility may modify the association between cell phone use and thyroid cancer: A population-based case-control study in Connecticut.
Authors: Luo J. , Li H. , Deziel N.C. , Huang H. , Zhao N. , Ma S. , Ni X. , Udelsman R. , Zhang Y. .
Source: Environmental Research, 2020 Mar; 182, p. 109013.
EPub date: 2019-12-06 00:00:00.0.
PMID: 31918310
Related Citations

Semiparametric Bayesian variable selection for gene-environment interactions.
Authors: Ren J. , Zhou F. , Li X. , Chen Q. , Zhang H. , Ma S. , Jiang Y. , Wu C. .
Source: Statistics In Medicine, 2019-12-21 00:00:00.0; , .
EPub date: 2019-12-21 00:00:00.0.
PMID: 31863500
Related Citations

NCutYX: a package for clustering analysis of multilayer omics data.
Authors: Teran Hidalgo S.J. , Wu M. , Ma S. .
Source: Bioinformatics (oxford, England), 2019-11-15 00:00:00.0; , .
EPub date: 2019-11-15 00:00:00.0.
PMID: 31730176
Related Citations

Identification of gene-environment interactions with marginal penalization.
Authors: Zhang S. , Xue Y. , Zhang Q. , Ma C. , Wu M. , Ma S. .
Source: Genetic Epidemiology, 2019-11-14 00:00:00.0; , .
EPub date: 2019-11-14 00:00:00.0.
PMID: 31724772
Related Citations

Horizontal and vertical integrative analysis methods for mental disorders omics data.
Authors: Wang S. , Shi X. , Wu M. , Ma S. .
Source: Scientific Reports, 2019-09-17 00:00:00.0; 9(1), p. 13430.
EPub date: 2019-09-17 00:00:00.0.
PMID: 31530853
Related Citations

Structured gene-environment interaction analysis.
Authors: Wu M. , Zhang Q. , Ma S. .
Source: Biometrics, 2019-08-19 00:00:00.0; , .
EPub date: 2019-08-19 00:00:00.0.
PMID: 31424088
Related Citations

Integrative Analysis of Cancer Omics Data for Prognosis Modeling.
Authors: Wang S. , Wu M. , Ma S. .
Source: Genes, 2019-08-09 00:00:00.0; 10(8), .
EPub date: 2019-08-09 00:00:00.0.
PMID: 31405076
Related Citations



Back to Top