Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5R01CA204120-08 Interpret this number
Primary Investigator: Ma, Shuangge
Organization: Yale University
Project Title: Novel Methods for Identifying Genetic Interactions for Cancer Prognosis
Fiscal Year: 2024


Abstract

Project Summary For the prognosis of melanoma, lung cancer, and many other cancers, G-E (gene-environment) interactions have important implications. Through a series of studies, our group has taken a unique robustness perspective and a leading role in developing the foundation of G-E interaction analysis using cutting-edge high-dimensional and regularized statistics. Recently, our group pioneered I-E (histopathological imaging-environment) interaction analysis and significantly expanded the scope of cancer analytics. We have made important discoveries for NHL, melanoma, and lung cancer, impactfully advancing their translational research and clinical practice. Our overarching goal is to construct more powerful prognosis models and more accurately identify G-E/I- E interactions so as to truthfully describe cancer biology and informatively guide clinical decision-making. In this project, we will be the first to develop paradigm-shifting SDL (statistically principled deep learning) techniques tailored to G-E/I-E interaction analysis for cancer prognosis. The proposed methods will inherit strengths from the existing deep learning and regression techniques and be superior to both. We will continue analyzing data on melanoma and lung cancer, further enhancing the high translational and clinical impact of our study. We will: (Aim 1) Develop foundational SDL techniques tailored to G-E/I-E interaction analysis. We will first develop “benchmark” nonrobust losses and then innovatively advance to losses that are robust to model mis-specification and long-tailed distribution/contamination. A novel penalization technique will be applied for architecture construction, which will accommodate the unique characteristics of the main G/I effects, main E effects, and their interactions in a customized manner, screen out noises, and respect the “main effects, interactions” hierarchy. (Aim 2) Boost performance by incorporating additional information. We will cost- effectively improve SDL performance by incorporating additional information on (a) the interconnections between prognosis and G-E/I-E interactions as well as main G/I effects, and (b) the interconnections among G/I variables. (Aim 3) Expand analysis scope and integrate multiple types of G/I measurements. Motivated by their overlapping but also independent information for prognosis, we will develop novel SDL methods and be the first to integrate multiple types of molecular and imaging measurements in interaction analysis. (Aim 4) Analyze the Yale SPORE and TCGA data on melanoma and lung cancer. Analysis will be conducted on multiple prognosis outcomes. Demographic/clinical/environmental risk factors, multiple types of molecular measurements (protein, gene expression, mutation, methylation, and microRNA), and histopathological imaging features will be analyzed. The analysis results will be thoroughly and rigorously evaluated, extensively compared to those using alternatives, and validated in multiple ways.



Publications

Robust Heterogeneity Adjustment for Gaussian Graphical Model With Latent Variables.
Authors: Li L. , Li R. , Ma S. , Zhang Q. .
Source: Statistics In Medicine, 2026 May; 45(10-12), p. e70571.
PMID: 42053355
Related Citations

Integrating Omics and Pathological Imaging Data for Cancer Prognosis via a Deep Neural Network-Based Cox Model.
Authors: Li J. , Ma S. .
Source: Statistics In Medicine, 2026 Feb; 45(3-5), p. e70435.
PMID: 41641685
Related Citations

DNN-based semiparametric AFT model for integrating genomic and pathological imaging data in cancer prognosis.
Authors: Li J. , Zhang Q. , Ma S. .
Source: Biometrics, 2026-01-06 00:00:00.0; 82(1), .
PMID: 41837305
Related Citations

Hierarchical structure-guided high-dimensional multi-view clustering.
Authors: Jiang J. , Fang K. , Ma S. , Zhang Q. .
Source: Journal Of Multivariate Analysis, 2026 Jan; 211, .
EPub date: 2025-09-25 00:00:00.0.
PMID: 41551979
Related Citations

Robust sparse Bayesian regression for longitudinal gene-environment interactions.
Authors: Fan K. , Jiang Y. , Ma S. , Wang W. , Wu C. .
Source: Journal Of The Royal Statistical Society. Series C, Applied Statistics, 2025 Dec; 74(5), p. 1372-1394.
EPub date: 2025-04-08 00:00:00.0.
PMID: 41245172
Related Citations

JOINT IDENTIFICATION OF SPATIALLY VARIABLE GENES VIA A NETWORK-ASSISTED BAYESIAN REGULARIZATION APPROACH.
Authors: Wu M. , Li Y. , Ma S. , Wu M. .
Source: The Annals Of Applied Statistics, 2025 Dec; 19(4), p. 2705-2723.
EPub date: 2025-12-05 00:00:00.0.
PMID: 42136619
Related Citations

Ordinal Sparse Neural Networks for Modeling Gene- and Imaging-Environment Interactions.
Authors: Xue J. , Xu Y. , Li J. , Ma S. , Fang K. .
Source: Statistics In Medicine, 2025 Oct; 44(23-24), p. e70302.
PMID: 41105049
Related Citations

Subgroup Analysis of Differential Networks with Latent Variables.
Authors: Li L. , Ma S. , Zhang Q. .
Source: Statistics And Computing, 2025 Oct; 35(5), .
EPub date: 2025-07-02 00:00:00.0.
PMID: 42130826
Related Citations

NETWORK-BASED MODELING OF EMOTIONAL EXPRESSIONS FOR MULTIPLE CANCERS VIA A LINGUISTIC ANALYSIS OF AN ONLINE HEALTH COMMUNITY.
Authors: Fan X. , Liu M. , Ma S. .
Source: The Annals Of Applied Statistics, 2025 Sep; 19(3), p. 2218-2236.
EPub date: 2025-08-28 00:00:00.0.
PMID: 41104371
Related Citations

Local Clustering for Functional Data.
Authors: Chen Y. , Zhang Q. , Ma S. .
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2025 Sep; 34(3), p. 1075-1090.
EPub date: 2025-02-10 00:00:00.0.
PMID: 41200425
Related Citations

Joint modeling of mixed outcomes using a rank-based sparse neural network.
Authors: Xue J. , Xu Y. , Li J. , Ma S. , Fang K. .
Source: Journal Of Biomedical Informatics, 2025-07-05 00:00:00.0; 169, p. 104870.
EPub date: 2025-07-05 00:00:00.0.
PMID: 40623577
Related Citations

Robust Transfer Learning for High-Dimensional GLM Using γ $$ \gamma $$ -Divergence With Applications to Cancer Genomics.
Authors: Xu F. , Ma S. , Zhang Q. , Xu Y. .
Source: Statistics In Medicine, 2025 Jul; 44(15-17), p. e70170.
PMID: 40662636
Related Citations

Subgroup Testing in the Change-Plane Cox Model.
Authors: Zhang X. , Ren P. , Shi X. , Ma S. , Liu X. .
Source: Statistics In Medicine, 2025 Jul; 44(15-17), p. e70179.
PMID: 40662752
Related Citations

High-Dimensional Gene-Environment Interaction Analysis.
Authors: Wu M. , Li Y. , Ma S. .
Source: Annual Review Of Statistics And Its Application, 2025 Mar; 12, .
EPub date: 2024-09-11 00:00:00.0.
PMID: 40881670
Related Citations

Bayesian Modeling of Cancer Outcomes Using Genetic Variables Assisted by Pathological Imaging Data.
Authors: Im Y. , Li R. , Ma S. .
Source: Statistics In Medicine, 2025-02-10 00:00:00.0; 44(3-4), p. e10350.
PMID: 39840672
Related Citations

Hierarchical Multi-Label Classification With Gene-Environment Interactions in Disease Modeling.
Authors: Li J. , Zhang Q. , Ma S. , Fang K. , Xu Y. .
Source: Statistics In Medicine, 2025-02-10 00:00:00.0; 44(3-4), p. e10330.
PMID: 39865593
Related Citations

Integrative rank-based regression for multi-source high-dimensional data with multi-type responses.
Authors: Xu F. , Ma S. , Zhang Q. .
Source: Journal Of Applied Statistics, 2025; 52(11), p. 2011-2030.
EPub date: 2025-01-16 00:00:00.0.
PMID: 40904949
Related Citations

Incorporating prior information in gene expression network-based cancer heterogeneity analysis.
Authors: Li R. , Xu S. , Li Y. , Tang Z. , Feng D. , Cai J. , Ma S. .
Source: Biostatistics (oxford, England), 2024-12-31 00:00:00.0; 26(1), .
PMID: 39074174
Related Citations

The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies.
Authors: Liu Y. , Ren J. , Ma S. , Wu C. .
Source: Statistics In Medicine, 2024-09-11 00:00:00.0; , .
EPub date: 2024-09-11 00:00:00.0.
PMID: 39260448
Related Citations

Estimation of multiple networks with common structures in heterogeneous subgroups.
Authors: Qin X. , Hu J. , Ma S. , Wu M. .
Source: Journal Of Multivariate Analysis, 2024 Jul; 202, .
EPub date: 2024-02-13 00:00:00.0.
PMID: 38433779
Related Citations

Hierarchical False Discovery Rate Control for High-dimensional Survival Analysis with Interactions.
Authors: Liang W. , Zhang Q. , Ma S. .
Source: Computational Statistics & Data Analysis, 2024 Apr; 192, .
EPub date: 2023-12-05 00:00:00.0.
PMID: 38098875
Related Citations

Information-incorporated sparse hierarchical cancer heterogeneity analysis.
Authors: Han W. , Zhang S. , Ma S. , Ren M. .
Source: Statistics In Medicine, 2024-03-30 00:00:00.0; , .
EPub date: 2024-03-30 00:00:00.0.
PMID: 38553996
Related Citations

Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering.
Authors: Sun X. , Zhang S. , Ma S. .
Source: Entropy (basel, Switzerland), 2024-03-30 00:00:00.0; 26(4), .
EPub date: 2024-03-30 00:00:00.0.
PMID: 38667864
Related Citations

FunctanSNP: an R package for functional analysis of dense SNP data (with interactions).
Authors: Ren R. , Fang K. , Zhang Q. , Ma S. .
Source: Bioinformatics (oxford, England), 2023-12-01 00:00:00.0; 39(12), .
PMID: 38060266
Related Citations

Gene-environment interaction analysis under the Cox model.
Authors: Fang K. , Li J. , Xu Y. , Ma S. , Zhang Q. .
Source: Annals Of The Institute Of Statistical Mathematics, 2023 Dec; 75(6), p. 931-948.
EPub date: 2023-04-10 00:00:00.0.
PMID: 39990259
Related Citations

The Bayesian Regularized Quantile Varying Coefficient Model.
Authors: Zhou F. , Ren J. , Ma S. , Wu C. .
Source: Computational Statistics & Data Analysis, 2023 Nov; 187, .
EPub date: 2023-06-23 00:00:00.0.
PMID: 38746689
Related Citations

Locally sparse quantile estimation for a partially functional interaction model.
Authors: Liang W. , Zhang Q. , Ma S. .
Source: Computational Statistics & Data Analysis, 2023 Oct; 186, .
EPub date: 2023-05-25 00:00:00.0.
PMID: 39555004
Related Citations

Aligned deep neural network for integrative analysis with high-dimensional input.
Authors: Zhang S. , Zhang S. , Yi H. , Ma S. .
Source: Journal Of Biomedical Informatics, 2023 Aug; 144, p. 104434.
EPub date: 2023-06-28 00:00:00.0.
PMID: 37391115
Related Citations

Pathological imaging-assisted cancer gene-environment interaction analysis.
Authors: Fang K. , Li J. , Zhang Q. , Xu Y. , Ma S. .
Source: Biometrics, 2023-05-03 00:00:00.0; , .
EPub date: 2023-05-03 00:00:00.0.
PMID: 37132273
Related Citations

Bi-level structured functional analysis for genome-wide association studies.
Authors: Wu M. , Wang F. , Ge Y. , Ma S. , Li Y. .
Source: Biometrics, 2023-04-26 00:00:00.0; , .
EPub date: 2023-04-26 00:00:00.0.
PMID: 37098961
Related Citations

Bayesian finite mixture of regression analysis for cancer based on histopathological imaging-environment interactions.
Authors: Im Y. , Huang Y. , Tan A. , Ma S. .
Source: Biostatistics (oxford, England), 2023-04-14 00:00:00.0; 24(2), p. 425-442.
PMID: 37057611
Related Citations

Gene-environment interaction analysis via deep learning.
Authors: Wu S. , Xu Y. , Zhang Q. , Ma S. .
Source: Genetic Epidemiology, 2023 Apr; 47(3), p. 261-286.
EPub date: 2023-02-19 00:00:00.0.
PMID: 36807383
Related Citations

Unified model-free interaction screening via CV-entropy filter.
Authors: Xiong W. , Chen Y. , Ma S. .
Source: Computational Statistics & Data Analysis, 2023 Apr; 180, .
EPub date: 2022-12-28 00:00:00.0.
PMID: 36910335
Related Citations

HETEROGENEITY ANALYSIS VIA INTEGRATING MULTI-SOURCES HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO CANCER STUDIES.
Authors: Zhong T. , Zhang Q. , Huang J. , Wu M. , Ma S. .
Source: Statistica Sinica, 2023 Apr; 33(2), p. 729-758.
PMID: 38037567
Related Citations

Spatio-temporally smoothed deep survival neural network.
Authors: Li Y. , Liang D. , Ma S. , Ma C. .
Source: Journal Of Biomedical Informatics, 2023 Jan; 137, p. 104255.
EPub date: 2022-12-01 00:00:00.0.
PMID: 36462600
Related Citations

A General Framework for Identifying Hierarchical Interactions and Its Application to Genomics Data.
Authors: Xiao Z. , Xingjie S. , Yiming L. , Xu L. , Ma S. .
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2023; 32(3), p. 873-883.
EPub date: 2023-02-06 00:00:00.0.
PMID: 38009111
Related Citations

Rank-Based Greedy Model Averaging for High-Dimensional Survival Data.
Authors: He B. , Ma S. , Zhang X. , Zhu L.X. .
Source: Journal Of The American Statistical Association, 2023; 118(544), p. 2658-2670.
EPub date: 2022-07-07 00:00:00.0.
PMID: 39552724
Related Citations

Two-level Bayesian interaction analysis for survival data incorporating pathway information.
Authors: Qin X. , Ma S. , Wu M. .
Source: Biometrics, 2022-12-16 00:00:00.0; , .
EPub date: 2022-12-16 00:00:00.0.
PMID: 36524727
Related Citations

A tree-based gene-environment interaction analysis with rare features.
Authors: Liu M. , Zhang Q. , Ma S. .
Source: Statistical Analysis And Data Mining, 2022 Oct; 15(5), p. 648-674.
EPub date: 2022-03-01 00:00:00.0.
PMID: 38046814
Related Citations

Sparse group variable selection for gene-environment interactions in the longitudinal study.
Authors: Zhou F. , Lu X. , Ren J. , Fan K. , Ma S. , Wu C. .
Source: Genetic Epidemiology, 2022-06-29 00:00:00.0; , .
EPub date: 2022-06-29 00:00:00.0.
PMID: 35766061
Related Citations

Network-based cancer heterogeneity analysis incorporating multi-view of prior information.
Authors: Li Y. , Xu S. , Ma S. , Wu M. .
Source: Bioinformatics (oxford, England), 2022-05-13 00:00:00.0; 38(10), p. 2855-2862.
PMID: 35561185
Related Citations

Network-based cancer heterogeneity analysis incorporating multi-view of prior information.
Authors: Li Y. , Xu S. , Ma S. , Wu M. .
Source: Bioinformatics (oxford, England), 2022-05-13 00:00:00.0; 38(10), p. 2855-2862.
PMID: 35561185
Related Citations

Biclustering analysis of functionals via penalized fusion.
Authors: Fang K. , Chen Y. , Ma S. , Zhang Q. .
Source: Journal Of Multivariate Analysis, 2022 May; 189, .
EPub date: 2021-10-29 00:00:00.0.
PMID: 36817965
Related Citations

GEInfo: an R package for gene-environment interaction analysis incorporating prior information.
Authors: Wang X. , Liu H. , Ma S. .
Source: Bioinformatics (oxford, England), 2022-04-29 00:00:00.0; , .
EPub date: 2022-04-29 00:00:00.0.
PMID: 35485739
Related Citations

iSFun: an R package for integrative dimension reduction analysis.
Authors: Fang K. , Ren R. , Zhang Q. , Ma S. .
Source: Bioinformatics (oxford, England), 2022-04-20 00:00:00.0; , .
EPub date: 2022-04-20 00:00:00.0.
PMID: 35441661
Related Citations

Integrative functional linear model for genome-wide association studies with multiple traits.
Authors: Li Y. , Wang F. , Wu M. , Ma S. .
Source: Biostatistics (oxford, England), 2022-04-13 00:00:00.0; 23(2), p. 574-590.
PMID: 33040145
Related Citations

Integrative functional linear model for genome-wide association studies with multiple traits.
Authors: Li Y. , Wang F. , Wu M. , Ma S. .
Source: Biostatistics (oxford, England), 2022-04-13 00:00:00.0; 23(2), p. 574-590.
PMID: 33040145
Related Citations

Robust Bayesian variable selection for gene-environment interactions.
Authors: Ren J. , Zhou F. , Li X. , Ma S. , Jiang Y. , Wu C. .
Source: Biometrics, 2022-04-08 00:00:00.0; , .
EPub date: 2022-04-08 00:00:00.0.
PMID: 35394058
Related Citations

Robust Bayesian variable selection for gene-environment interactions.
Authors: Ren J. , Zhou F. , Li X. , Ma S. , Jiang Y. , Wu C. .
Source: Biometrics, 2022-04-08 00:00:00.0; , .
EPub date: 2022-04-08 00:00:00.0.
PMID: 35394058
Related Citations

Bayesian hierarchical finite mixture of regression for histopathological imaging-based cancer data analysis.
Authors: Im Y. , Huang Y. , Huang J. , Ma S. .
Source: Statistics In Medicine, 2022-01-13 00:00:00.0; , .
EPub date: 2022-01-13 00:00:00.0.
PMID: 35028949
Related Citations



Back to Top