Grant Details
| Grant Number: |
5R01CA204120-08 Interpret this number |
| Primary Investigator: |
Ma, Shuangge |
| Organization: |
Yale University |
| Project Title: |
Novel Methods for Identifying Genetic Interactions for Cancer Prognosis |
| Fiscal Year: |
2024 |
Abstract
Project Summary
For the prognosis of melanoma, lung cancer, and many other cancers, G-E (gene-environment) interactions
have important implications. Through a series of studies, our group has taken a unique robustness perspective
and a leading role in developing the foundation of G-E interaction analysis using cutting-edge high-dimensional
and regularized statistics. Recently, our group pioneered I-E (histopathological imaging-environment) interaction
analysis and significantly expanded the scope of cancer analytics. We have made important discoveries for NHL,
melanoma, and lung cancer, impactfully advancing their translational research and clinical practice.
Our overarching goal is to construct more powerful prognosis models and more accurately identify G-E/I-
E interactions so as to truthfully describe cancer biology and informatively guide clinical decision-making. In this
project, we will be the first to develop paradigm-shifting SDL (statistically principled deep learning) techniques
tailored to G-E/I-E interaction analysis for cancer prognosis. The proposed methods will inherit strengths from
the existing deep learning and regression techniques and be superior to both. We will continue analyzing data
on melanoma and lung cancer, further enhancing the high translational and clinical impact of our study.
We will: (Aim 1) Develop foundational SDL techniques tailored to G-E/I-E interaction analysis. We will
first develop “benchmark” nonrobust losses and then innovatively advance to losses that are robust to model
mis-specification and long-tailed distribution/contamination. A novel penalization technique will be applied for
architecture construction, which will accommodate the unique characteristics of the main G/I effects, main E
effects, and their interactions in a customized manner, screen out noises, and respect the “main effects,
interactions” hierarchy. (Aim 2) Boost performance by incorporating additional information. We will cost-
effectively improve SDL performance by incorporating additional information on (a) the interconnections between
prognosis and G-E/I-E interactions as well as main G/I effects, and (b) the interconnections among G/I variables.
(Aim 3) Expand analysis scope and integrate multiple types of G/I measurements. Motivated by their overlapping
but also independent information for prognosis, we will develop novel SDL methods and be the first to integrate
multiple types of molecular and imaging measurements in interaction analysis. (Aim 4) Analyze the Yale SPORE
and TCGA data on melanoma and lung cancer. Analysis will be conducted on multiple prognosis outcomes.
Demographic/clinical/environmental risk factors, multiple types of molecular measurements (protein, gene
expression, mutation, methylation, and microRNA), and histopathological imaging features will be analyzed. The
analysis results will be thoroughly and rigorously evaluated, extensively compared to those using alternatives,
and validated in multiple ways.
Publications
Robust Heterogeneity Adjustment for Gaussian Graphical Model With Latent Variables.
Authors: Li L.
, Li R.
, Ma S.
, Zhang Q.
.
Source: Statistics In Medicine, 2026 May; 45(10-12), p. e70571.
PMID: 42053355
Related Citations
Integrating Omics and Pathological Imaging Data for Cancer Prognosis via a Deep Neural Network-Based Cox Model.
Authors: Li J.
, Ma S.
.
Source: Statistics In Medicine, 2026 Feb; 45(3-5), p. e70435.
PMID: 41641685
Related Citations
DNN-based semiparametric AFT model for integrating genomic and pathological imaging data in cancer prognosis.
Authors: Li J.
, Zhang Q.
, Ma S.
.
Source: Biometrics, 2026-01-06 00:00:00.0; 82(1), .
PMID: 41837305
Related Citations
Hierarchical structure-guided high-dimensional multi-view clustering.
Authors: Jiang J.
, Fang K.
, Ma S.
, Zhang Q.
.
Source: Journal Of Multivariate Analysis, 2026 Jan; 211, .
EPub date: 2025-09-25 00:00:00.0.
PMID: 41551979
Related Citations
Robust sparse Bayesian regression for longitudinal gene-environment interactions.
Authors: Fan K.
, Jiang Y.
, Ma S.
, Wang W.
, Wu C.
.
Source: Journal Of The Royal Statistical Society. Series C, Applied Statistics, 2025 Dec; 74(5), p. 1372-1394.
EPub date: 2025-04-08 00:00:00.0.
PMID: 41245172
Related Citations
JOINT IDENTIFICATION OF SPATIALLY VARIABLE GENES VIA A NETWORK-ASSISTED BAYESIAN REGULARIZATION APPROACH.
Authors: Wu M.
, Li Y.
, Ma S.
, Wu M.
.
Source: The Annals Of Applied Statistics, 2025 Dec; 19(4), p. 2705-2723.
EPub date: 2025-12-05 00:00:00.0.
PMID: 42136619
Related Citations
Ordinal Sparse Neural Networks for Modeling Gene- and Imaging-Environment Interactions.
Authors: Xue J.
, Xu Y.
, Li J.
, Ma S.
, Fang K.
.
Source: Statistics In Medicine, 2025 Oct; 44(23-24), p. e70302.
PMID: 41105049
Related Citations
Subgroup Analysis of Differential Networks with Latent Variables.
Authors: Li L.
, Ma S.
, Zhang Q.
.
Source: Statistics And Computing, 2025 Oct; 35(5), .
EPub date: 2025-07-02 00:00:00.0.
PMID: 42130826
Related Citations
NETWORK-BASED MODELING OF EMOTIONAL EXPRESSIONS FOR MULTIPLE CANCERS VIA A LINGUISTIC ANALYSIS OF AN ONLINE HEALTH COMMUNITY.
Authors: Fan X.
, Liu M.
, Ma S.
.
Source: The Annals Of Applied Statistics, 2025 Sep; 19(3), p. 2218-2236.
EPub date: 2025-08-28 00:00:00.0.
PMID: 41104371
Related Citations
Local Clustering for Functional Data.
Authors: Chen Y.
, Zhang Q.
, Ma S.
.
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2025 Sep; 34(3), p. 1075-1090.
EPub date: 2025-02-10 00:00:00.0.
PMID: 41200425
Related Citations
Joint modeling of mixed outcomes using a rank-based sparse neural network.
Authors: Xue J.
, Xu Y.
, Li J.
, Ma S.
, Fang K.
.
Source: Journal Of Biomedical Informatics, 2025-07-05 00:00:00.0; 169, p. 104870.
EPub date: 2025-07-05 00:00:00.0.
PMID: 40623577
Related Citations
Robust Transfer Learning for High-Dimensional GLM Using γ $$ \gamma $$ -Divergence With Applications to Cancer Genomics.
Authors: Xu F.
, Ma S.
, Zhang Q.
, Xu Y.
.
Source: Statistics In Medicine, 2025 Jul; 44(15-17), p. e70170.
PMID: 40662636
Related Citations
Subgroup Testing in the Change-Plane Cox Model.
Authors: Zhang X.
, Ren P.
, Shi X.
, Ma S.
, Liu X.
.
Source: Statistics In Medicine, 2025 Jul; 44(15-17), p. e70179.
PMID: 40662752
Related Citations
High-Dimensional Gene-Environment Interaction Analysis.
Authors: Wu M.
, Li Y.
, Ma S.
.
Source: Annual Review Of Statistics And Its Application, 2025 Mar; 12, .
EPub date: 2024-09-11 00:00:00.0.
PMID: 40881670
Related Citations
Bayesian Modeling of Cancer Outcomes Using Genetic Variables Assisted by Pathological Imaging Data.
Authors: Im Y.
, Li R.
, Ma S.
.
Source: Statistics In Medicine, 2025-02-10 00:00:00.0; 44(3-4), p. e10350.
PMID: 39840672
Related Citations
Hierarchical Multi-Label Classification With Gene-Environment Interactions in Disease Modeling.
Authors: Li J.
, Zhang Q.
, Ma S.
, Fang K.
, Xu Y.
.
Source: Statistics In Medicine, 2025-02-10 00:00:00.0; 44(3-4), p. e10330.
PMID: 39865593
Related Citations
Integrative rank-based regression for multi-source high-dimensional data with multi-type responses.
Authors: Xu F.
, Ma S.
, Zhang Q.
.
Source: Journal Of Applied Statistics, 2025; 52(11), p. 2011-2030.
EPub date: 2025-01-16 00:00:00.0.
PMID: 40904949
Related Citations
Incorporating prior information in gene expression network-based cancer heterogeneity analysis.
Authors: Li R.
, Xu S.
, Li Y.
, Tang Z.
, Feng D.
, Cai J.
, Ma S.
.
Source: Biostatistics (oxford, England), 2024-12-31 00:00:00.0; 26(1), .
PMID: 39074174
Related Citations
The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies.
Authors: Liu Y.
, Ren J.
, Ma S.
, Wu C.
.
Source: Statistics In Medicine, 2024-09-11 00:00:00.0; , .
EPub date: 2024-09-11 00:00:00.0.
PMID: 39260448
Related Citations
Estimation of multiple networks with common structures in heterogeneous subgroups.
Authors: Qin X.
, Hu J.
, Ma S.
, Wu M.
.
Source: Journal Of Multivariate Analysis, 2024 Jul; 202, .
EPub date: 2024-02-13 00:00:00.0.
PMID: 38433779
Related Citations
Hierarchical False Discovery Rate Control for High-dimensional Survival Analysis with Interactions.
Authors: Liang W.
, Zhang Q.
, Ma S.
.
Source: Computational Statistics & Data Analysis, 2024 Apr; 192, .
EPub date: 2023-12-05 00:00:00.0.
PMID: 38098875
Related Citations
Information-incorporated sparse hierarchical cancer heterogeneity analysis.
Authors: Han W.
, Zhang S.
, Ma S.
, Ren M.
.
Source: Statistics In Medicine, 2024-03-30 00:00:00.0; , .
EPub date: 2024-03-30 00:00:00.0.
PMID: 38553996
Related Citations
Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering.
Authors: Sun X.
, Zhang S.
, Ma S.
.
Source: Entropy (basel, Switzerland), 2024-03-30 00:00:00.0; 26(4), .
EPub date: 2024-03-30 00:00:00.0.
PMID: 38667864
Related Citations
FunctanSNP: an R package for functional analysis of dense SNP data (with interactions).
Authors: Ren R.
, Fang K.
, Zhang Q.
, Ma S.
.
Source: Bioinformatics (oxford, England), 2023-12-01 00:00:00.0; 39(12), .
PMID: 38060266
Related Citations
Gene-environment interaction analysis under the Cox model.
Authors: Fang K.
, Li J.
, Xu Y.
, Ma S.
, Zhang Q.
.
Source: Annals Of The Institute Of Statistical Mathematics, 2023 Dec; 75(6), p. 931-948.
EPub date: 2023-04-10 00:00:00.0.
PMID: 39990259
Related Citations
The Bayesian Regularized Quantile Varying Coefficient Model.
Authors: Zhou F.
, Ren J.
, Ma S.
, Wu C.
.
Source: Computational Statistics & Data Analysis, 2023 Nov; 187, .
EPub date: 2023-06-23 00:00:00.0.
PMID: 38746689
Related Citations
Locally sparse quantile estimation for a partially functional interaction model.
Authors: Liang W.
, Zhang Q.
, Ma S.
.
Source: Computational Statistics & Data Analysis, 2023 Oct; 186, .
EPub date: 2023-05-25 00:00:00.0.
PMID: 39555004
Related Citations
Aligned deep neural network for integrative analysis with high-dimensional input.
Authors: Zhang S.
, Zhang S.
, Yi H.
, Ma S.
.
Source: Journal Of Biomedical Informatics, 2023 Aug; 144, p. 104434.
EPub date: 2023-06-28 00:00:00.0.
PMID: 37391115
Related Citations
Pathological imaging-assisted cancer gene-environment interaction analysis.
Authors: Fang K.
, Li J.
, Zhang Q.
, Xu Y.
, Ma S.
.
Source: Biometrics, 2023-05-03 00:00:00.0; , .
EPub date: 2023-05-03 00:00:00.0.
PMID: 37132273
Related Citations
Bi-level structured functional analysis for genome-wide association studies.
Authors: Wu M.
, Wang F.
, Ge Y.
, Ma S.
, Li Y.
.
Source: Biometrics, 2023-04-26 00:00:00.0; , .
EPub date: 2023-04-26 00:00:00.0.
PMID: 37098961
Related Citations
Bayesian finite mixture of regression analysis for cancer based on histopathological imaging-environment interactions.
Authors: Im Y.
, Huang Y.
, Tan A.
, Ma S.
.
Source: Biostatistics (oxford, England), 2023-04-14 00:00:00.0; 24(2), p. 425-442.
PMID: 37057611
Related Citations
Gene-environment interaction analysis via deep learning.
Authors: Wu S.
, Xu Y.
, Zhang Q.
, Ma S.
.
Source: Genetic Epidemiology, 2023 Apr; 47(3), p. 261-286.
EPub date: 2023-02-19 00:00:00.0.
PMID: 36807383
Related Citations
Unified model-free interaction screening via CV-entropy filter.
Authors: Xiong W.
, Chen Y.
, Ma S.
.
Source: Computational Statistics & Data Analysis, 2023 Apr; 180, .
EPub date: 2022-12-28 00:00:00.0.
PMID: 36910335
Related Citations
HETEROGENEITY ANALYSIS VIA INTEGRATING MULTI-SOURCES HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO CANCER STUDIES.
Authors: Zhong T.
, Zhang Q.
, Huang J.
, Wu M.
, Ma S.
.
Source: Statistica Sinica, 2023 Apr; 33(2), p. 729-758.
PMID: 38037567
Related Citations
Spatio-temporally smoothed deep survival neural network.
Authors: Li Y.
, Liang D.
, Ma S.
, Ma C.
.
Source: Journal Of Biomedical Informatics, 2023 Jan; 137, p. 104255.
EPub date: 2022-12-01 00:00:00.0.
PMID: 36462600
Related Citations
A General Framework for Identifying Hierarchical Interactions and Its Application to Genomics Data.
Authors: Xiao Z.
, Xingjie S.
, Yiming L.
, Xu L.
, Ma S.
.
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2023; 32(3), p. 873-883.
EPub date: 2023-02-06 00:00:00.0.
PMID: 38009111
Related Citations
Rank-Based Greedy Model Averaging for High-Dimensional Survival Data.
Authors: He B.
, Ma S.
, Zhang X.
, Zhu L.X.
.
Source: Journal Of The American Statistical Association, 2023; 118(544), p. 2658-2670.
EPub date: 2022-07-07 00:00:00.0.
PMID: 39552724
Related Citations
Two-level Bayesian interaction analysis for survival data incorporating pathway information.
Authors: Qin X.
, Ma S.
, Wu M.
.
Source: Biometrics, 2022-12-16 00:00:00.0; , .
EPub date: 2022-12-16 00:00:00.0.
PMID: 36524727
Related Citations
A tree-based gene-environment interaction analysis with rare features.
Authors: Liu M.
, Zhang Q.
, Ma S.
.
Source: Statistical Analysis And Data Mining, 2022 Oct; 15(5), p. 648-674.
EPub date: 2022-03-01 00:00:00.0.
PMID: 38046814
Related Citations
Sparse group variable selection for gene-environment interactions in the longitudinal study.
Authors: Zhou F.
, Lu X.
, Ren J.
, Fan K.
, Ma S.
, Wu C.
.
Source: Genetic Epidemiology, 2022-06-29 00:00:00.0; , .
EPub date: 2022-06-29 00:00:00.0.
PMID: 35766061
Related Citations
Network-based cancer heterogeneity analysis incorporating multi-view of prior information.
Authors: Li Y.
, Xu S.
, Ma S.
, Wu M.
.
Source: Bioinformatics (oxford, England), 2022-05-13 00:00:00.0; 38(10), p. 2855-2862.
PMID: 35561185
Related Citations
Network-based cancer heterogeneity analysis incorporating multi-view of prior information.
Authors: Li Y.
, Xu S.
, Ma S.
, Wu M.
.
Source: Bioinformatics (oxford, England), 2022-05-13 00:00:00.0; 38(10), p. 2855-2862.
PMID: 35561185
Related Citations
Biclustering analysis of functionals via penalized fusion.
Authors: Fang K.
, Chen Y.
, Ma S.
, Zhang Q.
.
Source: Journal Of Multivariate Analysis, 2022 May; 189, .
EPub date: 2021-10-29 00:00:00.0.
PMID: 36817965
Related Citations
GEInfo: an R package for gene-environment interaction analysis incorporating prior information.
Authors: Wang X.
, Liu H.
, Ma S.
.
Source: Bioinformatics (oxford, England), 2022-04-29 00:00:00.0; , .
EPub date: 2022-04-29 00:00:00.0.
PMID: 35485739
Related Citations
iSFun: an R package for integrative dimension reduction analysis.
Authors: Fang K.
, Ren R.
, Zhang Q.
, Ma S.
.
Source: Bioinformatics (oxford, England), 2022-04-20 00:00:00.0; , .
EPub date: 2022-04-20 00:00:00.0.
PMID: 35441661
Related Citations
Integrative functional linear model for genome-wide association studies with multiple traits.
Authors: Li Y.
, Wang F.
, Wu M.
, Ma S.
.
Source: Biostatistics (oxford, England), 2022-04-13 00:00:00.0; 23(2), p. 574-590.
PMID: 33040145
Related Citations
Integrative functional linear model for genome-wide association studies with multiple traits.
Authors: Li Y.
, Wang F.
, Wu M.
, Ma S.
.
Source: Biostatistics (oxford, England), 2022-04-13 00:00:00.0; 23(2), p. 574-590.
PMID: 33040145
Related Citations
Robust Bayesian variable selection for gene-environment interactions.
Authors: Ren J.
, Zhou F.
, Li X.
, Ma S.
, Jiang Y.
, Wu C.
.
Source: Biometrics, 2022-04-08 00:00:00.0; , .
EPub date: 2022-04-08 00:00:00.0.
PMID: 35394058
Related Citations
Robust Bayesian variable selection for gene-environment interactions.
Authors: Ren J.
, Zhou F.
, Li X.
, Ma S.
, Jiang Y.
, Wu C.
.
Source: Biometrics, 2022-04-08 00:00:00.0; , .
EPub date: 2022-04-08 00:00:00.0.
PMID: 35394058
Related Citations
Bayesian hierarchical finite mixture of regression for histopathological imaging-based cancer data analysis.
Authors: Im Y.
, Huang Y.
, Huang J.
, Ma S.
.
Source: Statistics In Medicine, 2022-01-13 00:00:00.0; , .
EPub date: 2022-01-13 00:00:00.0.
PMID: 35028949
Related Citations