Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5R01CA204120-07 Interpret this number
Primary Investigator: Ma, Shuangge
Organization: Yale University
Project Title: Novel Methods for Identifying Genetic Interactions for Cancer Prognosis
Fiscal Year: 2023


Project Summary For the prognosis of melanoma, lung cancer, and many other cancers, G-E (gene-environment) interactions have important implications. Through a series of studies, our group has taken a unique robustness perspective and a leading role in developing the foundation of G-E interaction analysis using cutting-edge high-dimensional and regularized statistics. Recently, our group pioneered I-E (histopathological imaging-environment) interaction analysis and significantly expanded the scope of cancer analytics. We have made important discoveries for NHL, melanoma, and lung cancer, impactfully advancing their translational research and clinical practice. Our overarching goal is to construct more powerful prognosis models and more accurately identify G-E/I- E interactions so as to truthfully describe cancer biology and informatively guide clinical decision-making. In this project, we will be the first to develop paradigm-shifting SDL (statistically principled deep learning) techniques tailored to G-E/I-E interaction analysis for cancer prognosis. The proposed methods will inherit strengths from the existing deep learning and regression techniques and be superior to both. We will continue analyzing data on melanoma and lung cancer, further enhancing the high translational and clinical impact of our study. We will: (Aim 1) Develop foundational SDL techniques tailored to G-E/I-E interaction analysis. We will first develop “benchmark” nonrobust losses and then innovatively advance to losses that are robust to model mis-specification and long-tailed distribution/contamination. A novel penalization technique will be applied for architecture construction, which will accommodate the unique characteristics of the main G/I effects, main E effects, and their interactions in a customized manner, screen out noises, and respect the “main effects, interactions” hierarchy. (Aim 2) Boost performance by incorporating additional information. We will cost- effectively improve SDL performance by incorporating additional information on (a) the interconnections between prognosis and G-E/I-E interactions as well as main G/I effects, and (b) the interconnections among G/I variables. (Aim 3) Expand analysis scope and integrate multiple types of G/I measurements. Motivated by their overlapping but also independent information for prognosis, we will develop novel SDL methods and be the first to integrate multiple types of molecular and imaging measurements in interaction analysis. (Aim 4) Analyze the Yale SPORE and TCGA data on melanoma and lung cancer. Analysis will be conducted on multiple prognosis outcomes. Demographic/clinical/environmental risk factors, multiple types of molecular measurements (protein, gene expression, mutation, methylation, and microRNA), and histopathological imaging features will be analyzed. The analysis results will be thoroughly and rigorously evaluated, extensively compared to those using alternatives, and validated in multiple ways.