Skip to main content
An official website of the United States government
Grant Details

Grant Number: 1R21CA270428-01A1 Interpret this number
Primary Investigator: Shen, Ronglai
Organization: Sloan-Kettering Inst Can Research
Project Title: Leveraging the Hidden Genome to Recover the Missing Heritability of Cancer
Fiscal Year: 2023


PROJECT SUMMARY Decades of research into the genetic epidemiology of cancer have led to much knowledge about the heritability of the disease. However, findings have largely focused on rare variants in a relatively small number of known cancer risk genes, with modest amounts of additional heritability captured by common variants from genome- wide association studies. Evidence suggests that there is much missing heritability that remains to be discovered. In recent years, advances in next-generation sequencing have widened the range of detectable variants and large-scale whole-exome and whole-genome sequencing efforts have opened opportunities to investigate a vast “hidden genome” comprising regions of unknown relevance to cancer. The hidden genome is dominated by rare variants, and evaluating their individual impact on cancer risk presents a major challenge in terms of statistical power. To address this challenge, our team has developed methodology to systematically aggregate variants based on their context. We hypothesize that the approach can be applied to existing whole- exome and whole-genome sequencing datasets to uncover missing heritability. The aims of the proposed study are 1) to estimate the additional heritability that can be explained by the hidden genome compared to known risk variants and 2) to estimate shared heritability between different cancer types. Using whole-exome and whole-genome sequencing data from multiple sources, including the UK Biobank, The Cancer Genome Atlas, the Pan-Cancer Analysis of Whole Genomes Consortium, the NIH All of Us research program, and other sources, we will develop site-specific cancer risk models that summarize information across the hidden genome, as well as benchmark models based on known risk variants. We will assess the discriminatory accuracy of the models via the area under the receiver operating characteristic curve and translate the areas under the curve into estimates of heritability using a previously established formula. To quantify the extent of shared genetic susceptibility between different cancer types, we will calculate correlations between sets of predictions from the corresponding hidden genome models. We will compare our heritability and correlation estimates with previous findings from twin and family studies.


Identifying somatic fingerprints of cancers defined by germline and environmental risk factors.
Authors: Chakraborty S. , Guan Z. , Kostrzewa C.E. , Shen R. , Begg C.B. .
Source: Genetic epidemiology, 2024-04-30; , .
EPub date: 2024-04-30.
PMID: 38686586
Related Citations

Topical hidden genome: discovering latent cancer mutational topics using a Bayesian multilevel context-learning approach.
Authors: Chakraborty S. , Guan Z. , Begg C.B. , Shen R. .
Source: Biometrics, 2024-03-27; 80(2), .
PMID: 38682463
Related Citations

Back to Top