Grant Details
Grant Number: |
1R21CA270428-01A1 Interpret this number |
Primary Investigator: |
Shen, Ronglai |
Organization: |
Sloan-Kettering Inst Can Research |
Project Title: |
Leveraging the Hidden Genome to Recover the Missing Heritability of Cancer |
Fiscal Year: |
2023 |
Abstract
PROJECT SUMMARY
Decades of research into the genetic epidemiology of cancer have led to much knowledge about the heritability
of the disease. However, findings have largely focused on rare variants in a relatively small number of known
cancer risk genes, with modest amounts of additional heritability captured by common variants from genome-
wide association studies. Evidence suggests that there is much missing heritability that remains to be
discovered. In recent years, advances in next-generation sequencing have widened the range of detectable
variants and large-scale whole-exome and whole-genome sequencing efforts have opened opportunities to
investigate a vast “hidden genome” comprising regions of unknown relevance to cancer. The hidden genome is
dominated by rare variants, and evaluating their individual impact on cancer risk presents a major challenge in
terms of statistical power. To address this challenge, our team has developed methodology to systematically
aggregate variants based on their context. We hypothesize that the approach can be applied to existing whole-
exome and whole-genome sequencing datasets to uncover missing heritability.
The aims of the proposed study are 1) to estimate the additional heritability that can be explained by the
hidden genome compared to known risk variants and 2) to estimate shared heritability between different
cancer types. Using whole-exome and whole-genome sequencing data from multiple sources, including the UK
Biobank, The Cancer Genome Atlas, the Pan-Cancer Analysis of Whole Genomes Consortium, the NIH All of
Us research program, and other sources, we will develop site-specific cancer risk models that summarize
information across the hidden genome, as well as benchmark models based on known risk variants. We will
assess the discriminatory accuracy of the models via the area under the receiver operating characteristic curve
and translate the areas under the curve into estimates of heritability using a previously established formula. To
quantify the extent of shared genetic susceptibility between different cancer types, we will calculate
correlations between sets of predictions from the corresponding hidden genome models. We will compare our
heritability and correlation estimates with previous findings from twin and family studies.
Publications
None