||5R01CA197139-03 Interpret this number
||University Of Washington
||Integrative Interpretation of the Organismal Consequences of Non-Coding Variation
DESCRIPTION (provided by applicant): Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation, particularly in non-coding regions. To address
this challenge, we recently developed a novel framework, Combined Annotation Dependent Depletion (CADD), for estimating the deleteriousness of any genetic variant. CADD defines an objective, data-rich, and quantitative integration of many genomic annotations into a single measure of variant effect at the organismal level. The goals of this R01 proposal are to further develop the CADD framework, to apply it in the context of ongoing genetic studies of both rare and common human diseases, and to experimentally evaluate its predictions. In Specific Aim 1, we will substantially modify CADD in both straightforward and creative ways, with the goal of dramatically improving CADD's ability to annotate non- coding variants, not only to estimate their organismal effects but also to provide insights into molecular mechanisms. In Specific Aim 2, we will apply CADD to a variety of ongoing whole genome sequencing studies of human disease, especially those in which non-coding variants are either known or suspected to be causal. As part of this effort, we will develop new statistical frameworks that directly incorporat CADD into traditional genome-wide discovery approaches. In Specific Aim 3, we will perform a combination of high-throughput (massively parallel reporter assays), medium-throughput (CRISPR/Cas9), and low-throughput (in vivo mouse transgenics) experimental assays for systematic and targeted assessment of CADD predictions. This proposal includes both computational and experimental innovations, and builds on established collaborative relationships between investigators with complementary strengths. The completion of our aims will yield novel methods, data, and resources with which to annotate whole genome sequences, broadly enabling the field to more effectively identify and mechanistically understand non-coding genetic variants that are causally relevant to human disease.