Grant Details
Grant Number: |
5P01CA196569-09 Interpret this number |
Primary Investigator: |
Gauderman, William |
Organization: |
University Of Southern California |
Project Title: |
Statistical Methods for Integrative Genomics in Cancer |
Fiscal Year: |
2024 |
Abstract
OVERALL ABSTRACT
The overall goal of this Program Project is to develop novel statistical methods for integrating
multi-omic data to address etiology, prognosis, and treatment of cancer through a collaboration
of four closely related projects and four
shared cores (see inset). The four
projects can be broadly described as
spanning the spectrum of analysis
challenges including feature selection,
mediation, interaction, and
characterization. The first of these,
“High-Dimensional Regression for Data
Integration,” develops new strategies
for the analysis of longitudinal -omic
data incorporating external functional
information, maintaining a rigorous
inferential foundation. The second
project, “Integration of Omic Data to
Estimate Mediation or Latent
Structures,” develops novel latent factor
and mediation models using high-dimensional omic data or GWAS summary statistics to identify
and distinguish genotype, exposure and omic effects. The third project, “Integration of Omic
Data in the Analysis of Gene x Environment Interaction,” incorporates gene expression and
other -omics data into powerful multi-step approaches to scan for interactions leveraging
exposure or disease marginal associations. Project 3 will also add novel approaches to identify
transcriptional interactions, hierarchical GxE models with heredity constraints (i.e., requiring
interactions to include the corresponding main effects), and extensions to longitudinal, survival,
and quantitative traits. The fourth project, “Statistical Methods for Genome Characterization,”
automates annotation of gene function using phylogenetic inference to identify new cancer-
specific regions of conserved DNA methylation. Project 4 also proposes a novel approach for
agnostic pathway gene set enrichment analysis. These projects will be supported by four cores:
Administrative Core (A), Functional Annotation Core (B), Computation and Software
Development Core (C), and Data Analysis and Research Translation Core (D). Core B will
maintain up-to-date copies of key bioinformatics resources and will develop a software
application that will provide a single unified portal for creating annotation files that integrates
data from multiple resources. Core C will assist with high-volume computing needs and will
develop user-friendly software packages that implement novel methods. Core D will focus on
translation of new methods, both by supporting applications to real cancer datasets and by
developing materials for training outside investigators in the use of our methods and software.
Our proposed work will have both methodological and substantive importance. On the one
hand, we will develop novel statistical methods that will be applicable to a wide range of cancer
epidemiology studies and clinical trials. These methods will, for example, allow more powerful
discovery of genetic associations and interactions through leveraging biological information from
other sources. They will have translational significance in the areas of risk prediction and
targeted interventions. Our program is designed to be highly integrative, with the various
projects and cores being inter-related, so that together they will be more informative than any of
them could be on their own. Program members have access to extraordinary data resources at
USC and elsewhere, assuring that the methods we develop will be motivated by, and applicable
to, important questions arising in current cancer research.
Publications