Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5R01CA189532-06 Interpret this number
Primary Investigator: Hsu, Li
Organization: Fred Hutchinson Cancer Research Center
Project Title: Statistical Methods for Analysis of Tumor Heterogeneity in Genetic Epidemiology
Fiscal Year: 2020


Abstract

PROJECT SUMMARY/ABSTRACT Cancer is a major morbidity and mortality burden throughout the world. While much progress has been made, the elimination of cancer has not yet been achieved. In the currently funded grant, we have developed statistical methods for genome-wide association analysis of cancer and studied cancer by the site of origin. However, even within a site, cancer can have distinct mutational profiles across patients. Pooling all cancer cases occurring at one site as one disease may miss important clinical and etiological insights. Recently technology advances have made it possible to characterize somatic mutations at great detail in large numbers of tumors, providing a unique opportunity to study tumor heterogeneity. The objective of this competitive renewal is to continue our statistical methods development for association analyses of tumor heterogeneity with clinical outcomes, and for studying the underlying genetic and environmental etiology. There are challenges in analyzing the somatic mutation data. First, somatic mutation may only exist in a subset of tumor cells of a patient, so called intra-tumor heterogeneity. While our application is focused on tumor heterogeneity across patients, because intra-tumor heterogeneity can also impact clinical outcomes, important insight could be missed if it were not accounted for. The goal of Aim 1 is to develop statistical methods to account for intra-tumor heterogeneity when assessing the association of somatic mutations with clinical outcomes. Second, it is of great interest to discover germline-somatic mutation link; however, despite that tumor studies are considerably larger than before due to technology advances, the power for discovering such links remains limited because of moderate genetic effects and the burden of accounting for multiple comparison from testing millions of variants. The goal of Aim 2 is to develop novel screening strategies for prioritizing genetic variants in testing genome-wide association with tumor heterogeneity. We will achieve optimal power by using the weighted hypothesis testing framework, allowing for correlated genetic variants and continuous screening statistics. Third, it is common that tumor blocks can usually only be retrieved from a subset of cases and tumor sequencing data are thus only available for this subset. Meanwhile, extensive risk factor information has already been collected for the larger study. The goal of Aim 3 is to develop a robust and efficient approach to incorporate the summary statistics information from the larger study for characterizing the effects of genetic and environmental risk factors on risk of developing cancer with specific tumor feature. The methods will be applied to the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO, PI: Ulrike Peters; Lead Biostatistician: Li Hsu), which includes over 125,000 colorectal cancer cases and controls all with GWAS data and additionally 7,000 tumors sequencing data. As our methods are also applicable to other cancer studies, we will implement them in computationally efficient and user-friendly software packages and disseminate them to the community through R/CRAN, R/Bioconductor, or Github.



Publications

Unveiling challenges in Mendelian randomization for gene-environment interaction.
Authors: Gorfine M. , Qu C. , Peters U. , Hsu L. .
Source: Genetic Epidemiology, 2024 Jun; 48(4), p. 164-189.
EPub date: 2024-02-29 00:00:00.0.
PMID: 38420714
Related Citations

Risk projection for time-to-event outcome from population-based case-control studies leveraging summary statistics from the target population.
Authors: Zheng J. , Hsu L. .
Source: Lifetime Data Analysis, 2024-05-28 00:00:00.0; , .
EPub date: 2024-05-28 00:00:00.0.
PMID: 38805095
Related Citations

Structured testing of genetic association with mixed clinical outcomes.
Authors: Liu M. , Su Y.R. , Liu Y. , Hsu L. , He Q. .
Source: Genetic Epidemiology, 2024-04-12 00:00:00.0; , .
EPub date: 2024-04-12 00:00:00.0.
PMID: 38606632
Related Citations

Robust best linear weighted estimator with missing covariates in survival analysis.
Authors: Wang C.Y. , Hsu L. , Harrison T. .
Source: Statistics In Medicine, 2024-02-25 00:00:00.0; , .
EPub date: 2024-02-25 00:00:00.0.
PMID: 38402690
Related Citations

A Flexible Method for Diagnostic Accuracy with Biomarker Measurement Error.
Authors: Wang C.Y. , Feng Z. .
Source: Mathematics (basel, Switzerland), 2023-02-01 00:00:00.0; 11(3), .
EPub date: 2023-01-19 00:00:00.0.
PMID: 37251695
Related Citations

Validation of a genetic-enhanced risk prediction model for colorectal cancer in a large community-based cohort.
Authors: Su Y.R. , Sakoda L.C. , Jeon J. , Thomas M. , Lin Y. , Schneider J.L. , Udaltsova N. , Lee J.K. , Lansdorp-Vogelaar I. , Peterse E.F.P. , et al. .
Source: Cancer Epidemiology, Biomarkers & Prevention : A Publication Of The American Association For Cancer Research, Cosponsored By The American Society Of Preventive Oncology, 2023-01-09 00:00:00.0; , .
EPub date: 2023-01-09 00:00:00.0.
PMID: 36622766
Related Citations

A Generalized Integration Approach to Association Analysis with Multi-category Outcome: An Application to a Tumor Sequencing Study of Colorectal Cancer and Smoking.
Authors: Zheng J. , Dong X. , Newton C.C. , Hsu L. .
Source: Journal Of The American Statistical Association, 2023; 118(541), p. 29-42.
EPub date: 2022-09-20 00:00:00.0.
PMID: 37193510
Related Citations

T cell-inflamed gene expression profile is associated with favorable disease-specific survival in non-hypermutated microsatellite-stable colorectal cancer patients.
Authors: Yin H. , Harrison T.A. , Thomas S.S. , Sather C.L. , Koehne A.L. , Malen R.C. , Reedy A.M. , Wurscher M.A. , Hsu L. , Phipps A.I. , et al. .
Source: Cancer Medicine, 2022-11-07 00:00:00.0; , .
EPub date: 2022-11-07 00:00:00.0.
PMID: 36341526
Related Citations

Associating somatic mutation with clinical outcomes through kernel regression and optimal transport.
Authors: Little P. , Hsu L. , Sun W. .
Source: Biometrics, 2022-10-11 00:00:00.0; , .
EPub date: 2022-10-11 00:00:00.0.
PMID: 36217816
Related Citations

An empirical Bayes approach to improving population-specific genetic association estimation by leveraging cross-population data.
Authors: Hsu L. , Kooperberg A. , Reiner A.P. , Kooperberg C. .
Source: Genetic Epidemiology, 2022-09-18 00:00:00.0; , .
EPub date: 2022-09-18 00:00:00.0.
PMID: 36116031
Related Citations

Association between germline variants and somatic mutations in colorectal cancer.
Authors: Barfield R. , Qu C. , Steinfelder R.S. , Zeng C. , Harrison T.A. , Brezina S. , Buchanan D.D. , Campbell P.T. , Casey G. , Gallinger S. , et al. .
Source: Scientific Reports, 2022-06-17 00:00:00.0; 12(1), p. 10207.
EPub date: 2022-06-17 00:00:00.0.
PMID: 35715570
Related Citations

Robust functional principal component analysis via a functional pairwise spatial sign operator.
Authors: Wang G. , Liu S. , Han F. , Di C.Z. .
Source: Biometrics, 2022-05-18 00:00:00.0; , .
EPub date: 2022-05-18 00:00:00.0.
PMID: 35583919
Related Citations

Statistical Inference for High-Dimensional Pathway Analysis with Multiple Responses.
Authors: Liu Y. , Sun W. , Hsu L. , He Q. .
Source: Computational Statistics & Data Analysis, 2022 May; 169, .
EPub date: 2022-01-13 00:00:00.0.
PMID: 35125572
Related Citations

Genetic regulation of DNA methylation yields novel discoveries in GWAS of colorectal cancer.
Authors: Barfield R. , Huyghe J.R. , Lemire M. , Dong X. , Su Y.R. , Brezina S. , Buchanan D.D. , Figueiredo J.C. , Gallinger S. , Giannakis M. , et al. .
Source: Cancer Epidemiology, Biomarkers & Prevention : A Publication Of The American Association For Cancer Research, Cosponsored By The American Society Of Preventive Oncology, 2022-03-03 00:00:00.0; , .
EPub date: 2022-03-03 00:00:00.0.
PMID: 35247911
Related Citations

Risk Projection for Time-to-event Outcome Leveraging Summary Statistics With Source Individual-level Data.
Authors: Zheng J. , Zheng Y. , Hsu L. .
Source: Journal Of The American Statistical Association, 2022; 117(540), p. 2043-2055.
EPub date: 2021-04-22 00:00:00.0.
PMID: 36687294
Related Citations

Random effect based tests for multinomial logistic regression in genetic association studies.
Authors: He Q. , Liu Y. , Liu M. , Wu M.C. , Hsu L. .
Source: Genetic Epidemiology, 2021-08-17 00:00:00.0; , .
EPub date: 2021-08-17 00:00:00.0.
PMID: 34403161
Related Citations

Re-calibrating pure risk integrating individual data from two-phase studies with external summary statistics.
Authors: Zheng J. , Zheng Y. , Hsu L. .
Source: Biometrics, 2021-08-13 00:00:00.0; , .
EPub date: 2021-08-13 00:00:00.0.
PMID: 34390251
Related Citations

A COVARIANCE-ENHANCED APPROACH TO MULTI-TISSUE JOINT EQTL MAPPING WITH APPLICATION TO TRANSCRIPTOME-WIDE ASSOCIATION STUDIES.
Authors: Molstad A.J. , Sun W. , Hsu L. .
Source: The Annals Of Applied Statistics, 2021 Jun; 15(2), p. 998-1016.
EPub date: 2021-07-12 00:00:00.0.
PMID: 34413922
Related Citations

A Method for Subtype Analysis with Somatic Mutations.
Authors: Liu M. , Liu Y. , Wu M.C. , Hsu L. , He Q. .
Source: Bioinformatics (oxford, England), 2021-01-08 00:00:00.0; , .
EPub date: 2021-01-08 00:00:00.0.
PMID: 33416828
Related Citations

A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study.
Authors: Dong X. , Su Y.R. , Barfield R. , Bien S.A. , He Q. , Harrison T.A. , Huyghe J.R. , Keku T.O. , Lindor N.M. , Schafmayer C. , et al. .
Source: Plos Genetics, 2020 08; 16(8), p. e1008947.
EPub date: 2020-08-24 00:00:00.0.
PMID: 32833970
Related Citations

Multinomial logistic regression with missing outcome data: An application to cancer subtypes.
Authors: Wang C.Y. , Hsu L. .
Source: Statistics In Medicine, 2020-07-06 00:00:00.0; , .
EPub date: 2020-07-06 00:00:00.0.
PMID: 32628308
Related Citations

Practical implementation of frailty models in Mendelian risk prediction.
Authors: Huang T. , Gorfine M. , Hsu L. , Parmigiani G. , Braun D. .
Source: Genetic Epidemiology, 2020-06-07 00:00:00.0; , .
EPub date: 2020-06-07 00:00:00.0.
PMID: 32506746
Related Citations

Adjusted time-varying population attributable hazard in case-control studies.
Authors: Zhao W. , Zheng J. , Chen Y.Q. , Hsu L. .
Source: Statistical Methods In Medical Research, 2020 01; 29(1), p. 243-257.
EPub date: 2019-02-25 00:00:00.0.
PMID: 30799773
Related Citations

Mapping Tumor-Specific Expression QTLs in Impure Tumor Samples.
Authors: Wilson D.R. , Ibrahim J.G. , Sun W. .
Source: Journal Of The American Statistical Association, 2020; 115(529), p. 79-89.
EPub date: 2019-06-04 00:00:00.0.
PMID: 32773912
Related Citations

ICeD-T Provides Accurate Estimates of Immune Cell Abundance in Tumor Samples by Allowing for Aberrant Gene Expression Patterns.
Authors: Wilson D.R. , Jin C. , Ibrahim J.G. , Sun W. .
Source: Journal Of The American Statistical Association, 2020; 115(531), p. 1055-1065.
EPub date: 2019-09-16 00:00:00.0.
PMID: 33012900
Related Citations

Space-log: a novel approach to inferring gene-gene net-works using SPACE model with log penalty.
Authors: Wu Q.V. , Sun W. , Hsu L. .
Source: F1000research, 2020; 9, p. 1159.
EPub date: 2020-09-21 00:00:00.0.
PMID: 35083040
Related Citations

Learning-based biomarker-assisted rules for optimized clinical benefit under a risk constraint.
Authors: Wang Y. , Zhao Y.Q. , Zheng Y. .
Source: Biometrics, 2019-12-13 00:00:00.0; , .
EPub date: 2019-12-13 00:00:00.0.
PMID: 31833561
Related Citations

Joint analysis of single-cell and bulk tissue sequencing data to infer intratumor heterogeneity.
Authors: Sun W. , Jin C. , Gelfond J.A. , Chen M.H. , Ibrahim J.G. .
Source: Biometrics, 2019-12-07 00:00:00.0; , .
EPub date: 2019-12-07 00:00:00.0.
PMID: 31813161
Related Citations

Gaussian process regression for survival time prediction with genome-wide gene expression.
Authors: Molstad A.J. , Hsu L. , Sun W. .
Source: Biostatistics (oxford, England), 2019-07-11 00:00:00.0; , .
EPub date: 2019-07-11 00:00:00.0.
PMID: 31292609
Related Citations

Genetic variant predictors of gene expression provide new insight into risk of colorectal cancer.
Authors: Bien S.A. , Su Y.R. , Conti D.V. , Harrison T.A. , Qu C. , Guo X. , Lu Y. , Albanes D. , Auer P.L. , Banbury B.L. , et al. .
Source: Human Genetics, 2019 Apr; 138(4), p. 307-326.
EPub date: 2019-02-28 00:00:00.0.
PMID: 30820706
Related Citations

Diagnostics of Pleiotropy in Mendelian Randomization Studies: Global and Individual Tests for Direct Effects.
Authors: Dai J.Y. , Peters U. , Wang X. , Kocarnik J. , Chang-Claude J. , Slattery M.L. , Chan A. , Lemire M. , Berndt S.I. , Casey G. , et al. .
Source: American Journal Of Epidemiology, 2018-09-05 00:00:00.0; , .
EPub date: 2018-09-05 00:00:00.0.
PMID: 30188971
Related Citations

Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population.
Authors: Liu J. , Sun W. , Liu Y. .
Source: Biometrics, 2018-08-06 00:00:00.0; , .
EPub date: 2018-08-06 00:00:00.0.
PMID: 30081434
Related Citations

Characterizing functional consequences of DNA copy number alterations in breast and ovarian tumors by spaceMap.
Authors: Conley C.J. , Ozbek U. , Wang P. , Peng J. .
Source: Journal Of Genetics And Genomics = Yi Chuan Xue Bao, 2018-07-20 00:00:00.0; 45(7), p. 361-371.
EPub date: 2018-07-26 00:00:00.0.
PMID: 30057342
Related Citations

A new method for constructing tumor specific gene co-expression networks based on samples with tumor purity heterogeneity.
Authors: Petralia F. , Wang L. , Peng J. , Yan A. , Zhu J. , Wang P. .
Source: Bioinformatics (oxford, England), 2018-07-01 00:00:00.0; 34(13), p. i528-i536.
PMID: 29949994
Related Citations

Mendelian randomisation study of age at menarche and age at menopause and the risk of colorectal cancer.
Authors: Neumeyer S. , Banbury B.L. , Arndt V. , Berndt S.I. , Bezieau S. , Bien S.A. , Buchanan D.D. , Butterbach K. , Caan B.J. , Campbell P.T. , et al. .
Source: British Journal Of Cancer, 2018 Jun; 118(12), p. 1639-1647.
EPub date: 2018-05-24 00:00:00.0.
PMID: 29795306
Related Citations

A Mixed-Effects Model for Powerful Association Tests in Integrative Functional Genomics.
Authors: Su Y.R. , Di C. , Bien S. , Huang L. , Dong X. , Abecasis G. , Berndt S. , Bezieau S. , Brenner H. , Caan B. , et al. .
Source: American Journal Of Human Genetics, 2018-05-03 00:00:00.0; 102(5), p. 904-919.
PMID: 29727690
Related Citations

The association between copy number aberration, DNA methylation and gene expression in tumor samples.
Authors: Sun W. , Bunn P. , Jin C. , Little P. , Zhabotynsky V. , Perou C.M. , Hayes D.N. , Chen M. , Lin D.Y. .
Source: Nucleic Acids Research, 2018-04-06 00:00:00.0; 46(6), p. 3009-3018.
PMID: 29529299
Related Citations

Multivariate association analysis with somatic mutation data.
Authors: He Q. , Liu Y. , Peters U. , Hsu L. .
Source: Biometrics, 2018 Mar; 74(1), p. 176-184.
EPub date: 2017-07-19 00:00:00.0.
PMID: 28722765
Related Citations

Joint Analysis of Strain and Parent-of-Origin Effects for Recombinant Inbred Intercrosses Generated from Multiparent Populations with the Collaborative Cross as an Example.
Authors: Liu Y. , Xiong S. , Sun W. , Zou F. .
Source: G3 (bethesda, Md.), 2018-02-02 00:00:00.0; 8(2), p. 599-605.
EPub date: 2018-02-02 00:00:00.0.
PMID: 29255115
Related Citations

On Estimation of the Hazard Function from Population-based Case-Control Studies.
Authors: Hsu L. , Gorfine M. , Zucker D.M. .
Source: Journal Of The American Statistical Association, 2018; 113(522), p. 560-570.
EPub date: 2018-06-12 00:00:00.0.
PMID: 30906082
Related Citations

Incorporation of Biological Knowledge Into the Study of Gene-Environment Interactions.
Authors: Ritchie M.D. , Davis J.R. , Aschard H. , Battle A. , Conti D. , Du M. , Eskin E. , Fallin M.D. , Hsu L. , Kraft P. , et al. .
Source: American Journal Of Epidemiology, 2017-10-01 00:00:00.0; 186(7), p. 771-777.
PMID: 28978191
Related Citations

Update on the State of the Science for Analytical Methods for Gene-Environment Interactions.
Authors: Gauderman W.J. , Mukherjee B. , Aschard H. , Hsu L. , Lewinger J.P. , Patel C.J. , Witte J.S. , Amos C. , Tai C.G. , Conti D. , et al. .
Source: American Journal Of Epidemiology, 2017-10-01 00:00:00.0; 186(7), p. 762-770.
PMID: 28978192
Related Citations

Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases.
Authors: McAllister K. , Mechanic L.E. , Amos C. , Aschard H. , Blair I.A. , Chatterjee N. , Conti D. , Gauderman W.J. , Hsu L. , Hutter C.M. , et al. .
Source: American Journal Of Epidemiology, 2017-10-01 00:00:00.0; 186(7), p. 753-761.
PMID: 28978193
Related Citations

A new method to study the change of miRNA-mRNA interactions due to environmental exposures.
Authors: Petralia F. , Aushev V.N. , Gopalakrishnan K. , Kappil M. , W Khin N. , Chen J. , Teitelbaum S.L. , Wang P. .
Source: Bioinformatics (oxford, England), 2017-07-15 00:00:00.0; 33(14), p. i199-i207.
PMID: 28881990
Related Citations

Quantifying the genetic correlation between multiple cancer types.
Authors: Lindström S. , Finucane H. , Bulik-Sullivan B. , Schumacher F.R. , Amos C.I. , Hung R.J. , Rand K. , Gruber S.B. , Conti D. , Permuth J.B. , et al. .
Source: Cancer Epidemiology, Biomarkers & Prevention : A Publication Of The American Association For Cancer Research, Cosponsored By The American Society Of Preventive Oncology, 2017-06-21 00:00:00.0; , .
EPub date: 2017-06-21 00:00:00.0.
PMID: 28637796
Related Citations

Hypothesis testing in functional linear models.
Authors: Su Y.R. , Di C.Z. , Hsu L. .
Source: Biometrics, 2017-03-10 00:00:00.0; , .
EPub date: 2017-03-10 00:00:00.0.
PMID: 28295175
Related Citations

On Estimation Of Time-dependent Attributable Fraction From Population-based Case-control Studies
Authors: Zhao W. , Chen Y.Q. , Hsu L. .
Source: Biometrics, 2017-01-18 00:00:00.0; , .
PMID: 28099992
Related Citations

Heritability Estimation using a Regularized Regression Approach (HERRA): Applicable to continuous, dichotomous or age-at-onset outcome.
Authors: Gorfine M. , Berndt S.I. , Chang-Claude J. , Hoffmeister M. , Le Marchand L. , Potter J. , Slattery M.L. , Keret N. , Peters U. , Hsu L. .
Source: Plos One, 2017; 12(8), p. e0181269.
EPub date: 2017-08-16 00:00:00.0.
PMID: 28813438
Related Citations

Enrichment of colorectal cancer associations in functional regions: Insight for using epigenomics data in the analysis of whole genome sequence-imputed GWAS data.
Authors: Bien S.A. , Auer P.L. , Harrison T.A. , Qu C. , Connolly C.M. , Greenside P.G. , Chen S. , Berndt S.I. , Bézieau S. , Kang H.M. , et al. .
Source: Plos One, 2017; 12(11), p. e0186518.
EPub date: 2017-11-21 00:00:00.0.
PMID: 29161273
Related Citations

A Unified Powerful Set-based Test For Sequencing Data Analysis Of Gxe Interactions
Authors: Su Y.R. , Di C.Z. , Hsu L. , Genetics and Epidemiology of Colorectal Cancer Consortium .
Source: Biostatistics (oxford, England), 2016-07-28 00:00:00.0; , .
PMID: 27474101
Related Citations



Back to Top