Skip Navigation
Grant Details

Grant Number: 5P01CA134294-05 Interpret this number
Primary Investigator: Lin, Xihong
Organization: Harvard University (Sch Of Public Hlth)
Project Title: Statistical Informatics for Cancer Research
Fiscal Year: 2012
Back to top


Abstract

DESCRIPTION (provided by applicant): We propose a Program Project, Statistical Informatics in Cancer Research, to tackle a series of problems motivated by the analysis of high dimensional data arising in population-based studies of cancer. This Program Project comprises three research projects and two cores. Project 1 focuses on spatio-temporal modeling of disease count data collected for administrative areas. The specific aims are motivated by problems encountered in epidemiological studies designed to monitor and assess health disparities. Our proposed methods address issues associated with administrative boundaries changing over time, sparse disease counts, spatial confounding, and heavy computational burdens for large data sets. Methods will be applied to data on U.S. breast cancer incidence from three state cancer registries, Boston-area premature mortality, and NCI SEER data. Project 2 is also motivated by spatially-indexed data related to cancer incidence and mortality, but the emphasis is on population surveillance and spatial cluster detection. Three of the specific aims of Project 2 are motivated by the analysis of NCI SEER data and one from a case/control study designed to assess spatial clustering in childhood leukemia. This dataset also includes individual level data on several genetic biomarkers of susceptibility. One sub-aim of this project assesses gene-space interaction by studying whether disease clustering patterns differ according to genetic polymorphisms. Project 3 focuses on methods for the analysis of very high dimensional genomic and proteomic biomarkers. Extensions to spatially indexed genomic data are also considered in Project 3. All of the aims of the three projects are closely integrated with the motivating real world cancer studies in which the investigators are involved. The three projects link thematically through a focus on population-based, observational studies in cancer, as well as technically through the consideration of high-dimensional correlated data (arising from different sources) that require advanced statistical and computing methods. Several specific techniques (e.g. spatio-temporal modeling, penalized likelihoods, False Discovery Rates, hidden Markov models) are shared between two and in some cases all three projects. The two cores consist of an Administrative Core and a Statistical Computing Core. The Administrative Core will coordinate the overall scientific direction and programmatic activities of Program, which will include short courses, a visitor program, dissemination of research results, and an external advisory committee. A Statistical Computing Core will ensure the development and dissemination of open access, good quality, user friendly software designed to implement the statistical methods developed in the Research Projects, which is the final Specific Aim of each of the three projects. The Program Director and Co-Director, Professors Louise Ryan and Xihong Lin, respectively, are internationally known biostatisticians with strong track records of academic administration.

Back to top


Publications

JOINT ANALYSIS OF SNP AND GENE EXPRESSION DATA IN GENETIC ASSOCIATION STUDIES OF COMPLEX DISEASES.
Authors: Huang YT, Vanderweele TJ, Lin X
Source: Ann Appl Stat, 2014 Mar 1;8(1), p. 352-376.
PMID: 24729824
Related Citations

Back to top


Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model Averaged Causal Effects.
Authors: Zigler CM, Dominici F
Source: J Am Stat Assoc, 2014 Jan 1;109(505), p. 95-107.
PMID: 24696528
Related Citations

Back to top


National trends in pancreatic cancer outcomes and pattern of care among Medicare beneficiaries, 2000 through 2010.
Authors: Wang Y, Schrag D, Brooks GA, Dominici F
Source: Cancer, 2014 Apr 1;120(7), p. 1050-8.
EPub date: 2013 Dec 30.
PMID: 24382787
Related Citations

Back to top


GEE-based SNP set association test for continuous and discrete traits in family-based association studies.
Authors: Wang X, Lee S, Zhu X, Redline S, Lin X
Source: Genet Epidemiol, 2013 Dec;37(8), p. 778-86.
EPub date: 2013 Oct 25.
PMID: 24166731
Related Citations

Back to top


Gene set analysis using variance component tests.
Authors: Huang YT, Lin X
Source: BMC Bioinformatics, 2013 Jun 28;14, p. 210.
EPub date: 2013 Jun 28.
PMID: 23806107
Related Citations

Back to top


Consistent Group Identification and Variable Selection in Regression with Correlated Predictors.
Authors: Sharma DB, Bondell HD, Zhang HH
Source: J Comput Graph Stat, 2013 Apr 1;22(2), p. 319-340.
PMID: 23772171
Related Citations

Back to top


General framework for meta-analysis of rare variants in sequencing association studies.
Authors: Lee S, Teslovich TM, Boehnke M, Lin X
Source: Am J Hum Genet, 2013 Jul 11;93(1), p. 42-53.
EPub date: 2013 Jun 13.
PMID: 23768515
Related Citations

Back to top


Cross-ratio estimation for bivariate failure times with left truncation.
Authors: Hu T, Lin X, Nan B
Source: Lifetime Data Anal, 2014 Jan;20(1), p. 23-37.
EPub date: 2013 May 23.
PMID: 23700275
Related Citations

Back to top


Sequence kernel association tests for the combined effect of rare and common variants.
Authors: Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X
Source: Am J Hum Genet, 2013 Jun 6;92(6), p. 841-53.
EPub date: 2013 May 16.
PMID: 23684009
Related Citations

Back to top


Genome-wide association analysis for multiple continuous secondary phenotypes.
Authors: Schifano ED, Li L, Christiani DC, Lin X
Source: Am J Hum Genet, 2013 May 2;92(5), p. 744-59.
PMID: 23643383
Related Citations

Back to top


Parallelism, uniqueness, and large-sample asymptotics for the Dantzig selector.
Authors: Dicker L, Lin X
Source: Can J Stat, 2013 Mar 1;41(1), p. 23-35.
PMID: 23589664
Related Citations

Back to top


Exposure to airborne particulate matter is associated with methylation pattern in the asthma pathway.
Authors: Sofer T, Baccarelli A, Cantone L, Coull B, Maity A, Lin X, Schwartz J
Source: Epigenomics, 2013 Apr;5(2), p. 147-54.
PMID: 23566092
Related Citations

Back to top


Comparative effectiveness of three platinum-doublet chemotherapy regimens in elderly patients with advanced non-small cell lung cancer.
Authors: Zhu J, Sharma DB, Chen AB, Johnson BE, Weeks JC, Schrag D
Source: Cancer, 2013 Jun 1;119(11), p. 2048-60.
EPub date: 2013 Apr 5.
PMID: 23564469
Related Citations

Back to top


Variable selection and estimation in generalized linear models with the seamless L 0 penalty.
Authors: Li Z, Wang S, Lin X
Source: Can J Stat, 2012 Dec;40(4), p. 745-769.
PMID: 23519603
Related Citations

Back to top


Test for interactions between a genetic marker set and environment in generalized linear models.
Authors: Lin X, Lee S, Christiani DC, Lin X
Source: Biostatistics, 2013 Sep;14(4), p. 667-81.
EPub date: 2013 Mar 5.
PMID: 23462021
Related Citations

Back to top


Family-based association tests for sequence data, and comparisons with population-based association tests.
Authors: Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X
Source: Eur J Hum Genet, 2013 Oct;21(10), p. 1158-62.
EPub date: 2013 Feb 6.
PMID: 23386037
Related Citations

Back to top


Model feedback in Bayesian propensity score estimation.
Authors: Zigler CM, Watts K, Yeh RW, Wang Y, Coull BA, Dominici F
Source: Biometrics, 2013 Mar;69(1), p. 263-73.
EPub date: 2013 Feb 4.
PMID: 23379793
Related Citations

Back to top


Multivariate Gene Selection and Testing in Studying the Exposure Effects on a Gene Set.
Authors: Sofer T, Maity A, Coull B, Baccarelli A, Schwartz J, Lin X
Source: Stat Biosci, 2012 Nov 1;4(2), p. 319-338.
PMID: 23264831
Related Citations

Back to top


Design and analysis issues in gene and environment studies.
Authors: Liu CY, Maity A, Lin X, Wright RO, Christiani DC
Source: Environ Health, 2012 Dec 19;11, p. 93.
EPub date: 2012 Dec 19.
PMID: 23253229
Related Citations

Back to top


Detecting rare variant effects using extreme phenotype sampling in sequencing association studies.
Authors: Barnett IJ, Lee S, Lin X
Source: Genet Epidemiol, 2013 Feb;37(2), p. 142-51.
EPub date: 2012 Nov 26.
PMID: 23184518
Related Citations

Back to top


SNP Set Association Analysis for Familial Data.
Authors: Schifano ED, Epstein MP, Bielak LF, Jhun MA, Kardia SL, Peyser PA, Lin X
Source: Genet Epidemiol, 2012 Sep 11;null, p. null.
EPub date: 2012 Sep 11.
PMID: 22968922
Related Citations

Back to top


Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies.
Authors: Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, NHLBI GO Exome Sequencing Project?ESP Lung Project Team, Christiani DC, Wurfel MM, Lin X
Source: Am J Hum Genet, 2012 Aug 10;91(2), p. 224-37.
EPub date: 2012 Aug 2.
PMID: 22863193
Related Citations

Back to top


Paradoxical results of adaptive false discovery rate procedures in neuroimaging studies.
Authors: Reiss PT, Schwartzman A, Lu F, Huang L, Proal E
Source: Neuroimage, 2012 Dec;63(4), p. 1833-40.
EPub date: 2012 Jul 27.
PMID: 22842214
Related Citations

Back to top


Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test.
Authors: Cai T, Lin X, Carroll RJ
Source: Biostatistics, 2012 Sep;13(4), p. 776-90.
EPub date: 2012 Jun 25.
PMID: 22734045
Related Citations

Back to top


Optimal tests for rare variant effects in sequencing association studies.
Authors: Lee S, Wu MC, Lin X
Source: Biostatistics, 2012 Sep;13(4), p. 762-75.
EPub date: 2012 Jun 14.
PMID: 22699862
Related Citations

Back to top


Carboplatin and paclitaxel with vs without bevacizumab in older patients with advanced non-small cell lung cancer.
Authors: Zhu J, Sharma DB, Gray SW, Chen AB, Weeks JC, Schrag D
Source: JAMA, 2012 Apr 18;307(15), p. 1593-601.
PMID: 22511687
Related Citations