|Grant Number:||5P01CA134294-05 Interpret this number|
|Primary Investigator:||Lin, Xihong|
|Organization:||Harvard University (Sch Of Public Hlth)|
|Project Title:||Statistical Informatics for Cancer Research|
DESCRIPTION (provided by applicant): We propose a Program Project, Statistical Informatics in Cancer Research, to tackle a series of problems motivated by the analysis of high dimensional data arising in population-based studies of cancer. This Program Project comprises three research projects and two cores. Project 1 focuses on spatio-temporal modeling of disease count data collected for administrative areas. The specific aims are motivated by problems encountered in epidemiological studies designed to monitor and assess health disparities. Our proposed methods address issues associated with administrative boundaries changing over time, sparse disease counts, spatial confounding, and heavy computational burdens for large data sets. Methods will be applied to data on U.S. breast cancer incidence from three state cancer registries, Boston-area premature mortality, and NCI SEER data. Project 2 is also motivated by spatially-indexed data related to cancer incidence and mortality, but the emphasis is on population surveillance and spatial cluster detection. Three of the specific aims of Project 2 are motivated by the analysis of NCI SEER data and one from a case/control study designed to assess spatial clustering in childhood leukemia. This dataset also includes individual level data on several genetic biomarkers of susceptibility. One sub-aim of this project assesses gene-space interaction by studying whether disease clustering patterns differ according to genetic polymorphisms. Project 3 focuses on methods for the analysis of very high dimensional genomic and proteomic biomarkers. Extensions to spatially indexed genomic data are also considered in Project 3. All of the aims of the three projects are closely integrated with the motivating real world cancer studies in which the investigators are involved. The three projects link thematically through a focus on population-based, observational studies in cancer, as well as technically through the consideration of high-dimensional correlated data (arising from different sources) that require advanced statistical and computing methods. Several specific techniques (e.g. spatio-temporal modeling, penalized likelihoods, False Discovery Rates, hidden Markov models) are shared between two and in some cases all three projects. The two cores consist of an Administrative Core and a Statistical Computing Core. The Administrative Core will coordinate the overall scientific direction and programmatic activities of Program, which will include short courses, a visitor program, dissemination of research results, and an external advisory committee. A Statistical Computing Core will ensure the development and dissemination of open access, good quality, user friendly software designed to implement the statistical methods developed in the Research Projects, which is the final Specific Aim of each of the three projects. The Program Director and Co-Director, Professors Louise Ryan and Xihong Lin, respectively, are internationally known biostatisticians with strong track records of academic administration.
JOINT ANALYSIS OF SNP AND GENE EXPRESSION DATA IN GENETIC ASSOCIATION STUDIES OF COMPLEX DISEASES.
Authors: Huang YT, Vanderweele TJ, Lin X
Source: Ann Appl Stat, 2014 Mar 1;8(1), p. 352-376.
Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model Averaged Causal Effects.
Authors: Zigler CM, Dominici F
Source: J Am Stat Assoc, 2014 Jan 1;109(505), p. 95-107.
National trends in pancreatic cancer outcomes and pattern of care among Medicare beneficiaries, 2000 through 2010.
Authors: Wang Y, Schrag D, Brooks GA, Dominici F
Source: Cancer, 2014 Apr 1;120(7), p. 1050-8.
EPub date: 2013 Dec 30.
GEE-based SNP set association test for continuous and discrete traits in family-based association studies.
Authors: Wang X, Lee S, Zhu X, Redline S, Lin X
Source: Genet Epidemiol, 2013 Dec;37(8), p. 778-86.
EPub date: 2013 Oct 25.
Gene set analysis using variance component tests.
Authors: Huang YT, Lin X
Source: BMC Bioinformatics, 2013 Jun 28;14, p. 210.
EPub date: 2013 Jun 28.
Consistent Group Identification and Variable Selection in Regression with Correlated Predictors.
Authors: Sharma DB, Bondell HD, Zhang HH
Source: J Comput Graph Stat, 2013 Apr 1;22(2), p. 319-340.
General framework for meta-analysis of rare variants in sequencing association studies.
Authors: Lee S, Teslovich TM, Boehnke M, Lin X
Source: Am J Hum Genet, 2013 Jul 11;93(1), p. 42-53.
EPub date: 2013 Jun 13.
Cross-ratio estimation for bivariate failure times with left truncation.
Authors: Hu T, Lin X, Nan B
Source: Lifetime Data Anal, 2014 Jan;20(1), p. 23-37.
EPub date: 2013 May 23.
Sequence kernel association tests for the combined effect of rare and common variants.
Authors: Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X
Source: Am J Hum Genet, 2013 Jun 6;92(6), p. 841-53.
EPub date: 2013 May 16.
Genome-wide association analysis for multiple continuous secondary phenotypes.
Authors: Schifano ED, Li L, Christiani DC, Lin X
Source: Am J Hum Genet, 2013 May 2;92(5), p. 744-59.
Parallelism, uniqueness, and large-sample asymptotics for the Dantzig selector.
Authors: Dicker L, Lin X
Source: Can J Stat, 2013 Mar 1;41(1), p. 23-35.
Exposure to airborne particulate matter is associated with methylation pattern in the asthma pathway.
Authors: Sofer T, Baccarelli A, Cantone L, Coull B, Maity A, Lin X, Schwartz J
Source: Epigenomics, 2013 Apr;5(2), p. 147-54.
Comparative effectiveness of three platinum-doublet chemotherapy regimens in elderly patients with advanced non-small cell lung cancer.
Authors: Zhu J, Sharma DB, Chen AB, Johnson BE, Weeks JC, Schrag D
Source: Cancer, 2013 Jun 1;119(11), p. 2048-60.
EPub date: 2013 Apr 5.
Variable selection and estimation in generalized linear models with the seamless L 0 penalty.
Authors: Li Z, Wang S, Lin X
Source: Can J Stat, 2012 Dec;40(4), p. 745-769.
Test for interactions between a genetic marker set and environment in generalized linear models.
Authors: Lin X, Lee S, Christiani DC, Lin X
Source: Biostatistics, 2013 Sep;14(4), p. 667-81.
EPub date: 2013 Mar 5.
Family-based association tests for sequence data, and comparisons with population-based association tests.
Authors: Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X
Source: Eur J Hum Genet, 2013 Oct;21(10), p. 1158-62.
EPub date: 2013 Feb 6.
Model feedback in Bayesian propensity score estimation.
Authors: Zigler CM, Watts K, Yeh RW, Wang Y, Coull BA, Dominici F
Source: Biometrics, 2013 Mar;69(1), p. 263-73.
EPub date: 2013 Feb 4.
Multivariate Gene Selection and Testing in Studying the Exposure Effects on a Gene Set.
Authors: Sofer T, Maity A, Coull B, Baccarelli A, Schwartz J, Lin X
Source: Stat Biosci, 2012 Nov 1;4(2), p. 319-338.
Design and analysis issues in gene and environment studies.
Authors: Liu CY, Maity A, Lin X, Wright RO, Christiani DC
Source: Environ Health, 2012 Dec 19;11, p. 93.
EPub date: 2012 Dec 19.
Detecting rare variant effects using extreme phenotype sampling in sequencing association studies.
Authors: Barnett IJ, Lee S, Lin X
Source: Genet Epidemiol, 2013 Feb;37(2), p. 142-51.
EPub date: 2012 Nov 26.
SNP Set Association Analysis for Familial Data.
Authors: Schifano ED, Epstein MP, Bielak LF, Jhun MA, Kardia SL, Peyser PA, Lin X
Source: Genet Epidemiol, 2012 Sep 11;null, p. null.
EPub date: 2012 Sep 11.
Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies.
Authors: Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, NHLBI GO Exome Sequencing Project?ESP Lung Project Team, Christiani DC, Wurfel MM, Lin X
Source: Am J Hum Genet, 2012 Aug 10;91(2), p. 224-37.
EPub date: 2012 Aug 2.
Paradoxical results of adaptive false discovery rate procedures in neuroimaging studies.
Authors: Reiss PT, Schwartzman A, Lu F, Huang L, Proal E
Source: Neuroimage, 2012 Dec;63(4), p. 1833-40.
EPub date: 2012 Jul 27.
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test.
Authors: Cai T, Lin X, Carroll RJ
Source: Biostatistics, 2012 Sep;13(4), p. 776-90.
EPub date: 2012 Jun 25.
Optimal tests for rare variant effects in sequencing association studies.
Authors: Lee S, Wu MC, Lin X
Source: Biostatistics, 2012 Sep;13(4), p. 762-75.
EPub date: 2012 Jun 14.
Carboplatin and paclitaxel with vs without bevacizumab in older patients with advanced non-small cell lung cancer.
Authors: Zhu J, Sharma DB, Gray SW, Chen AB, Weeks JC, Schrag D
Source: JAMA, 2012 Apr 18;307(15), p. 1593-601.