|Grant Number:||5P01CA134294-05 Interpret this number|
|Primary Investigator:||Lin, Xihong|
|Organization:||Harvard School Of Public Health|
|Project Title:||Statistical Informatics for Cancer Research|
DESCRIPTION (provided by applicant): We propose a Program Project, Statistical Informatics in Cancer Research, to tackle a series of problems motivated by the analysis of high dimensional data arising in population-based studies of cancer. This Program Project comprises three research projects and two cores. Project 1 focuses on spatio-temporal modeling of disease count data collected for administrative areas. The specific aims are motivated by problems encountered in epidemiological studies designed to monitor and assess health disparities. Our proposed methods address issues associated with administrative boundaries changing over time, sparse disease counts, spatial confounding, and heavy computational burdens for large data sets. Methods will be applied to data on U.S. breast cancer incidence from three state cancer registries, Boston-area premature mortality, and NCI SEER data. Project 2 is also motivated by spatially-indexed data related to cancer incidence and mortality, but the emphasis is on population surveillance and spatial cluster detection. Three of the specific aims of Project 2 are motivated by the analysis of NCI SEER data and one from a case/control study designed to assess spatial clustering in childhood leukemia. This dataset also includes individual level data on several genetic biomarkers of susceptibility. One sub-aim of this project assesses gene-space interaction by studying whether disease clustering patterns differ according to genetic polymorphisms. Project 3 focuses on methods for the analysis of very high dimensional genomic and proteomic biomarkers. Extensions to spatially indexed genomic data are also considered in Project 3. All of the aims of the three projects are closely integrated with the motivating real world cancer studies in which the investigators are involved. The three projects link thematically through a focus on population-based, observational studies in cancer, as well as technically through the consideration of high-dimensional correlated data (arising from different sources) that require advanced statistical and computing methods. Several specific techniques (e.g. spatio-temporal modeling, penalized likelihoods, False Discovery Rates, hidden Markov models) are shared between two and in some cases all three projects. The two cores consist of an Administrative Core and a Statistical Computing Core. The Administrative Core will coordinate the overall scientific direction and programmatic activities of Program, which will include short courses, a visitor program, dissemination of research results, and an external advisory committee. A Statistical Computing Core will ensure the development and dissemination of open access, good quality, user friendly software designed to implement the statistical methods developed in the Research Projects, which is the final Specific Aim of each of the three projects. The Program Director and Co-Director, Professors Louise Ryan and Xihong Lin, respectively, are internationally known biostatisticians with strong track records of academic administration.
Effect of flexible sigmoidoscopy screening on colorectal cancer incidence and mortality: a randomized clinical trial.
Authors: Holme Ř, Lřberg M, Kalager M, Bretthauer M, Hernán MA, Aas E, Eide TJ, Skovlund E, Schneede J, Tveit KM, Hoff G
Source: JAMA, 2014 Aug 13;312(6), p. 606-15.
Rare-variant association analysis: study designs and statistical tests.
Authors: Lee S, Abecasis GR, Boehnke M, Lin X
Source: Am J Hum Genet, 2014 Jul 3;95(1), p. 5-23.
Does exposure prediction bias health-effect estimation?: The relationship between confounding adjustment and exposure prediction.
Authors: Cefalu M, Dominici F
Source: Epidemiology, 2014 Jul;25(4), p. 583-90.
Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies.
Authors: Aschard H, Vilhjálmsson BJ, Greliche N, Morange PE, Trégouët DA, Kraft P
Source: Am J Hum Genet, 2014 May 1;94(5), p. 662-76.
EPub date: 2014 Apr 17.
JOINT ANALYSIS OF SNP AND GENE EXPRESSION DATA IN GENETIC ASSOCIATION STUDIES OF COMPLEX DISEASES.
Authors: Huang YT, Vanderweele TJ, Lin X
Source: Ann Appl Stat, 2014 Mar 1;8(1), p. 352-376.
Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model Averaged Causal Effects.
Authors: Zigler CM, Dominici F
Source: J Am Stat Assoc, 2014 Jan 1;109(505), p. 95-107.
Methodological challenges in mendelian randomization.
Authors: VanderWeele TJ, Tchetgen Tchetgen EJ, Cornelis M, Kraft P
Source: Epidemiology, 2014 May;25(3), p. 427-35.
50-year trends in US socioeconomic inequalities in health: US-born Black and White Americans, 1959-2008.
Authors: Krieger N, Kosheleva A, Waterman PD, Chen JT, Beckfield J, Kiang MV
Source: Int J Epidemiol, 2014 Aug;43(4), p. 1294-313.
EPub date: 2014 Mar 16.
Ancestry estimation and control of population stratification for sequence-based association studies.
Authors: Wang C, Zhan X, Bragg-Gresham J, Kang HM, Stambolian D, Chew EY, Branham KE, Heckenlively J, FUSION Study, Fulton R, Wilson RK, Mardis ER, Lin X, Swaroop A, Zöllner S, Abecasis GR
Source: Nat Genet, 2014 Apr;46(4), p. 409-15.
EPub date: 2014 Mar 16.
National trends in pancreatic cancer outcomes and pattern of care among Medicare beneficiaries, 2000 through 2010.
Authors: Wang Y, Schrag D, Brooks GA, Dominici F
Source: Cancer, 2014 Apr 1;120(7), p. 1050-8.
EPub date: 2013 Dec 30.
Omnibus risk assessment via accelerated failure time kernel machine modeling.
Authors: Sinnott JA, Cai T
Source: Biometrics, 2013 Dec;69(4), p. 861-73.
EPub date: 2013 Nov 6.
GEE-based SNP set association test for continuous and discrete traits in family-based association studies.
Authors: Wang X, Lee S, Zhu X, Redline S, Lin X
Source: Genet Epidemiol, 2013 Dec;37(8), p. 778-86.
EPub date: 2013 Oct 25.
Gene set analysis using variance component tests.
Authors: Huang YT, Lin X
Source: BMC Bioinformatics, 2013 Jun 28;14, p. 210.
EPub date: 2013 Jun 28.
Consistent Group Identification and Variable Selection in Regression with Correlated Predictors.
Authors: Sharma DB, Bondell HD, Zhang HH
Source: J Comput Graph Stat, 2013 Apr 1;22(2), p. 319-340.
General framework for meta-analysis of rare variants in sequencing association studies.
Authors: Lee S, Teslovich TM, Boehnke M, Lin X
Source: Am J Hum Genet, 2013 Jul 11;93(1), p. 42-53.
EPub date: 2013 Jun 13.
Cross-ratio estimation for bivariate failure times with left truncation.
Authors: Hu T, Lin X, Nan B
Source: Lifetime Data Anal, 2014 Jan;20(1), p. 23-37.
EPub date: 2013 May 23.
Sequence kernel association tests for the combined effect of rare and common variants.
Authors: Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X
Source: Am J Hum Genet, 2013 Jun 6;92(6), p. 841-53.
EPub date: 2013 May 16.
Genome-wide association analysis for multiple continuous secondary phenotypes.
Authors: Schifano ED, Li L, Christiani DC, Lin X
Source: Am J Hum Genet, 2013 May 2;92(5), p. 744-59.
Parallelism, uniqueness, and large-sample asymptotics for the Dantzig selector.
Authors: Dicker L, Lin X
Source: Can J Stat, 2013 Mar 1;41(1), p. 23-35.
Exposure to airborne particulate matter is associated with methylation pattern in the asthma pathway.
Authors: Sofer T, Baccarelli A, Cantone L, Coull B, Maity A, Lin X, Schwartz J
Source: Epigenomics, 2013 Apr;5(2), p. 147-54.
Comparative effectiveness of three platinum-doublet chemotherapy regimens in elderly patients with advanced non-small cell lung cancer.
Authors: Zhu J, Sharma DB, Chen AB, Johnson BE, Weeks JC, Schrag D
Source: Cancer, 2013 Jun 1;119(11), p. 2048-60.
EPub date: 2013 Apr 5.
State Medicaid eligibility and care delayed because of cost.
Authors: Clark CR, Ommerborn MJ, Coull BA, Pham DQ, Haas J
Source: N Engl J Med, 2013 Mar 28;368(13), p. 1263-5.
Variable selection and estimation in generalized linear models with the seamless L 0 penalty.
Authors: Li Z, Wang S, Lin X
Source: Can J Stat, 2012 Dec;40(4), p. 745-769.
Test for interactions between a genetic marker set and environment in generalized linear models.
Authors: Lin X, Lee S, Christiani DC, Lin X
Source: Biostatistics, 2013 Sep;14(4), p. 667-81.
EPub date: 2013 Mar 5.
Family-based association tests for sequence data, and comparisons with population-based association tests.
Authors: Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X
Source: Eur J Hum Genet, 2013 Oct;21(10), p. 1158-62.
EPub date: 2013 Feb 6.
Model feedback in Bayesian propensity score estimation.
Authors: Zigler CM, Watts K, Yeh RW, Wang Y, Coull BA, Dominici F
Source: Biometrics, 2013 Mar;69(1), p. 263-73.
EPub date: 2013 Feb 4.
Unmeasured confounding and hazard scales: sensitivity analysis for total, direct, and indirect effects.
Authors: VanderWeele TJ
Source: Eur J Epidemiol, 2013 Feb;28(2), p. 113-7.
EPub date: 2013 Feb 1.
Multivariate Gene Selection and Testing in Studying the Exposure Effects on a Gene Set.
Authors: Sofer T, Maity A, Coull B, Baccarelli A, Schwartz J, Lin X
Source: Stat Biosci, 2012 Nov 1;4(2), p. 319-338.