Skip Navigation
National Institutes of Health: National Cancer Institute: Division of Cancer Control and Population Sciences
Grant Details

Grant Number: 5R37CA076404-17 Interpret this number
Primary Investigator: Lin, Xihong
Organization: Harvard School Of Public Health
Project Title: Statistical Methods for Correlated and High-Dimensional Biomedical Data
Fiscal Year: 2013
Back to top


Correlated and high-dimensional data arise frequently in health sciences research, especially in cancer research. Correlated data arise in longitudinal studies and familial studies, while high-dimensional data have emerged in recent years as a consequence of the rapid advance of genomic and proteomic research. We propose in this application to develop nonparametric and semiparametric regression methods for clustered/longitudinal data and high-dimensional genomic and proteomic data. Specifically, we propose to develop (1) the kernel (spline) profile EM method for generalized semiparametric mixed models for clustered/longitudinal data; (2) nonparametric and semiparametric regression models for longitudinal data with dropouts; (3) the mixed model kernel machine method for generalized semiparametric regression models and semiparametric Cox models for the analysis of gene expression pathways and tag single nucleotide polymorphisms (SNPs) within a candidate gene, and the sparse kernel machine (SKM) method for selecting genes and tag SNPs from a large pool of genes or tag SNPs; (4) the joint modeling method using functional wavelet models and generalized semiparametric models for mass spectrometry proteomic data and disease outcomes. Asymptotic properties of the proposed methods will be investigated and simulation studies will be conducted to evaluate their finite sample performance. Efficient numerical algorithms and user-friendly statistical software will be developed, with the goal of disseminating these models and methods to health sciences researchers. In collaboration with biomedical investigators, we will apply the proposed models and methods to several motivating data sets on cancer research and other fields of research.

Back to top


Semiparametric Regression For Periodic Longitudinal Hormone Data From Multiple Menstrual Cycles
Authors: Zhang D. , Lin X. , Sowers M. .
Source: Biometrics, 2000 Mar; 56(1), p. 31-9.
PMID: 10783774
Related Citations

A Scaled Linear Mixed Model For Multiple Outcomes
Authors: Lin X. , Ryan L. , Sammel M. , Zhang D. , Padungtod C. , Xu X. .
Source: Biometrics, 2000 Jun; 56(2), p. 593-601.
PMID: 10877322
Related Citations

Latent Variable Models For Longitudinal Data With Multiple Continuous Outcomes
Authors: Roy J. , Lin X. .
Source: Biometrics, 2000 Dec; 56(4), p. 1047-54.
PMID: 11129460
Related Citations

A Tobit Variance-component Method For Linkage Analysis Of Censored Trait Data
Authors: Epstein M.P. , Lin X. , Boehnke M. .
Source: American Journal Of Human Genetics, 2003 Mar; 72(3), p. 611-20.
PMID: 12587095
Related Citations

A Population Pharmacokinetic Model With Time-dependent Covariates Measured With Errors
Authors: Li L. , Lin X. , Brown M.B. , Gupta S. , Lee K.H. .
Source: Biometrics, 2004 Jun; 60(2), p. 451-60.
PMID: 15180671
Related Citations

Mixtures Of Varying Coefficient Models For Longitudinal Data With Discrete Or Continuous Nonignorable Dropout
Authors: Hogan J.W. , Lin X. , Herman B. .
Source: Biometrics, 2004 Dec; 60(4), p. 854-64.
PMID: 15606405
Related Citations

A Varying-coefficient Cox Model For The Effect Of Age At A Marker Event On Age At Menopause
Authors: Nan B. , Lin X. , Lisabeth L.D. , Harlow S.D. .
Source: Biometrics, 2005 Jun; 61(2), p. 576-83.
PMID: 16011707
Related Citations

Missing Covariates In Longitudinal Data With Informative Dropouts: Bias Analysis And Inference
Authors: Roy J. , Lin X. .
Source: Biometrics, 2005 Sep; 61(3), p. 837-46.
PMID: 16135036
Related Citations

Quantitative Quality-assessment Techniques To Compare Fractionation And Depletion Methods In Seldi-tof Mass Spectrometry Experiments
Authors: Harezlak J. , Wang M. , Christiani D. , Lin X. .
Source: Bioinformatics (oxford, England), 2007-09-15 00:00:00.0; 23(18), p. 2441-8.
PMID: 17626063
Related Citations

Two-stage Functional Mixed Models For Evaluating The Effect Of Longitudinal Covariate Profiles On A Scalar Outcome
Authors: Zhang D. , Lin X. , Sowers M. .
Source: Biometrics, 2007 Jun; 63(2), p. 351-62.
PMID: 17688488
Related Citations

Estimation Using Penalized Quasilikelihood And Quasi-pseudo-likelihood In Poisson Mixed Models
Authors: Lin X. .
Source: Lifetime Data Analysis, 2007 Dec; 13(4), p. 533-44.
PMID: 18080833
Related Citations

A Powerful And Flexible Multilocus Association Test For Quantitative Traits
Authors: Kwee L.C. , Liu D. , Lin X. , Ghosh D. , Epstein M.P. .
Source: American Journal Of Human Genetics, 2008 Feb; 82(2), p. 386-97.
PMID: 18252219
Related Citations

Semiparametric Modeling Of Longitudinal Measurements And Time-to-event Data--a Two-stage Regression Calibration Approach
Authors: Ye W. , Lin X. , Taylor J.M. .
Source: Biometrics, 2008 Dec; 64(4), p. 1238-46.
PMID: 18261160
Related Citations

Analysis Of Case-control Age-at-onset Data Using A Modified Case-cohort Method
Authors: Nan B. , Lin X. .
Source: Biometrical Journal. Biometrische Zeitschrift, 2008 Apr; 50(2), p. 311-20.
PMID: 18318038
Related Citations

Bayesian Inference In Semiparametric Mixed Models For Longitudinal Data
Authors: Li Y. , Lin X. , Müller P. .
Source: Biometrics, 2010 Mar; 66(1), p. 70-8.
PMID: 19432777
Related Citations

Inverse Probability Of Censoring Weighted Estimates Of Kendall's ¿ For Gap Time Analyses
Authors: Lakhal-Chaieb L. , Cook R.J. , Lin X. .
Source: Biometrics, 2010 Dec; 66(4), p. 1145-52.
PMID: 20337629
Related Citations

Powerful Snp-set Analysis For Case-control Genome-wide Association Studies
Authors: Wu M.C. , Kraft P. , Epstein M.P. , Taylor D.M. , Chanock S.J. , Hunter D.J. , Lin X. .
Source: American Journal Of Human Genetics, 2010-06-11 00:00:00.0; 86(6), p. 929-42.
PMID: 20560208
Related Citations

Increased Power For The Analysis Of Label-free Lc-ms/ms Proteomics Data By Combining Spectral Counts And Peptide Peak Attributes
Authors: Dicker L. , Lin X. , Ivanov A.R. .
Source: Molecular & Cellular Proteomics : Mcp, 2010 Dec; 9(12), p. 2704-18.
PMID: 20823122
Related Citations

Semiparametric Frailty Models For Clustered Failure Time Data
Authors: Yu Z. , Lin X. , Tu W. .
Source: Biometrics, 2012 Jun; 68(2), p. 429-36.
PMID: 22070739
Related Citations