|Grant Number:||5R37CA076404-16 Interpret this number|
|Primary Investigator:||Lin, Xihong|
|Organization:||Harvard School Of Public Health|
|Project Title:||Statistical Methods for Correlated and High-Dimensional Biomedical Data|
Correlated and high-dimensional data arise frequently in health sciences research, especially in cancer research. Correlated data arise in longitudinal studies and familial studies, while high-dimensional data have emerged in recent years as a consequence of the rapid advance of genomic and proteomic research. We propose in this application to develop nonparametric and semiparametric regression methods for clustered/longitudinal data and high-dimensional genomic and proteomic data. Specifically, we propose to develop (1) the kernel (spline) profile EM method for generalized semiparametric mixed models for clustered/longitudinal data; (2) nonparametric and semiparametric regression models for longitudinal data with dropouts; (3) the mixed model kernel machine method for generalized semiparametric regression models and semiparametric Cox models for the analysis of gene expression pathways and tag single nucleotide polymorphisms (SNPs) within a candidate gene, and the sparse kernel machine (SKM) method for selecting genes and tag SNPs from a large pool of genes or tag SNPs; (4) the joint modeling method using functional wavelet models and generalized semiparametric models for mass spectrometry proteomic data and disease outcomes. Asymptotic properties of the proposed methods will be investigated and simulation studies will be conducted to evaluate their finite sample performance. Efficient numerical algorithms and user-friendly statistical software will be developed, with the goal of disseminating these models and methods to health sciences researchers. In collaboration with biomedical investigators, we will apply the proposed models and methods to several motivating data sets on cancer research and other fields of research.
Semiparametric frailty models for clustered failure time data.
Authors: Yu Z, Lin X, Tu W
Source: Biometrics, 2012 Jun;68(2), p. 429-36.
EPub date: 2011 Nov 9.
Increased power for the analysis of label-free LC-MS/MS proteomics data by combining spectral counts and peptide peak attributes.
Authors: Dicker L, Lin X, Ivanov AR
Source: Mol Cell Proteomics, 2010 Dec;9(12), p. 2704-18.
EPub date: 2010 Sep 7.
Powerful SNP-set analysis for case-control genome-wide association studies.
Authors: Wu MC, Kraft P, Epstein MP, Taylor DM, Chanock SJ, Hunter DJ, Lin X
Source: Am J Hum Genet, 2010 Jun 11;86(6), p. 929-42.
Inverse probability of censoring weighted estimates of Kendall's ? for gap time analyses.
Authors: Lakhal-Chaieb L, Cook RJ, Lin X
Source: Biometrics, 2010 Dec;66(4), p. 1145-52.
Bayesian inference in semiparametric mixed models for longitudinal data.
Authors: Li Y, Lin X, Müller P
Source: Biometrics, 2010 Mar;66(1), p. 70-8.
EPub date: 2009 May 7.
Analysis of case-control age-at-onset data using a modified case-cohort method.
Authors: Nan B, Lin X
Source: Biom J, 2008 Apr;50(2), p. 311-20.
Semiparametric modeling of longitudinal measurements and time-to-event data--a two-stage regression calibration approach.
Authors: Ye W, Lin X, Taylor JM
Source: Biometrics, 2008 Dec;64(4), p. 1238-46.
EPub date: 2008 Feb 7.
A powerful and flexible multilocus association test for quantitative traits.
Authors: Kwee LC, Liu D, Lin X, Ghosh D, Epstein MP
Source: Am J Hum Genet, 2008 Feb;82(2), p. 386-97.
Estimation using penalized quasilikelihood and quasi-pseudo-likelihood in Poisson mixed models.
Authors: Lin X
Source: Lifetime Data Anal, 2007 Dec;13(4), p. 533-44.
EPub date: 2007 Dec 16.
Two-stage functional mixed models for evaluating the effect of longitudinal covariate profiles on a scalar outcome.
Authors: Zhang D, Lin X, Sowers M
Source: Biometrics, 2007 Jun;63(2), p. 351-62.
Quantitative quality-assessment techniques to compare fractionation and depletion methods in SELDI-TOF mass spectrometry experiments.
Authors: Harezlak J, Wang M, Christiani D, Lin X
Source: Bioinformatics, 2007 Sep 15;23(18), p. 2441-8.
EPub date: 2007 Jul 11.
Missing covariates in longitudinal data with informative dropouts: bias analysis and inference.
Authors: Roy J, Lin X
Source: Biometrics, 2005 Sep;61(3), p. 837-46.
A varying-coefficient Cox model for the effect of age at a marker event on age at menopause.
Authors: Nan B, Lin X, Lisabeth LD, Harlow SD
Source: Biometrics, 2005 Jun;61(2), p. 576-83.
Mixtures of varying coefficient models for longitudinal data with discrete or continuous nonignorable dropout.
Authors: Hogan JW, Lin X, Herman B
Source: Biometrics, 2004 Dec;60(4), p. 854-64.
A population pharmacokinetic model with time-dependent covariates measured with errors.
Authors: Li L, Lin X, Brown MB, Gupta S, Lee KH
Source: Biometrics, 2004 Jun;60(2), p. 451-60.
A tobit variance-component method for linkage analysis of censored trait data.
Authors: Epstein MP, Lin X, Boehnke M
Source: Am J Hum Genet, 2003 Mar;72(3), p. 611-20.
EPub date: 2003 Feb 13.
Latent variable models for longitudinal data with multiple continuous outcomes.
Authors: Roy J, Lin X
Source: Biometrics, 2000 Dec;56(4), p. 1047-54.
A scaled linear mixed model for multiple outcomes.
Authors: Lin X, Ryan L, Sammel M, Zhang D, Padungtod C, Xu X
Source: Biometrics, 2000 Jun;56(2), p. 593-601.
Semiparametric regression for periodic longitudinal hormone data from multiple menstrual cycles.
Authors: Zhang D, Lin X, Sowers M
Source: Biometrics, 2000 Mar;56(1), p. 31-9.