Skip to main content
Grant Details

Grant Number: 5R01CA079949-09 Interpret this number
Primary Investigator: Zhou, Haibo
Organization: Univ Of North Carolina Chapel Hill
Project Title: Statistical Methods for Outcome-Dependent Sampling
Fiscal Year: 2010
Back to top


DESCRIPTION (provided by applicant): We will develop and evaluate improved statistical methods for the design and analysis of biomedical studies conducted with general biased sampling design schemes, the univariate and multivariate outcome-auxiliary- dependent sampling (OADS) and the two-stage OADS designs. The advantage of such designs is that it allows both prospective and retrospective samples at the same time where the prospective sample provides the benefits of a cohort study and the retrospective sample enables investigators to concentrate resources on where there is the greatest amount of information, i.e., some judiciously chosen subsets based on the outcome and auxiliary covariate information. New statistical methods is needed to achieve the potential statistical efficiency. Extension of the simple ODS design to allow the sampling probability to depend on a continuous outcome and a continuous auxiliary covariates will be developed. We also develop optimal two-stage OADS designs under commonly encountered budget and precision/power constraints in practice. Tools and benchmark for distinguishing available sampling options in the planning stage of the study will be developed. These are the relative-budget-index for fixed precision/power case and the relative-gain-index for fixed budget case. The proposed methods are particularly useful in cancer and environmental research where auxiliary exposure information and expensive exposure assessment are frequent challenges. The proposal consists of six projects. The first project deals with semiparamtric efficient inference for two-stage OADS design where the first stage data can be either from a simple random sample or from an ODS sample itself. The second project concerns the optimal two-stage OADS design for a fixed budget and the development of a formal evaluation criteria (RGI) that measures the closeness of an alternative design to the optimal one. The third project concerns the optimal two-stage OADS design for a given precision/power and the development of a formal evaluation criteria (RBI) that measures the closeness of an alternative design to the optimal one (the one with the minimal budget). The fourth project considers a multivariate OADS and multivariate two-stage OADS design and develop the semiparametric inferences for correlated responses under the multivariate OADS. The fifth project concerns a partial linear model for the nonlinear exposure effects in both fixed and random effects regression analysis under an OADS and two-stage OADS design. The sixth project investigates a variable selection and hypothesis testing techniques for data from two-stage OADS design. The strengths and weaknesses of proposed methods will be critically examined via theoretical investigations and simulations. Cost-effective sampling strategies in a given setting will be investigated. Comparisons with existing methods will be conducted. Related software will be developed. Data sets from epidemiologic and environmental studies on the effects of environmental exposures, and on cancer and other diseases will be analyzed. These include the Cancer Risk in Uranium Miners Study, the Magnetic Fields and Breast Cancer Risk Study, the Collaborative Perinatal Project, the Family Heart Study, and the DDE-antiandrogen Study.

Back to top


Mixed Effect Regression Analysis For A Cluster-based Two-stage Outcome-auxiliary-dependent Sampling Design With A Continuous Outcome
Authors: Xu W. , Zhou H. .
Source: Biostatistics (oxford, England), 2012 Sep; 13(4), p. 650-64.
PMID: 22723503
Related Citations

Partial Linear Inference For A 2-stage Outcome-dependent Sampling Design With A Continuous Outcome
Authors: Qin G. , Zhou H. .
Source: Biostatistics (oxford, England), 2011 Jul; 12(3), p. 506-20.
PMID: 21156990
Related Citations

Semiparametric Inference For A 2-stage Outcome-auxiliary-dependent Sampling Design With Continuous Outcome
Authors: Zhou H. , Wu Y. , Liu Y. , Cai J. .
Source: Biostatistics (oxford, England), 2011 Jul; 12(3), p. 521-34.
PMID: 21252082
Related Citations

Gaussian Process Based Bayesian Semiparametric Quantitative Trait Loci Interval Mapping
Authors: Huang H. , Zhou H. , Cheng F. , Hoeschele I. , Zou F. .
Source: Biometrics, 2010 Mar; 66(1), p. 222-32.
PMID: 19459837
Related Citations

Estimated Pseudopartial-likelihood Method For Correlated Failure Time Data With Auxiliary Covariates
Authors: Liu Y. , Zhou H. , Cai J. .
Source: Biometrics, 2009 Dec; 65(4), p. 1184-93.
PMID: 19432779
Related Citations

Outcome-dependent Sampling: An Efficient Sampling And Inference Procedure For Studies With A Continuous Outcome
Authors: Zhou H. , Chen J. , Rissanen T.H. , Korrick S.A. , Hu H. , Salonen J.T. , Longnecker M.P. .
Source: Epidemiology (cambridge, Mass.), 2007 Jul; 18(4), p. 461-8.
PMID: 17568219
Related Citations

Back to Top