||5R03CA235363-02 Interpret this number
||Measuring Explained Variation in Survival Analysis
Despite the widely used survival analysis in cancer research for constructing prognostic factors for cancers and
identifying risk factors for cancer recurrence or survival after treatment, there is no consensus on how to measure
variations of event times explained by available factors. Many analogous measures of the coefficient of
determination, also known as R-squared, have been proposed for proportional hazard models. However, some
measures are up bounded by a value much smaller than one, even for a time determined by available factors,
and others are too sensitive for falsely correlated factors. On the other hand, research is limited on such
measures for accelerated failure time models, and to the best of our knowledge, the only measure proposed
recently is based on parametrically partitioning the total variation into explained and unexplained parts, assuming
that the true model is known. To address this issue, the objective of this project is to develop proper statistics to
measure the variation of event times, under popular right censoring mechanisms, explained by available factors.
The premise of this proposal is that a variance function can be employed to describe the dependence of variation
on the pertinent mean, and quantifying the variation change along the variance function can measure explained
variation of heteroscedastic event times. In recent work on generalized linear models, it was demonstrated that
a variable-function-based R-squared appropriately measures the explained variation of non-Gaussian
responses. Riding on such successful extension, the two-year research study proposed here focuses on the
following two specific aims: Aim 1. To measure explained variation for accelerated failure time models. While
each accelerated failure-time model presents a quadratic variance function, the team will construct the variable-
function-based R-squared for such survival models, by addressing censoring issues via proper integration or
adjustment. Treating accelerated failure-time models as censored linear regression models, these studies will
also extend the classical R-squared with proper management of censoring issues. Aim 2. To measure explained
variation for proportional hazards models. With the partial likelihood function as the likelihood function of a
conditional logistic model, the investigators will construct the variable-function-based R-squared for the pertinent
conditional logistic model in order to measure the explained variation in the underlying proportional hazards
model. In addition, the team will construct a variance-function-based R-squared by measuring variation of an
underlying survival process, which presents a binary random variable at each specific time. A rigorous
experiment with both simulation and real cancer studies, will be designed to validate the proposed measures
across different models in cancer research. The proposed measures will be implemented in a publicly available
R package rsq, providing cancer researchers a useful tool to conduct the necessary survival analysis. The
success of this project will ultimately help quantify and understand the heritability of different cancers.
Genetic architecture of root and shoot ionomes in rice (Oryza sativa L.).
, Chen C.
, Shi Y.
, Maron L.G.
, Liu D.
, Rutzke M.
, Greenberg A.
, Craft E.
, Shaff J.
, Paul E.
, et al.
TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik, 2021 Aug; 134(8), p. 2613-2637.