Molecular technology for studying the genome of human cells leads to large
structured sets of categorical data. These data are used by cancer
researchers to understand the complex and variable sequence of genetic
changes that occur within cells of evolving tumors. The primary goal of
the proposed research is to develop a statistical methodology that will
assist oncologists in the analysis and interpretation of such data. In
particular, statistical methods are proposed for the localization of genes
associated with the cancer phenotype.
A very common experiment, used in the study of diverse cancers, involves
a panel of molecular markers either scattered throughout the genome or
from a single chromosomal region. By comparing signals from normal and
tumor cells, the oncologist can score each tumor-marker combination for
loss of heterozygosity. Putative tumor suppressor genes may exist in
regions commonly inactivated, and thus identifying such regions is of
critical importance. Inference from marker data must account for various
complexities: within tumor variation, dependence of response between
nearby markers, the problem of multiple comparison, the known structural
features of chromosomes like locations of fragile sites, the dependence of
data from related cells, consequences of genetic instability like
aneuploidy and background loss, and covariate information like levels of
oncoproteins. The absence of statistical analysis, or the use of naive
methods, is an inefficient use of valuable data, and may even lead to
erroneous conclusions.
The evolutionary nature of tumor growth suggests a natural form for a
stochastic model of the changing genome--one based on genetic instability
and selection. Such a model creates a framework for parametrizing the
distribution of loss-of- heterozygosity data. Questions about the location
and action of putative suppressor genes can be formulated as questions
about components of the stochastic model, and thus classical inference
procedures can be applied.
Numerous technical questions arise about how and what to compute.
Bayesian and profile likelihood strategies are proposed to estimate gene
location given the model. Markov chain Monte Carlo methods are necessary
to implement the Bayesian strategy, and predictive distributions will be
studied to asses goodness of fit. Alternatively, bootstrap methods enable
frequency calibration of profile likelihood as well as methods for model
testing. Asymptotic analysis will give insight into the form of the non-
standard likelihood surface. Computer simulation of the model will be
useful both to study bias and variance properties of the proposed methods
and as the basis for power calculations to design marker studies.
Error Notice
The database may currently be offline for maintenance and should be operational soon. If not, we have been notified of this error and will be reviewing it shortly.
We apologize for the inconvenience.
- The DCCPS Team.