Grant Details
Grant Number: |
5R44CA064112-03 Interpret this number |
Primary Investigator: |
Mehta, Cyrus |
Organization: |
Cytel, Inc |
Project Title: |
Smart Monte Carlo Methods for Analyzing Categorical Data |
Fiscal Year: |
1998 |
Abstract
Binary logistic regression and its extensions to unordered polytocous
response, ordered polytocous response, and Poisson response are among the
most popular mathematical models for the analysis of categorical data
with widespread applicability in the biomedical sciences. The usual
method of inference for such models is unconditional maximum likelihood.
For large well balanced data sets, or for data with only a few parameters
this approach is satisfactory. However, unconditional maximum likelihood
estimation can produce inconsistent point estimates, inaccurate p-values
and inaccurate confidence intervals for small or imbalanced data sets,
and for sets with a large number of parameters relative to the number of
observations. Sometimes the method fails entirely as no estimates can
be found which maximize the unconditional likelihood function. A
methodologically sound alternative approach which as none of the above
drawbacks is the exact conditional approach. Here one estimates the
parameters of interest by computing the exact permutation distributions
of their sufficient statistics, conditional on the observed values of the
sufficient statistics for the remaining "nuisance" parameters. The major
stumbling block to exact permutational inference has always been the
heavy computational burden it imposes. Despite the availability of fast
numerical algorithms for the exact computations, there numerous instances
where a data set is tool large to be analyses by the exact methods, yet
too sparse or imbalanced for the maximum likelihood approach to be
reliable. What is needed is a reliable Monte Carlo alternative to the
exact conditional approach which can bridge the gap between the exact and
asymptotic methods of inference. The problem is technically hard because
conventional Monte Carlo methods lead to massive rejection of samples
that do not satisfy the constraints of the conditional distribution. We
propose a network sampling approach to the Monte Carlo problem that we
believe is a major break-through for this difficult but important
problem.
Publications
None