Grant Details
Grant Number: |
1R43CA093112-01 Interpret this number |
Primary Investigator: |
Mehta, Cyrus |
Organization: |
Cytel, Inc |
Project Title: |
Markov Chain Monte Carlo and Exact Logistic Regression |
Fiscal Year: |
2001 |
Abstract
DESCRIPTION (provided by applicant): Logistic regression is a very popular
model for the analysis of binary data with widespread applicability in the
physical, behavioral and biomedical sciences. Parameter inference for this
model is usually based on maximizing the unconditional likelihood function.
However unconditional maximum likelihood inference can produce inconsistent
point estimates, inaccurate p-values and inaccurate confidence intervals for
small or unbalanced data sets and for data sets with a large number of
parameters relative to the number of observations. Sometimes the method fails
entirely as no estimates can be found that maximize the unconditional
likelihood function. A methodologically sound alternative approach that has
none of the aforementioned drawbacks is the exact conditional approach in which
one generates the permutation distributions of the sufficient statistics for
the parameters of interest conditional on fixing the sufficient statistics of
the remaining nuisance parameters at their observed values. The major stumbling
block to this approach is the heavy computational burden it imposes. Monte
Carlo methods attempt to overcome this problem by sampling from the reference
set of possible permutations instead of enumerating them all. Two competing
Monte Carlo methods are network based sampling and Markov Chain Monte Carlo
(MCMC) sampling. Network sampling suffers from memory limitations while MCMC
sampling can produce incorrect results if the Markov chain is not ergodic or if
the process is not in the steady state. We propose a novel approach which
combines the network and MCMC sampling, draws upon the strengths of each of
them and overcomes their individual limitations. We propose to implement this
hybrid network-MCMC method in our LogXact software and as an external procedure
in the SAS system.
PROPOSED COMMERCIAL APPLICATION:
There is great demand for logistic regression software that can handle small, sparse or
unbalanced data sets by exact methods. Our LogXact package is the only software that
can provide exact inference for data sets which are not "toy problems". Yet even
LogXact quickly breaks down on moderate sized problems. The new generation of hybrid
network-MCMC algorithms will handle substantially larger problems that nevertheless need
exact inference. The commercial potential is considerable since such data sets are common
in scientific studies.
Publications
None