Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5R44CA064112-03 Interpret this number
Primary Investigator: Mehta, Cyrus
Organization: Cytel, Inc
Project Title: Smart Monte Carlo Methods for Analyzing Categorical Data
Fiscal Year: 1998


Binary logistic regression and its extensions to unordered polytocous response, ordered polytocous response, and Poisson response are among the most popular mathematical models for the analysis of categorical data with widespread applicability in the biomedical sciences. The usual method of inference for such models is unconditional maximum likelihood. For large well balanced data sets, or for data with only a few parameters this approach is satisfactory. However, unconditional maximum likelihood estimation can produce inconsistent point estimates, inaccurate p-values and inaccurate confidence intervals for small or imbalanced data sets, and for sets with a large number of parameters relative to the number of observations. Sometimes the method fails entirely as no estimates can be found which maximize the unconditional likelihood function. A methodologically sound alternative approach which as none of the above drawbacks is the exact conditional approach. Here one estimates the parameters of interest by computing the exact permutation distributions of their sufficient statistics, conditional on the observed values of the sufficient statistics for the remaining "nuisance" parameters. The major stumbling block to exact permutational inference has always been the heavy computational burden it imposes. Despite the availability of fast numerical algorithms for the exact computations, there numerous instances where a data set is tool large to be analyses by the exact methods, yet too sparse or imbalanced for the maximum likelihood approach to be reliable. What is needed is a reliable Monte Carlo alternative to the exact conditional approach which can bridge the gap between the exact and asymptotic methods of inference. The problem is technically hard because conventional Monte Carlo methods lead to massive rejection of samples that do not satisfy the constraints of the conditional distribution. We propose a network sampling approach to the Monte Carlo problem that we believe is a major break-through for this difficult but important problem.



Back to Top