The places people reside throughout their lives play an important role in their health and in their
propensity to develop diseases such as cancer. However, the longitudinal spatiotemporal contexts of where
people live are not commonly incorporated into cancer studies. Recent advances in information technology and
“big data” and associated analytic approaches have made it possible for cancer registries and researchers to
capture residential histories at the population level. We propose to develop a large multi-dimensional
database for cancer patients using multiple data sources to reconstruct their longitudinal residential and
exposure histories, and to identify potential patient exposure profiles using data mining techniques
guided by scientific evidence from the cancer epidemiology and environmental health literature. We will
demonstrate the feasibility and identify advantages and challenges of such an approach by using
mesothelioma as an example.
We hypothesize that there are distinct spatiotemporal environmental exposure trajectories and
exposure profiles among mesothelioma patients that can be identified using residential histories. Our specific
aims are: Aim 1: Develop an optimal algorithm to streamline the process of compiling, cleaning, verifying, and
constructing the residential histories of mesothelioma patients diagnosed between 2011 and 2015 in New York,
as reported to the New York State Cancer Registry (NYSCR), utilizing multiple commercial and governmental
data sources; Aim 2: Develop an optimal algorithm to streamline the process of compiling, cleaning, verifying,
and constructing the exposure history associated with each mesothelioma patient's residential history by
leveraging exposure proxies at the individual residence level and area-level information associated with
patient's residential addresses, utilizing multiple commercial and governmental data sources; and Aim 3:
Visualize the spatiotemporal dynamics of patients' residential and exposure histories, and identify predictors of
their exposure profiles, using advanced data mining techniques such as cluster analysis, latent class analysis,
and network analysis. The proposal is innovative in both the methods for constructing the database and the
analytical methods for uncovering important exposure profiles, such as critical exposure windows,
environmental clusters/hotspots, and the relative contributions of exposures across space and time. To our
knowledge, no similar database exists at present. The residential data compiled in this project will be
permanently stored within the NYSCR to allow future use, the first such example by any cancer registry. The
identified exposure phenotypes will contribute to better understanding of the role environmental exposure plays
in mesothelioma disease development. The methods developed can be tested, scaled up, replicated by other
states, and adopted to other cancers and non-cancer related conditions. This life-course perspective approach
holds great potential for advancing cancer research as well as for routine cancer registry surveillance.
If you are accessing this page during weekend or evening hours, the database may currently be offline for maintenance and should operational within a few hours. Otherwise, we have been notified of this error and will be addressing it immediately.
Please contact us
if this error persists.
We apologize for the inconvenience.
- The DCCPS Team.