Grant Details
Grant Number: |
5U01CA214846-03 Interpret this number |
Primary Investigator: |
Carey, Vincent |
Organization: |
Brigham And Women'S Hospital |
Project Title: |
Accelerating Cancer Genomics with Cloud-Scale Bioconductor |
Fiscal Year: |
2019 |
Abstract
PROJECT SUMMARY
The Bioconductor project is rooted in recognition that efficient, rigorous, and reproducible analysis of high-
dimensional data can be achieved when statisticians, biologists, and computer scientists federate efforts in a
transparent and carefully engineered way. The project Accelerating Cancer Genomics with Cloud-scale Bio-
conductor devises new approaches to carrying out genome-scale analysis of cancer data using cloud computing
environments. The proposal is based on strategies that have proven highly effective in fifteen years of supporting
collaborative and carefully engineered software for genome scale analysis in computational biology in the Biocon-
ductor project, based on the highly portable and widely adopted R language and environment for data analysis.
In Aim 1 we develop architecture and infrastructure for scalably harvesting cloud-based representations of large-
scale cancer genome studies such as The Cancer Genome Atlas, creating formal high-performance workflows
for processing and interpreting cancer genome analyses, and providing packaging and data distribution schemes
for moving data to the cloud for scalable analysis there. In Aim 2 we create and support independent creation of
intuitive and cancer-relevant interface components supporting reproducible interactive exploration and analysis
using the facilities of Rstudio. In Aim 3 we update and generalize the Bioconductor MLInterfaces metapackage to
support advanced machine learning using the cancer-oriented strategies and facilities devised in Aims 1 and 2.
Our proposal will benefit large numbers of cancer researchers who will be taking advantage of cloud resources,
probably with R close to hand, by marrying strengths of cloud-centric strategies for data archiving and query
resolution, to the strengths of Bioconductor development and analysis capabilities. We have letters of support
from the leadership of the three NCI Cancer Cloud Pilot projects for this project.
Publications
None