Skip to main content
An official website of the United States government
Grant Details

Grant Number: 1U01CA214846-01 Interpret this number
Primary Investigator: Carey, Vincent
Organization: Brigham And Women'S Hospital
Project Title: Accelerating Cancer Genomics with Cloud-Scale Bioconductor
Fiscal Year: 2017


Abstract

PROJECT SUMMARY The Bioconductor project is rooted in recognition that efficient, rigorous, and reproducible analysis of high- dimensional data can be achieved when statisticians, biologists, and computer scientists federate efforts in a transparent and carefully engineered way. The project Accelerating Cancer Genomics with Cloud-scale Bio- conductor devises new approaches to carrying out genome-scale analysis of cancer data using cloud computing environments. The proposal is based on strategies that have proven highly effective in fifteen years of supporting collaborative and carefully engineered software for genome scale analysis in computational biology in the Biocon- ductor project, based on the highly portable and widely adopted R language and environment for data analysis. In Aim 1 we develop architecture and infrastructure for scalably harvesting cloud-based representations of large- scale cancer genome studies such as The Cancer Genome Atlas, creating formal high-performance workflows for processing and interpreting cancer genome analyses, and providing packaging and data distribution schemes for moving data to the cloud for scalable analysis there. In Aim 2 we create and support independent creation of intuitive and cancer-relevant interface components supporting reproducible interactive exploration and analysis using the facilities of Rstudio. In Aim 3 we update and generalize the Bioconductor MLInterfaces metapackage to support advanced machine learning using the cancer-oriented strategies and facilities devised in Aims 1 and 2. Our proposal will benefit large numbers of cancer researchers who will be taking advantage of cloud resources, probably with R close to hand, by marrying strengths of cloud-centric strategies for data archiving and query resolution, to the strengths of Bioconductor development and analysis capabilities. We have letters of support from the leadership of the three NCI Cancer Cloud Pilot projects for this project.



Publications

GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases.
Authors: Oh S. , Geistlinger L. , Ramos M. , Blankenberg D. , van den Beek M. , Taroni J.N. , Carey V.J. , Greene C.S. , Waldron L. , Davis S. .
Source: Nature Communications, 2022-06-27 00:00:00.0; 13(1), p. 3695.
EPub date: 2022-06-27 00:00:00.0.
PMID: 35760813
Related Citations

Multiomic Integration of Public Oncology Databases in Bioconductor.
Authors: Ramos M. , Geistlinger L. , Oh S. , Schiffer L. , Azhar R. , Kodali H. , de Bruijn I. , Gao J. , Carey V.J. , Morgan M. , et al. .
Source: Jco Clinical Cancer Informatics, 2020 Oct; 4, p. 958-971.
PMID: 33119407
Related Citations

Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes.
Authors: Gevaert O. , Nabian M. , Bakr S. , Everaert C. , Shinde J. , Manukyan A. , Liefeld T. , Tabor T. , Xu J. , Lupberger J. , et al. .
Source: Jco Clinical Cancer Informatics, 2020 May; 4, p. 421-435.
PMID: 32383980
Related Citations

Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale.
Authors: Carey V.J. , Ramos M. , Stubbs B.J. , Gopaulakrishnan S. , Oh S. , Turaga N. , Waldron L. , Morgan M. .
Source: Jco Clinical Cancer Informatics, 2020 May; 4, p. 472-479.
PMID: 32453635
Related Citations

Impact of Data Preprocessing on Integrative Matrix Factorization of Single Cell Data.
Authors: Hsu L.L. , Culhane A.C. .
Source: Frontiers In Oncology, 2020; 10, p. 973.
EPub date: 2020-06-23 00:00:00.0.
PMID: 32656082
Related Citations

Orchestrating single-cell analysis with Bioconductor.
Authors: Amezquita R.A. , Lun A.T.L. , Becht E. , Carey V.J. , Carpp L.N. , Geistlinger L. , Martini F. , Rue-Albrecht K. , Risso D. , Soneson C. , et al. .
Source: Nature Methods, 2019-12-02 00:00:00.0; , .
EPub date: 2019-12-02 00:00:00.0.
PMID: 31792435
Related Citations

Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods.
Authors: Haas B.J. , Dobin A. , Li B. , Stransky N. , Pochet N. , Regev A. .
Source: Genome Biology, 2019-10-21 00:00:00.0; 20(1), p. 213.
EPub date: 2019-10-21 00:00:00.0.
PMID: 31639029
Related Citations

Combined Analysis of Metabolomes, Proteomes, and Transcriptomes of Hepatitis C Virus-Infected Cells and Liver to Identify Pathways Associated With Disease Development.
Authors: Lupberger J. , Croonenborghs T. , Roca Suarez A.A. , Van Renne N. , Jühling F. , Oudot M.A. , Virzì A. , Bandiera S. , Jamey C. , Meszaros G. , et al. .
Source: Gastroenterology, 2019 08; 157(2), p. 537-551.e9.
EPub date: 2019-04-09 00:00:00.0.
PMID: 30978357
Related Citations

restfulSE: A semantically rich interface for cloud-scale genomics with Bioconductor.
Authors: Gopaulakrishnan S. , Pollack S. , Stubbs B.J. , Pagès H. , Readey J. , Davis S. , Waldron L. , Morgan M. , Carey V. .
Source: F1000research, 2019; 8, p. 21.
EPub date: 2019-01-07 00:00:00.0.
PMID: 30828438
Related Citations

BiocPkgTools: Toolkit for mining the Bioconductor package ecosystem.
Authors: Su S. , Carey V.J. , Shepherd L. , Ritchie M. , Morgan M.T. , Davis S. .
Source: F1000research, 2019; 8, p. 752.
EPub date: 2019-05-29 00:00:00.0.
PMID: 31249680
Related Citations

TFutils: Data structures for transcription factor bioinformatics.
Authors: Stubbs B.J. , Gopaulakrishnan S. , Glass K. , Pochet N. , Everaert C. , Raby B. , Carey V. .
Source: F1000research, 2019; 8, p. 152.
EPub date: 2019-02-05 00:00:00.0.
PMID: 31297189
Related Citations




Back to Top