Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5U24CA180996-08 Interpret this number
Primary Investigator: Morgan, Martin
Organization: Roswell Park Cancer Institute Corp
Project Title: Cancer Genomics: Integrative and Salable Solutions in R/Bioconductor
Fiscal Year: 2020


Abstract

Abstract Bioconductor is an ecosystem of more than 1,500 open-source software and data packages for the statistical analysis and comprehension of high-throughput genomic data. It is widely used by the cancer genomics research community for statistical analysis and visualization. This software ecosystem is supported by core data classes and methods, reused by both users and developers, that provide convenient representations and efficient operations for many kinds of high-throughput molecular data. Falling sequencing costs and single-cell assays enable increasingly resolved study of the molecular biology of cancer, through combined assaying of DNA sequence, epigenetics, gene expression, protein, and other aspects, even at the single-cell level, for a single specimen. These developments present new challenges in complexity, size, and interpretability of the data. The overarching goal of this project is to create and adapt core Bioconductor software infrastructure to meet these challenges, through the following aims. First, we develop infrastructure for the analysis of single-cell multi-omic experiments. Second, we implement FAIR principles for improved somatic variant prioritization, by defining performant data architecture that harmonizes and integrates the large amount of experimental and annotation data available through Bioconductor. Users of our system will be able to create provenance-rich interoperable reports on structural and functional contexts of somatic variants for use in prioritization. Third, we develop scalable infrastructure for the curation, distribution, maintenance, discoverability, and usability of cancer data resources within and externally to Bioconductor. Finally, we develop a program of user training and new outreach approaches to support adoption of advanced Bioconductor infrastructure by developers of new cancer-related packages and existing packages critical to the cancer research community.



Publications

saseR: Juggling offsets unlocks RNA-seq tools for fast and Scalable differential usage, Aberrant Splicing and Expression Retrieval.
Authors: Segers A. , Gilis J. , Van Heetvelde M. , Risso D. , De Baere E. , Clement L. .
Source: Biorxiv : The Preprint Server For Biology, 2024-10-16 00:00:00.0; , .
EPub date: 2024-10-16 00:00:00.0.
PMID: 39464066
Related Citations

Exploring public cancer gene expression signatures across bulk, single-cell and spatial transcriptomics data with signifinder Bioconductor package.
Authors: Pirrotta S. , Masatti L. , Bortolato A. , Corrà A. , Pedrini F. , Aere M. , Esposito G. , Martini P. , Risso D. , Romualdi C. , et al. .
Source: Nar Genomics And Bioinformatics, 2024 Sep; 6(4), p. lqae138.
EPub date: 2024-10-03 00:00:00.0.
PMID: 39363890
Related Citations

The tidyomics ecosystem: Enhancing omic data analyses.
Authors: Hutchison W.J. , Keyes T.J. , tidyomics Consortium , Crowell H.L. , Serizay J. , Soneson C. , Davis E.S. , Sato N. , Moses L. , Tarlinton B. , et al. .
Source: Biorxiv : The Preprint Server For Biology, 2024-05-22 00:00:00.0; , .
EPub date: 2024-05-22 00:00:00.0.
PMID: 38826347
Related Citations

A multi-organ map of the human immune system across age, sex and ethnicity.
Authors: Mangiola S. , Milton M. , Ranathunga N. , Li-Wai-Suen C. , Odainic A. , Yang E. , Hutchison W. , Garnham A. , Iskander J. , Pal B. , et al. .
Source: Biorxiv : The Preprint Server For Biology, 2024-04-29 00:00:00.0; , .
EPub date: 2024-04-29 00:00:00.0.
PMID: 38746418
Related Citations

Defining and benchmarking open problems in single-cell analysis.
Authors: Luecken M.D. , Gigante S. , Burkhardt D.B. , Cannoodt R. , Strobl D.C. , Markov N.S. , Zappia L. , Palla G. , Lewis W. , Dimitrov D. , et al. .
Source: Research Square, 2024-04-04 00:00:00.0; , .
EPub date: 2024-04-04 00:00:00.0.
PMID: 38645152
Related Citations

Differential detection workflows for multi-sample single-cell RNA-seq data.
Authors: Gilis J. , Perin L. , Malfait M. , Van den Berge K. , Takele Assefa A. , Verbist B. , Risso D. , Clement L. .
Source: Biorxiv : The Preprint Server For Biology, 2023-12-19 00:00:00.0; , .
EPub date: 2023-12-19 00:00:00.0.
PMID: 38187695
Related Citations

BugSigDB captures patterns of differential abundance across a broad range of host-associated microbial signatures.
Authors: Geistlinger L. , Mirzayi C. , Zohra F. , Azhar R. , Elsafoury S. , Grieve C. , Wokaty J. , Gamboa-Tuz S.D. , Sengupta P. , Hecht I. , et al. .
Source: Nature Biotechnology, 2023-09-11 00:00:00.0; , .
EPub date: 2023-09-11 00:00:00.0.
PMID: 37697152
Related Citations

Curated single cell multimodal landmark datasets for R/Bioconductor.
Authors: Eckenrode K.B. , Righelli D. , Ramos M. , Argelaguet R. , Vanderaa C. , Geistlinger L. , Culhane A.C. , Gatto L. , Carey V. , Morgan M. , et al. .
Source: Plos Computational Biology, 2023 Aug; 19(8), p. e1011324.
EPub date: 2023-08-25 00:00:00.0.
PMID: 37624866
Related Citations

CO-CLUSTERING OF SPATIALLY RESOLVED TRANSCRIPTOMIC DATA.
Authors: Sottosanti A. , Risso D. .
Source: The Annals Of Applied Statistics, 2023 Jun; 17(2), p. 1444-1468.
EPub date: 2023-05-01 00:00:00.0.
PMID: 37811520
Related Citations

RaggedExperiment: the missing link between genomic ranges and matrices in Bioconductor.
Authors: Ramos M. , Morgan M. , Geistlinger L. , Carey V.J. , Waldron L. .
Source: Bioinformatics (oxford, England), 2023-05-19 00:00:00.0; , .
EPub date: 2023-05-19 00:00:00.0.
PMID: 37208161
Related Citations

signifinder enables the identification of tumor cell states and cancer expression signatures in bulk, single-cell and spatial transcriptomic data.
Authors: Pirrotta S. , Masatti L. , Corrà A. , Pedrini F. , Esposito G. , Martini P. , Risso D. , Romualdi C. , Calura E. .
Source: Biorxiv : The Preprint Server For Biology, 2023-03-10 00:00:00.0; , .
EPub date: 2023-03-10 00:00:00.0.
PMID: 36945491
Related Citations

Designing spatial transcriptomic experiments.
Authors: Righelli D. , Sottosanti A. , Risso D. .
Source: Nature Methods, 2023-03-02 00:00:00.0; , .
EPub date: 2023-03-02 00:00:00.0.
PMID: 36864198
Related Citations

A Bartlett-type correction for likelihood ratio tests with application to testing equality of Gaussian graphical models.
Authors: Banzato E. , Chiogna M. , Djordjilović V. , Risso D. .
Source: Statistics & Probability Letters, 2023 Feb; 193, .
EPub date: 2022-11-09 00:00:00.0.
PMID: 38584807
Related Citations

benchdamic: benchmarking of differential abundance methods for microbiome data.
Authors: Calgaro M. , Romualdi C. , Risso D. , Vitulo N. .
Source: Bioinformatics (oxford, England), 2023-01-01 00:00:00.0; 39(1), .
PMID: 36477500
Related Citations

GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases.
Authors: Oh S. , Geistlinger L. , Ramos M. , Blankenberg D. , van den Beek M. , Taroni J.N. , Carey V.J. , Greene C.S. , Waldron L. , Davis S. .
Source: Nature Communications, 2022-06-27 00:00:00.0; 13(1), p. 3695.
EPub date: 2022-06-27 00:00:00.0.
PMID: 35760813
Related Citations

SpatialExperiment: infrastructure for spatially resolved transcriptomics data in R using Bioconductor.
Authors: Righelli D. , Weber L.M. , Crowell H.L. , Pardo B. , Collado-Torres L. , Ghazanfar S. , Lun A.T.L. , Hicks S.C. , Risso D. .
Source: Bioinformatics (oxford, England), 2022-04-28 00:00:00.0; , .
EPub date: 2022-04-28 00:00:00.0.
PMID: 35482478
Related Citations

SpatialExperiment: infrastructure for spatially resolved transcriptomics data in R using Bioconductor.
Authors: Righelli D. , Weber L.M. , Crowell H.L. , Pardo B. , Collado-Torres L. , Ghazanfar S. , Lun A.T.L. , Hicks S.C. , Risso D. .
Source: Bioinformatics (oxford, England), 2022-04-28 00:00:00.0; , .
EPub date: 2022-04-28 00:00:00.0.
PMID: 35482478
Related Citations

NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data.
Authors: Agostinis F. , Romualdi C. , Sales G. , Risso D. .
Source: Bioinformatics (oxford, England), 2022-03-10 00:00:00.0; , .
EPub date: 2022-03-10 00:00:00.0.
PMID: 35266509
Related Citations

Open-source Software Sustainability Models: Initial White Paper From the Informatics Technology for Cancer Research Sustainability and Industry Partnership Working Group.
Authors: Ye Y. , Barapatre S. , Davis M.K. , Elliston K.O. , Davatzikos C. , Fedorov A. , Fillion-Robin J.C. , Foster I. , Gilbertson J.R. , Lasso A. , et al. .
Source: Journal Of Medical Internet Research, 2021-12-02 00:00:00.0; 23(12), p. e20028.
EPub date: 2021-12-02 00:00:00.0.
PMID: 34860667
Related Citations

A mouse-specific retrotransposon drives a conserved Cdk2ap1 isoform essential for development.
Authors: Modzelewski A.J. , Shao W. , Chen J. , Lee A. , Qi X. , Noon M. , Tjokro K. , Sales G. , Biton A. , Anand A. , et al. .
Source: Cell, 2021-10-28 00:00:00.0; 184(22), p. 5541-5558.e22.
EPub date: 2021-10-12 00:00:00.0.
PMID: 34644528
Related Citations

PsiNorm: a scalable normalization for single-cell RNA-seq data.
Authors: Borella M. , Martello G. , Risso D. , Romualdi C. .
Source: Bioinformatics (oxford, England), 2021-09-09 00:00:00.0; , .
EPub date: 2021-09-09 00:00:00.0.
PMID: 34499096
Related Citations

Per-sample standardization and asymmetric winsorization lead to accurate clustering of RNA-seq expression profiles.
Authors: Risso D. , Pagnotta S.M. .
Source: Bioinformatics (oxford, England), 2021-02-09 00:00:00.0; , .
EPub date: 2021-02-09 00:00:00.0.
PMID: 33560368
Related Citations

A spatially resolved brain region- and cell type-specific isoform atlas of the postnatal mouse brain.
Authors: Joglekar A. , Prjibelski A. , Mahfouz A. , Collier P. , Lin S. , Schlusche A.K. , Marrocco J. , Williams S.R. , Haase B. , Hayes A. , et al. .
Source: Nature Communications, 2021-01-19 00:00:00.0; 12(1), p. 463.
EPub date: 2021-01-19 00:00:00.0.
PMID: 33469025
Related Citations

Toward a gold standard for benchmarking gene set enrichment analysis.
Authors: Geistlinger L. , Csaba G. , Santarelli M. , Ramos M. , Schiffer L. , Turaga N. , Law C. , Davis S. , Carey V. , Morgan M. , et al. .
Source: Briefings In Bioinformatics, 2021-01-18 00:00:00.0; 22(1), p. 545-556.
PMID: 32026945
Related Citations

SIMON: Open-Source Knowledge Discovery Platform.
Authors: Tomic A. , Tomic I. , Waldron L. , Geistlinger L. , Kuhn M. , Spreng R.L. , Dahora L.C. , Seaton K.E. , Tomaras G. , Hill J. , et al. .
Source: Patterns (new York, N.y.), 2021-01-08 00:00:00.0; 2(1), p. 100178.
EPub date: 2021-01-08 00:00:00.0.
PMID: 33511368
Related Citations

Transparency and reproducibility in artificial intelligence.
Authors: Haibe-Kains B. , Adam G.A. , Hosny A. , Khodakarami F. , Massive Analysis Quality Control (MAQC) Society Board of Directors , Waldron L. , Wang B. , McIntosh C. , Goldenberg A. , Kundaje A. , et al. .
Source: Nature, 2020 10; 586(7829), p. E14-E16.
EPub date: 2020-10-14 00:00:00.0.
PMID: 33057217
Related Citations

Multiomic Integration of Public Oncology Databases in Bioconductor.
Authors: Ramos M. , Geistlinger L. , Oh S. , Schiffer L. , Azhar R. , Kodali H. , de Bruijn I. , Gao J. , Carey V.J. , Morgan M. , et al. .
Source: Jco Clinical Cancer Informatics, 2020 Oct; 4, p. 958-971.
PMID: 33119407
Related Citations

Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data.
Authors: Calgaro M. , Romualdi C. , Waldron L. , Risso D. , Vitulo N. .
Source: Genome Biology, 2020-08-03 00:00:00.0; 21(1), p. 191.
EPub date: 2020-08-03 00:00:00.0.
PMID: 32746888
Related Citations

Multiomic Analysis of Subtype Evolution and Heterogeneity in High-Grade Serous Ovarian Carcinoma.
Authors: Geistlinger L. , Oh S. , Ramos M. , Schiffer L. , LaRue R.S. , Henzler C.M. , Munro S.A. , Daughters C. , Nelson A.C. , Winterhoff B.J. , et al. .
Source: Cancer Research, 2020-08-03 00:00:00.0; , .
EPub date: 2020-08-03 00:00:00.0.
PMID: 32747365
Related Citations

Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale.
Authors: Carey V.J. , Ramos M. , Stubbs B.J. , Gopaulakrishnan S. , Oh S. , Turaga N. , Waldron L. , Morgan M. .
Source: Jco Clinical Cancer Informatics, 2020 May; 4, p. 472-479.
PMID: 32453635
Related Citations

Reliable Analysis of Clinical Tumor-Only Whole-Exome Sequencing Data.
Authors: Oh S. , Geistlinger L. , Ramos M. , Morgan M. , Waldron L. , Riester M. .
Source: Jco Clinical Cancer Informatics, 2020 04; 4, p. 321-335.
PMID: 32282230
Related Citations

The Impact of Stroma Admixture on Molecular Subtypes and Prognostic Gene Signatures in Serous Ovarian Cancer.
Authors: Schwede M. , Waldron L. , Mok S.C. , Wei W. , Basunia A. , Merritt M.A. , Mitsiades C.S. , Parmigiani G. , Harrington D.P. , Quackenbush J. , et al. .
Source: Cancer Epidemiology, Biomarkers & Prevention : A Publication Of The American Association For Cancer Research, Cosponsored By The American Society Of Preventive Oncology, 2020 02; 29(2), p. 509-519.
EPub date: 2019-12-23 00:00:00.0.
PMID: 31871106
Related Citations

Orchestrating single-cell analysis with Bioconductor.
Authors: Amezquita R.A. , Lun A.T.L. , Becht E. , Carey V.J. , Carpp L.N. , Geistlinger L. , Martini F. , Rue-Albrecht K. , Risso D. , Soneson C. , et al. .
Source: Nature Methods, 2019-12-02 00:00:00.0; , .
EPub date: 2019-12-02 00:00:00.0.
PMID: 31792435
Related Citations

MetaGxData: Clinically Annotated Breast, Ovarian and Pancreatic Cancer Datasets and their Use in Generating a Multi-Cancer Gene Signature.
Authors: Gendoo D.M.A. , Zon M. , Sandhu V. , Manem V.S.K. , Ratanasirigulchai N. , Chen G.M. , Waldron L. , Haibe-Kains B. .
Source: Scientific Reports, 2019-06-19 00:00:00.0; 9(1), p. 8770.
EPub date: 2019-06-19 00:00:00.0.
PMID: 31217513
Related Citations

HMP16SData: Efficient Access to the Human Microbiome Project Through Bioconductor.
Authors: Schiffer L. , Azhar R. , Shepherd L. , Ramos M. , Geistlinger L. , Huttenhower C. , Dowd J.B. , Segata N. , Waldron L. .
Source: American Journal Of Epidemiology, 2019-06-01 00:00:00.0; 188(6), p. 1023-1026.
PMID: 30649166
Related Citations

Waldron et al. Reply to "Commentary on the HMP16SData Bioconductor Package".
Authors: Waldron L. , Schiffer L. , Azhar R. , Ramos M. , Geistlinger L. , Segata N. .
Source: American Journal Of Epidemiology, 2019-06-01 00:00:00.0; 188(6), p. 1031-1032.
PMID: 30689687
Related Citations

Tobacco exposure associated with oral microbiota oxygen utilization in the New York City Health and Nutrition Examination Study.
Authors: Beghini F. , Renson A. , Zolnik C.P. , Geistlinger L. , Usyk M. , Moody T.U. , Thorpe L. , Dowd J.B. , Burk R. , Segata N. , et al. .
Source: Annals Of Epidemiology, 2019 Jun; 34, p. 18-25.e3.
EPub date: 2019-03-28 00:00:00.0.
PMID: 31076212
Related Citations

Sociodemographic variation in the oral microbiome.
Authors: Renson A. , Jones H.E. , Beghini F. , Segata N. , Zolnik C.P. , Usyk M. , Moody T.U. , Thorpe L. , Burk R. , Waldron L. , et al. .
Source: Annals Of Epidemiology, 2019-05-08 00:00:00.0; , .
EPub date: 2019-05-08 00:00:00.0.
PMID: 31151886
Related Citations

Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation.
Authors: Thomas A.M. , Manghi P. , Asnicar F. , Pasolli E. , Armanini F. , Zolfo M. , Beghini F. , Manara S. , Karcher N. , Pozzi C. , et al. .
Source: Nature Medicine, 2019 04; 25(4), p. 667-678.
EPub date: 2019-04-01 00:00:00.0.
PMID: 30936548
Related Citations

Linear models enable powerful differential activity analysis in massively parallel reporter assays.
Authors: Myint L. , Avramopoulos D.G. , Goff L.A. , Hansen K.D. .
Source: Bmc Genomics, 2019-03-12 00:00:00.0; 20(1), p. 209.
EPub date: 2019-03-12 00:00:00.0.
PMID: 30866806
Related Citations

Neuronal brain-region-specific DNA methylation and chromatin accessibility are associated with neuropsychiatric trait heritability.
Authors: Rizzardi L.F. , Hickey P.F. , Rodriguez DiBlasi V. , Tryggvadóttir R. , Callahan C.M. , Idrizi A. , Hansen K.D. , Feinberg A.P. .
Source: Nature Neuroscience, 2019 02; 22(2), p. 307-316.
EPub date: 2019-01-14 00:00:00.0.
PMID: 30643296
Related Citations

restfulSE: A semantically rich interface for cloud-scale genomics with Bioconductor.
Authors: Gopaulakrishnan S. , Pollack S. , Stubbs B.J. , Pagès H. , Readey J. , Davis S. , Waldron L. , Morgan M. , Carey V. .
Source: F1000research, 2019; 8, p. 21.
EPub date: 2019-01-07 00:00:00.0.
PMID: 30828438
Related Citations

BiocPkgTools: Toolkit for mining the Bioconductor package ecosystem.
Authors: Su S. , Carey V.J. , Shepherd L. , Ritchie M. , Morgan M.T. , Davis S. .
Source: F1000research, 2019; 8, p. 752.
EPub date: 2019-05-29 00:00:00.0.
PMID: 31249680
Related Citations

TFutils: Data structures for transcription factor bioinformatics.
Authors: Stubbs B.J. , Gopaulakrishnan S. , Glass K. , Pochet N. , Everaert C. , Raby B. , Carey V. .
Source: F1000research, 2019; 8, p. 152.
EPub date: 2019-02-05 00:00:00.0.
PMID: 31297189
Related Citations

Consensus on Molecular Subtypes of High-Grade Serous Ovarian Carcinoma.
Authors: Chen G.M. , Kannan L. , Geistlinger L. , Kofia V. , Safikhani Z. , Gendoo D.M.A. , Parmigiani G. , Birrer M. , Haibe-Kains B. , Waldron L. .
Source: Clinical Cancer Research : An Official Journal Of The American Association For Cancer Research, 2018-10-15 00:00:00.0; 24(20), p. 5037-5047.
EPub date: 2018-07-03 00:00:00.0.
PMID: 30084834
Related Citations

Continuity of transcriptomes among colorectal cancer subtypes based on meta-analysis.
Authors: Ma S. , Ogino S. , Parsana P. , Nishihara R. , Qian Z. , Shen J. , Mima K. , Masugi Y. , Cao Y. , Nowak J.A. , et al. .
Source: Genome Biology, 2018-09-25 00:00:00.0; 19(1), p. 142.
EPub date: 2018-09-25 00:00:00.0.
PMID: 30253799
Related Citations

The impact of different sources of heterogeneity on loss of accuracy from genomic prediction models.
Authors: Zhang Y. , Bernau C. , Parmigiani G. , Waldron L. .
Source: Biostatistics (oxford, England), 2018-09-06 00:00:00.0; , .
EPub date: 2018-09-06 00:00:00.0.
PMID: 30202918
Related Citations

Data and Statistical Methods To Analyze the Human Microbiome.
Authors: Waldron L. .
Source: Msystems, 2018 Mar-Apr; 3(2), .
EPub date: 2018-03-13 00:00:00.0.
PMID: 29556541
Related Citations

Orchestrating a community-developed computational workshop and accompanying training materials.
Authors: Davis S. , Ramos M. , Shepherd L. , Turaga N. , Geistlinger L. , Morgan M.T. , Haibe-Kains B. , Waldron L. .
Source: F1000research, 2018; 7, p. 1656.
EPub date: 2018-10-17 00:00:00.0.
PMID: 30473781
Related Citations

Software for the Integration of Multiomics Experiments in Bioconductor.
Authors: Ramos M. , Schiffer L. , Re A. , Azhar R. , Basunia A. , Rodriguez C. , Chan T. , Chapman P. , Davis S.R. , Gomez-Cabrero D. , et al. .
Source: Cancer Research, 2017-11-01 00:00:00.0; 77(21), p. e39-e42.
PMID: 29092936
Related Citations



Back to Top