Data Access

Data Sharing

The Rare Genomes Project is based at the Broad Institute of MIT and Harvard, a non-profit research institution in Cambridge, Massachusetts. The aim of the project is to use genomic sequencing of affected individuals and their family members to identify the genetic causes of rare disease. Our team, led by principal investigator Heidi Rehm, is committed to open science and rapid data sharing, with the hope of improving the rate of rare disease diagnosis through collaborations with other researchers worldwide. De-identified data generated by the project will be shared on platforms including:

Matchmaker Exchange

Matchmaker Exchange (MME) allows researchers, laboratories, and clinicians to submit anonymized genotype and phenotype information to a federated platform of international stakeholders in the hopes of “matching” with others who submitted a case with overlapping clinical symptoms or genetic variants. The Rare Genomes Project enters variants of interest into MME on an ongoing basis. To access MME’s resources, individuals must submit their own cases or datasets to begin matching. To learn more about how to get started with MME’s services, visit their website.

ClinVar

ClinVar is a publicly accessible NCBI database that collects and displays information about genetic variants and their relationship to human disease. Clinical laboratories and researchers submit variant information along with the interpretation of its pathogenicity. Reportable variants identified through the project are most often submitted to ClinVar by our clinical partner, the Laboratory for Molecular Medicine. Direct submissions by the Rare Genomes Project are listed under the entry for the Broad Institute Rare Disease Group.

DUOS

The Broad Institute developed the Data Use Oversight System (DUOS) to streamline the sharing of de-identified clinical and genomic sequencing data with the research community. To access data generated by the Rare Genomes Project, researchers may visit the DUOS website, apply for access, and describe how they will use the data for their research study. This request will then be reviewed by the DUOS data access committee, who must review and approve the request before access to the data is granted. Approved researchers will gain access to genomic sequencing data and clinical information from RGP families, but with all personally identifiable information removed. The Rare Genomes Project submits data to DUOS on an ongoing basis under dataset name ‘RGP_Rehm_RareDisease_WGS’ and ID number DUOS_000008. The Rare Genomes Project does not process applications to DUOS. For more information about DUOS or to address questions about your application, please visit DUOS’s support page.

AnVIL

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (AnVIL) is a federally-funded resource for the genomic scientific community that leverages a cloud-based infrastructure for genomic data access, sharing, and computing. AnVIL provides a collaborative environment with user interfaces for datasets and analysis workflows.

To access data generated by the Rare Genomes Project, researchers apply for access. Approved researchers will gain access to genomic sequencing data and clinical information from RGP families, but with all personally identifiable information removed. The Rare Genomes Project submits data to AnVIL on an ongoing basis under Broad Institute Center for Mendelian Genomics in the subset workspace AnVIL_CMG_Broad_Rare_RGP_WGS, with ID number phs001272.v2.p1. The Rare Genomes Project does not process applications to AnVIL. For more information about AnVIL or to address questions about your application, please visit https://anvilproject.org/.

Data Access

Data Sharing

Matchmaker Exchange

ClinVar

DUOS

AnVIL

If you have further questions, please contact us by email at raregenomes@broadinstitute.org, or call us at 617-714-7395 (Toll-free: 855-534-4300).

Contact Us

Mailing Address