Our Largest Genomic Dataset: Curated Data Repository version 9

  • Updated

We’re excited to announce a landmark moment for the All of Us Research Program. The Curated Data Repository version 9 (CDRv9) is now available exclusively in the updated Researcher Workbench with data available from over 747,000 participants!

CDRv9 delivers unprecedented scale, diversity, and depth as the world’s largest integrated dataset that combines genomic data with real-world clinical and wearable data. More than 645,000 participants in the dataset come from backgrounds that are underrepresented in biomedical research. CDRv9 is built to support rare variant discovery, biomarker identification, AI/ML innovation, drug target discovery, and improved disease risk prediction.

The CDRv9 for both tiers (Controlled Tier C2025Q4R6 and Registered Tier R2025Q4R6) includes participant data with a cutoff date of January 1, 2025.

New data in CDRv9 release

  • New omics data with nearly 9,000 RNA-seq and nearly 10,000 proteomics samples available in the Researcher Workbench.
  • The first release of clinical notes data for >99,000 participants from selected sites in the Controlled Tier dataset, and includes an initial extraction of concept codes using natural language processing (NLP).
  • Expanded sleep data in the newly released Fitbit tables sleep_daily_summary_30dayavg, sleep_daily_summary_counts,sleep_daily_summary_ext, and sleep_level_short

Overview of Updates

  • Survey data from more than 747,000 participants. 
  • Physical measurement data from more than 600,000 participants including data self-reported by participants remotely.
  • Electronic health record data from more 481,000 than participants.
  • Fitbit data from more than 68,000 participants, including data from the Wearables Enhancing All of Us Research (WEAR) Study.
  • Over 535,000 participants with whole genome sequencing data
  • More than 14,000 long-read genomes.
  • More than 97,000 participants with Exploring the Minds data

Learn more about the All of Us CDR v9 in the CDRv9 Data Characterization Report. 

Was this article helpful?

2 out of 2 found this helpful

Have more questions? Submit a request

Comments

0 comments

Article is closed for comments.