All of Us Genomics & Multi-omics Quality Report

On June 26, 2026, the All of Us Research Program released the genomic and multi-omic data of 553,949 array samples, 535,662 srWGS samples with SNP & Indel calls, 96,405 srWGS samples with SV calls, 8,980 RNA Seq samples, 9,969 proteomics samples and 14,521 lrWGS samples in the Researcher Workbench (RW) for use by researchers registered for Controlled Tier access. As described previously [2], this high-quality genetic data along with comprehensive health data will enable health research and catalog the genetic variation that leads to human health and disease. For a snapshot of the data, see Table 1.

Dataset	Number of participants	Highlights
Array	553,949	More than 1.9 million variants We added more than 100,000 new participants in CDRv9
Short-read WGS SNP and Indel	535,662	More than 1.3 billion variants We added more than 120,000 new participants with srWGS data in CDRv9 There are more than 125 million new variants as compared to the previous All of Us dataset The All of Us srWGS dataset is now one of the largest srWGS datasets
Short-read WGS structural variants (SVs)	96,405	Nearly 1.5 million variants
Long-read WGS	14,521	We added more than 11,000 new participants with long-read WGS data in CDRv9 (more than 2.5 million variants)
RNA Seq	8,980	NEW data type in the CDRv9 release Gene counts and per-ancestry quantitative trait loci (QTL) analyses
Proteomic samples	9,969	NEW data type in the CDRv9 release Expression counts for more than 5,000 proteins

Table 1 - Snapshot of All of Us CDRv9 genomics and multi-omics data

In addition to variant calls, raw data (IDAT files for array data, CRAM files for srWGS data, BAM files for lrWGS data and RNA data), and auxiliary files (including variant annotations, pharmacogenomics, genetic ancestry categories, relatedness/kinship scores, HLA variant calls, and challenging medically relevant gene calls) are available in the RW through Controlled Tier access. Quality control processes, performed both independently and across samples, indicate that these data are ready for general analysis. We suggest researchers, at a minimum, read the Known Issues and FAQ sections below before using the data.

CDRv9 Genomics & Multi-omics QC report.pdf10 MB

All of Us Genomics & Multi-omics Quality Report

Was this article helpful?

Comments

<%= previousTitle %>

<%= nextTitle %>

<%= block.name %>

<%= block.name %>

Have a question or would like to make a request?

Categories

Toggle navigation menu

<%= category.name %>

Search

Was this article helpful?

<%= previousTitle %>

<%= nextTitle %>

<%= block.name %>

<%= block.name %>

Have a question or would like to make a request?

Categories

Toggle navigation menu

<%= category.name %>

Categories

Categories