Click here for a downloadable .pdf version.
This document details the All of Us Genome Centers (GC) and Data and Research Center (DRC) quality control (QC) steps for genomic data in the research pipeline. This pipeline removes or flags samples and variants in the genomic data that fail quality thresholds. We apply these QC steps in the research pipeline before we release the genomic data for research use. We, the All Of Us DRC, only describe QC processes that are performed analytically (i.e., after the sample has been genotyped and sequenced). All descriptions and results are limited to the v7 data release made available in the Researcher Workbench April 20, 2023, which contains 312,945 genotyping array (“array”) samples, 245,394 short read whole genome sequencing (srWGS) samples with single nucleotide polymorphism, insertion, and deletion variant calls (SNPs and Indels), 97,940 srWGS samples with structural variant (SV) calls, and 1,027 long read whole genome sequencing (lrWGS) samples with SNP, Indel, and SV calls. The samples in the genomic data correspond to the All of Us Curated Data Repository (CDR) release C2022Q4R9 (“v7”), though please see Known Issue #1, as 20 array samples (less than 0.01%) and six srWGS samples (less than 0.01%) are missing their corresponding CDR data. These pipelines are automated unless otherwise noted. This document covers all genomic data types made available to researchers at this time including small variants (SNPs and Indels), structural variants, raw data, and auxiliary data. Small variants are available for array samples, srWGS samples, and lrWGS samples.
Structural variants are available for srWGS samples and lrWGS samples. Please note: as of June 17, 2024, we have released an additional set of v7 srWGS SVs, increasing the srWGS samples that have SV data to 40%. To learn more about the QC process for these 97,940 srWGS samples with SV calls, please see the All of Us Short-Read Structural Variant Quality Report.
The QC information for the original 11,390 srWGS SV for v7 CDR samples is available in the archived QC report here.
Comments
0 comments
Article is closed for comments.