All of Us glossary

  • Updated

All of Us Research Program


Anatomical Therapeutic Chemical; classification scheme for drug data

Available Fields

Lists all fields present within the program data model with relevant metadata, including a description, the data provenance, and whether the field was impacted by privacy methods


Common Data Model

CDR Curated Data Repository
Cleaning & Conformance

Details cleaning and conformance rules run between CDR_base and CDR to shape data to adhere to clean norms and expectations.


Selected group of participants you are interested in researching; created through Cohort Builder tool


Information in a patient’s medical record including diagnosed conditions (“conditions”), prescribed medications (“drugs”), or recorded physical measurements (“measurements”)

Concept Generalizations

Details all concepts (rows) which are generalized in the data model with relevant metadata, including the Concept ID, the data provenance, a description of the generalization applied, and the expected generalization output

Concept Set 

A saved collection of concepts from a particular 
domain (conditions, drugs, or measurements) to use for analysis

Concept Suppressions

Details all concepts (rows) which are suppressed (removed) from the data model with relevant metadata, including the Concept ID and the data provenance

Core Participant

A participant who has consented to providing EHR data has completed the 3 main surveys

CPT Codes

Current Procedural Terminology; a list of descriptive terms and identifying numeric codes used by physicians and healthcare professionals for the billing of medical services and procedures


Data and Research Center


Direct Volunteer; a participant who did not register through an HPO location


Electronic health record


Fast Healthcare Interoperability Resources

Field Generalizations

Details all fields (columns) which are generalized in the data model with relevant metadata, including a field description, the data provenance, a description of the generalization applied, and the expected generalization output

Field Suppressions

Details all fields (columns) which are suppressed (set to null) in the data model with relevant metadata, including a description and the data provenance


Health provider organization

ICD Codes

International Classification of Diseases; used in the United States to classify diseases, illnesses, or injuries

Jupyter Notebook

An open-source web application that supports interactive data science and computations; contains computer code (Python and R) and text rich elements


A “computational engine” operating within notebooks; executes codes within notebook


Logical Observation Identifiers Names and Codes;  it is used by health care provider organizations to code laboratory test orders and results. For example, 2345-7 is the code used for the amount of glucose measured in blood during a blood test

Medical Concepts

Medical concepts describe information in a patient’s medical record, such as a condition they have, a doctor’s diagnosis, a prescription they are taking, or a procedure or measurement the doctor performed


National Institutes of Health


Observational Health Data Sciences and Informatics


Observational Medical Outcomes Partnership


Individual who has registered for the program, but not necessarily provided consent or completed surveys

PDR Program Data Repository

Participant Provided Information


Participant Technology Systems Center; provides mobile applications and websites for participants to enroll in All of Us, provide data, and receive updates


Represented in biomedical research


Raw Data Repository


Naming system for all medications available in the U.S. market; the RxNorm name of each drug is a compilation of its active ingredients, strength, and form


Systematized Nomenclature of Medicine; connects the various terminology, medical codes, synonyms, and definitions used among different electronic health records (EHRs) so that they can be matched up for later reference

Source Vocabulary

The original classification system in a participant’s EHR used for categorization of conditions, diagnoses, and procedures (e.g., ICD-9 and ICD-10CM codes) when it first enters our system. Source vocabularies are retained after being “mapped” to standard vocabularies so that data can still be searched using the original terminology or codes

Standard Vocabulary

A classification system that incorporates different source vocabularies into one system by “mapping” multiple terms or codes to a common vocabulary to facilitate and optimize data analysis

Table Suppressions

Details all tables which are suppressed (not available) in the data model


Under-represented in biomedical research


Was this article helpful?

4 out of 5 found this helpful

Have more questions? Submit a request



Article is closed for comments.