Common data models, like the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), exist to standardize data across multiple sources to support clinical research.
The All of Us Research Program uses OMOP to store and standardize the data collected from participants and health care organizations via surveys, physical measurements, and electronic health records (EHRs).
All data collected are expressed as “concepts” in OMOP. Before exploring concepts and concept relationships, you will want to understand the basics of OMOP and how the data are stored and standardized.
OMOP stores data in a relational database.
Maintained by an international collaborative called the Observational Health Data Sciences and Informatics (OHDSI) program, OMOP contains 39 unique tables within six data categories that relate to one another.
Think of the 39 unique tables as spreadsheets that include patient demographics (person table), condition (condition_occurence table), medication (drug_exposure table), procedure (procesure_occurrence table), and more.
Each table contains a variety of fields, and some of these fields are unique to the table. If a line connects two tables in the infographic, those tables include a field in common.
Because the All of Us Research Program uses OMOP to store health and survey data, it’s important to understand OMOP structure. All 39 tables and their corresponding fields are available online. View the comprehensive list of OMOP tables and fields.
OMOP standardizes medical terms and concepts.
The All of Us Research Program collects PPI, physical measurements, and EHR data from participants and health care organizations from across the United States and its territories. These data are collected in their original source vocabulary and standardized to the OMOP standard vocabulary.
The curation team standardizes the data into six major OMOP domains:
- Conditions
- Drugs
- Measurements
- Procedures
- Program physical measurements
- Survey questions and answers
All source vocabulary from the surveys, physical measurements, and EHRs are mapped to the OMOP standard vocabulary.
Domain | Source Vocabulary | OMOP Standard Vocabulary |
Conditions | ICD-9, ICD-10 | SNOMED |
Drugs | NDC | RxNorm |
Measurements | LOINC or institutional specific codes | LOINC |
Procedures | ICD-9, ICD-10, CPT | SNOMED |
Program physical measurements | PPI | SNOMED, LOINC, PPI |
Survey questions and answers | PPI | SNOMED, LOINC, PPI |
ICD = International Classification of Diseases
SNOMED = Systematized Nomenclature of Medicine
LOINC = Logical Observation Identifiers Names and Codes
NDC = National Drug Code
CPT = Current Procedural Terminology
You can browse and explore the standardized vocabularies used in OMOP using Athena, a publicly available website maintained by the OMOP community. You can search by typing into the search bar or exploring topics by the OMOP domain.
To better understand mapping source vocabulary to OMOP standard vocabulary, let’s look at an example. The condition Type 2 diabetes may be recorded as SNOMED-CT code 44054006 at Health Care Organization A and ICD-10 code E11 at Health Care Organization B. During the standardization process, OMOP maps all the source vocabulary to a standard vocabulary defined by OMOP. For the conditions domain, the source vocabularies are mapped to the OMOP standard vocabulary: SNOMED.
For additional information about OMOP, watch “Understanding SQL, OMOP, & BigQuery.”
For an overview of using OMOP with EHR data, watch “Intro to All of Us EHR Data.”
Next articles
Exploring Concepts with OMOP and SQL
Take a deeper dive into the OMOP CDM and how some data are organized in the Researcher Workbench.
Data Dictionaries
Explore all the metadata and data tables used to populate the datasets in the Researcher Workbench
Comments
0 comments
Article is closed for comments.