Introduction to All of Us Survey Collection and Data Transformation Methods

  • Updated


Within the All of Us Research Program, participant provided information (PPI) is collected via surveys and is meant as a way to augment data collected from other sources such as the electronic health record (EHR). Surveys are developed and deployed following a methodical process which includes prioritization of scientific domains, content creation based on sourcing of items from well-established studies and the literature, pilot evaluation and refinement, and scheduled deployment. Surveys are deployed on the secure online All of Us Participant Portal. Baseline surveys include questions on sociodemographics, health, and lifestyle. Additional surveys on other domains of interest are regularly developed and deployed. You can view survey questions and learn more about their sources by browsing the All of Us Research Hub Survey Explorer.



Beginning at program establishment in May 2017, all participants - whether enrolled through their participating health care providers (HPOs) or not (direct volunteers) - responded to surveys via the All of Us Participant Portal supported by the program’s Participant Technology Support Center (Vibrent Health). Beginning in December 2019, a new version of the Participant Portal, supported by the program’s Participant Center (Scripps University with CareEvolution), was introduced as a pilot approach to enrolling a limited number of direct volunteer participants. 

Survey questions and answers are transformed into structural survey metadata and stored and transmitted from the online Participant Portals to the All of Us Data and Research Center’s Raw Data Repository via an artifact referred to as a “survey codebook.” Survey codebooks are created using REDCap data dictionary format and contain not only survey content, but also assigned variables (e.g., codes), branching logic, and field validation rules. Besides serving as the codebook from survey data collection and transmission, resulting REDCap data dictionary artifacts are also made available through the REDCap Consortium’s Shared Library for reuse in other studies.5 Codebook variables (e.g., source codes) are deposited into the OMOP Common Data Model’s observation table and stored as an All of Us specific “PPI” vocabulary. This vocabulary is both browsable and downloadable via Athena, the online OMOP repository.  Survey response data are extracted from the All of Us program’s Raw Data Repository (RDR) and transmitted to the Curated Data Repository (CDR) for the purpose of research-grade dataset creation. Extracted response data are mapped according to the codebook source codes, as well OMOP concept codes and IDs. All survey-specific OMOP concepts are assessed for Registered and Controlled Tier privacy methodology application and resulting decisions are logged and implemented within the corresponding CDRs.

Definitions of Survey Relevant Variable Names in OMOP CDM v 5.2.1. 
Variable Name Definition 
observation _id A unique identifier for each observation.

A foreign key identifier to the Person about whom the observation was recorded. The demographic details

of that Person are stored in the PERSON table.

observation_concept_id A foreign key to the standard observation concept identifier in the Standardized Vocabularies.
observation_date The date of the observation.
observation_datetime The date and time of the observation.

A foreign key to the predefined concept identifier in the

Standardized Vocabularies reflecting the type of the observation.

value_as_number Survey numeric answers.
value_as_string Survey free text answers: suppressed by privacy rules. 
value_as_concept_id OMOP standard concept ID for survey answer, note the code “PMI_SKIP (903096) is applied when a participant was presented with an item and chose not to specify an answer. 
qualifier_concept_id A foreign key to a Standard Concept ID for a qualifier (e.g., severity of drug-drug interaction alert).
unit_concept_id A foreign key to a Standard Concept ID of measurement units in theStandardized Vocabularies.
provider_id A foreign key to the provider in the PROVIDER table who was responsible for making the observation.
visit_occurrence_id A foreign key to the visit in the VISIT_OCCURRENCE table during which the observation was recorded.
observation_source_value All of Us question code.
observation_source_concept_id OMOP concept ID for survey question.
unit_source_value The source code for the unit as it appears in the source data. This code is mapped to a standard unit concept in the Standardized Vocabularies and the original code is stored here for reference.
qualifier_source_value The source value associated with a qualifier to characterize the observation.


Beginning with the v7 Registered and Controlled Tier CDRs (R2022Q4R9 and C2022Q4R9, respectively), select metadata about survey response submissions are made available to researchers via a OMOP Survey Conduct Table. For additional details about this table, see the CDR Data Dictionaries.

Survey Relevant Metadata Elements Available in OMOP Survey Conduct Table
Survey Conduct Table Field Name Definition 
survey_conduct_id Unique identifier for each individual survey response
person_id Unique identifier for the participant
survey_concept_id Concept id provided for each survey (e.g. Basics, Cope, etc.)
survey_source_value Set as the survey name
survey_source_concept_id Unique identifier stored for the survey name
survey_source_identifier Unique identifier stored for the survey name
assisted_concept_id Concept codes IDs for values of “42530794” if the participant completed the survey via the program’s “Computer Assisted Telephone Interviews (CATI)” protocol*, “43530058” if the participant completed the program consent with assistance**, and “0” if the participant completed the activity with no assistance. 
assisted_source_value Value that corresponds to the concept ID above
collection_method_concept_id Concept code IDs of “42530794” if the participant completed the survey via the program’s “Computer Assisted Telephone Interviews (CATI)” protocol and “42531021” if not.* 
collection_method_source_value Value that corresponds to the concept ID above
survey_end_date Authored date on file
survey_end_datetime Authored date on file
language Indicates the language in which the survey was completed. Values: ‘EN’ = english, ‘ES’ = Spanish.
*The Computer Assisted Telephone Interviews (CATI) protocol launched in 2020 to expand the support available to participants completing All of Us surveys. CATI is a telephone surveying method where a trained interviewer follows a script to collect survey answers from participants.
**The All of Us Research Program protocol allows participants to receive in-person assistance when completing the program consent. In-person assistance, however, is not permitted for survey activity completion. 


Inclusion Criteria

With the exception of defined CDR “cut-off” dates and CDR privacy methodology, there are no exclusion factors for the survey response data. In terms of inclusion criteria, participants who met eligibility criteria for the surveys are included in the dataset. 

Survey Names, Eligibility Criteria, and Release Date Ranges
Survey  Sample Release Date Range
The Basics Baseline survey, available to all after primary consent completed. Fielding in process
Lifestyle Baseline survey, available to all after primary consent and the Basics completed. Fielding in process
Overall Health Baseline survey, available to all after primary consent and the Basics completed. Fielding in process
Personal Medical History (PMH) Follow up survey, available to all after baseline surveys completed. Fielding complete. Replaced by combined Personal and Family Health History survey.
Family Health History (FHH) Follow up survey, available to all after baseline surveys completed. Fielding complete.Replaced by combined Personal and Family Health History survey.
Personal and Family Health History (PFHH) Follow up survey, available to all after baseline surveys completed. Note: If participants completed both the previously fielded PMH and FHH surveys, they are not asked to complete PFHH. If participants completed one, but not the other, they are invited to complete PFHH. Fielding in process
Health Care Access and Utilization (HCAU) Follow up survey, available to all after baseline surveys completed. Fielding in process. 
COVID-19 Participant Experience (COPE) - May 2020 version Available to all after primary consent and the Basics completed for specified timeframe only. Fielding complete, released May 7, 2020 through May 30, 2020.
COVID-19 Participant Experience (COPE) - June 2020 version Available to all after primary consent and the Basics completed for specified timeframe only. Fielding complete, released June 2, 2020 through June 26, 2020. 
COVID-19 Participant Experience (COPE) - July 2020 version Available to all after primary consent and the Basics completed for specified timeframe only. Fielding complete, released July 7, 2020 through September 25, 2020. 
COVID-19 Participant Experience (COPE) - November 2020 version Available to all after primary consent and the Basics completed for specified timeframe only. Fielding complete, released October 27, 2020 through December 3, 2020. 
COVID-19 Participant Experience (COPE) - December 2020 version Available to all after primary consent and the Basics completed for specified timeframe only. Fielding complete, released December 8, 2020 through January 4, 2021. 
COVID-19 Participant Experience (COPE) - February 2021 version Available to all after primary consent and the Basics completed for specified timeframe only. Fielding complete, released February 8, 2021 through March 5, 2021. 
Summer 2021 Minute Survey on COVID-19 Vaccines Available to all after primary consent and the Basics completed for specified timeframe only. Also dependent on February 2021 COPE and Summer 2021 Minute survey response - see COVID-19-related Survey Series Details and Resources for more details.   Released June 10, 2021 through August 19, 2021.
Fall 2021 Minute Survey on COVID-19 Vaccines Available to all after primary consent and the Basics completed for specified timeframe only. Also dependent on February 2021 COPE and Summer 2021 Minute survey response - see COVID-19-related Survey Series Details and Resources for more details.  Released August 19, 2021 through October 28, 2021.
Winter 2021 Minute Survey on COVID-19 Vaccines Available to all after primary consent and the Basics completed for specified timeframe only. Also dependent on February 2021 COPE and Summer 2021 Minute survey response - see COVID-19-related Survey Series Details and Resources for more details.  Released October 28, 2021 through January 20, 2022.
New Year 2022 Minute Survey on COVID-19 Vaccines Available to all after primary consent and the Basics completed for specified timeframe only. Also dependent on February 2021 COPE, Summer 2021 * Winter 2021 Minute survey response - see COVID-19-related Survey Series Details and Resources for more details. Released January 20, 2022 through March 8, 2022.
Social Determinants of Health Follow up survey, available to all after baseline surveys completed. Fielding in process. 



Was this article helpful?

5 out of 6 found this helpful

Have more questions? Submit a request



Article is closed for comments.