Curated Data Repository and Development Release Notes (R2020Q4R2 CDR)

  • Updated

All of Us - Curated Data Repository Release Notes
R2020Q4R2 Release Documentation

Brief details of new features and changes from previous datasets are noted below.

Generation Documentation: Confluence Log

Vocabulary: vocabulary20200825

Software Version: v0-3-rc11

Data flow: This version introduced Wearables as a data input.

Common Data Model: No change from previous version.

Curation Process: New CDR Data Dictionary. Privacy Methodology has minor changes in race/ethnicity generalization.

Data Cutoff Date: 8/1/20 except for Fitbit®, which was 11/26/19

Version Date: 11/10/20 - no data manipulations made after this date


CDR Version/

Version Date


Public Release Note



Wearable Device

Added wearables data (e.g., Fitbit®) in non-OMOP supplemental tables, though all new tables and fields are recorded on the Available Fields tab of CDR Data Dictionary. These data, released as an initial pilot, contain both detailed and summary data for heart rate and physical activity level. The CDR contains at least some degree of Fitbit® data elements for 8,435 participants beginning December 10, 2008 through November 26, 2019. The total number of participants is representative of all individuals who selected to link their Fitbit data devices with the All of Us Research Program prior to November 26, 2019. All dates are shifted according to standard Registered Tier Privacy rules.


Added data from the COVID-19 Participant Experience (COPE) survey. All participants who completed the Basics survey were eligible to complete a COPE survey in May, June, and July 2020. The CDR contains response data from all three survey versions and will be updated in the future to include data from additional versions. Response data specific to COVID-19 diagnoses and treatment have been suppressed from the Registered Tier dataset to protect participant privacy. Suppressed COPE survey concepts can be found in the Registered Tier CDR data dictionary. Unlike other data elements, the original date/time completion for COPE surveys has been preserved (e.g. these have not been date shifted). Custom concept IDs are noted below for each survey version:

  • COPE Survey May 2020 (2100000002)
      • Open 5/7/20-5/29/20
  • COPE Survey June 2020 (2100000003)
      • Open 6/2/20-6/26/20
  • COPE Survey July 2020 (2100000004)
      • Open 7/7/20 - 9/25/20

Added the custom concept “None Indicated,” 2100000001, to the person table for participants who did not indicate a race, only an ethnicity, in the Basics survey

Added custom concept to generalize household size, collected in the Basics survey, for privacy methodology:

  • For households >10 members: 2000000013
  • For households with >5 household members under 18: 2000000012

Added new gender identity response option “Two-Spirit” to the Basics survey beginning in Fall 2019, but is generalized to “Not man only, not woman only, prefer not to answer, or skipped (2000000002) in the CDR”

Fixed improper branching logic where previously existed in surveys - suppressed answers to child (e.g. follow up) questions that should not have been displayed to participants


Dropped PIDs that have no EHR or PPI data present


List of suppressed State_of_residence data has changed, based on some states and territories obtaining 200 or more participants; 

  • states with under 200 participants are generalized to a custom concept_id to protect privacy
  • also generalized when state doesn’t match the state of the HPO


State_of_residence fields added to person table


Response to sexual orientation question was changed from single-select to select all that apply

  • ONLY “Straight, that is, not gay or lesbian” is generalized to the “Straight” orientation value 
  • If someone selects “Straight, that is, not gay or lesbian” and a second selection, it will be generalized to “Non-straight orientation, prefer not to answer, or skipped” (2000000003)


COVID-19-related EHR data are suppressed


New questions on disability added to The Basics module; no new privacy methodology applied


Stabilized Research ID (RID) across CDR versions (starting with R2019Q4R3), allowing for longitudinal research


Fixed issue with some survey answers not mapping correctly, causing these rows to be missing value_concept_id


PIIBirthInformation_BirthDate removed


*Note: R2020Q4R2 has all of the same additions/fixes as R2020Q4R2_base. However, additions/fixes applied to R2020Q4R2 are NOT applied retroactively to R2020Q4R2_base.


CDR Version/

Version Date


Public Release Note




Dropped rows with 0 source and standard concept_ids 


Fixed heart beat unit standardization error that converted both units and values to “beats per minute” when only the unit needed to be changed to “minutes”


Removed Physical Measurements height and weight from the height and weight cleaning algorithm


Known Issues in R2020Q4R2



Public Release Note

Known Issue

EHR data from EHR site 925 will be out of temporal alignment with other data types due to a source anomaly. Source data was date shifted by the site prior to and in addition to curation-imposed date shift, so EHR data will not match other data types for affected participants.

Known Issue

value_as_number translation in COPE (came in as string when it needs to be number)

Known Issue

Approx. ~4,000 participants do not have ExtraConsent_TodaysDate value due to an issue with data collection; participants DID sign consent, data are just missing this date value



All Of Us - Development Release Notes

Highlights from Researcher Workbench releases occurring between 9/29/20 and 12/7/2020


New Data

  • New CDR now available - users will be notified of the option to upgrade and, when ready, are guided through the upgrade process  





  • The data apps now support the building of cohorts, concept sets, and datasets using data from COPE survey participant responses (more info on COPE survey here)
  • Fitbit data now available - Users can create datasets from 4 different types of Fitbit data to export to notebooks for analysis
    • Users will create their cohort then choose from one or more of the pre-packaged Fitbit concept sets to create a dataset
      • Heart Rate Summary
      • Heart Rate Level
      • Intraday Steps
      • Activity Summary


Cohort Builder 

  • Like Concept Set Selector, search now supports the use of additional characters - +, -, *, ( ) and "
  • Users can choose data from one or more version(s) of the COPE survey using the attributes slide out


  • Users can create a cohort of participants with ANY Fitbit data



Concept Set Selector

  • Upon clicking to create a new concept set, users will see tiles representing each domain or survey eligible for selection.  
    • Search results will now appear in the tile format - like data browser.  Each tile will update according to the search term
    • OR users can click a tile for the domain of interest and perform a more specific search



  • Instead of checkboxes, concepts will be added by clicking the plus sign for each concept.


  • As concepts are selected, they appear in the shopping cart


  • Drugs will now be added at the ingredient level (like Cohort Builder)
    • Ex. Search for Tylenol - add acetaminophen


  • Search results counts (Roll-up and Item) now match Cohort Builder when using the same search string
  • For the domains of Observations, Drug Exposures, and Labs & Measurements: search results will now only include standard concepts - no source concepts will appear

Bug Fixes / Minor issues

  • Corrected issue where exporting a dataset from dataset builder to notebooks stalled while loading and the loading spinner would persist until the page eventually reloads
  • Improved language and function of warning message when downloading notebook to force users to select OK or cancel



Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request



Article is closed for comments.