Data Explorer in Researcher Workbench

In the Researcher Workbench cohorts and datasets can be created through the Data Explorer, powered by Verily Pre. Data Explorer lets you visually explore data, design custom cohorts, and export datasets directly to your workspaces.

Note: Data Explorer is not available for use with CDR v7 in the Researcher Workbench. To continue your analysis, please use the most recent CDR v9 or CDR v8 in the Researcher Workbench.

Accessing Data Explorer in your workspace

After adding an All of Us Data Collection, you can start working with Data Explorer by creating a cohort in the Resources tab of your workspace.

In your workspace, navigate to the Resource tab.
Select the ‘New Resource’ button, and then ‘New Cohort’.
Choose the All of Us data collection that was previously selected when adding a data collection to your workspace. For example, for All of Us Registered Tier data, you will select R2025Q4R6, which is CDRv9 Registered Tier. A full list of CDR versions are noted in the Data Dictionary here.
Review the data collection policies, complete the “Researcher Use Statement Questions,” and select “I'm sure. I understand that all policies and terms above will be permanently applied to this workspace.”
Enter the details about your cohort, to include a cohort name, description of your cohort, and designate a workspace bucket folder to add the cohort to. Then select ‘Add to your workspace.’
You will automatically be redirected to the Data Explorer interface to begin building your cohort and dataset.

For a full step-by-step guide to creating a cohort and dataset in Data Explorer, see the article Get started with Data Explorer or this interactive tutorial here.

Overview of Data Explorer Features

Creating a cohort

The point‑and‑click interface in Data Explorer lets you apply criteria to select the participants you want to include in your research project. Below are a few key features of the updated interface.

Create Custom Cohorts - The ‘Cohort filter’ allows you to filter your cohort by inclusion or exclusion criteria. You can create one or more groups based on certain criteria. Click ‘Add some criteria’ to display the various options for your filter criteria such as the various EHR domain, program data, or source code fields. To apply new filter criteria, you must select “Apply” at the bottom to refresh the cohort.
Cohort Visualization - Cohort visualizations are a series of bar graphs that show demographic breakdowns of your cohort, to include age, sex assigned at birth, and top conditions of your cohort. As you select your cohort criteria, the cohort visualizations will update to reflect the results. You can hover over each of the bars to see a more detailed breakdown.
Review individuals - Similar to the ‘Cohort Review’ in the Researcher Workbench 1.0, selecting ‘Review Individuals’ allows you to examine the individual participants included in your cohort. You can generate a review set that provides a snapshot of participant data, helping you confirm that your chosen criteria produced the expected results. This feature also enables you to add annotations to document details about your cohort.
Save data snapshot - After you’ve created a cohort with the appropriate inclusion and exclusion criteria, select ‘Save data snapshot’ to save your cohort and proceed to building your dataset.

Build a data snapshot to export

After creating your cohort, you can generate a data snapshot, which lets you apply additional concept criteria to assemble a dataset for export. You can create data snapshots and notebooks to export directly to your workspace. You'll also be able to view the SQL queries needed to generate the data snapshot. These queries are available in Python and R, and can be run in Jupyter notebooks in your workspace.

Data snapshot steps - The main steps in generating a data snapshot include
1. Adding any additional concept sets about your cohort
2. Select the file format you want to export. Tables and SQL queries can be exported in the following formats:
  1. Zipped or unzipped .csv files
  2. Queries for the cohort (IPYNB) with R Notebook
  3. Queries for the cohort (IPYNB) with Python Notebook
3. Select destination in workspace bucket
Add concept sets - You can select concepts from various data domains that you want to examine in your cohort under ‘Browse more data.’ Some prepackaged concepts are available to you, such as common demographics from the person table. Once you add the concept set, select the ‘Apply’ button to refresh the preview.
Manage Columns and table views - After selecting concept sets, you can manage the columns available in your tables for export. By default, all table columns will be selected, but you can deselect any you want to exclude by either using the checkbox feature next to a column name or by selecting Manage columns and using the toggle function. We recommend only including the columns you need for your analysis. Additionally, there are various views available to review or copy prior to your export.
1. Tables - this view shows you the CDR tables (i.e person, condition_occurrence, measurement, etc), used in the query in a structured format.
2. Queries for each table - this view displays the SQL query to access each CDR table.
3. Queries for cohort - this view displays the SQL query generated for your cohort based on your inclusion and exclusion criteria.
4. Summary - this view provides a high level summary of the tables, fields, and criteria information about your data snapshot prior to export.

Work with your data

Once you successfully export your data snapshot, it will be available to access within your specified workspace bucket destination. In order to interact with your data snapshot created in the Data Explorer, you’ll need to create a cloud application such as the Jupyter Lab environment.

From your workspace's Apps tab, select ‘New app instance’ > ‘JupyterLab’
Customize your cloud environment as needed.
Open JupyterLab once the cloud app environment has been created by clicking on the name of your app. An app has successfully been created when it notes ‘Running’ in green.
Search for the data snapshot under the workspace bucket file destination. Select the file format of your choice to run.

Tips for using All of Us data collections with Data Explorer

Data Explorer is a new tool powered by Verily Pre. While it shares some features with the Cohort Builder and Dataset Builder, its underlying methods don’t always match those used in the original Researcher Workbench. The guidance below offers recommendations and tips to help you use the Data Explorer effectively with All of Us data collections.

Cohorts autosave, including when you rename your cohort. However, to apply new filter criteria, you must select “Apply” at the bottom to refresh the cohort.
Selecting ‘Meets any criteria’ applies an “OR” operator, while ‘Meets all criteria’ applies an “AND” operator. For example, choosing ‘Meets any criteria’ will include participants who satisfy either criteria A or criteria B, while choosing ‘Meets all criteria’ will include participants who satisfy both criteria A and criteria B.
We recommend using the “Meets all criteria” (trigger the “AND” operator) when working with multiple criteria groups. For example, Group 1 = demographic criteria while using Group 2 for EHR domain criteria.
Temporal options are available, but only work with EHR domain criteria. To enable the temporal feature, you’ll need to select one of the four temporal feature criteria.

Only ‘Current Age’ is available for age demographic criteria within the Data Explorer user interface (UI). To use ‘Age a CDR’ or ‘Age at event’ we recommend manually calculating this age in your notebook.
To filter your inclusion criteria, first select the criterion you want to add. Then hover over the criterion to display the available filter options. For example, choose the +Ethnicity criterion to add it to your group, and then select the ‘Ethnicity’ label to view its filter options.

To edit, apply modifiers or delete inclusion or exclusion criteria, use the following icons:
- Pencil icon = edit criteria option
- Modifiers = apply specified modifies about criteria
- Filter slash = disable criteria for export, but will not delete criteria.
- Trash bin = delete criteria

Data Explorer Release Notes

For information related to most recent changes or known issues related to the Researcher Workbench Data Explorer, please see Data Explorer Product Release Notes.

Data Explorer in Researcher Workbench

Accessing Data Explorer in your workspace