The Dataset Builder is a point-and-click tool available to you in the All of Us Researcher Workbench for creating your dataset and selecting the data you want to analyze.
There are three steps for building your dataset: selecting your cohort, selecting your concept set(s), and selecting your values. Before you select your cohort or your values, you will want to select your concept set(s).
A concept set is a group of concepts from a single data domain that you want to examine in your cohort. Read “Exploring Concepts with OMOP and SQL” for information about concepts and concept relationships.
To create your concept set(s)
- Click “” to the right of “Datasets.”
- Click “” to the right of “Select Concept Sets (Rows).”
Note: Some prepackaged concept sets are available to you under the Select Concept Sets section.
- Type in the search bar or explore the domains, survey questions, and more to find your concept(s) of interest.
- Click on the concept(s) of interest to add them to your concept set.
- Click “Finish & Review.”
- Click “Save Concept Set.”
- Select whether you want to add the concept(s) to an existing concept set or create a new concept set.
- Name your concept set and add a description for your concept set.
- Click “Save.”
- Now, you can choose to create another concept set or create a dataset.
To build your dataset
- Select your cohort under the “Select Cohorts (Participants)” column on the left by clicking the checkbox.
Note: Under “Select Cohorts (Participants),” you can see a prepackaged cohort of “all participants.” This will pull ALL the participant data in the Researcher Workbench. We do not recommend pulling the entire database into your workspace.
- Select your concept set(s) under “Workspace Concept Sets” in the “Select Concept Sets (Rows)” column by clicking the checkbox.
- Confirm your values under the “Select Values (Columns)” column on the right by clicking the checkbox.
- All values are selected by default. You can deselect any values that you do not want to bring into your dataset.
- All values are selected by default. You can deselect any values that you do not want to bring into your dataset.
- Click “View Preview Table” to display a preview of the resulting table based on your Dataset Builder selections.
- Click “Create Dataset.”
- Name your dataset and add a description for your dataset.
- Click “Save.”
- Click “Analyze.”
- Select your programming language (R, Python, and SAS) and complete the fields as necessary based on your analysis tool of choice (Jupyter Notebook, RStudio, and SAS Studio).
If you select R or Python, you can select to export Python or R code to an existing or new Jupyter Notebook. If you want to use SAS Studio or RStudio, you must copy the code from the “Export Dataset” screen, launch SAS Studio or RStudio, and run your analysis.
For detailed steps on how to export and analyze your dataset by analysis tool, read “Exporting and analyzing your dataset.”
Note: If you are familiar with SQL programming language, you can choose to skip using the Cohort Builder and the Dataset Builder and go straight to Jupyter Notebook to query the All of Us data if you prefer. We do not recommend this for researchers with limited SQL knowledge.
Looking for more information on the Dataset Builder? Check out our YouTube Channel for a video about using the Dataset Builder.
After building your dataset, you can begin your analysis in the Researcher Workbench.
Next article
Analyzing Data in the All of Us Researcher Workbench
Explore the resources and analysis tools available for analyzing data
Comments
0 comments
Article is closed for comments.