Exporting and analyzing your data in Workbench applications

  • Updated

All analyses using the All of Us dataset occur in the Researcher Workbench within one or more supported applications. After creating a dataset using the Cohort and Dataset Builders, it can be exported for analysis. This article describes the exporting process in the applications listed below.

Exporting Data in Jupyter Notebooks

For additional help with analyzing your data within Jupyter Notebooks, see How to Get Started with Registered Tier Data.

Saving a DatasetExporting to Notebook 

  1. Name your dataset.
  2. "Export to notebook" is selected by default. Exporting to a notebook will load this dataset into a notebook within your workspace.
  3. Select which notebook to use. You can insert the dataset into an existing notebook by selecting it from the drop-down or you can create a new one.
  4. Select which programming language you would like to use. You can “See Code Preview” to examine the differences between R and Python.
  5. Click on “Save and Analyze.”

 

If you choose to "Export to Notebook," the screen below will load. Note: It may take a few minutes depending on the size of your dataset.

 

This is your notebook. The code used to create your dataset is pre-loaded. Select "EDIT" to begin analyzing in the new code boxes below. Note: If you export multiple concept sets in a dataset to a notebook, you will still need to join the resulting data frames.

Opening a Notebook from the Analysis Tab

After exporting your dataset to a notebook, you will need to go to the "ANALYSIS" tab to find and open it.

Alternatively, you can create a new notebook by clicking on the  under "Create a New Notebook."

Rename, Duplicate, Delete, or Copy a Notebook

Click on the snowman to see options for your notebook. You can rename, duplicate or delete the notebook. You can also copy it to another workspace.

View, Edit, or Run a Notebook

When you first open a notebook, you will see it in read-only mode. To make changes to the notebook, select the "EDIT" option. To run the notebook, but not make any alterations, choose “playground mode.”

Note that opening a notebook can take up to 10 minutes. The Researcher Workbench platform works behind the scenes to spin up a new instance of Jupyter Notebooks.

Notebook Edit Mode

This is your Jupyter Notebook in EDIT mode.

Cells form the body of the notebook.

 Code cell
• Contains code to be executed in the kernel and displays its output below

Cells have labels on the left
• In [ ]
• In [*]
• In [1]

Markdown cell
• Contains text formatted using Markdown and displays its output in-place when it is run.

 

For additional information about Jupyter Notebooks, see our article Jupyter Notebooks and programming and also our Featured Workspace How to Backup Notebooks and Intermediate Results for information how to backup your notebooks and access earlier saved versions. You can also find more information in the following video:

 

Exporting Datasets into RStudio

You can use the Cohort Builder and Dataset Builder tools to create a dataframe in RStudio. In this process, you will use the Cohort Builder and Dataset Builder to make a dataset. You will then get the dataset code in R and execute the code in the RStudio application.

1. Build a cohort and dataset using the Data tab within your workspace.

Not sure how to build a cohort or dataset? Read Selecting participants: using the Cohort Builder tool and Using the Concept Set Selector and Dataset Builder tools to build your dataset

2. Click Analyze when you finish building your cohort and dataset.

3. The Export Dataset popup will appear with options for exporting the dataset.

4. Select R as the programming language.

5. Use the Copy Code button to copy the dataset builder code to your clipboard. 

 

A screenshot showing the Export Dataset popup

6. Paste the code into an R Script (.R) file or into your RStudio Console.

A screenshot demonstrating the copy and pasted dataset into the RStudio Console

7. Execute the code in your RStudio app.

In the Console, you will see the first 5 rows of your data frame as output. Additionally, the dataset will be saved as an object in your environment pane.

A screenshot of the RStudio Environment pane with a dataset created from the Cohort and Dataset Builders

To save the dataset to your persistent disk, run the following code.

The dataset name is the object name in your environment pane and in the bottom of the code you copied from the dataset builder.

Run the following code in your RStudio Console.

save(<dataset_name>, file = "data.Rdata")

To copy the dataset from your persistent disk to your workspace bucket:

Run the following code in your RStudio Terminal:

bucket=“$WORKSPACE_BUCKET”
gsutil cp /home/rstudio/data.Rdata “$bucket”

Read RStudio on the Researcher Workbench for information about RStudio and cloud storage.

Exporting Datasets into SAS Studio

You can use the Cohort Builder and Dataset Builder tools to create a data frame in SAS Studio. In this process, you will use the Cohort Builder and Dataset Builder to make a dataset. You will then get the dataset code in SAS and execute the code in the SAS application

1) To start, ensure you have an active SAS app running. See instructions here - Starting SAS

2) Build a cohort and dataset within the Data tab of your workspace. 

Not sure how to build a cohort or dataset? Read Selecting participants: using the Cohort Builder tool and Using the Concept Set Selector and Dataset Builder tools to build your dataset

3) Click Analyze when you finish building your cohort and dataset.

4) The Export Dataset popup will appear with options for exporting the dataset.

5) Select SAS as the programming language. 

6) Use the Copy Code button to copy the dataset builder code to your clipboard. 

sas.image1.png

7) Paste the code into a SAS program (.sas).

a. You can create a new SAS program from the SAS Studio UI by selecting New in the menu bar and selecting SAS Program or selecting New SAS Program on the start page. 

sas.image2.png

8) Execute the code in your SAS app by clicking Run.

sas.image3.png

In the Libraries tab, you will see a new table under your Work library. The table will be named based on the dataset name defined by the Dataset Builder (e.g. “measurement_02903259”).

sas.image4.png

If you want the new SAS table to saved in a different library other than Work follow the instructions below. 

1) Follow steps 1-7 above.

2) Create a new library via one of the following options: 

a.  Navigating to the Library tab and selecting the New Library button. Name your library and select OK.  

b. Add libname <new_library_name> '/data/'; to the top of the Dataset Builder code snippet.

c. Add libname <new_library_name> '/data/'; to a different SAS program and execute the code.

3) In your SAS program with the Dataset builder code, change create table <table_name>  to create table <new_library_name>.<table_name> Dataset. 

4) Execute the Dataset Builder code in your SAS app. Your table will now be saved in your new library <new_library_name>.

To learn more about SAS Studio please see these support articles: Exploring All of Us data using SAS and How to run SAS in the Researcher Workbench

Was this article helpful?

6 out of 16 found this helpful

Have more questions? Submit a request

Comments

0 comments

Article is closed for comments.