Exploring All of Us data using SAS Studio

  • Updated

SAS, which stands for "Statistical Analysis Software," is a statistical software tool used for data analysis and statistical modeling. It has the ability to handle large datasets efficiently with an extensive set of statistical techniques and a user-friendly interface (UI). 

SAS Studio is available in the Researcher Workbench to support your analysis needs. This article will provide a general overview of how to export the All of Us dataset into your SAS app, how to manipulate tables and visualize data, and how to use SAS code snippets in an active SAS app. To view how to launch a SAS application in the Researcher Workbench, see How to run SAS in the Researcher Workbench

Explore SAS Studio in this Featured Workspace - You can explore more about SAS Studio in this tutorial workspace (Researcher Workbench login required). To access the contents of the SAS Studio tutorial, please duplicate the workspace, then start a SAS Studio app in your duplicated workspace to launch. It will not render a full preview in the Featured Workspace 'Analysis' tab. Please note all mark down is indicated in green within the tutorial workspace. 

Table of Contents

Import data into SAS Studio from Dataset Builder 

You can use the Cohort Builder and Dataset Builder tools to create a data frame in SAS Studio. In this process, you will use the Cohort Builder and Dataset Builder to make a dataset. You will then get the dataset code in SAS and execute the code in the SAS application

  1. To start, ensure you have an active SAS app running. See instructions in the section Starting SAS
  2. Build a cohort and dataset within the Data tab of your workspace. 

    Not sure how to build a cohort or dataset? Read Selecting participants: using the Cohort Builder tool and Using the Concept Set Selector and Dataset Builder tools to build your dataset

  3. Click Analyze when you finish building your cohort and dataset.
  4. The Export Dataset popup will appear with options for exporting the dataset.
  5. Select SAS as the programming language.
  6. Use the Copy Code button to copy the dataset builder code to your clipboard.
  7. Paste the code into a SAS program (.sas). You can create a new SAS program from the SAS Studio UI by selecting New in the menu bar and selecting SAS Program or selecting New SAS Program on the start page.

sas.image1.png

sas.image2.png

8. Execute the code in your SAS app by clicking Run.
sas.image3.png

In the Libraries tab, you will see a new table under your Work library. The table will be named based on the dataset name defined by the Dataset Builder (e.g. “measurement_02903259”).

sas.image4.png

If you want the new SAS table to save in a different library other than Work, follow the instructions below. 

  1. Follow steps 1-7 above.
  2. Create a new library via one of the following options
    • Navigating to the Library tab and selecting the New Library button. Name your library and select OK.  
    • Add libname <new_library_name> '/data/'; to the top of the Dataset Builder code snippet.
    • Add libname <new_library_name> '/data/'; to a different SAS program and execute the code.
  3. In your SAS program with the Dataset builder code, change create table <table_name>  to create table <new_library_name>.<table_name> Dataset. 
  4. Execute the Dataset Builder code in your SAS app. Your table will now be saved in your new library <new_library_name>.

Creating custom queries using SAS

If you would like to create a custom query in SAS without using the Cohort Builder and Dataset Builder tools, you can use the code snippet below and insert your custom query after:

%let workspacecdr = %sysget(WORKSPACE_CDR);
%put The CDR for this workspace is: &workspacecdr;
%let googleproject = %sysget(GOOGLE_PROJECT);
%put The Google Project for this workspace is: &googleproject;

/* Define the BigQuery SQL query */
proc sql;
   connect to bigquery (PROJECT="&googleproject." schema="&workspacecdr." mode='Performance');

When creating custom queries, we recommend you review the All of Us dataset structure and specific tables and fields as noted in the Data Dictionary.

We recommend you limit your query to only columns and rows from each table to fit your project needs. We do not recommend you query an entire All of Us or OMOP related table, as an entire table or too many rows may crash the SAS server, interrupting your SAS session.

To learn more about best practices when querying All of Us data, see the How to Query All of Us data tutorial notebook (Researcher Workbench login required).

Merging tables in SAS using SQL

The instructions below will assist you in merging two tables in SAS using SQL. To learn more about merging tables, see the Merging SAS Tables in a Data Step tutorial video

  1. Within your SAS environment, select New from the top menu → SAS Program to open a new SAS program.
    sas.image5.png
  2. You are able to join two tables by a table key. In the screenshot below, we’re joining the ‘measurement’ and ‘condition’ tables created using the Dataset Builder by ‘person.id’ using this syntax: 
    create table work.measurement_condition_join as
    select *
    from work.measurement_02903259 m inner join work.condition_97877949
    on m.person_id = c.person_id;
    • The syntax ‘work.person’ corresponds to ‘[library].[table name]’. This command tells SAS that the ‘measurement’ table is located under the library ‘work’. If you’re using BigQuery instead, [library] should be a BigQuery dataset.
    • create table will create a table called 'measurement_condition_join' and put it in the WORK Library.
    • select * is the command to select all columns.
    • from the 'measurement' and 'condition' tables. ‘m’ and ‘c’ are aliases for the ‘measurement’ and ‘condition’ tables.
    • on m.person_id = c.person_id is joining these two tables by the 'person_id' columns.
      sas.image6.png
  3. Navigate to the Work library and double click on the new table (in this example the new table is named measurement_condition_join) to check that step 2 above worked as expected. 
    sas.image7.png

Explore data with SAS Library Viewer

SAS allows users to explore data directly from the Libraries window without needing to write any code. The following sections will walk you through some of the features available through the Library Viewer. To learn more about SAS Libraries, see the Accessing Data in SAS Libraries tutorial video

Explore column types

To view a table’s dimension, navigate to the Library, click the drop down of the table you are interested in, and view the column names with their corresponding data type (Numeric, Character, Date). Note: you can also right click on a variable to see variables properties. 

sas.image9.png

View table properties

To view a table’s properties, right click on the table and select Properties. This will open up a new window with two tabs. The General tab will show you the table dimensions and the Column Properties tab will show you each column’s Label, Type, Length, and Format.

sas.image8.png

Table Viewer

To visually explore the data in a table, double click on the table from the Library Viewer. SAS will open a Table View that allows you to do a number of functions. You can right click on a column name to filter and sort the table. You can also click on the ellipses on the far right (highlighted in orange) to expand a list of other table options made available in the UI. 

sas.image10.png

Data query UI

SAS also provides you with a data query UI. To open the query interface, right click on the table from the library viewer and select Create Query. This will open a query builder UI that allows you to select columns, calculate summary statistics, filter and sort data, or even join multiple tables.

Note: as the query builder is filled out, SAS will generate SQL code that can be saved for future execution.

sas.images11.png

sas.image12.png

SAS Tasks

SAS Tasks give you a wide variety of interface options to prepare data, create visualizations, and deploy statistical algorithms without having to write any SAS code.

To access Tasks, simply click on the task icon highlighted below which is located in the ribbon on the left side of your screen. Once you select Tasks, you will see several folders located in the Tasks window. Click the drop-down arrow to view all the tasks within each folder.

Note: Your environment has access to the three folders highlighted below. Any folder starting with “SAS Viya” is not supported in your current environment.  To learn more about accessing Tasks, see the Accessing SAS Viya Tasks in SAS Studio tutorial video

sas.image13.png

Manipulate data

SAS offers a wide variety of SAS tasks to edit data. All SAS Tasks data manipulation options are found under the Prepare Data folder. We will demonstrate a couple common examples below, but you can view more options to data manipulation via SAS support videos or within SAS Communities. Once you have finished building out a task template, you can save the task for future execution and all of your selections will be saved for you. 

Filter data

To filter data using SAS Tasks:

  1. Under SAS Tasks, select on Prepare DataTransform Data, then double click Filter Data.
    sasimage14.png
  2. Once you double click, a filter data template will appear. Note: all required options will be in red text and will stay red until you have provided input for all required fields.
    sasimage15.png
  3. Add data by clicking on the Data icon. In the new pop-up window select Library of choice→ Table of choice, then click OK.
    sasimage15.png
    sasimage17.png
  4. Fill out the logic for your filtering criteria. In this example, we have filtered on the variable gender containing Male or Female.
    • To do so, first select the variable you want to add by clicking the plus sign or in the gray Add a variable box.
    • In the pop-up window, select the desired variable and then click OK.
    • Next, select the Comparison criteria you would like to use. Here we’ve used Equal to.
    • Then select the Value Type.
      • If you choose Enter a Value, then you must type the value exactly as it appears in the data (including case and punctuation). Here we’ve selected Select distinct Value which provides a drop down list.
      • If you would like to add additional logic you need to select the Logic drop down. Notice as you fill out the filtering task you can see the SQL code being generated in the code window on the right.
    • Once you have completed your filtering criteria, click the Run icon at the top left of the Task template window.
      sasimage16.png
  5. SAS provides a default table name and will output your new data set to the Work library. You have the option to change the library and table name in the Output Data Set section at the bottom of the Task template.
    • To do so, type the name of the library you would like to output to, type a period, then type the name of the data set that you would like to create. Note: the table name has to start with a character and cannot include spaces.
      • If you have created the table previously you must choose the Overwrite data box highlighted below. Once done, click the Run icon again.
        sasimage18.png

Join tables

  1. Under SAS Tasks, select on Prepare DataTransform Data, then double click Combine Tables.
    sasimage19.png
  2. From the Combine Table template, select the tables you want to join.
    • Then select the join method you would like to use. SAS allows you to concatenate your data (stack one on top of the other) or perform data joins (full, inner, left and right).
    • Next, select the type code that you want SAS to create.
    • Finally, select the column(s) you want to join by. Note: any columns you want to join on must have matching names in each table. As with the filter template, SAS will create SQL or SAS Data Step code based on your selections.
      sasimage20.png
  3. Next, click on the Output tab near the top of the template. Here you can provide the library and table name for the new dataset you are creating and select the variables that you want to keep from each table.

    Note: if you do not select any columns, SAS will default to select everything. If you have columns with matching names, SAS will default to select the data from the first table. If you want to include both columns then you must rename one of the columns before creating your combined dataset.

    Once you are ready, click the Run icon at the top left of the template. 
    sasimage21.png

Create visualizations

SAS provides various ways to create visualizations within the SAS app. To create visualizations for your data, follow these general steps.

  1. Under SAS Tasks, select on Visualize Data. Select the drop down for the Graphs folder, then double click on the icon of the graph you would like to create.
    sasimage22.png
  2. Once you select the graph that you want to create, a new window will open containing several options. All required fields must be completed for the graph to create, as indicated in the image below. 
    sasimage23.png
  3. To add data, click the folder icon → select the library → select the dataset, then click OK. 
    sasimage24.png
  4. Next, select your Categories by clicking on the plus sign and then select the category and subcategory that you would like to use. If you select the wrong variable, you can update it by clicking the trash icon next to the plus sign and select a new variable.
    sasimage25.png
  5. Complete all remaining options and then select Run at the top left of the template window.
    sasimage26.png
  6. Once your graph is created, you can save your template and results.

Saving a SAS visualization

When you have created a visualization in SAS, click on the ‘Results’ window. Right-click on the image, and click save. This will download the image to your local PC. Prior to downloading any objects from SAS, please ensure you are in compliance with our Data and Statistics Dissemination Policy and have reviewed our Egress Policy.

imag24.png

Policies for downloading data from SAS

While it is possible to download data from SAS, please remember that all downloads and dissemination of data must comply with the Data User Code of Conduct and the  Data and Statistics Dissemination Policy.

As a reminder, per the Data User Code of Conduct, registered users of the Researcher Workbench are not permitted to download or remove participant-level data from the platform. Refer to our Egress Alert Policy article for more details: Egress Alert Policy – User Support

An egress alert will be issued when a registered researcher violates certain security policies put in place by the All of Us Research Program’s Data and Research Center (DRC).

SAS Snippets

Snippets offer beginner coders the framework for fully functioning SAS code. SAS Snippets are great for users who want more flexibility than what SAS Tasks provide while still requiring minimal coding from the end user. To learn more about SAS Snippets, see the Using Snippets in SAS Studio tutorial video

  1. To access snippets, click on the Snippet icon on the left hand ribbon.
  2. Click on the type of snippet you would like to use. Highlighted below is that snippet for importing a CSV. In this example, you provide the file path to the CSV on line 4 and the code will create a SAS table MyCSV to the work library. 
    sasimage27.png

Additional resources

For additional information, support, and tutorials, check out the following resources:

 

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Article is closed for comments.