Featured Workspaces - Table of Contents

Scope:

The purpose of the All of Us Featured Workspaces Table of Contents is to assist users in navigating the Featured Workspaces tab within the Researcher Workbench.  

 

Background:

All researchers with a Researcher Workbench account have access to “Featured Workspaces,” which are workspaces designed to provide examples of cohorts, concept sets, and data analyses that can be used to inform or enhance your work. These workspaces can be accessed from the left-hand navigation bar in the Researcher Workbench by clicking a section labeled “Featured Workspaces.” In order for you to edit these workspaces, you need to clone the workspace of interest by clicking on the left side of the workspace icon on the “snowman” icon   and then selecting “Duplicate.”  This will open the cloned workspace's “About” section, where you can choose to change the name or any other description within this space and then click “Duplicate Workspace.” If you do not clone these workspaces, you will not be able to use these and copy the code for your own analyses. These workspaces are currently divided into three categories:

 

Tutorial Workspaces:

If you are new to the Researcher Workbench, the Tutorial Workspaces tab within Featured Workspaces is a great starting point for learning how to analyze data within the All of Us dataset.  Workspaces in this section will walk you through basic data manipulation, analysis techniques specific to the All of Us data, and backing up your research within the Researcher Workbench.. 

 

Skills Assessment Training Notebooks For Users

This workspace contains multiple notebooks that assess users' understanding of the workbench and OMOP. These notebooks are meant to help users check their knowledge not only on Python, R, and SQL, but also on the general data structure and data model used by the All of Us program.

00. Overview of Data Analysis Steps in the AoU Research Program

If you are new to the All of Us Researcher Workbench (this platform), you might feel lost at first and not know where to even start with your analysis. The notebook suite in this workspace will teach you the basics of what you need to know to start and complete your analysis. This introductory notebook walks you through the general steps that you need to follow to perform your analysis.

01a. Learning OMOP Common Data Model (CDM)

The All of Us (AoU) Researcher Workbench uses the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) which has Observational Health Data Sciences and Informatics (OHDSI) standardized vocbularies. The OHDSI is an international collaborative endeavour that maintains and evolves the OMOP CDM specifications aimed to efficiently perform clinical health care analytics based on standardize clinical observational outcomes data. The All of Us Data and Research Center leverages the OMOP v5.3.1 CDM to empower researchers by using existing, standardized vocabularies and a harmonized data representation. These factors enable connection to other ontologies, datasets, and tools that use the same codes or data model.

01b. How to explore OMOP Using Athena

As a user who wants to use the Researcher Workbench for educational/research needs, here’s what you should know about Athena. Athena is an essential tool to search for concepts in the OMOP vocabulary. In this notebook, you will learn how to navigate Athena's website to find information about OMOP concepts and vocabulary. The OMOP vocabulary is constructed around concept ids. Any medical concepts (conditions, drugs, procedures, etc.) will be represented in OMOP as concept ids. These IDs are unique identifiers, i.e. you shouldn't find two different medical concepts with the same concept ids. Concept ids make it easier for people to find a particular medical concepts in the OMOP table. For example, if you are looking for patient data on diabetes, you need to know what the concept id related to diabetes is, and use that to search the OMOP tables. Note that the *All of Us* CDR dataset does not implement every single concept id found on Athena's website. In this notebook, you will learn how to find a concept id using Athena.

02a. Learning How to Use All of Us Tools

The All of Us data is saved into tables and analyses must be conducted on the Researcher Workbench (within Jupyter Notebooks). To help facilitate the collection and understanding of the data, some tools such as Cohort Builder (CB) and Dataset Builder (DB) are available to users within the Reseacher Workbench before they begin the analyses.

02b. Learning How to Query Data using SQL

03a. Learning How to manipulate data (for Python Users)

In the Researcher Workbench, the dataset created using the Dataset Builder will be loaded in DataFrames whether the user selects Python or R as the programming language. This notebook describes the most common functions that the user might need to manipulate DataFrames in Python. For the entire list of DataFrame functions, you can visit this page.

03b. Learning How to manipulate data (for R Users)

In the Researcher Workbench, the dataset created using the Dataset Builder will be loaded in DataFrames whether the user selects Python or R as the programming language. This notebook describes the most common functions that the user might need to manipulate DataFrames in R.

How to get started with Registered Tier Data (tier 6)

This notebook will give you an overview of what data is available in the current Curated Data Repository (CDR). It will also teach you how to retrieve information about Electronic Health Records (EHRs), Physical Measurements (PM), and Survey data.

Data 101 - Data Fundamentals [Python] or [R]:

Depending on your preferred programming language, there is both a notebook in Python and in R. The notebooks will walk through the supplied code to explain what is being performed. The code blocks can be copied over into a new notebook, set in either edit or playground mode, to be run. The notebook walks through the following procedure: 

  1. Setup: How to set up this notebook, install and import software packages, and select the correct version of the CDR.
  2. Data Availability Part 1: How to summarize the number of unique participants with data present across the major data types: Physical Measurements, Surveys, and EHR;
  3. Data Availability Part 2: How to delve a little deeper into data availability within each major data type.
  4. Data Organization: An explanation of how data are organized according to our Common Data Model (CDM).
  5. Example Queries: How to directly query the CDR, using examples of SQL queries to extract demographic data.
  6. Expert Tip: How to access the base version of the CDR, for users that want to do their own cleaning.

 

How to get started with Controlled Tier Data:

Data 101: Fundamentals Python (Controlled Tier Data)

Introduces the data types available in the Controlled Tier, as well as a tutorial on how to retrieve and summarize them in the Jupyter notebook using python coding language.

Data 101: Fundamentals R (Controlled Tier Data)

Introduces the data types available in the Controlled Tier, as well as a tutorial on how to retrieve and summarize them in the Jupyter notebook using R coding language.

 

Data Wrangling in the All of Us Program 

A featured workspace targeted to new users that covers basic data wrangling in the Workbench, related to the popular Office Hours session about this topic. It expands significantly on the Office Hours demo, and provides a step-by-step walkthrough about building cohorts, pulling specific types of data associated with the cohort, merging/combining this data into one final data frame, and then visualizing it as well as analyzing it through some common statistical tests.

 

Data Wrangling in R

Data Wrangling in python

 

How to Work with All of Us Survey Data (tier 6): 

By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, and what the frequencies of answers for each question in each PPI module are.

0 - How to Guide [Python] or [R]:

This notebook will walk you through examples of various ways in which to extract and visualize Participant Provided Information (PPI) survey data from the All of Us CDR. Ultimately, you should be able to use example code from this notebook and (with a few minor changes) see similar results in your own research.

1 - The Basics [Python] or [R]:

Use this notebook if you would like to know how to extract survey data on All of Us Participants, or get the response frequencies for all survey responses to the The Basics survey.

2 - Overall Health [Python] or [R]:

Use this notebook if you would like to get the response frequencies for all survey responses to the Overall Health survey.

3 - Lifestyle [Python] or [R]:

Use this notebook if you would like to get the response frequencies for all survey responses to the Lifestyle survey.

4 - Personal Medical History [Python] or [R]:

Use this notebook if you would like to get the response frequencies for all survey responses to the Personal Medical History survey.

5 - Healthcare Access and Utilization [Python] or [R]:

Use this notebook if you would like to get the response frequencies for all survey responses to the Healthcare Access & Utilization survey.

6 - Family Health History [Python] or [R]:

Use this notebook if you would like to get the response frequencies for all survey responses to the Family Health History survey.

7 - Social Determinants of Health [Python] or [R]:

Use this notebook if you would like to get the response frequencies for all survey responses to The Social Determinants of Health survey.

8 - COPE Survey [Python] or [R]:

This notebook will give an overview of COPE survey data available in the current Curated Data Repository (CDR) and how to retrieve them. This tutorial is divided into the following sections:

  1. Setup: How to set up this notebook and install and import all necessary software packages.
  2. Data Availability: What types of data can be found in the COPE survey.
  3. Example Queries: How to extract data by survey version, topic, and question.

9 - COVID-19 Vaccine Survey [Python] or [R]:

Use this notebook if you would like to get an overview of working with the COVID-19 minute survey data. 

The tutorial is divided into three sections:

  1. Setup: How to set up this notebook and install and import all necessary software packages.
  2. Data Availability: What types of data can be found in the COPE survey.
  3. Example Queries: How to extract data by survey version, topic, and question.

10 - HowToCalculateSurveyScore [Python]:

This notebook will demonstrate how to calculate scores from applicable sets of survey response data contained in the Curated Data Repository (CDR). More specifically, it will describe how to apply source scoring instructions and calculate resulting scores for measures included in the All of Us Covid-19 Participant Experience (COPE), Overall Health, and Lifestyle surveys. This tutorial is divided into the following sections:

  1. Setup: How to set up this notebook and import all necessary software packages.
  2. Scores: What measures can be scored and how to calculate them according to each applicable survey (COPE, Overall Health, Lifestyle).

 

How to work with All of Us COPE Survey Data (CT): 

COPE Survey [Python] or [R]:

This notebook will give an overview of COPE survey data available in the current Controlled Tier Curated Data Repository (CDR) and how to retrieve them.

This tutorial is divided into the following sections:

  1. Setup: How to set up this notebook and install and import all necessary software packages.
  2. Data Availability: What types of data can be found in the COPE survey.
  3. Example Queries: How to extract data by survey version, topic, and question.

 

How to work with All of Us Physical Measurement Data:

By running the notebooks in this workspace, you should get familiar with how to navigate around physical measurements. 

Program Physical Measurements [Python] or [R]:

Use this notebook if you would like to know how to extract Physical Measurement (PM) data on All of Us participants, get an idea of what is in the PM data, or determine whether PM data is derived from Participant Provided Information (PPI) or Electronic Health Records (EHR). Measurements collected include: height, weight, BMI, waist circumference, hip circumference, pregnancy-status, blood pressure, heart rate, and wheelchair use.

 

How to work with Wearable Device Data (tier 6): 

This notebook will give an overview characterization of the Fitbit data elements currently available in the current Curated Data Repository (CDR) and provide best practices and tips for how to retrieve them.

Fitbit 101 [Python] or [R]:

This notebook will give you an overview of what Fitbit data elements are available in the current Curated Data Repository (CDR) and how to extract them for your research. This tutorial is divided into the following sections:

  1. Setup: How to set up the notebook and install and import all necessary software packages.
  2. Data Availability: What types of data elements are contained within each Fitbit data table.
  3. Example Queries: How to extract data from each table.

 

How to Backup Notebooks and Intermediate Results:

These notebooks will give an overview of how to create snapshots of notebooks and backups of intermediate results stored in other files such as plot images and derived data.

Create and view HTML snapshots of notebooks [Python]: 

This notebook displays a widget which can save snapshots of a notebook for later review, allowing users to track changes to results in notebooks over time. To do this, it converts the selected notebook to an HTML file (without re-running the notebook), then copies that HTML file to a subfolder within the same workspace bucket where the notebook file is stored.

Note: the code within this notebook was also made into a code snippet which can be accessed from any notebook through the Snippet tab. 

Version Files [Python] or [R]:

If your notebooks create files such as image files of plots or CSVs of intermediate results that you would like to retain and/or you would like to share with your collaborators without requiring them to re-run your notebook to recreate them, then this notebook shows how to store those files in a workspace bucket. 

 

How to Run Python Notebooks in the Background:

If you wish to capture all notebook cell outputs, use this notebook to run your long-running notebook (or any other long-running notebook).

But also note that for your analysis the cluster will autopause after 24 hours. To prevent your cluster from shutting down if your background job takes longer than 24 hours, be sure to log in and start a notebook, any notebook, to reset the autopause timer.

 

How to Work with All of Us Genomics Data (Hail and Plink) - Controlled Tier: 

Details and instructions on how to prepare a phenotype, import and save VCF files, run a GWAS, and access and utilize the Hail Matrix Table. How to use Plink in a Jupyter notebook. How to run notebooks in the background.

1 - Get Started with Microarray Genotyping Data:

We will learn how to load the All of Us array data, which includes:

  • Hail matrix table
  • PLINK BED files
  • IDAT files
  • auxiliary files
  •  

1 - Get Started with WGS Data:

We will learn how to load the All of Us WGS data, which includes:

  • Hail matrix table
  • CRAM files
  • variant call format (VCF) files
  • auxiliary data, such as
    • VCF shard intervals
    • variant annotations
    • genetic predicted ancestry
    • joint callset sample QC
    • participant-level QC metrics
    • relatedness

2 - Hail Part One Prepare Phenotype:

This notebook shows how to extract EHR and Demographics data that will later be used in this notebook series for our GWAS analysis.

2 - Hail Part Two GWAS:

This notebook performs GWAS and visualize the association results. You will learn:

  • How to load the previously saved phenotype
  • How these two data types are merged into one matrix table for the GWAS modeling
  • Use Hail to run your GWAS model
  • Create Manhattan and QQ-plot to check your results

3 - Manipulate Hail MatrixTable:

In this tutorial notebook, you will learn how to utilize some Hail functions to manipulate a Hail MarixTable which includes:

  • Get basics about MatrixTable
  • Filter variants
  • Filter samples
  • Convert a Hail MatrixTable into other formats of files, such as PLINK BED files and VCF files.

4 - PLINK and GWAS Part One Model Building [Python]:

This notebook performs:

  • How to pre-process the phenotype data to PLINK readable format
  • How to use PLINK to run your GWAS model

4 - PLINK and GWAS Part Two Plot [R]:

This notebook shows how to load and visualize GWAS results using R.

5 - How to Run Notebooks in the Background:

On the All of Us Researcher Workbench, users are logged out after 30 minutes of inactivity. For long running Hail jobs, this means that users may not have all notebook cells populated even though the notebook continues to run to completion.

  • Also note that in your analysis, the cluster will autopause after 24 hours. To prevent your cluster from shutting down if your background Hail job takes longer than 24 hours, be sure to log in and start a notebook, any notebook, to reset the autopause timer.

5 - Monitor Cloud Analysis Environment:

This notebook shows how to view CPU utilization and memory utilization when your Cloud analysis environment is running. For example, if you are using a Dataproc cluster and you see that the CPU utilization of the worker machines is close to 100%, you could increase the number of workers to speed up the "wall clock time" it takes to complete the analysis.

 

CRAM Processing

The notebooks of this workspace demonstrate how to copy or localize All of Us CRAM files to your workspace bucket and active cloud environment in order to look at their contents with the Integrated Genome Viewer (IGV).

This workspace contains two notebooks:

1 - CRAM localization 

This notebook shows where the CRAM files and manifest are, how to localize the manifest or a known CRAM to your workspace bucket and active environment, and how to use the manifest to localize CRAMs by the included paths.

2 - IGV Analysis 

In this notebook, you will use the Integrative Genomics Viewer (IGV) to look at one of the CRAM files you localized to your active environment in notebook 1.

 

How to Run WDLs using Cromwell in the Researcher Workbench:

Validate VCFs with Cromwell:

In this tutorial, you will learn how to set up your Cloud Environment and use Cromwell to execute an example script, validate_vcf.wdl. This workflow uses the GATK ValidateVariants tool to validate VCF files. Alpha3 VCF files, corresponding index files, and the human reference genome assembly are provided to the workflow as inputs. The duration of this tutorial should be around 20 minutes.

After completing the tutorial, you should understand how to do the following:

  1. Setup your Cloud Environment
  2. Load WDL, JSON, and other files required for your workflow
  3. Execute a WDL using Cromwell
  4. Find output and log files

 

How to Use Nextflow in the Researcher Notebook:

Validate VCFs with Nexflow:

This notebook is designed to be an example of how to use a Nextflow script within the Researcher Workbench with the Alpha3 Array single sample variant call format (VCF) files.

How to Use dsup in the Researcher Workbench:

1 - dsub Setup:

In this notebook we setup dsub for use on the All of Us Researcher Workbench.

What you will learn:

  1. What is dsub?
  2. When would I want to use dsub instead of a notebook?
  3. How to install dsub.
  4. How to create a bash function with default argument values for dsub.

2 - Run a Basic dsub Job:

In this notebook, we will run the most simple dsub 'hello world' job to demonstrate the basics.

What you will learn:

  1. How to run a simple dsub job.
  2. How to check job status.
  3. How to debug a failed job by examining the log files.

3 - Run a Parallel dsub Job:

In this notebook we run a simple dsub job to count the number of lines using wc on some PLINK format files, which is not necessarily a useful task, but it demonstrates how to run the wc command in a parallel manner.

What you will learn:

  1. How to construct a task file for dsub
  2. How to access All of Us PLINK format files
  3. How to run parallel dsub jobs
  4. What it looks like when tasks succeed or fail

 

How to Reproduce All of Us SARS-CoV-2 Antibody Study:

SARS-CoV-2 Antibody Study:

This notebook will give an overview of the Antibodies to SARS-CoV-2 in All of Us Research Program Participants, January 2-March 18, 2020 study in the current Curated Data Repository (CDR) and how to reproduce it.

This tutorial is divided into the following sections:

  1. Setup: How to set up this notebook and install and import all necessary software packages.
  2. Materials and Methods: What are the characteristics of the study population and what are the methods used in study.
  3. Data Extraction: How to query the relevant CDR tables to extract the study data.
  4. Results: How to reproduce the results of the study.

 

Demonstration Projects:

Workspaces in the Demonstration Projects section of Featured Workspaces  will show you end-to-end analysis performed using All of Us data. These projects demonstrate the quality, utility, and diversity of the All of Us data by replicating findings in previously published studies.

 

Wearables and the Human Phenome

This study and corresponding notebook examines the associations between physical activity over time (measured using participant wearables) and incident chronic diseases as determined by EHR data. It's an example of how the Workbench can be leveraged to work with FitBit data using R. 

 

Demo-PheWAS Smoking

As a demonstration project, this study will present the results of Phenome-Wide Association Studies (PheWAS) to show how the various sources of data contained within All of Us research dataset can be used to inform scientific discovery.

Smoking PheWAS Part I. [Python]:

The specific goals of this notebook are to:

  1. To demonstrate how to implement a Phenome-wide Association Study within the All of Us Researcher Workbench.
  2. To demonstrate use of heterogeneous data sources within the All of Us dataset.

Smoking PheWAS Part II. [Python]:

In this document, we will be developing plots that compare the results of EHR smoking with PPI ever smoking and PPI smoking every day PheWAS routines.

 

Demo - Cardiovascular Risk Scoring: 

In this project, we plan on using the AHA algorithm/equation to calculate the cardiovascular risk scores. Further, we want to demonstrate the usage of smoking and race data collected by the program, which are data that usually researchers use natural language processing to extract, to facilitate the calculation of cardiovascular risk score.

Framingham_ASCVDModels [Python]:

The specific goals of this notebook are to:

  1. Identifying the variables required to calculate the cardiovascular risk score from participants provided information (PPI) and EHR data.
  2. Calculating the cardiovascular and analyzing the risk scores by race.

 

Demo - Medication Sequencing: 

In this project, we plan on using the medication sequencing developed at Columbia University and the OHDSI network as a means to characterize treatment pathways at scale. Further, we want to demonstrate implementation of these medication sequencing algorithms in the All of Us research dataset to show how the various sources of data contained within the program can be used to characterize treatment pathways at scale. 

Medication Sequences Code [Python]:

The specific goals of this notebook are to:

  1. To demonstrate how to implement a medication sequence for diseases including type 2 diabetes and depression within the All of Us Researcher Workbench.
  2. To demonstrate use of heterogeneous data sources within the All of Us research dataset.

 

Demo - All of Us Descriptive Statistics 

In this study, we will apply data visualization libraries to aggregate information about the Cohort. We will measure age by using the age reflected when the CDR was generated.

Descriptive plots to characterize the cohort [R]:

The specific goals of this notebook are to:

  1. Describe an overview of data types included in beta release Curated Data Repository (CDR), including the historic availability of the electronic health records (EHR).
  2. Describe participants by age, race, and ethnicity using all the available data types.
  3. Describe the underrepresented biomedical research (UBR) population within the All of Us Research Program participants.

 

Demo - Family History in EHR and PPI: 

FHH Analysis:

To evaluate data in the All of Us program, we use descriptive statistics to perform exploratory analysis of the family history availability in EHR and survey data.

 

Demo - Hypertension Prevalence: 

HTN:

All of Us demonstration project teams were charged with replicating known associations from published literature to demonstrate the utility of All of Us data and to test the Researcher Workbench interface prior to release. Our aim was to use published methods to replicate known differences in hypertension prevalence in UBR groups and illustrate variation in hypertension prevalence in geographic regions of the U.S. We compared our results to the 2015–2016 National Health and Nutrition Examination Survey (NHANES) hypertension prevalence results. https://www.cdc.gov/nchs/products/databriefs/db289.htm

 

Demo - Systemic Disease and Glaucoma: 

The aims of our study were to: (1) externally validate our single-center model’s performance with All of Us data, (2) develop models trained by the All of Us data and compare their performance to our single-center model, and (3) share insights from our experience using All of Us data and the Researcher Workbench with other ophthalmology researchers who may be interested in using this novel data source.

1 - Data Extraction and Cleaning:

2 - Validation of Single-center Model:

3 - Data Modeling using AoU Data:

4 - Best Performing Models in Manuscript:

5 - Modeling AoU Data with Single-center Model Predictors:

 

Demo - Siloed Analysis of All of Us and UK Biobank Genomic Data: 

What you will find in this workspace are all the notebooks needed to perform a regenie genome-wide association study of lipids over the exonic variants within the All of Us (AoU) alpha3 release of genomic data in a siloed fashion.

0 - readme:

1 - aou_lipids_phenotype:

1 - aou_lipids_phenotype_20220314_204423:

2 - aou_write_filtered_bgen:

2 - aou_write_filtered_bgen_20220315_183726:

2 - aou_write_filtered_bgen_previewable_20220315_183726:

3 - aou_variant_qc:

3 - aou_variant_qc_20220316_192346:

4 - aou_plink_ld_and_pca:

4 - aou_plink_ld_and_pca_20220316_194724:

5 - aou_phenotype_for_gwas:

5 - aou_phenotype_for_gwas_20220316_203933:

6 - aou_regenie_gwas:

6 - aou_regenie_gwas_20220316_204249:

7 - aou_plot_results:

7 - aou_plot_results_20220316_223323:

8 - aggregate_gwas_results:

8 - aggregate_gwas_results_20220317_183208:

html_snapshots:

Monitor_cloud_analysis_environment:

Run_notebook_in_the_background:

 

Data Quality Reports - 2022Q2R2 v6 CDR: 

This workspace provides detailed demographic information about the participants in the 2022Q2R2 version 6 curated data repository (CDR), as well as a summary of participants by data type. Tables and useful graphs are included. Notebooks include: 

1. Summary of Participants By Data Type

2. Demographic Characteristics of Participants By Data Types

3. UBR Breakdown

 

Replication of Dissecting Racial Bias Paper:

In this notebook , we will train a machine learning model that will predict the health status of participant the year following the participants enrollment.

  • Demographics: age at enrollment, gender, race and ethnicity, insurance, education
  • Indicators for active chronic conditions at the enrollment year
  • Biomarkers related to chronic diseases
  • Medication: number of unique medications

 

Phenotype Library:

Workspaces in the Phenotype Library section of Featured Workspaces demonstrate how computable electronic phenotypes can be implemented within the All of Us dataset using examples of previously published phenotype algorithms. 

 

Phenotype - Breast Cancer:

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithms: Ning Shang, George Hripcsak, Chunhua Weng, Wendy K. Chung, & Katherine Crew. Breast Cancer. Retrieved from https://phekb.org/phenotype/breast-cancer.

Breast Cancer PPI Data Analysis [R]:

In this Notebook, we provide a supplement to the validated Breast Cancer Phenotype by demonstrating a method to identify additional participants according to their answers to breast cancer-related questions on participant provided information (PPI) surveys.

Breast Cancer Phenotype Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined algorithm for breast cancer.

 

Phenotype - Breast Cancer (Controlled Tier): 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithms: Ning Shang, George Hripcsak, Chunhua Weng, Wendy K. Chung, & Katherine Crew. Breast Cancer. Retrieved from https://phekb.org/phenotype/breast-cancer.

Breast Cancer Phenotype Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined algorithm for breast cancer.

 

Phenotype - Dementia: 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithms: Ritchie, M., Denny, J., Crawford, D., Ramirez, A., Weiner, J., … Roden, D. (2010). Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. American Journal of Human Genetics. 87(2):310 doi: 10.1016/j.ajhg.2010.03.003

Dementia Phenotype Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined phenotype algorithm for dementia.

Dementia Phenotype Analysis from Cohort Builder [Python]:

This notebook describes the query performed to capture a cohort according to a predefined phenotype algorithm for dementia. This notebook also provides visualization.

 

Phenotype - Depression: 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithm from the eMERGE network: TBA. KPWA/UW. Depression. PheKB; 2018 Available from: https://phekb.org/phenotype/1095

Depression Phenotype Analysis [Python]:

This notebook describes the query performed to capture 3 cohorts according to a predefined phenotype algorithm for depression.

 

Phenotype - Ischemic Heart Disease: 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithm: Christianne L. Roumie; Jana Shirey-Rice, Sunil Kripalani. Vanderbilt University. MidSouth CDRN - Coronary Heart Disease Algorithm. PheKB; 2014. Available from https://phekb.org/phenotype/234

Ischemic Heart Disease Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined phenotype algorithm for ischemic heart disease, also known as coronary artery disease.

 

Phenotype - Ischemic Heart Disease (Controlled Tier): 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithm: Christianne L. Roumie; Jana Shirey-Rice, Sunil Kripalani. Vanderbilt University. MidSouth CDRN - Coronary Heart Disease Algorithm. PheKB; 2014. Available from https://phekb.org/phenotype/234

Ischemic Heart Disease Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined phenotype algorithm for ischemic heart disease, also known as coronary artery disease.

 

Phenotype - Diabetes: 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithm: Jennifer Pacheco and Will Thompson. Northwestern University. Type 2 Diabetes Mellitus. PheKB; 2012. Available from: https://phekb.org/phenotype/18

Type 2 Diabetes Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined phenotype algorithm for Type 2 Diabetes according to four different cases.

 

Phenotype - Diabetes (Controlled Tier): 

By reading and running the notebooks in this Phenotype Library workspace, researchers can implement the following published phenotype algorithm: Jennifer Pacheco and Will Thompson. Northwestern University. Type 2 Diabetes Mellitus. PheKB; 2012. Available from: https://phekb.org/phenotype/18

Type 2 Diabetes Analysis [Python]:

This notebook describes the query performed to capture a cohort according to a predefined phenotype algorithm for Type 2 Diabetes according to four different cases.

Was this article helpful?

1 out of 4 found this helpful

Have more questions? Submit a request