This support article highlights use cases of the “temporal” feature within the Cohort Builder. This feature can be very useful when creating a cohort to explore participant availability based on inclusion and exclusion criteria for a specific study design. To learn more about the Temporal Feature, please see this Office Hours presentation: Dec. 8th, 2023: Cohort Builder Temporal Feature.
Disclaimer: Information and examples shown in this article are sourced from a DRC operational environment using synthetic data. This data is solely for education and training purposes, and does not represent actual participant data. Counts and any demographic summaries may vary. Therefore, researchers should NOT expect to replicate these counts using the current CDR on the Researcher Workbench.
Table of Contents
Maria Kilrain, Hiral Master, Lina Sulieman, Sam Stewart
On behalf of the Data & Research Center and the National Institute of Health
The All of Us Research program collects and provides longitudinal data from participants in a secured fashion to researchers on Researcher Workbench. Often, researchers explore whether there is enough sample size to conduct longitudinal study on the Workbench. The Researcher Workbench provides point and click tools like the Cohort and Dataset Builder tools to ease this process (more details on the Cohort Builder and Dataset Builder tools here).
The Cohort Builder tool has a “temporal” feature that allows researchers to create a cohort with a temporal inclusion or exclusion criteria that considers clinical events that happened during the same, before, after or within a certain time range to another clinical event such as a clinical encounter.
In this article, we are presenting examples to describe how to create temporal inclusion criteria using the prescription of tamoxifen drug as our main clinical event. The following method can be used for exclusion criteria too. The following examples will describe how to build cohorts that require temporal features and demonstrate how the size of the cohort will change depending on the temporal option applied.
The following options are available to be used within the temporal feature:
- before a specific event – ‘X or more days before’.
- after a specific event – ‘X or more days after ‘.
- within a period after a specific event – ‘on or within X days of’.
- during the same encounter with a healthcare provider – ‘During same encounter as’.
It is important to consider that all aforementioned alternatives will be applied in combination with the following options for the first criterion:
- ‘Any mention’ - represents any time the specific code appears in EHR.
- ‘First mention’ - represents the first time specific code appears in EHR.
- ‘Last mention‘- represents the last time a specific code appears in EHR.
How to read the Temporal Feature
The temporal feature should be read from top to bottom as indicated in the example below. It is important to keep in mind the structure of the feature before entering all criteria. Also, note that even though the temporal selection is added above the temporal key, there is a continuous line that shows a division or delimitation between the temporal criteria and any prior or additional selection.
Example: “Selecting participants whose first mention in EHR records of malignant tumor of breast was recorded 30 or more days prior to records of drug Tamoxifen in EHR data”.
Table 1. - The table below shows the temporal options along with their respective use cases.
|Selecting cohorts without temporal feature : tamoxifen as a drug.
|During the same encounter with heath provider – ‘During same encounter as’
|Option 1: ‘During same encounter as’.
|After specific event– ‘X or more days after ‘
|Option 2: ‘X or more days after’.
|Before – ‘X or more days before’
|Option 3: ‘X or more days before’.
|Within a period since specific event – ‘on or within X days of’
|Option 4: ‘on or within X days of’.
Use Case: Participants that take or have taken tamoxifen
Tamoxifen can reduce the risk of breast cancer in high-risk women, however, this medication can have side effects. The most common ones are: hot flashes, vaginal discharge, nausea, mood swings, fatigue, depression, hair thinning, constipation, loss of libido , and dry skin. Tamoxifen also can increase the risk of having a stroke, blood clots, or endometrial cancer.
In the following sections, this article will focus on showing different practice examples in which the use of the temporal feature will modify the size of our cohort. There are two criteria to consider when using the mentioned feature. We will explore how to approach the different options available for the temporal feature.
Selecting cohorts without temporal feature : tamoxifen as a drug
In your workspace, under the "Data" tab you will have access to both the Cohort Builder and Dataset Builder. The Cohort Builder is the tool that will allow you to use the temporal feature.
In this first example, only one criterion will be selected : ‘tamoxifen’ as a ‘drug’ because we are interested in all participants that have been treated with that specific drug.
In the search box, we enter the key concept, tamoxifen and then, click on drug as a domain. A list of options will be displayed. We are interested only in tamoxifen as a generic so we click on the plus sign next to tamoxifen to select this concept. After clicking on ‘finish and review’ we can save our first criteria.
After saving criteria, the Cohort Builder will show the cohort size and high overview of the cohort demographics. The cohort counts will reflect participants that have been treated with tamoxifen.
In the following section we will explore different options to filter the group of study using the temporal feature. The size of the cohort will change depending on the temporal condition selected.
Selecting temporal options from the temporal feature
In order to explore the temporal feature, the option needs to be activated by clicking on the toggling option next to the “Temporal” word. There are four options available as we mentioned at the beginning of this article. We are presenting an example of each of these options in the subsequent sections.
The different temporal options available in the Cohort Builder are displayed when the "temporal" toggle is on.
The mentioned temporal options will be used in combination with ‘Any mention of’, ‘First mention of’ and ‘Last mention of’ presented below, according to our search criteria.
Example for Option 1: ‘During same encounter as’.
In this example we are exploring the counts of participants whose EHR data shows tamoxifen and omeprazole recorded during the same encounter with a healthcare provider. Some stomach side effects related to tamoxifen are nausea, loss of appetite, gastrointestinal symptoms or pain in the upper stomach. Healthcare providers may prescribe some medication to help patients cope with the undesirable side effects. In this example, we are searching for participants that were prescribed omeprazole, a medication used to indigestion and heartburn, and acid reflux, after starting the treatment with tamoxifen, and both medications were recorded during the same encounter with a healthcare provider. In the search box we provide the key word ‘omeprazole’ and select the ‘drug’ as a domain.
By selecting ‘Omeprazole’ from the list of possible drugs, clicking on ‘finish and review’ button and saving criteria, our first criterion will be chosen.
Once the temporal button is activated, we can choose the option ‘ during same encounter as’ and add the second criterion. By selecting the ‘Drugs’ domain, we can search for tamoxifen as our temporal second concept.
As the below picture shows, we select ‘tamoxifen’ from the list of possible medications, and then we can save our selection by checking ‘Save criteria’.
The resulting cohort group includes participants that meet the criteria: tamoxifen and omeprazole recorded during the same encounter with a healthcare provider.
Now our cohort can be saved by checking the ‘create cohort’ button and we can name it (for example) ‘tamoxifen_omeprazole_cohort’. The newly created cohort is available to be used in the Dataset Builder.
Please note this example looks into two medications that have been recorded on the same encounter. Therefore, the final counts will not change even if you reverse the medication order.
Example for Option 2: ‘X or more days after’.
In this second example, we will explore how our cohort size will change after using a different temporal option: ‘X or more days after’. This cohort will include participants that developed diabetes after one year treatment of tamoxifen (365 days; all time intervals must be provided in days).
According to new medical studies, breast cancer may be linked to diabetes, but there is overwhelming evidence of association between cancer treatments and risk of developing type 2 diabetes for patients that have been exposed to medication such as tamoxifen.
As our first criterion, we search for participants with ‘diabetes mellitus ’ as a ‘condition’.
We will be prompted to select one of the different concepts from a list of conditions related to diabetes mellitus. ‘Type 2 diabetes mellitus’ will be the one we are interested in.
After clicking on finish and review and saving our criteria, we would have selected our first criterion, ‘Type 2 diabetes mellitus’.
Once we have saved our first concept, a second concept needs to be chosen in order to use the temporal feature within the Cohort Builder. Since the group of study we are focusing on are participants that developed type 2 diabetes mellitus after one year treatment or more with tamoxifen, the temporal option needed is ‘X or more days after’. Also we are selecting ‘first mention of’ because we are trying to make sure the condition started after the use of the drug.
To utilize the temporal feature, note that we have selected the ‘first mention of’ the condition ‘type 2 diabetes mellitus’, 365 days or more after ‘ tamoxifen’ was registered in participants’ EHR records.
The following step is clicking on the domain drug and entering tamoxifen as a second concept.
We can now click on ‘finish and review’ to make sure our second concept is selected and save the selection.
Our total count of participants reflects both criteria: prescribed tamoxifen and developed diabetes after one year or more from the beginning of treatment with tamoxifen.
By creating a cohort, this group will be available in the dataset builder. This new cohort can be called ‘tamoxifen_diabetes_cohort’ for example, then saved.
Example of Option 3: ‘X or more days before’
Tamoxifen may be recommended by a healthcare provider to treat premenopausal and postmenopausal women with breast cancer. Moreover, tamoxifen can be given after surgery to reduce the risk of recurrent breast cancer or avoid metastasis to other parts of the body. Tamoxifen is often used as the first treatment for breast cancer or when surgery needs to be postponed.
For women who have high risk of developing cancer or have been diagnosed with ductal carcinoma in situ, taking tamoxifen for 5 years lowers the risk of the ductal carcinoma in situ recurrence in the same breast or other breast. Tamoxifen decreases the risk of developing an invasive breast cancer. In this example, we are looking into participants that have received a treatment with tamoxifen after a diagnosis of breast cancer.
As the first criterion, we are selecting ‘Malignant tumor of breast’ as a ‘condition’.
We can select the condition ‘malignant tumor of breast’ from the list of different conditions.
By checking the’ finish and review button’ and saving this concept as the first requisite for our group of study, we can proceed with selecting the additional criteria.
As a second criterion, we will include all participants that have started ‘tamoxifen’ after 30 or more days since a participant was diagnosed with breast cancer.
For this example, we are selecting ‘tamoxifen’ as a generic drug, we are not focusing on any specific brand or format so any participants that have been exposed to tamoxifen will be captured independently of the dose, format or product brand.
As we can observe, our cohort counts reflect participants that were diagnosed with the condition ‘malignant tumor of breast’ 30 or more days prior to starting tamoxifen treatment.
After selecting 'create cohort' we will have the option to save it as ‘tamoxifen_breast_cancer_cohort’ (for example) in order to use it in the Dataset Builder.
Example for Option 4: ‘on or within X days of’.
Tamoxifen is a powerful drug to treat or avert breast cancer, therefore currently it is one of the most frequently prescribed drugs for breast cancer globally. A major side effect of tamoxifen is to increase the risk of uterine or endometrial cancer after long-term treatment.
In this example, we are exploring the group of participants that have developed this condition after ten years (or less) of starting the treatment with tamoxifen. The option we are selecting in this case is ‘on or within x days of’ and the value is 3650 days (all values have to be entered in days). By selecting the first mention of tamoxifen, we will capture any participants that took it for at most 10 years.
So we are searching for participants whose first records of uterine cancer are within the 10 years since the beginning of the treatment with tamoxifen.
As the first criterion for the group, the cohort needs to be diagnosed with uterine cancer or ‘Malignant neoplasm of uterus’ or any sub-category of this condition.
By clicking on ‘Malignant neoplasm of uterus’ any child conditions will be also selected.
For our second criterion, we need to add the temporal feature. Since we need to make sure the uterine cancer starts at most 10 years after the beginning of the treatment with tamoxifen, on the top box we are selecting ‘first mention of’ in combination with the temporal option ‘on or within x days of’. As time needs to be provided in days, we can enter 3650 days.
After selecting the domain ‘drug’, we can search for ‘Tamoxifen’ in the list of medications displayed.
The resulting cohort will reflect participants that develop uterine cancer within the first 10 years of the start of the treatment with tamoxifen.
As we did with the prior examples, the last step is creating a cohort by clicking on the ‘create cohort‘ button and assigning a name such as 'uterine_cancer_tamoxifen_cohort’.
Selecting a cohort in the Dataset Builder and export to Jupyter Notebook for analysis
After saving our cohorts, they will be available to be used in the Dataset Builder. You can select the cohort you are interested in and add concept sets in order to generate your dataset.
As you can see all the cohorts generated in the previous steps are available in the Dataset Builder.
Finally, you can export the dataset into a Jupyter notebook using the programming language of your preference. For more details on how to save and import the dataset into Jupyter notebook, refer Analyzing your data in Jupyter Notebooks or watch tutorial video on Dataset Builder & Concept Sets.
The Temporal Feature within the Cohort Builder sets the query parameters to identify applicable person ids based on the inclusion and exclusion criteria you select within the tool. Additional data wrangling and cleaning of the final dataset may be needed.
 American Cancer Society. Tamoxifen and Raloxifene for Lowering Breast Cancer (https://www.cancer.org/cancer/types/breast-cancer/risk-and-prevention/tamoxifen-and-raloxifene-for-breast-cancer-prevention.html) Accessed 08/10/2023.
 American Cancer Society Journals. Association between tamoxifen treatment and diabetes: a population-based study (https://acsjournals.onlinelibrary.wiley.com/doi/full/10.1002/cncr.26559 / )Accessed 08/09/2023.
 Cleveland Clinic. Tamoxifen(https://my.clevelandclinic.org/health/drugs/9785-tamoxifen) Accessed 08/12/2023.