Overview of Exploring the Mind Data in the Researcher Workbench

  • Updated

Table of contents

Overview

All of Us participants who provide primary consent to be part of the All of Us Research Program and complete the All of Us core surveys (the Basics, Overall Health, and Lifestyle surveys) are invited to complete Exploring the Mind (EtM) tasks.

The four tasks, which were developed through a partnership with the Many Brains Project and their TestMyBrain platform, are a series of computer-based tasks aimed at measuring various cognitive aspects through remote, unsupervised administration. Beginning February 2024, the data collected from these tasks are now accessible in the All of Us Researcher Workbench.

The EtM dataset includes four tasks:

  • Gradual-onset continuous performance task (gradCPT): City or Mountain
  • Delay discounting task: Now or Later
  • Flanker task: Left or Right
  • Multiracial facial emotion recognition task: Guess the Emotion

Data format and details

Methodology

Starting in September 2023, the All of Us Research Program invites eligible participants to participate in the Exploring the Mind (EtM) activity series which focuses on measuring cognitive control, sustained attention, social facial recognition, and reward valuation.

The tasks included in the series were selected in collaboration with the National Institute of Mental Health (NIMH) using the Research Domain Criteria (RDoC) framework (Cuthbert, 2022; Cuthbert & Kozak, 2013; National Institute of Mental Health, 2023).

After All of Us participants complete the Basics survey, the Lifestyle survey, and the Overall Health survey, participants are invited to complete four game-like cognitive tests, referred to as the Exploring the Mind tasks, in the All of Us participant portal.

Participants can complete any of the four tasks available in any order they choose (see Figure 1). Completion can be done on a device type of their choosing (i.e., desktop, laptop, smartphone, tablet). Basic metadata from the device type and response method used (e.g., mouse click, touchscreen) are recorded and available in the Researcher Workbench. For additional details of cognitive testing structure, read Introduction to Cognitive Testing Data in the All of Us Research Program.

Exploring the Mind task gallery within the All of Us participant portal.

Figure 1: Exploring the Mind task gallery within the All of Us participant portal.

NIMH partnered with the All of Us Research Program to develop four EtM tasks: City of Mountain, Now or Later, Left or Right, and Guess the Emotion.

City or Mountain

The City or Mountain task, a gradual onset continuous performance task (gradCPT), tests how fast participants respond—or how well participants resist responding—to a changing scene. Participants are asked to press a response key only when they see an image of a city while the image fades between city and mountain scenes. This type of task is used to understand cognitive systems such as attention, cognitive control, and the ability to respond to only one type of information while ignoring others. For more information, read Overview of the Gradual Onset Continuous Performance Task (City or Mountain).

Now or Later

The Now or Later task, a delay discounting task, measures a participant’s individual level of temporal discounting or how long they are willing to wait for a certain reward. Participants are presented with different pretend scenarios where they are offered a certain amount of money after specific waiting times. The amount of money and waiting time change in each scenario. Participants decide whether they would like a smaller monetary reward in a short time frame or a larger monetary reward in a longer time frame. This type of task is used to understand how people assess the value of rewards. For more information, read Overview of the Delay Discounting Task (Now or Later).

Left or Right

The Left or Right task, a Flanker test, measures how well participants can focus in distracting environments. Participants are presented with a screen showing five arrows—one center arrow between two sets of side arrows. During the task, the five arrows change direction, and participants must indicate the direction of only the middle arrow while ignoring the direction of the side arrows. This type of task is used to measure attention and response inhibition, or the ability to stop a response that results from inappropriate information (in this case, the side arrows). For information, read Overview of the Flanker Attention Task (Left or Right).

Guess the Emotion

The Guess the Emotion task, a version of a Facial Emotional Recognition task, asks participants to look at pictures of people and name the emotions based on their facial expressions. This type of task is used to learn about social processes, social communication, and how people receive information communicated through facial expressions. For more information, read Overview of the Emotional Recognition Task (Guess the Emotion).

Read additional information about the four tasks in the NIMH’s news release.

Categories of cognitive test data available in the Researcher Workbench

The cognitive test data are available in three categories for each task: trial-level data, summary scores (outcomes), and metadata.

Each of these data categories may contain multiple data types (e.g., numeric values, text strings, boolean values). Data from each task are available at both outcome and trial levels. Metadata for each of the tasks are also made available, which provides insights on the platform (i.e., device type) that participants used to complete the task.

To learn more about the EtM data structure and data elements available, see the Data Dictionaries. For considerations on how and when to use different data categories, read Introduction to Cognitive Testing Data in the All of Us Research Program.

Trial-level data

Trial-level data describe the characteristics of both (1) the stimulus the participant observed on each trial and (2) the participant’s response on each trial.

Summary scores

At the end of a cognitive test, data obtained from the participant’s responses to each individual trial (trial-level data) are aggregated to produce summary scores (sometimes also referred to as outcomes).

In contrast to individual trial data, each summary score reflects performance across a combination of trials, often the entire test. Example summary scores may include (1) the proportion of trials where the participant made the correct response and (2) the participant’s average reaction time across all trials.

Summary scores can be computed from the trial-level data, but are nevertheless provided separately to allow you immediate access to scores that reflect overall performance on the test. Performance on practice trials is never included in the calculation of summary scores.

Metadata

Metadata refers to data that contextualizes the data generated from a participant’s responses during a cognitive test but is not directly generated from those responses.

Metadata may include information about when the test was completed (e.g., start time and language of test administration) and characteristics of the participant’s testing device (e.g., screen width, operating system).

Data quality

Data quality assessments on reliability and quality control flags are available in the Introduction to Cognitive Testing Data in the All of Us Research Program.

While there isn't a definitive benchmark for what constitutes acceptable reliability, there are suggestions for using trial-level data to compute reliability provided in the details of each EtM task. To further assist researchers, quality control flags are included in both trial-level data and summary score data. The quality control flags are unique to each task’s design, and they are described in detail in each of the individual cognitive task supplemental documents.

The quality control criteria provided for each test are intended to capture extreme deviations from what is typically seen in participants performing the tasks in a valid manner. You must use your own judgment when determining whether flagged participants (or trials) should be excluded from analyses. You may also consider implementing your own quality control criteria separately from these recommendations.

Querying Exploring the Mind (EtM) data

The following queries are examples of how to use the Exploring the Mind (EtM) dataset within the All of Us Researcher Workbench. Currently, EtM data are available in the Registered and Controlled Tiers but do not have compatibility with the Cohort Builder and Dataset Builder.

Therefore, the EtM data must be queried directly into the Jupyter Notebook using SQL and will need to be merged with the current Curated Data Repository (CDR) depending on the data types required for research exploration.

To facilitate the utilization of the data on the Researcher Workbench, we have provided sample queries on how to access this data and merge it with CDRv8.

Setup

This block shows the information required to set up a Jupyter Notebook and connect to the Exploring the Mind (EtM) dataset.

# MODULES and CLIENT

from google.cloud import bigquery
import os
import numpy as np
import pandas as pd
pd.set_option('max_colwidth', 400)
import datetime

# TARGET DATA
etm_dataset = 'fc-aou-cdr-prod-ct.C_V8_R2_offcycle_etm'
v8_dataset = os.getenv('WORKSPACE_CDR')

# if Registered Tier, this should be:
# etm_dataset = 'fc-aou-cdr-prod.R_V8_R2_offcycle_etm'
# v8_dataset = os.getenv('WORKSPACE_CDR')

Example queries

This query shows the number of participants with Exploring the Mind (EtM) data in CDRv8. There are 36,058 distinct person_ids.

# pids that have ETM data

query=f"""
        SELECT DISTINCT person_id
        FROM (
            SELECT DISTINCT person_id
            FROM `{etm_dataset}.delaydiscounting`
            UNION DISTINCT
            SELECT DISTINCT person_id
            FROM `{etm_dataset}.emorecog`
            UNION DISTINCT
            SELECT DISTINCT person_id
            FROM `{etm_dataset}.flanker`
            UNION DISTINCT
            SELECT DISTINCT person_id
            FROM `{etm_dataset}.gradcpt`) pids
        """
etm_pids = pd.read_gbq(query, dialect = "standard")
print(f"N participants in ETM data: {len(etm_pids)}")

Example query 1: Retrieving outcomes data from an EtM table

This is an example query of how to retrieve outcomes data from Delay Discounting (one of the EtM tasks).

# retrieving outcomes data from an ETM table
# there are 4 tables in the ETM data: 1 for each of the 4 tasks
# each table contains five fields: sitting_id, person_id, src_id, outcomes (task-level), and trial_data (trial-level)
# this first example retrieves task-level outcomes for the Delay Discounting task

query=f"""
        SELECT sitting_id
            , person_id
            , src_id
            , outcomes.*
        FROM `{etm_dataset}.delaydiscounting`
        LIMIT 10
        """
dd_outcomes = pd.read_gbq(query, dialect = "standard")
dd_outcomes.head()

# the `outcomes` column is a Record type field that contains many sub-fields inside it
# to access those sub-fields, we use the `outcomes` column in our Select clause, then use dot notation to access the sub-fields inside it

Example query 2: Accessing sub-fields of outcomes data

This is an example query of how to retrieve the data in the sub-fields or task-level data from the outcomes data for Delay Discounting (one of the EtM tasks).

# the previous example shows how to access all of the sub-fields: `outcomes.*`
# the following example shows how to access specific sub-fields

query=f"""
        SELECT sitting_id
            , person_id
            , src_id
            , outcomes.mean_rt AS mean_reaction_time
            , outcomes.median_rt AS median_reaction_time
            , outcomes.sd_rt AS standard_dev_reaction_time
        FROM `{etm_dataset}.delaydiscounting`
        LIMIT 10
        """
dd_outcomes_sub = pd.read_gbq(query, dialect = "standard")
dd_outcomes_sub.head()

Example query 3: Retrieving trial-level data from an EtM table

This is an example query of how to access trial-level data from Delay Discounting (one of the EtM tasks).

# retrieving trial-level data from an ETM table
# the previous examples showed how to access task-level data from `delaydiscounting`
# this example shows how to access trial-level data from `delaydiscounting`

query=f"""
        SELECT sitting_id
            , person_id
            , src_id
            , trial_data.*
        FROM `{etm_dataset}.delaydiscounting`,
            UNNEST(trial_data) AS trial_data
        LIMIT 10
        """
dd_trials = pd.read_gbq(query, dialect = "standard")
dd_trials.head()

# the `trial_data` column is a Record type field that contains many sub-fields inside it
# it differs slightly in structure from the `outcomes` column in that it requires "unnesting"
# to unnest the trial_data field, we add `, UNNEST(trial_data) AS trial_data` to our From clause, making the `trial_data` field available for our Select clause
# to access the sub-fields within `trial_data`, we use dot notation in the Select clause
# as in the `outcomes` examples, we can use `trial_data.*` to retrieve all sub-fields or specify specific sub-fields (e.g., `trial_data.trial_type`)
# note that the example uses a Limit clause: the unnested data for this view contains over 700,000 rows, so the data should be filtered or aggregated

Example query 4: Combining task and trial data for an EtM task

This is an example query of how to combine both task and trial-level data for the Delay Discounting (one of the EtM tasks).

# retrieving task and trial data for an ETM task
# this example shows how to combine task-level and trial-level data for the Delay Discounting task

query=f"""
        SELECT DISTINCT sitting_id
            , person_id
            , src_id
            , outcomes.score
            , outcomes.any_timeouts
            , trial_data.trial_id
            , trial_data.response
        FROM `{etm_dataset}.delaydiscounting`, UNNEST(trial_data) AS trial_data
        WHERE DATE(trial_data.trial_timestamp) = '2023-09-01'
        """
dd_trials_outcomes = pd.read_gbq(query, dialect = "standard")
print(f"Number of rows in result: {len(dd_trials_outcomes)}")
dd_trials_outcomes.head()

# note that even when filtering to 1 day of results, there are still over 17,000 rows of data
# the outcomes will be consistent for every record with the same `sitting_id` and the trial_data will change for each row
# a person_id can have multiple `sitting_id` if they participated multiple times, so that field joins the outcomes and trial data

Example query 5: EtM task data with All of Us ‘gender identity’ and survey response

This is an example query of how to combine EtM data, i.e. Delay Discounting task and trial data, specifically with demographic information (gender identity) and a survey response (Including yourself, who in your family has had anxiety reaction/panic disorder) from CDRv8.

# combining task data from an ETM task with other v8 data
# this example shows how to combine data for the Delay Discounting task with other CDR v8 data types
# the query retrieves: 1. the score and median reaction time for the trial from the task view,
# 2. the date the trial was taken from the trial view,
# 3. the participant's gender identity from the person table,
# 4. the participant's survey response to whether they or a family member have an anxiety/panic disorder

query=f"""
        SELECT DISTINCT dd.sitting_id
            , dd.person_id
            , outcomes.score
            , outcomes.median_rt AS median_reaction_time
            , DATE(trial_data.trial_timestamp) AS trial_date
            , p.gender_source_value AS gender_identity
            , s.answer AS anxiety_answer
            , s.answer_concept_id AS anxiety_answer_concept_id
        FROM `{etm_dataset}.delaydiscounting` dd,
            UNNEST(trial_data) AS trial_data
        JOIN `{v8_dataset}.person` p ON dd.person_id = p.person_id
        JOIN `{v8_dataset}.ds_survey` s ON dd.person_id = s.person_id
        WHERE DATE(trial_data.trial_timestamp) > '2023-09-01'
            AND s.answer LIKE 'Including yourself, who in your family has had anxiety reaction/panic disorder? - %'
            -- filters to participants who responded to a health history survey that they or a family member have an anxiety reaction/panic disorder
        """
dd_v8 = pd.read_gbq(query, dialect = "standard")
dd_v8.head()

Additional resources

The following resources provide additional information about the cognitive testing data with key details and methodology for each of the EtM tasks.

 

The National Institute of Mental Health (NIMH) (2024, January 9.) Using Games to Explore the Mind. [Press Release]. https://www.nimh.nih.gov/news/science-news/2024/using-games-to-explore-the-mind

National Institute of Mental Health. (2024). Research Domain Criteria (RDoC). https://www.nimh.nih.gov/research/research-funded-by-nimh/rdoc (accessed Sep 10, 2024).

Was this article helpful?

2 out of 2 found this helpful

Have more questions? Submit a request

Comments

0 comments

Article is closed for comments.