Skip to Main Content

HSLS MolBio Workshops

Information & resources for hands-on bioinformatics classes.

About the Program

All of Us Research Program

The All of Us Research Program (AoURP), led by the National Institutes of Health, is a longitudinal cohort study aimed at advancing precision medicine and improving human health through partnering with one million or more diverse participants across the United States. With an emphasis on reaching historically underrepresented populations in biomedical research, the AoURP datasets include:

  • Electronic Health Records (EHRs): Participants have the option to share EHR data, which is standardized using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM).
  • Biosamples and Bioassays: DNA is extracted via blood, urine, or saliva for subsequent genomic analysis.
  • Mobile Health Data: Biometric data like heart rate and blood pressure are tracked using wearable devices.
  • Physical Measurements: At the participant's first appointment, trained All of Us staff members measure and record information such as height, weight, BMI, waist and hip circumferences, blood pressure, heart rate, pregnancy status, and wheelchair use.
  • Surveys: The database features a total of eight surveys, with The Basics, Lifestyle, and Overall Health being the three primary ones. For more information about the additional surveys and to explore their content, please visit the Survey Explorer.

All of Us Research Hub

The All of Us Research Hub stores health data from a diverse group of participants from across the United States.

  • Data Access Tiers
    • Public Tier: The dataset contains only aggregate data with identifiers removed. These data are available to everyone through Data Snapshots and the Data Browser, an interactive tool on the Research Hub.
    • Registered Tier: The curated dataset contains deidentified individual-level data, available only to approved researchers on the Researcher Workbench. The Registered Tier currently includes data from electronic health records (EHRs), wearables, and surveys, as well as physical measurements taken during participant enrollment.
    • Controlled Tier: The dataset contains genomic data in the form of whole genome sequencing (WGS) and genotyping arrays, previously suppressed demographic data fields from EHRs and surveys, and unshifted dates of events.
  • Data Methods
    • To ensure the Research Hub collects the highest quality data possible, the AoURP employs a comprehensive data methodology to curate data for registered researchers.

Introduction to the Workbench

Researcher Workbench

A cloud-based platform where researchers can access AoURP-generated data. Its powerful tools support data analysis and collaboration. Researchers create workspaces to access, store and analyze data for specific research projects. Researchers with R or Python experience can perform high-powered queries and analysis within the AoU datasets using an integrated, cloud-based Jupyter Notebook environment.

Video: Introduction to the Researcher Workbench (2:35 min)

Video: All of Us Researcher Workbench webinar (43:22 min)

 

Want to keep up with the latest All of Us news? Sign up for the All of Us Newsletter

How to Access the Data

The University of Pittsburgh has signed a Data Use and Registration Agreement with the All of Us Research Program, allowing Pitt researchers to apply for Registered and Controlled Tier access. 

Register for Data Access

  1. Sign up to use the Workbench (make an @researchallofus.org account)
  2. Complete two-step verification with your @researchallofus.org account 
  3. Verify identity with login.gov. For issues with this step, you can contact support@researchallofus.org (there can be issues even when info is input correctly). Alternative identity verification methods are available if you have a state ID, or phone number or SSN, US passport, or an e-passport. For alternative identity verification methods, contact support@researchallofus.org as well. 
  4. Login to Researcher Workbench
  5. Complete ethics training modules; Pitt users must complete relevant training in the Workbench profiles to access registered and controlled tiers data.
  6. To check for access level, click the three-bar menu in the top upper left corner, click your name, and then click “Data Access Requirements."

Video: All of Us Researcher Workbench Onboarding (7:16 min)

Unsure where to start after gaining access to the data?

Follow this Roadmap for Getting Started with All of Us Data.

AoURP Google Cloud Set Up & Costs

Google Cloud Platform (GCP)

Access to the Researcher Workbench and data are free. Computation and storage accrue usage costs through the Google Cloud Platform (GCP). All of Us Research Program provides $300 in free credits for each registered Researcher Workbench user, which will help researchers to get started using the Researcher Workbench. 

Setting up Billing Account for GCP

Once a user’s credits are low, they should receive a message suggesting they setup a long-term billing solution. The Credits and Billing Page will provide additional information regarding initial credits and how to set up a billing account. 

Billing is controlled at the workspace level, so users would have to link the workspace to an associated billing account, either with GCP or a Google billing partner. If their research is funded by the National Institutes of Health (NIH), they are eligible for the STRIDES GCP pricing initiative

Project computation costs can vary, but there are estimations available.

Video: How to add a GCP billing account to your workspace (starts at 20 min)

HSLS-DBMI Workshops

Public Tier Access:

Introduction to AoU Researcher Workbench


Registered Tier Access:

Terminology and Data Model Training

  • Introduction to concept sets and their role in observational research with EHR data; ICD, SNOMED, LOINC, genome-related terms, etc.; the data dictionary and relationship to other AoU clinical research tools
  • Feedback

EHR and Survey Data Analysis

  • How to create a project using concept sets, datasets, and workbooks; how to use code snippets to analyze EHR or other data types; going beyond code snippets to accomplish more advanced tasks
  • Feedback

Controlled Tier Access:

Genomics: Potential Impact of PCSK9 on LDL Levels

  • This workshop will explore the impact of the PCSK9 gene on LDL levels, using EHR, survey and genomic data from the All of Us database, and will include data cleaning, filtering genes, and logistic regression analysis, providing a solid foundation for future GWAS.
  • PowerPoint Slides

Support

 Q/A Sessions

Pitt


Baylor College of Medicine

  • Schedule: 1st and 3rd Wednesday and 2nd, 4th, and 5th Friday of each month at varying times. The full schedule is available to review for official times.
  • Zoom link: bcm.zoom.us/j/94305076343 (No registration required)
  • The first hour is dedicated to a presentation on varying topics followed by a Q&A session. Some topics covered include: Introduction to Researcher Workbench, Creating a Cohort Dataset, and Introduction to Workspace Buckets Issues Common to Beginners

  • Website: bcm.edu/allofuseveningswithgenetics 
  • Email: allofuseveningswithgenetics@bcm.edu 

All of Us

Training Links

AoURP offers New User Orientations that might be useful for introductory users. 

HSLS Self-Paced Learning Module

Applying Probability and Data with R to All of Us Datasets

Videos

Links

Training Videos

User Guide 

Research Scope

Learn how researchers used AoURP data via the following searches - start with typing a topic of interest in the search field.

  1. Research Project Directory
  2. All of Us Researcher Convention 2023
  3. Preprints where most of the articles are not published or peer-reviewed yet
  4. Publications
  5. Clinical Studies (PubMed)
  6. NIH Awarded Grants (NIH Reporter)

Learn how researchers used UK Biobank data (similar to AoURP, a large-scale longitudinal cohort study containing in-depth genetic and health information from half a million UK participants) via the following searches - start with typing a topic of interest in the search field. Researchers could try replicating UK Biobank research findings in AoURP datasets.

  1. Publications
  2. Preprints where most of the articles are not published or peer-reviewed yet (type your topic of interest in the search box)
  3. Approved Projects Directory
  4. GeneBass - a resource of exome-based association statistics. The dataset encompasses 4,529 phenotypes with gene-based and single-variant testing across 394,841 individuals with exome sequence data from the UK Biobank. 

Student's Posters:

1. Ivy Baker et al., Mental Health Disparities Between Deaf and Hard of Hearing & Hearing Peers

2. Valerie DeVos et al., Comparing Healthcare Experiences Between Deaf and Hard of Hearing vs Hearing patients During the COVID-19 Pandemic

Additional Resources

Funding Opportunities

Funding opportunities for All of Us research from NIH

Key Publications

  • All of Us Research Program Investigators, The "All of Us" Research Program. N Engl J Med. 2019 Aug 15;381(7):668-676. doi: 10.1056/NEJMsr1809937. PMID: 31412182; PMCID: PMC8291101.

  • Ramirez AH, et al.,  All of Us Research Program. The All of Us Research Program: Data quality, utility, and diversity. Patterns (N Y). 2022 Aug 12;3(8):100570. doi: 10.1016/j.patter.2022.100570. PMID: 36033590; PMCID: PMC9403360