Skip to Main Content

HSLS MolBio Workshops

Information & resources for hands-on bioinformatics classes.


This page contains resources - PowerPoint slides, lecture videos, datasets for a software demonstration, etc. -  for HSLS Molbio Information Service offered workshops on Gene expression data mining and pathway enrichment analysis

Part 1 (GEO Data Mining) teaches how to retrieve a list of differentially expressed genes (DEG) associated with a gene expression study (RNA-seq / microarray) by searching the Gene Expression Omnibus (GEO) database using BioJupies (RNA-Seq), GREIN (RNA-Seq), and Geo2R (Microarray).

Part 2 (Enrichment Analysis with g:Profiler and GSEA) focuses on uncovering the biology hidden behind the extracted differentially expressed gene list by searching publicly available pathway enrichment analysis resources, including Gene Ontology (GO), Molecular Signature Database (MsigDB)ReactomePantherKEGGPathwayCommons, and WikiPathways using GSEA and g: Profiler.

Part 3 (Cytoscape - Data Visualization) entails the visualization of gene expression using Cytoscape. We cover how to upload gene lists into the publicly available protein interaction databases such as StringDB to retrieve relevant interaction networks and import them into Cytoscape and analyze gene ontology terms and pathways. We also cover how to generate enrichment maps using GSEA and g: Profiler results in Cytoscape.


Target Audience

Experimental biologists seeking to analyze gene lists through omics experiments. The software covered in the workshop operates through a user-friendly, point-and-click graphical user interface, so neither programming experience nor familiarity with the command line interface is required.

Workshop Materials


Lecture Videos

Data Download


Differentially Expressed Genes (DEG)

  • DEG_biojupies (.xlsx) - differentially expressed human genes between AI-10-49 treatment vs. DMSO (GSE101788, RNA-seq) generated by BioJupies software
  • DEG_geo2r (.xlsx) - DEG calculated by Geo2R using GSE11352 dataset (microarray); Filter used: Adj.p.val <0.1
  • GSE147507_handson_GREIN - a DEG list from SARS-Cov2 infected vs. Mock infected Calu3 cells generated by GREIN 

g: Profiler




Software Download

  • GSEA Software - geneset enrichment analysis software
  • Cytoscape - network data integration, analysis, and visualization tool
  • BioRender - Scientific illustrations creating tool

Databases & Tools

GEO Data Mining

  • NCBI GEO -a public functional genomics data repository supporting MIAME-compliant data submissions.
  • Search Engine for Gene Expression Data
  • BioJupies - BioJupies Automatically Generates RNA-seq Data Analysis Notebooks.
  • GREIN - GREIN is an interactive web platform that provides user-friendly options to explore and analyze GEO RNA-seq data.
  • GEO2R - Use GEO2R to compare two or more groups of samples in order to identify genes that are differentially expressed across experimental conditions.
  • Correlation Engine* - Find pre-analyzed results from GEO hosted RNA-seq and microarray experiments.
  • CistromeDB - Retrieve pre-analyzed results generated from GEO hosted ChIP-Seq experiments.
  • CitromeDB ToolKit - Find factors that have a significant binding overlap with your ChIP-seq peak set.
  • RaNA-Seq - An open bioinformatics tool for the quick analysis of RNA-Seq data.
  • GSE101788 (RNA-Seq) - CBFb-SMMHC inhibition triggers apoptosis by disrupting MYC chromatin dynamics in acute myeloid leukemia [RNA-seq]
  • GSE11352 (Microarray) - Timecourse of estradiol (10nM) exposure in MCF7 breast cancer cells.
  • GSE101789 (ChIP-Seq) - CBFb-SMMHC inhibition triggers apoptosis by disrupting MYC chromatin dynamics in acute myeloid leukemia
  • GSE147507 (RNA-seq) - GEO dataset ( SARS-C0v2 vs. mock-infected Calu3 Cells) for hands-on practice exercise
  • BioRender - Scientific illustrations creating tool

Enrichment Analysis

Feedback & Support