Skip to Main Content

Exploring and Cleaning Data with OpenRefine

What is OpenRefine?

  • OpenRefine is a powerful, free, open source, tool for working with messy tabular data.
  • It is a desktop application that uses your web browser as a graphical interface.
  • OpenRefine does not require internet access to run its basic functions (It only requires an internet connection to import data from the web, reconcile data using a web service, or export data to the web). 
  • None of the data or commands you enter in OpenRefine are sent to a remote server.

Pre-workshop Preparation

1. Download and Install OpenRefine


2. Run OpenRefine to Ensure that it's Working Properly 


  • On Windows: Navigate to the folder where you’ve installed OpenRefine and double-click ‘openrefine.exe’ or ‘refine.bat’.  A terminal window will appear.  Keep this open in the backround so that OpenRefine continues to run.
  • On Mac: Navigate to your Applications folder and click the OpenRefine icon.

Note: The interface to OpenRefine is accessed via a web browser. When you run it a window should open in your computer's default web browser and have the address http://127.0.0.1:3333/. If this doesn’t happen automatically you can open a web browser and type the address in.

3. Download the Sample Dataset


Note: The dataset is a subset of: Cincinnati (Ohio). Health Dept. (2012): Cincinnati Birth and Death Records, 1865-1912: source data. http://hdl.handle.net/2374.UC/752228