The Problem

Hansard documents

  • Huge
  • Cumbersome
  • Inaccessible

  • How do I summarize or explore

  • Who said what and when?
  • What are the hot topics?

The Solution

A tool to statistically process Hansard Data

  • Explore “Who” is talking about “What”
    • What has my MP talked about?
  • Date sensitive
    • What happened last week?
    • What are the hot topics this year?

What it does

  • How it works
    • XML Hansard is processed to SQL
    • Set of scripts create derived statistical data
    • Word counts
    • distance between commonly used words
    • A publicly visible API provides uniform open access to the data


  • Web Visualization Interface
  • Mobile Interface
  • Processed data available via API


  • Words for MP ** Words for MP between dates
  • Words for date
  • Words that are close and used frequently
  • Members that relate to word
  • Links to references where the word was used back to the Public Hansard URL to cross reference source of data

Ideas for taking this project forward

  • Daily/Weekly Digests of What an MP has said/done
  • personalized Alerts
  • Statistics openly available for creative Visualizations
  • Historical Statistical exploration
  • Hands on exploration of 1950's or even 1800's data

Estimated costs for taking this project forward

About the data used for this project

The data provided by parliament officially at

Is poor data. The pagination is not consistent. The names are not consistent.

A cleaner version is available at:

Where the community have processed it and cleaned the data up properly

Developers: Greg Miell ( Allan Callaghan ( Greg Mackelden ( Chris (Shish) Girling (

Project URL:

Created at

Rewired State: Parliament

Flickr Photos

Add photos by tagging them with rewiredstate:project="parliamentary"