An Online Platform for Exploring the Canadian Parliamentary Debates from 1901 to Present
The Official Report of Debates, or "Hansard," is a detailed transcription of the discussions in the Canadian House of Commons. Spanning more than one century and a half, it touches upon almost any issue that has moved public opinion since Confederation. As a crucial part of Canada's heritage, the "verbatim" record of the parliamentary debates are relevant to researchers, activists, and journalists alike.
The sheer size of the Hansard has posed a challenge, however. It contains over 650 million words, plus translations of the text into the other official language, which made querying its contents a significant needle-in-haystack problem.
Over the past decade, advances in machine intelligence and computing power have opened up opportunities to explore and navigate such huge amounts of text data. Digitization of the parliamentary record started in the 90s when the Library of Parliament began publishing the latest debates online. In 2013, Canadiana and the Library of Parliament partnered to scan the entire historical proceedings from 1867 to 1999 and release them as a searchable digital archive (the Canadian Parliamentary Historical Resources).
In 2013, a group of political scientists, computer scientists, and historians at the University of Toronto set out to structure and enrich this data by adding semantic annotations. The began by identifying the building blocks of the debatesagenda points and speechesand associating speeches with biographical data.
To make the data accessible to a wide as possible audience, this group created Lipad.ca, an online platform that allows users to query and explore the enriched Hansard from 1901 to present. The interface was designed to be modern, accessible, and user-friendly for both researchers and the general public.
Future development planned for the site includes interactive data visualization apps, expansion of the database to include French language debates, senate, and provincial-level debates, and aggregation of supplementary Canadian political data such as polling results and news.
The social impact of this Linked Data project can be significant. The release of substantial quantities of digitized political texts opens the door to scientific research in a number of disciplines, for Canadian and international scholars alike.