Following a BBC review of local journalism on how they could contribute to local news reporting, including helping hyperlocals, the first of a series of Masterclasses was held at BBC Birmingham on 20th January. We were invited as the team of Cardiff-based hyperlocal, RoathCardiff.
An excellent range of speakers, ranging from the BBC’s Bella Hurrell, from the data visualisation unit; David Ottewell, Head of Data Journalism (Regionals) at Trinity Mirror Group; Stephen Rosenthal, Director of Comms and Public Affairs at Google; and Dr Tony Hirst from the Department of Communications and Systems at the Open University, shared their expertise and experience with an eager audience of hyperlocal bloggers, and journalists on local and regional titles.
Tony Hirst’s hands-on session on open source data cleaning tool OpenRefine (formerly GoogleRefine) showed how it could be of great use to journalists seeking to both compare data from different sources and in different formats, and to then be able to drill down into it in great detail, revealing previously hidden gems of great interest to local communities.
Demonstrating the tool using spending data from Birmingham City Council – available online for all local authorities, but
not necessarily in a usable format – Tony showed how the data could be cleaned up, rationalised, and interrogated in many different ways. Whilst this is an extremely useful tool, unknown to most of the day’s participants, it would need a bit of practice to get to grips with and really get the best from.
Data mining expert Paul Bradshaw (Birmingham City University and onlinejournalismblog) took us on a whistle-stop tour of advanced search methods, including Boolean searches; using specific document types, for example ‘inurl’, ‘intext’, ‘filetype:xls’; sites such as gov.uk, Met Police FOI disclosure log (or in our area, South Wales Police disclosure log); search terms including ‘health’, ‘transport’ and ‘housing’; and scraping sites using ScraperWiki and similar.
David Ottewell showed how many headline stories across all Trinity Mirror’s local and regional papers had been sourced from data freely available on the ONS website and gov.uk; and how FOI and website scraping could lead to the discovery of data otherwise unavailable in any form useful for journalists. He also emphasised the importance of requesting data in the format most useful to you (usually a spreadsheet), and not necessarily how it had originally been produced.
Encouraging journalists to remember that data was not just about figures, but all information (and that you don’t need a PhD in maths to use it!), he suggested regularly using sites such as those for local government, health boards, Police crime statistics, the Missing Person’s Bureau, and the Met Office.
With the addition of advanced search techniques on Google presented by Stephen Rosenthal: breaking trends, YouTube trends, headlines by UK, region and city (and did you know you can search Google by colour?) and Bella Hurrell’s instructive and illuminating presentation on using graphics to transform complex data into accessible and shareable visualisations (thanks to her team of graphic designers, and journalists), the training for us was priceless, and a huge asset to our inquiring and investigative tools as a hyperlocal site.
Some of the tools mentioned:
- OpenRefine – to compare data from different sources and in different formats
- Boolean searches
- Sites such as gov.uk, Met Police FOI disclosure log
- Using ScraperWiki to scrape websites
- Using the ONS website to find stories
- Submitting FOI requests
- Google advanced search
Homepage image accompanying this article is copyright Bob Mical.