• Follow me on Twitter

#bfong: Crowdsourcing MPs’ expenses – Simon Willison

Next up at Brighton Future of News Group meeting is software architect for the Guardian Simon Willison (@simonw) talking about crowdsourcing experiments – first is the Guardian’s MPs’ expenses crowdsourcing:

Says reason he works for a newspaper is you’ve the audience and brand to push the work further, have a deadline to work to and the resources of journalists and information to help the project.

Crowdsourcing MPs’ expenses
Short notice before documents were released, thought that they woulldn’t be searchable or well-organised.

Created application and set it live within one week.

Progress bar on homepage became popular with readers – but was created to fill a space on the homepage!

Had to play a guessing game because we didn’t know what the documents were going to look like.

Our big plan when we thought there were going to be only 100,000 documents was to get people to go through and submit all the amounts filed.

Investigate this! button alerted reporters.

Had examples of receipts where everything was redacted but the number. Page for every MP with credits for users that have been investigating the expenses claims. Could also enter your postcode to be taken to your MP.

Truth_will_out pages were set up to profile what documents/MPs individuals had been investigating.

First version was a hit and miss: it did work – went live and crashed 5 minutes later; we did get some stories out of it.

But we went live with documents from most important cabinet members and people raced to those, but when we released other documents the progress bar started crawling.

Telegraph had already seen all of the documents at this stage, so second version came out when new batch of data was released that no one had seen – a smaller batch of documents. (The team were hoping MPs might try to fix the system as this would be an even bigger story – but it didn’t happen.)

Investigate the MPs’ Expenses 2 was broken down into assignments e.g. by part, by cabinet. Could flag up the assignment we were most interested in people contributing to. People could see a progress bar for each assignment and watch the progress grow at the top of each page.

Changed the questions we were asking too e.g. is the document “very interesting”, “handwritten”, about “soft furnishings” e.g. something that captures the public’s imagination.

Much more engagement and bigger sense of involvement – people using the site at the same time, felt they were working together. Version 1 felt as if you were working in isolation.

Trading ideas with other teams at newspapers e.g. New York Times’ DocumentCloud took some inspiration from Guardian’s uploading up pdf expenses documents in bulk.

Some other crowdsourcing projects Simon has been involved with/flags up:

Wildlifenearyou.com

Encouraging people to upload images of animals they’ve seen – map this, create lists of favourite animals, let other users identify animals in images that are unknown. Increased crowdsourcing gives a greater set of data.

Built another crowdsourcing tool, which makes you choose which animal is better. One user got hooked and voted on more than 1,000 images. From this site can build up top 10 lists. Can give users medal if their photo is voted “best” – considering developing a leaderboard for this.

Owlsnearyou.com launched just a few weeks ago – a very specific version of Wildlifenearyou. Piggybacked on Superbowl hashtag on Twitter by creating Superb Owl day!

OpenStreetMap
Trying to create a UG map of the entire world – can be edited in a Wikipedia style. Has become really good at responding to crises – donated some high res photos and traced them to create best digital map of Haiti available. Has become default maps for rescue teams.

OpenStreetMap makes data more available/usable than Google Street Maps.

Blair’s finances crowdsourcing – small number of individuals doing the bulk of the work. High number of votes per user.

The end product of crowdsourcing?

Way to export day/allow users to pull data that they’ve added back out would be another way of giving users in crowdsourcing experiments ownership. What you do with the end product is an interesting problem that still needs to be solved.

3 Responses

  1. […] As Laura notes, a specific version of Wildlifenearyou.com, Owlsnearyou.com launched just a few weeks ago. Getting the site some extra coverage, Owlsnearyou cannily “piggybacked” on the Superbowl hashtag on Twitter by creating “Superb Owl Day”… Geddit? […]

  2. […] Laura Oliver provides excellent coverage of both speakers which you can read here and here. […]

  3. […] Oliver, editor of Journalism.co.uk also blogged about Jo Wadsworth’s and Simon Willison’s presentations, as did John […]

Leave a comment