12. The basics of data journalism

Like fact-checking, data journalism is one of the practices that has emerged as online newsrooms have developed since the mid-2000s. It has made it possible to exploit the wealth of data now available online on all subjects.

A brief history of data journalism

Although there has been a lot of talk about data journalism in the last ten years or so, it is far from being a new practice. The oldest examples of it date back to the middle of the 19th century. The journalist and nurse Florence Nightingale was already publishing the mortality data of English soldiers during the Crimean War back in 1858.

What has changed since then is the advent of computers and the democratisation of public data. Every journalist has in their hands a tool that allows them to perform calculations and searches in an extremely efficient manner and to process an astronomical amount of data. All that is left to do is to get stuck in (virtually, of course).

What is data?

Often, when speaking of data journalism, people think of unemployment rates. And that is normal, it is probably the graph seen most often in the newspapers. However, it should be noted that data and statistics are not the same thing. The unemployment rate in the UK, for example, is taken from Jobcentre Plus data that is then reworked by the ONS statisticians using specific formulas.

A piece of data is a precise, unique, defined element. There are four types of data:

  • data can be text: your forename is a piece of data
  • data can be a number: your age is a piece of data
  • data can be something true or false, which is known in the business as a Boolean data type: are you British? Yes? No? The answer is a piece of data.
  • data can be a grouping of several other pieces of data, which is known as an array: “Clive, 18, No” is an array that contains pieces of text data, number data and Boolean data.

Spreadsheets and pivot tables

Beyond the theory, data journalism is, above all else, using a piece of software that you definitely know, but which is often scary: Excel (or any other type of spreadsheet software). Excel is the tool par excellence for data journalism. If you learn to use it a little, it will allow you to do very complex calculations with ease, to calculate averages, to count a number of occurrences, to search for certain parts of text, etc.

If you delve a little deeper into the subject, you can get into pivot tables (it is not that complicated, we promise). Using this tool, you will be able to sort through huge databases of several thousand rows and columns to cut to the core material that will help your investigations.

And if you really want to go a little further, OpenRefine will be your ally. Using this tool, you will be able to navigate through millions of cells at a glance.