Nicolas Kayser-Bril 2017-12-24
Computer Algorithm Is Hired

We are looking to recruit an ambitious and driven Script or Algorithm to build and manage a global sub-editing team that will be in charge of producing, reviewing and editing all content released and distributed to our clients: interview invites, interview transcripts and intelligence reports.

This looks weird but it is how a future job description might look like. Will journalists be replaced with big data? Big data will be replaced by Machine Learning and algorithms. Data is somehow a fashionable topic as well as big data. To approach this task you need the ‘hardest’ software.

Either you buy a super computer that is going to cost a lot of money or instead of having just one computer doing the task, you do it on ten computers at the same time. It is not rocket science. Any good computer programmer can do it. If you are the journalist , you know where to find the programmer that will do it for you. If you are familiar with the term Machine Learning, you may imagine the idea of using neural networks to find patterns in data or to predict future behavior. 

One major example of this, is an investigation by BuzzFeed. They scraped a lot of data from betting websites and they looked at suspicious patterns. There was a network of criminals who would pay tennis players to abandon a game when the odds were especially bad against them. If you bet against a player and this player abandoned the game, you would make a lot of money. BuzzFeed leaded Machine Learning on this betting data set. They found out that players were especially suspicious.

Another project launched by Pro Publica was that they do machine learning to analyze all the documents they have in their huge database. It appeared that it does not work on the tennis racket. The list of the players that they had was false. What they found was not a fraud.

The investigation was not bad. The problem is that Machine Learning did not bring much to the investigation. The way neural networks work is that it is a black box where raw data is put. You give training and you have the output. If you have an error in your ML algorithm, you have no idea where the error appeared. It’s a big problem when you do journalism. You cannot, as a journalist, reach a conclusion based on a pure computer algorithm.