How mathematics can fight the abuse of big data algorithms

Prof Alan Champneys, Professor of Applied Non-linear Mathematics, University of Bristol

“Is maths creating an unfair society?” That seems to be the question on many people’s lips. The rise of big data and the use of algorithms by organisations has left many blaming mathematics for modern society’s ills – refusing people cheap insurance, giving false credit ratings, or even deciding who to interview for a job.

We have been here before. Following the banking crisis of 2008, some argued that it was a mathematical formula that felled Wall Street. The theory goes that the same model that was used to price sub-prime mortgages was used for years to price life assurance policies. Once it was established that dying soon after a loved one (yes, of a broken heart) was a statistical probability, a formula was developed to work out what the increased risk levels were.

In the same way that an actuary can tell how likely it is that a loved one will die soon after their partner, a formula was used to predict how likely it was that a person or company would default on a loan. Specifically, it was applied to predicting the risk of two subprime mortgages co-defaulting.

The formula ended up being very wrong. If I default on my mortgage, there is a good chance it is because of a downturn in the economy. So my neighbour, who is in similar socio-economic bracket as me, is pretty likely to default, too. This effect is an order of magnitude stronger than the broken-heart coefficient would predict. So apparently the maths was at fault. Big time.

By Lee Jordan (Flickr) [CC BY-SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons

People queuing outside a Northern Rock bank branch in Birmingham, United Kingdom on September 15, 2007, to withdraw their savings because of the subprime crisis. Credit – Lee Jordan/Flickr.com (CC BY-SA 2.0)

Why didn’t the mathematicians notice? Well, in fact they did, argues Paul Embrechts, a leading financial mathematician who runs the risk lab based at ETH, the Swiss Federal Technical University in Zurich. But few were listening. Embrechts explains it was the blind use of a forumla way outside of its region of validity that was at fault. There was nothing wrong with the formula, it just didn’t apply (as the mathematicians had already shown). Unfortunately, the industry was “stuck in a classic positive feedback loop which no party wanted to walk away from”. Blaming the maths “is akin to blaming Einstein’s E=mc² formula for the destruction wreaked by the atomic bomb”.

Lessons still to learn

There was a lack of appreciation of the difference between risk (something that is priced by the quants – the name the financial services industry gives to mathematicians and data analysts) and uncertainty (what can go very wrong). The Basel Committee on Banking Supervision, in response to the global banking crisis, made it clear that banks must make an explicit assessment of this uncertainty and that different scenarios must be tested.

However, it seems that the banking industry may not yet have learned this lesson, and here I shall change a few details for obvious reasons. I have a friend, with a PhD in mathematics, who recently worked in the City of London, ensuring products sold by a leading financial institution were risk free. He was shocked by what was going on.

Policies are still being sold according to a formula that predicts the company’s profitability. Then a separate team applies simple linear regression (changing a parameter to see how much a value changes by) to “assure” the product against risk. This is to satisfy the requirement of the regulatory authorities.

However, there is little understanding among them of the mathematical theory behind what they are doing and a strong culture in the team to return the answer that all is fine. No possibility is allowed that the fundamentals of the pricing model may have been wrong in the first place, or that risk and uncertainty should be handled in tandem with profitability when the product is constructed.

Critical use of formulae

So the nub of the problem is not that mathematics is to blame, but that in our quantitative world there is often a lack of mathematical understanding among those who are blindly using formulae derived by the experts.

This idea, is in fact the key point of a recent book by Cathy O’Neil, Weapons of Math Destruction. She is not describing the dangers of mathematics per se, but the algorithms used in conjunction with “big data” that are increasingly being used by advertisers, retailers, insurers and various government authorities to make decisions based on what they have profiled about us. She is an advocate for mathematics and for “machine learning” (or artificial intelligence). But what her book seeks to argue against is the use of these algorithms without thought or without feedback.

By Thierry Gregorius (Cartoon: Big Data) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Cartoon: Big Data. Credit – Thierry Gregorius/Wikimedia Commons. (CC.BY.2.0)

The popular TV sketch show series Little Britain had a recurring scene involving a member of the public repeatedly being told by a customer service assistant sat behind a computer screen that “the computer says no”. It is funny, because it is an experience that most of us can identify with.

But the problem is not the computer, nor necessarily the algorithm it is running, but the inability of the person behind the computer to use their common sense. Instead of the computer informing their decision-making process, they are ruled by what it says.

In the mathematical and data modelling classes colleagues and I teach, we encourage students to apply the scientific method to a raft of different problems from across a variety of sectors. Predictions should not just be based on mathematics models and algorithms, but constantly tested against real data. This is an iterative process and lies at the heart of what mathematics is about.

The lesson would seem to be that we need to inculcate more of this kind of thinking in society. As we enter the big data era, rather than mathematics being to blame, it is the lack of mathematical understanding in many key businesses that is at fault. We need more mathematical thinking, not less.

This article is reposted from The Conversation website.

PolicyBristol Hub

Research and policy in dialogue

How mathematics can fight the abuse of big data algorithms

Lessons still to learn

Critical use of formulae

Subscribe By Email