Has the impact of analytics on modern football been overstated?

0
18

The funny thing about the analytics movement in football — which has largely been inspired by a similar movement in baseball — is that the race to tell the story has been almost as keenly contested as the race to actually crack the code.

Baseball’s analytics movement, after all, is well known because of a famous book rather than a famous match or season. Michael Lewis’s Moneyball, an account of Oakland Athletics’ use of sabermetrics to gain an advantage over richer rivals, was published in 2003 and helped to popularise the use of data in sport, even if baseball was already a sport heavily based around numbers.

It was turned into a 2011 film starring Brad Pitt, which was nominated for Best Picture at the Academy Awards. The phrase ‘Moneyball’ has crept into the wider lexicon, including in football, to the extent that ‘Moneyball’ is probably more famous than the Oakland A’s themselves.

Football is now into double figures in terms of books that have attempted to be football’s Moneyball — from Simon Kuper and Stefan Szymanski’s Soccernomics to David Sumpter’s Soccermatics. In baseball, the roles were clear: author Michael Lewis was the narrator, while general manager Billy Beane was the main protagonist. In football, things are more confused, because the same figures have played both roles.


Oakland Athletics General Manager Billy Beane (right) in July 2010 (Michael Zagaris/Getty Images)

The Numbers Game, an enjoyable book by the American duo of Chris Anderson and David Sally, was essentially the first major ‘analytics era’ football book when it was published in 2014. But then their subsequent attempt to gain control of an English club, Coventry City, and transform it through analytics was the main storyline in Rory Smith’s 2023 book Expected Goals.

Data Game by Josh Williams specifically focuses on Liverpool’s use of analytics — but then one of the central characters in that book, the club’s director of research Ian Graham, recently released his own tome, entitled How To Win The Premier League. Only one-third of that book is really about his own experience, and even then, in his words, that section is mainly concerned with “telling the story of data analysis in football”, where he discusses many of the same stories discussed in the other books.

Therefore, the Lewises have become Beanes, and the Beanes are being Lewises. And therefore, while not to accuse anyone of being untruthful or cynical, it’s hardly inconceivable that there’s been a natural exaggeration of analytics’ impact and a collective self-promotion.

So has — as the books variously claim — there really been a ‘data revolution’ in football that has ‘changed the game forever’?


The debate in analytics has inevitably, and to a certain extent rightly, been framed as ‘Spreadsheet Nerds’ on the left versus ‘Old-School Football Men’ on the right.

Personally, I’d put myself fairly far to the left, perhaps around 0.85 xSN. I accept there will always be intangibles that are near-impossible to measure, but having long admired the use of numbers in cricket, England’s second sport, I’ve been delighted with the football media’s gradual embracing of data. I’ve been a semi-regular attendee at the Opta Pro Forum, an annual conference in London about the topic, and use data in articles regularly.

I’m also very interested in the tactical evolution of the game.

But having now read and largely enjoyed 10 books about analytics (Christoph Biermann’s Football Hackers and Ryan O’Hanlon’s Net Gains are particularly good), I’ve struggled to draw a straight line from A to B.

Earlier this year, when it came to updating and expanding my own book The Mixer, about the history of Premier League tactics originally published in 2017, I originally pencilled in a chapter about the impact of analytics, which I’d been conditioned to think of as a game-changing development in modern football. But after further consideration, I then dropped it, after deciding that it was actually less noteworthy than even something relatively minor, like teams being able to pass the ball to a team-mate within their own penalty box at goal kicks.


Have law changes such as being able to receive goal kicks inside the penalty area made more impact than analytics? (Ryan Pierse/Getty Images)

And the more you inspect these books, the more both the Lewises and the Beanes seem doubtful that analytics has changed the way football is played.

In Football Hackers, Biermann quotes Christofer Clemens, the German national side’s head analyst at the 2014 World Cup, explaining how they used data at that tournament. “We disregarded almost everything that we had looked at in previous years,” he said. “We’re increasingly convinced that there’s a lack of data that provides real information about the things that make you successful in football.”

When Biermann catches up with him later in the book, Clemens becomes even blunter. “All usable data only illustrates the game retrospectively… it is so superficial that we cannot make any predictions based upon it. We cannot use it to draw up an idea for developing players or for coming up with concrete instructions. It tells us nothing of relevance about a player, about success or about the impact of a strategy.”

The generally accepted ‘major takeaway’ of the data revolution is teams taking fewer long-range shots and working the ball into close-range positions before having an attempt at goal. One veteran of the data industry jokes that football analytics, while a multi-million-pound industry that employs hundreds of people, is essentially about inventing increasingly sophisticated ways to tell everyone to shoot from close to the goal, rather than far away from it. This is comparable to, but less interesting than, the major takeaway in basketball, where shots are increasingly concentrated around the basket, and from certain positions that are favourable to scoring three-pointers.

In football, the data does demonstrate that shots are being struck closer to goal. But then this is a process that has, to a certain extent, always been happening — watch a game from the 1960s and you’ll find the number of hopeless long-range shots absolutely maddening.

John Muller’s excellent article about the evolution of the World Cup suggests that shot distances have come down through the years anyway.

“World Cup data makes clear that the great shot shift was going on a long time before anybody had heard of xG,” Muller explains. “In 1970, 62 per cent of shots came from outside the penalty area. By 2006, that was down to 54 per cent. Fifty years ago, it took around 15 non-penalty shots to score a goal — nowadays it’s closer to one in 10. Shooting was especially selective at the last couple of tournaments, but long-term trends make it fair to wonder how much data really had to do with it.”


Long-range shots have been decreasing as a proportion of overall attempts for decades (Mark Leech/Getty Images)

In his book, Graham is also sceptical about this apparent triumph of analytics. “I am very doubtful of this claim,” he writes. “Even today, most Premier League clubs do not take data analysis seriously, and its adoption has certainly not increased steadily like average shot distance has decreased.”

But this has been held up as the main finding from the analytics era: to the point that data has been blamed for killing the beauty and unpredictability of long-range goals, which simply isn’t true. Yes, the number of shots from outside the box, and the percentage of shots taken from outside the box, have dropped. But there have been 144, 145 and 143 goals scored from outside the box in the previous three Premier League seasons, compared to 123, 133, 144, 125 and 122 in the five seasons beforehand, which doesn’t correlate with the idea that long-range goals are now extinct because Premier League clubs employ people who use spreadsheets.


Phil Foden led the Premier League in long-range goals last season, with six (Alex Livesey – Danehouse/Getty Images)

Throughout all these books, it is striking that there are no real accounts of how team strategies have altered depending upon the numbers, or even how individual football matches have been won and lost because of data. Smith concedes, in his final chapter, that it is “difficult to say, with absolute certainty, which elements of football in its current form — the style of play that is a la mode among the game’s aristocrats — can be traced back entirely to data.”

Part of this is, of course, because insiders don’t want to reveal trade secrets, which is inevitable. There might be crucial things we don’t know. These seem most likely to be related to the concept of ‘pitch control’, which is briefly covered in Graham’s book. He explains how he hired Will Spearman, another with a physics PhD, on the back of two presentations at the aforementioned Opta Pro Forum because he was the first person “to do anything sensible with tracking data”. Pitch control, in basic terms, calculates which parts of the pitch a team’s player can reach before any opponents, and therefore essentially maps which team ‘controls’ which zones.

These models — and they’re more sophisticated than this, as Graham explains in greater depth — are genuinely interesting and probably have had a major impact within clubs. But we’re still lacking case studies about how they’ve been put into practice. Maybe they have convinced a manager to change system at half-time to spark a famous fightback. Maybe they hint that speed in behind the opposition this weekend will be more promising than a false nine coming short between the lines, and that will alter team selection. But it remains something of a mystery.

The rise of Pep Guardiola, which came slightly before the data revolution, has surely been the biggest factor in the recent stylistic change of football. Guardiola is frequently mentioned in all these books, and the popularity of ‘his’ possession-based style can be easily measured. But, while he is a devotee of video analysis, no one has discovered any tales of Guardiola placing great emphasis on data.

For all the hundreds of supremely bright people employed in this field, it’s uncertain how much influence they’re having.

“Almost every Premier League team now does have at least one person with the word ‘analytics’ in their title,” writes O’Hanlan in Net Gains. “But most of the people I’ve spoken to who work for clubs, or who have occupied one of these roles, say that the majority have very little impact on the team’s decision-making. Teams hire them because it would look bad if they didn’t.”

It feels like most of football analytics, at least based upon the considerable available literature, is largely about ranking players (even Graham’s well-written book features a chapter entitled ‘Goat War’, about who is better between Lionel Messi and Cristiano Ronaldo), helping to analyse whether they’d fit into their club, and calculating a reasonable transfer fee. That is very useful, but it isn’t really ‘changing football’.

Really, the most telling elements of Graham’s account aren’t really about data at all, but about cooperation between departments, the use of video analysis (rather than data analysis, although the two are clearly linked) and his admiration for Jurgen Klopp’s footballing and emotional intelligence. Indeed, it’s probably fair to say that Moneyball itself was more about this type of thing — wider cultural change, which incorporated increased use of data — than it was about numbers alone changing baseball.


Data is part of modern coaching but in conjunction with many elements (Adam Fradgley/via Getty Images)

Tactically, Graham’s most interesting revelation is that Klopp repeatedly said he wanted to sign players with ‘extreme characteristics’ and wasn’t bothered if they had clear weaknesses. That goes against modern football’s shift towards all-rounders and also works well with data-based scouting, where finding players who excel in one particular area of the game is very simple.

Liverpool, of course, won the Premier League partly because they made some very good signings that Graham and his colleagues can take credit for. But they also won it because of Roberto Firmino’s reinvention into a false nine after a period where no one knew what he was, because their top assisters, Trent Alexander-Arnold and Andy Robertson, repopularised crossing after a period when it had gone out of fashion, and (perhaps most pertinently) because they overperformed their expected goals numbers by around 20 in 2019-20. There might be a data-based explanation for any of these developments, but no one has offered anything substantial.


(Ian MacNicol/Getty Images)

In stark contrast, the story of the analytics movement in cricket — more or less Britain’s version of baseball — is genuinely revelatory. Hitting Against The Spin, by Nathan Leamon and Ben Jones, challenges long-held assumptions and actually prescribes playing in an unorthodox fashion. In keeping with its title, but going against what most cricketers were taught, they show that hitting ‘against’ the spin is more effective than hitting ‘with’ the spin, and they show that while the ‘good-length, good-line ball from a leg-spinner’ is the rarest type of delivery in Test cricket, it is statistically the most effective.

Cricketers will read it and genuinely change how they play. That sport, like baseball, is simpler and easier to quantify than football. But it’s nevertheless a little underwhelming to read groundbreaking statistical analysis of other sports, return to football and find that the main conclusion is that more shots go in from 10 yards than from 20 yards.


So far, it seems fair to say that data has ‘changed football’ in the same way Google reviews have ‘changed restaurants’, considering almost everyone, when planning a meal out in an unfamiliar location, will check online reviews before making their decision. If there is relevant data out there, with a decent sample size, you’ll use it — and you probably won’t go for a 3.1-rated pizza restaurant over a 4.5.

And, of course, during the period where this has been commonplace, restaurants have probably changed in various ways (more vegan food, more ‘small plates’ and more meals served on chopping boards, judging by the London restaurant scene). But it doesn’t follow that Google reviews have been the catalyst for those changes — they’ve simply changed how people select restaurants.

In other words, it is clear that analytics has changed how football clubs recruit new footballers, but it is not clear that footballers behave differently, or should behave differently, now they’re being studied in this manner. We always knew that scoring goals is good, assisting goals is good, and dribbling towards goal is good. Analytics has become better at measuring which players do these things, and the things leading up to them, and increasingly things like off-the-ball runs too, but has it influenced teams to play any differently?

Ultimately, it depends how you consider the word ‘football’. If you take ‘football’ to mean the football industry — financial transactions, contract negotiations and clubs as businesses — then analytics has definitely changed football. If you take ‘football’ to mean the actual game — 90 minutes, 11 against 11, how should we play in order to win? — then, 3,000-odd pages later, we’re still waiting for a convincing (or fully disclosed) account of analytics’ impact.

Really, Moneyball wasn’t really a story about data; it was a story about challenging conventional wisdom. We are constantly told that data has changed football. But maybe it hasn’t changed that much at all.

(Header photo: Alexander Hassenstein via Getty Images)

Read the full article here

LEAVE A REPLY

Please enter your comment!
Please enter your name here