A Bit of a Deeper Understanding of the 538 Soccer Predictions Numbers

I came across a conversation the other day over at WAGNH (weaintgotnohistory.com for those unaware), about the use of 538’s (fivethirtyeight.com) SPI ratings, and interpreting one team having a higher score than another. It’s a pretty complex process they use, and they have a good and thorough write-up of it here for those wanting to read up.

But for those that just want a quick glimpse behind the curtain, and an ability to talk about SPI with other folks and actually sound like you know what you are talking about, here’s a short primer, using teams in the same league, as an example. Once you go between leagues, the league strengths come into play, etc. and that’s a lot more complex.

I’m going with Serie A, because the conversation I was part of was about this, and I already have a nice chart built, that I don’t feel like redoing for the PL (or maybe I will in a follow up article). I’m also only going to show and explain for Juventus, Inter Milan, and Napoli, because again, chart’s made already.

When you look at the Serie A Rankings page, and this article is as of mid-June, so after the end of the season, and prior to the updated preseason rankings, for anyone reading this from the future, you will see a few things:

  • Each club, obviously
  • An SPI score for each club
  • An OFF score
  • A DEF score
  • The actual table results

What we are focused on for this is the SPI score. At the start of the season, this comes from two sources:

  • 67% – The SPI score at the end of the previous season
  • 33% – The Market valuation implied SPI rating from Transfermarkt.com

Obviously the SPI has to originate somewhere, and it also evolves as the season progresses. The folks at 538 have utilized data all the way back to 1888, with data from more than half a million matches, to help define this system they’ve built. From their own words that I’m quoting, here is how they define the SPI rating:

At the heart of our club soccer forecasts are FiveThirtyEight’s SPI ratings, which are our best estimate of a team’s overall strength. In our system, every team has an offensive rating that represents the number of goals it would be expected to score against an average team on a neutral field, and a defensive rating that represents the number of goals it would be expected to concede. These ratings, in turn, produce an overall SPI rating, which represents the percentage of available points — a win is worth 3 points, a tie worth 1 point, and a loss worth 0 points — the team would be expected to take if that match were played over and over again.

https://fivethirtyeight.com/methodology/how-our-club-soccer-predictions-work/

So if a club has a 90 SPI, that means they are expected to 90% of the points over a large simulation before external factors are applied.

As they state, they start with the OFF and DEF scores, which are the number of goals to be either scored or conceded, and that comes from a set of three metrics:

  • Adjusted Goals – This is where the actual goals that were scored are downweighted as need be, via the opponent being down a man, or while running up the score late in a match, for example. Then the other goals are bumped up, to offset a bit. It’s a part of their model, and how they’ve tuned it. Details in the data they provide, if you want to actually go down that rabbit hole.
  • Shot-based Expected Goals – Good ol’ xG type data. Again, adjusted based on their model and fine-tuned. Dig all you want.
  • Non-shot Expected Goals – This is based on how many goals they should have scored based on their non-shooting actions around the goal. It’s all defined in their model. As they say, in their example: ” For example, we know that intercepting the ball at the opposing team’s penalty spot results in a goal about 9 percent of the time, and a completed pass that is received at the center of the six-yard box leads to a goal about 14 percent of the time.”

These metric values are then summed up for the OFF and DEF scores, for what they do and what they concede, respectively. And again, this gets adjusted throughout the season, as the results get compared to the expected results.

And with those values, they then do the match simulations, and determine from what we listed above, the SPI value based on points return probability.

In the chart below I show how the SPI values for Juventus, Inter Milan, and Napoli fluctuate over the season, and to keep things simple, show their cumulative league points totals at the same rate. It’s a very basic indicator of performance that most folks understand:

As can be seen, at the start of the season, Juve has a much larger SPI value (the dashed glowing lines) than either of the other two clubs, which is to be expected. Perennial league title winner and all around high market value club. And you can see around the first of the year there’s a bit of a drop in terms of SPI, even though the league points are still rolling in. This is an indicator that even though they were winning, they weren’t doing as well as they were expected to. Likely due to match congestion, or injury, or simply a bit of fatigue and bad-weather grind.

Notice though, around the middle of March, they start to take a deep slide in their SPI rating, and this is where the actual results started getting worse as well. As of March 15th, Juve’s record was 24-3-0, and by the end of the season it was 28-6-4, meaning over that stretch they went 4-3-4. UGLY. Obviously that’s beyond even playing poorly versus the expected, to the point of actually not even getting results. So their SPI took a dramatic drop, and because Inter was pushing hard to try to qualify for the CL, their SPI had been climbing a bit too, and they eventually crossed paths in May, and Inter finished slightly higher than Juve in the final SPI table, even though they came in third.

The short version of this is that the SPI rating is more of a ‘current results for head to head type comparisons’ type of rating, or perhaps a ‘heat check’, rather than say league table places, etc. And while it’s great to know this type of info, it’s also based on a LOT of context. Like in Serie A, knowing that Juve had the league wrapped up and ended badly, affecting their rating. Didn’t have any effect on their latest title win, though, as whether you are hot or cold at any given moment, league titles are based on the efforts of the entire season.