We are often told that the beautiful game cannot be captured by mere numbers… in terms of evaluating team performances, this may very well be true. This doesn’t stop people from trying, and with a friend of mine today, we set about the task of trying to figure out whether or not the pythagorean method for calculating expected wins that is employed in baseball, professional basketball and hockey, could possibly work in the soccer universe.
The Pythogorean theorem for those who do not remember either school or the Scarecrow in The Wizard of Oz, is the expression:
In Baseball, this idea has been fine-tuned to a predictive function:
W% = R^x/(R^x + RA^x)
where x = (RPG)^.287
The essence of this function is that a teams expected success correlates in a particular relationship between the amount of times the team has scored versus the amount of times the team has been scored upon. This is a nicer method than simply looking at the differential, because it tells you how many WINS you ought to expect your team to have.
Does this work for soccer? It is certainly possible to just assert that it does, like this writeup on the 08-09 La Liga table proclaims. This author takes the Pythagorean theorem literally, raising all the values to the second power, and just assumes that this equation predicts the percentage of POINTS a team receives rather than a percentage of WINS.
Here are some of the problems with looking at the question this way.
- Points are not awarded in a linear fashion. You can earn 3 points in a match, you can earn 1 point in a match, you can earn 0 points in a match. You cannot get 2 points! The fact is that it messes up the measurements of the importance of goals when one goal can turn 0 points into 1 point, or 1 point into 3 points. One possible way around this is to count wins as two points when we calculate, but this seems a little unfair as all clubs play knowing that they are playing to get three points for a win, rather than two, and this presumably affects their performance.
- The relationship between winning and drawing is generally unclear. Even professional hockey is helped in terms of using Pythagorean methods by the fact that someone has to win every game. Not so with the beautiful game. In fact, let’s take an extreme case. If a team went an entire season playing 38 straight 0-0 draws, a Pythagorean theorem would predict their winning percentage to be 0. That would be true. They would have zero wins. But, they would also have 38 points! This is why calculating a winning percentage is not the same things as calculating a percentage of total points. If your theory is completely off on a calculation that everyone can make using common sense without a second’s thought, how good is the theory?
- Another variation on the theme. In baseball, if you scored as many runs as you allowed, we would expect an expected percentage of .500, or 81 wins and 81 losses. There seems to be no reason to expect this is true about clubs that score as many as they concede. Last year, Spurs had a goal differential of zero, but had 51 points. Auxerre had 55 points with a zero differential. This year, NAC Breda has 33 points from 22 matches with a zero differential. Scoring as many as you concede doesn’t aggregate reliably to anywhere on the table because of the possibility that there is no winner and the bonus point awarded for winning rather than drawing.
In scientific research, we don’t use statistical analysis, but rather comparative method, when we run into a situation in which we have “many variables, and small N.” (N means number of cases). We have only 20 teams, playing only 38 games, and usually somewhere between only 2.5 and 3 goals per game – which comes out to somewhere around one goal every 30-40 minutes of game play.
I, for one, am not against statistical analysis in football where it lives up to its own standards of precision and reliability. However, when it comes to why a team is where they are in their place on the table, a comparative look between different clubs with regard to the things they do the same and the things they do differently that get similar and different results, I suspect, is going to have to suffice in place of mathematical certainty.
Steven Maloney is a contributing writer for Glorious Football and a Professor of Political Science at the University of Saint Thomas in Saint Paul, Minnesota. He can be reached for comment at steven.maloney@gmail.com.


{ 1 comment… read it below or add one }
This is a very interesting read. But…
“but this seems a little unfair as all clubs play knowing that they are playing to get three points for a win, rather than two, and this presumably affects their performance.”
I have to disagree. The point of the game has, since its inception, been to win, not earn points. Whether it’s 2, 3 or 4 I don’t think will affect the performance one bit.
Aside from that, like I said, this is extremely interesting,