My first published attempt to predict the outcomes of Premier League matches got off to a good start last week with 23% of my predictions getting the right scoreline and 52% getting the right outcome. I have updated my prediction of the final league which continues to point to an incredibly tight relegation battle, especially for my team Newcastle United.
I use a statistical approach known as Poisson Regression which I described in depth last week for round 29 of matches. If this is the first time you have seen my predictions then I strongly encourage you to click on that link to familiarise yourself with my terminology. My predictions all start with the latest form guide as shown below which I use to calculate two numbers for each team playing this weekend:
- eGS – the expected number of goals that a team will score.
- ePts – the expected number of points that a team will receive.
For round 30 this weekend, I have used my calculations of eGS & ePts (shown in the table below) to make 4 separate predictions of the scorelines for each match. The highlighted team in each match is the one with the higher eGS value but that doesn’t necessarily mean they will win.
The 4 scoreline prediction methods (ML, Med, Rdd & Int) work as follows:
- ML – Maximum Likelihood is the scoreline with the highest probability from the Scoreline Matrix as explained in step 3 of my post for round 29.
- Med – Median is derived from the median number of goals that each team is expected to score. See step 6 of my post for round 29 for a fuller explanation.
- Rdd – Rounded is a simpler predictor which just involves each teams eGS being rounded to the nearest whole number. So 1.8 for Chelsea rounds to 2 goals and 0.4 for Crystal Palace rounds to 0 goals, hence the 2-0 prediction.
- Int – Integer is simply the integer part of eGS which is equivalent to rounding down. Int is considered because whilst Rdd is better at predicting higher scorelines, it is very poor at predicting goalless teams whereas is more likely to predict this.
On reason I publish four predictions is that they are all plausible methods of converting both teams’ eGS into a scoreline and as yet I am unable to say which is the better option. Looking back at round 29, it is not possible to point to one definitely being better than the others.
Each prediction can be scored in one of three ways;
- Exactly Right i.e. I predicted the actual scoreline
- Partly Right i.e. I predicted the right outcome (win, draw or loss) but not the right score.
- Wrong i.e. I predicted the wrong outcome.
Round 29 is summarised in the table shown. MED was the best in terms of predicting the outcome but ML & INT did better in terms of the scoreline.
To arrive at a final league table, I need to repeat this process for rounds 31 to 38 and then combine the predictions into a predicted final table. As I explained last week, I am making two separate predictions of the final table and I will demonstrate with my team Newcastle United by showing the predictions for all their remaining games.
My preferred method of estimating the final table is to total up the ePts values for all remaining games. Newcastle are currently 29 points and if you total up the ePts, you find I am expecting them to get another 10.2 points which when rounded comes out at 39 points. Repeating this for all teams and you get the final table shown here which has Man City winning the league with a new record points tally, Newcastle in 13th and Crystal Palace, Stoke & WBA relegated. However, only 3 points separate Newcastle from 18th place and the margin of error shown by the LCI & UCI columns indicate that relegation is still a distinct possibility.
My second method of estimating the final league table is to use the 4 scoreline methods described earlier. For each method, I work out the expected number of points given the predicted scores and then take an average across the 4 methods. This method tends to give more points to teams at the top and fewer points to teams at the bottom but it does result in an explicit prediction of the W-D-L record for each team. Again we see the same 3 teams being relegated, Man City winning the title and Newcastle in 15th place but with 2 points fewer as shown by the DIFF column.
For the moment, I consider the 1st predicted table to be superior but when we get down to the last few matches, the 2nd table is likely to be a better predictor. I will make it clear when I think we have reached this point.