After many weeks of a relegation dogfight involving up to 10 teams, it now appears that an exciting finish to the 2018 EPL season is becoming less likely as a gap opens up between the bottom 3 and the rest. After 34 matches, another gap is opening up between the 4 prediction methods I use and one method so far has a 74% success rate in predicting the match outcome.
I use a statistical approach known as Poisson Regression which I described in depth for round 29 of matches. If this is the first time you have seen my predictions then I strongly encourage you to click on that link to familiarise yourself with my terminology. My predictions all start with the latest form guide as shown below which I use to calculate two numbers for each team playing this weekend:
- eGS – the expected number of goals that a team will score.
- ePts – the expected number of points that a team will receive.
For round 33 this weekend, I have used my calculations of eGS & ePts (shown in the table below) to make 4 separate predictions of the scorelines for each match. The highlighted team in each match is the one with the higher eGS value but that doesn’t necessarily mean they will win.
The 4 scoreline prediction methods (ML, Med, Rdd & Int) work as follows:
- ML – Maximum Likelihood is the scoreline with the highest probability from the Scoreline Matrix as explained in step 3 of my post for round 29.
- Med – Median is derived from the median number of goals that each team is expected to score. See step 6 of my post for round 29 for a fuller explanation.
- Rdd – Rounded is a simpler predictor which just involves each teams eGS being rounded to the nearest whole number. So 1.8 for Chelsea rounds to 2 goals and 0.4 for Crystal Palace rounds to 0 goals, hence the 2-0 prediction.
- Int – Integer is simply the integer part of eGS which is equivalent to rounding down. Int is considered because whilst Rdd is better at predicting higher scorelines, it is very poor at predicting goalless teams whereas is more likely to predict this.
On reason I publish four predictions is that they are all plausible methods of converting both teams’ eGS into a scoreline and whilst as yet I am unable to say definitively which is the better option, it does appear rounds 29 to 32 that MED is edging ahead as shown in the table below. For each match prediction, I have scored them in one of three ways;
- Exactly Right i.e. I predicted the actual scoreline
- Partly Right i.e. I predicted the right outcome (win, draw or loss) but not the right score.
- Wrong i.e. I predicted the wrong outcome.
The table shows that Med is most accurate so far in terms of the match outcomes (74% correct) but it is not the best for a specific scoreline. RDD so far is the worst performing prediction model.
To arrive at a final league table, I need to repeat this process for rounds 34 to 38 and then combine the predictions into a predicted final table. As I explained in round 29, I am making two separate predictions of the final table and I will demonstrate with my team Newcastle United by showing the predictions for all their remaining games.
My preferred method of estimating the final table is to total up the ePts values for all remaining games. Newcastle are currently on 35 points and if you total up the ePts, you find I am expecting them to get another 7.8 points which when rounded comes out at 43 points. Repeating this for all teams and you get the final table shown here which has Man City winning the league with a new record points tally, Newcastle in 12th and Southampton, Stoke & WBA relegated. After many weeks of a relegation dogfight involving up to 10 teams, it now appears that a gap is opening up between the bottom 3 and the rest. Newcastle are now projected to be 10 points above Southampton and even when you take the margin of error into account (as shown by the LCI & UCI columns) it is beginning to look like that they are safe.
My second method of estimating the final league table is to use the 4 scoreline methods described earlier. For each method, I work out the expected number of points given the predicted scores and then take an average across the 4 methods. This method tends to give more points to teams at the top and fewer points to teams at the bottom but it does result in an explicit prediction of the W-D-L record for each team. Again we see the same 3 teams being relegated, Man City winning the title and Newcastle in 11th place this time with the same number of points. This is not the case for all teams and the DIFF column shows the difference between the two predicted tables.
For the moment, I consider the 1st predicted table to be superior but when we get down to the last few matches, the 2nd table is likely to be a better predictor. I will make it clear when I think we have reached this point.