During the first wave of the COVID19 epidemic, the daily number of deaths published by Public Health England (PHE) has been the main headline in the news. On 17th July, Matt Hancock, Secretary of State for Health, called for a review of this time series after a blog published by Yoon Loke & Carl Heneghan of Oxford University questioned whether definition used by this time series was appropriate. I myself had noticed a change in the PHEr time series in my tracker of COVID19 deaths in England but I hadn’t understood why this might have been the case. After looking at the data again in more detail, I have concluded that this time series is overestimating the number of deaths by 42 +/- 13 per day since the 23rd May and it needs to be revised otherwise it will create confusion should a second wave come.
This article was edited on 26th July to add confidence intervals and links to related articles
4 time series for COVID19 related deaths in England
My analysis is based on the 4 time series for COVID19 deaths in England listed in the table below. I track these in my post “COVID19 Deaths #1 – Latest Data & Trends for England” and I will be using the same terminology in this article.
It is the PHEr series we are concerned with and I will be comparing this with the ONSr series to decide if there is an issue. Both measure registration of death certificates throughout England but they differ on their definition. ONSr requires a mention of COVID19 on the death certificate but does not require a positive test for COVID19. PHEr is the reverse; it requires a positive test for COVID19 but doesn’t actually require that COVID19 is mentioned on the death certificate. It was this point that was picked up by Loke & Heneghan and they pointed out that if someone had COVID19, received a positive test, recovered from it and was then run over by a bus, the death would be counted in the PHEr series but not in the ONSr series. The point is well made but how big a difference does it make to the numbers?
Other people have already attempted to answer this question so my analysis here should be read in addition to these articles.
- Loke & Heneghan’s original blog post.
- A good summary of the issue by Anthony Masters, RSS Ambassador.
How the ONSr/PHEr ratio has changed over time
If we look at the most recent week for ONSr data (week ending 1oth July) a total of 344 deaths with COVID19 on the death certificate were registered in England. For the same week, PHEr recorded 587 deaths which is 70% higher. At the peak of the first wave after Easter, ONSr recorded 8379 deaths and PHEr 5800 deaths i.e. PHEr was 30% lower not higher than ONSr. Logic tells us that PHEr should be lower than ONSr since PHEr is only deaths with a positive COVID19 test whilst ONSr counts these and deaths suspected to be from COVID19 but not confirmed with a positive test.
If I plot ONSr divided by PHEr over time, this is what I see.
At the start of the first wave, the ONS took a while to catch up with COVID19 death registrations partly because they work standard office hours whereas PHE have more staff working at weekends hence why the ratio was below 1 before Easter instead of being above 1 as logic tells us. At the peak though, the ratio hit 1.6 but once all the Easter bank holidays were out of the way and the ONS had caught up, the cumulative ONSr count by 22nd May was 41,891 deaths which was 1.28 times that of the cumulative PHEr count of 32,775 deaths. That ratio of 1.28 had been stable for at least 2 weeks.
Is it reasonable to expect this ratio to remain stable over time for the remainder of the first wave? The evidence from the other two series ONSo & NHSo (which are based on the date death occurred) suggests that it is reasonable to assume this. NHSo only covers deaths in NHS commissioned services whilst ONSo covers all locations but both require that COVID19 is listed on the death certificate. You may be aware that the first wave took a particular heavy toll on residents in care homes and consequently over Easter, ONSo was much higher than the NHSo figure. The care home epidemic is now more or less over (according to CQC data) and since the middle of May, the ratio of the cumulative ONSo count has been a steady 1.6 times that of the NHSo figure.
I need to digress at this point to point out a wrinkle with the NHSo figures. Like PHEr, this is also published daily but the headline figure published is only for deaths with a positive COVID19 test. From 24th April, the NHS started to publish deaths with COVID19 on the death certificate but where a positive COVID19 test was not forthcoming. These are not included in the headline total but are published separately but in the chart above, I have included both sets of data in the NHSo figure. The chart below makes it clear how much of a difference this makes.
What is the effect of assuming a stable cumulative ONSr / PHEr ratio of 1.28?
As I explain in section 1 of my tracker post of COVID19 deaths in England, on 23rd May, PHEr began to include deaths where the positive test for COVID19 came from pillar 2 testing (tests undertaken by commercial laboratories in the wider population). Until then, PHEr had only been counting deaths connected with a pillar 1 test (key workers and clinical needs in a PHE/NHS laboratory). Whilst PHEr, in their revision, did include deaths that had occurred before 23rd May, they did not revise the time series before 23rd May. Instead they made a one off addition of 445 deaths to the week commencing 23rd May which created a distinct kink in the time series.
I find it notable that the ratio of cumulative ONSr to cumulative PHEr began to fall from that date forward and has continued to fall in the straight line since and is now 1.2. I use a solid triangle on the chart to highlight this.
The change is more dramatic if I look at the other line which is the ratio of the 7 day centred moving average. This went below 1 on 12th June and is continuing to fall in largely a straight line.
What if the cumulative ratio had remained at 1.28 instead since 23rd May? Let’s take the latest date of 10th July when the cumulative ONSr count was 48,388. If I divide that by 1.28, I would expect the cumulative PHEr count to have been 37,800. This is 2413 lower than the actual figure of 40,213. The same calculation for 9th July tells me cumulative PHEr should have been 2313 lower. Alternatively it tells me the daily PHEr figure for 10th July of 147 was 100 higher than it should have been. I can repeat this calculation for all dates since 23rd May and I end up with a noisy time series. If I then plot a 7 day centred moving average of this estimated change in the daily PHEr figures I get this chart.
This chart shows that on average, the daily PHEr figure would have been 42 deaths per day lower (95% confidence interval +/- 13) had the cumulative ratio been unchanged since 22nd May. I find it very striking that the difference is stable over time (trendline is not statistically significant) and to me, this backs up Loke & Heneghan’s contention about the PHEr figure. If people are catching COVID19, testing positive for it, recovering from it and then subsequently dying from completely unrelated causes, such deaths should be random over time and on average the same each day. This is what we see in the chart above.
EDIT 26th July – I am in a twitter discussion about what I’ve said in this paragraph which has thrown up some additional points to consider.
My Recommendations for the PHEr series
I recommend PHE publish 2 time series going forward.
- A new headline PHEr series which excludes deaths that were registered more than 28 days after the date of a positive test for COVID19. I understand this is the definition used in Scotland and Northern Ireland. This will require the historical data from 23rd May onwards to be restated.
- For deaths registered more than 28 days after a positive COVID19 test, a separate time series should be published.
The reason why I recommend retaining the additional deaths as per recommendation 2 is that we still don’t know the long term effect of COVID19. Just because you recover from an infection now doesn’t mean it won’t have an effect on your long term health. Although the death certificates of these people may not necessarily mention COVID19, it’s possible the time series may throw up something in the future that is indicative of a long term COVID19 effect.
Until these changes have been made, you should mentally subtract 42 (+/- 13) from any daily PHEr number published and 293 (+/- 35) from any weekly number published. So for the week ending yesterday (24th July) PHEr had counted 463 deaths but if we now subtract 293, this would be 170 deaths instead. This would be consistent with my extrapolated 252 deaths for ONSr (344 for week ending 10th July). Effectively it will be impossible for weekly PHEr deaths to fall below 300 until my recommended changes are made.
– More posts about COVID19 –
- A very useful guidance to interpreting statistics of COVID19 published by the Royal Statistical Society.
- My collection of links to all kinds of material related to the statistics of COVID19, epidemiological modelling and testing.
- How large a sample is needed in order to decide whether COVID19 restrictions can be lifted? A lot, lot less than you think!
- How many excess deaths will there be as of 19th June? This is my estimate of excess deaths using a statistical model.
- Latest trends in number of cases of COVID19 in England
- Latest trends in number of deaths due to COVID19 in England