Last updated on 27th September 2020 but downloadable spreadsheet in section 3a was updated on 19th October 2020. I will update the post when I get the time!
The latest data for COVID19 (Coronavirus) cases in England as of Saturday 26th September 2020 shows the number of people testing positive for COVID19 is up 60% from a week ago but this masks extreme regional disparities that make the national trend meaningless. The North is in the grip of a second wave unlike the South which is not. Unless recent trends in the North abate, the scenario of 50k positive tests per day by the end of October recently postulated by the Chief Scientific Officer remains feasible.
If you are familiar with the 5 data sources I use in this blog, you can skip to my thoughts at the end of this post. I first go through each source to explain what they are and what they are showing in the last week before I pull the whole thing together into a commentary.
I plan to update this post every Sunday. You can follow me on Twitter to be told when I have made updates.
The 5 time series for COVID19 Cases in England
Each time series is denoted with a 4 letter code which I will use throughout. Clicking on the 4 letter code will take you to the source data. I have only extracted data for England from these sources but some also cover Scotland, Wales & Northern Ireland.
- PHEs – Public Health England Positive Tests for COVID19 by Date of Specimen – Daily number of positive tests for COVID19 from either an NHS/PHE laboratory (Pillar 1) or a commercial laboratory (Pillar 2). Data is published daily.
- DHSCp – Department of Health & Social Care COVID19 Tests by Date of Processing – Daily number of tests (pillar 1 & 2) for the United Kingdom along with % of tests giving a positive result. Publication of daily data though is due to cease as of 20th August and in future will only be published as weekly data.
- Local Authority Data – PHEs & DHSC data broken down by local authority. Clicking the link downloads a spreadsheet I’ve created which allows you to choose a selection of local authorities to include in a number of charts. You can also find PHE’s watchlist of local areas of concern here.
- NHSt – Number of Positive COVID19 cases transferred to NHS Track and Trace Service – Weekly number of positive cases from the PHEs time series above transferred to NHS T&T. This is then broken down into cases where the NHS were able to obtain the contact details of those who came into contact with the infected person and those cases where this wasn’t achieved.
- ONSe – ONS COVID19 Household Infection Survey – Using a random sample of households in England who are tested every week, the % of people infected with COVID19 is estimated on a fortnightly basis. Sample excludes institutional settings such as Hospitals & Care Homes. Data is published weekly on a Friday.
1 – PHEs – Public Health England COVID19 Positive Tests
As of 26th September 2020, a total of 373,717 tests in England gave a positive result for the SARS-COV-2 virus which causes the disease COVID19. The breakdown by the date the specimen was taken is shown in the chart below as bars.
The chart format is identical to the PHEr series plotted in section 1 of my post “Latest Trends for COVID19 deaths in England“. The extrapolated number of positive tests is based on the last 3 weeks of data and the method I use is the same as explained in that link. Here, I will display the same calculation using a different chart format as below which uses a vertical log scale to show two things; the natural logarithm of the cumulative number of positive tests (diamonds) and the difference between the logarithm of the cumulative number and the logarithm of the cumulative number 7 days previously (small circles). Mathematically the second line is the same as the natural logarithm of the dashed black line in the chart above.
By fitting a straight line through the small circles, I can use it to extrapolate the log of the geometric trend into the future. It is apparent though that a straight line fit is not possible for the whole timeframe. From the end of March to the beginning of June, a straight line fit is apparent as shown by the red dashed line. From 6th June the geometric trend turned upwards but at a slow rate and a new straight line fit as shown by the solid blue line is apparent. Then on 31st August, the sharp upward spike occurred as shown by the dashed gold line. My extrapolation model is based on the last 3 weeks of data so as to avoid overreacting to the last few days but in my comments at the end, I will explore other extrapolations.
Unlike PHEr which was based on the date the death was registered, PHEs is based on the date the specimen was taken. Given that it might take a few days for the specimen to be sent to a laboratory, be tested and then the results to be communicated to PHE, this means that the data for the most recent dates are always being revised, sometimes by a considerable amount. Typically it is the last 4 days that see the largest changes so these days are not included in the extrapolation and are shaded in the first chart above.
However, that doesn’t mean I have to ignore the last 4 days. I have been tracking on a weekly basis the extent to which the last 4 days are revised a week after they were first published. The extent is fairly obvious from the chart below.
Each line is coded with a V number. This refers to the version number of the spreadsheet you can download in section 3a. The chart shows a weekly pattern to the data that has been fairly consistent. You can see that the biggest differences between the latest data and what is restated a week later are for the last 4 days hence why I exclude these from the underlying trend and extrapolation of that trend. Over the last few weeks, I have been experimenting with ways of estimating what the last 4 days will be in a weeks time and what I have learned is that the last 2 days are very hard to predict but the 3rd & 4th day are easier to predict. As yet, I have not worked out the best way to present these estimates in this blog post but I use them to track whether the revisions over the next few days are in line with my expectations.
2. DHSCp – Dept Health & Social Care Number of COVID19 Tests
What I have plotted in the charts in section 1 is the sum of “Pillar 1” and “Pillar 2” testing data as explained in the detailed notes about the data supplied by PHE. The DHSC (Department of Health & Social Care) provides further details but briefly the difference is this –
- Pillar 1 are tests carried out in NHS/PHE laboratories focusing on key workers and those with a clinical need.
- Pillar 2 are tests carried out by commercial laboratories which tend to focus on the wider community.
DHSC compile this data for all 4 nations and publish the data by the date the test was processed rather than by date of specimen. They report both the total number of tests undertaken and the number of tests with positive result for COVID19. The chart below shows both the number of tests processed by pillar and the overall % of tests that were positive. Data is for the whole of the UK rather than England only.
You can see that the Pillar 2 testing process only got up to speed at the beginning of May. Total number of tests then averaged 80k per day until the end of June whereupon the numbers rose considerably and continue to increase with the latest 7 day CMA at 230k tests per day.
Since the number of tests is now on the rise, it is possible that any increase in the number of positive tests could simply be due to more tests. Therefore, the above chart also shows the percentage of tests that give a positive result (known as Positivity) as black diamonds with a black line showing the 7 day CMA. This was broadly stable in the range 0.6%-0.7% for a number of weeks up to the end of August but has almost quadrupled in the last 3 weeks to 2.3%.
3a – Regional Data
For the 7 days to 22nd September, a total of 30,014 positive tests were recorded across England which is up by almost 60% on the previous 7-day period. PHEs can be broken down by regions and local authorities. At the regional level, this table shows a significant regional divide in England with the 3 Northern regions and West Midlands recording levels higher than that seen in the spring whilst the South remains well below the peaks recorded in Q2.
I am recommending that all organisations vulnerable to local lockdowns keep a close eye on local case count. The government has given local authorities the power to reimpose lockdowns in their areas like those that have occurred already in Leicester, Blackburn, Oldham and Preston. To help you analyse your own area, I created a spreadsheet which recreates the PHEs charts from section 1 for any area you wish. Click on the link below to download the file and please read the HELP sheet to get started.
Purely for archive purposes, the links to previous versions are here – v1.00, v1.01, v2.02, v2.03, v2.04, v2.05, v2.06, v2.07, v2.08, v2.09, v2.11, v2.12 . They are mostly 7 days apart. Version 1s contain pillar 1 data only whilst version 2s contain pillar 1 & 2 data.
As well as PHEs data, my spreadsheet also includes charts showing the number of positive tests per 100,000 population by week supplied by DHSC. This provides a useful variant on the daily charts shown before. At a regional level, DHSC also provide data showing the % of COVID19 tests giving a positive result (known as Positivity) by pillars 1 & 2 and my spreadsheet includes the chart format below. Here I have shown all 9 regions together as a matrix. A new feature of these charts are the light green bars which are regional estimates of positivity made by the Office of National Statistics (ONS) which I talk about in section 4. The reason I’ve added this to the chart is that the pillar 2 data is not collected from a sample designed in accordance with statistical principles whereas the ONS data is so collected. What this matrix of charts is showing is that for now concerns about the statistical reliability of pillar 2 numbers are overblown.
The obvious conclusion from the regional positivity matrix is that the regional divide between North & South is extraordinary and means that national trends are unreliable. Inferences must be made from regional & local data instead.
3b – Local Authority Watchlist
PHE publishes a local authority watchlist of areas that it is concerned about. I recommend you click on this link as it includes additional data which is not included in my spreadsheet including number of tests by local authority and the breakdown of number of positive tests by age. Where necessary I will comment on this watchlist but I’ve also created my own watchlist based on 5 criteria:-
- Number of positive tests per 100K population for latest week
- How that figure for the latest week compares with the highest weekly figure seen in Q2 2020.
- The % growth in number of positive tests over the last 3 weeks
- The % growth in number of positive tests over the last week
- The number of positive tests for the latest week
I can only do this for the 150 Upper Tier Local Authorities which are unitary authorities, county councils and metropolitan district and London borough councils. Each UTLA has been ranked on each of the 5 measures and I have arbitrarily then taken an average of the 5 ranks with the 1st rank given twice as much weight. This is the figure (AVG column) I have sorted the UTLAs on in the table below and next to it is the range of the 5 ranks used in that calculation. Given that the maximum rank is 150, this can be used to place these numbers in context. 30 UTLAs are shown who together make up the top 20% of UTLAs of concern.
One problem when interpreting PHEs at a local level is putting the data into context. I would much rather have positivity data at a local level but that is not available as a data download. So to provide some context for PHEs, I’ve created a new chart format below which is based on the DHSC number of positive tests per 100K population which is available weekly for the 150 UTLAs. On this chart, I’ve plotted Bolton (as the area where the recent surge started) data against the distribution seen each week using a box plot format, a format that I’ve also explained in more depth in this twitter thread. Bolton has dropped to number 25 in my watchlist and the recent pause looks like it might be genuine. It is still however the UTLA with the highest number of positive tests per 100k population.
A few points to bear in mind when analysing local authority data:-
- Be aware of the “speed camera fallacy” which I explain in this twitter thread when trying to claim that local interventions are effective. As I pointed out there, it could be that what any decline after a spike is just regression to the mean and it is difficult (but not impossible) to design a suitable experiment that could test for this effect.
- One way to partly overcome the speed camera fallacy is to look what’s happening in neighbouring areas as I explained in this twitter thread I wrote a few weeks ago about trends in London, Birmingham and Leicester.
- For a better understanding of how local restrictions will work, this collection of comments from scientists published by the Science Media Centre in reaction to Boris Johnson’s announcement of a delay to further easing of restrictions at the beginning of August is worth reading. Many of the scientists have clarified that restrictions can be eased in situations where tracking and tracing of people is easy to do e.g. workplaces & schools. It is areas that where tracking and tracing is hard to do that may be subject to more severe restrictions. Hence the next section looks at the effectiveness of the NHS Track & Trace Service.
4 – NHSt – Referrals to NHS Track & Trace Service
The NHS Track & Trace service was launched at the end of May and weekly data has been published ever since. Of most interest to me is whether the service is being successful in handling the number of cases referred to it for which the first question is, are they able to identify who needs to be tracked down? The chart here gives an answer to this question.
3 outcomes are tracked by this chart –
- Are those who test positive being referred to the service? In the last week recorded, 14% of those who tested positive have not been transferred which is an improvement on the previous week. Clearly referrals need to be as fast as possible.
- If transferred, can the service obtain contact details for everyone who has come into contact with the infected person? In the last week recorded, this was not achieved with 16% of those transferred which is the lowest figure seen since the service started. This is an improvement and hopefully it will continue since the higher it is, the more chance there is for the virus to be spread by unsuspecting carriers.
- Of those transferred and whose contacts can be traced, the actual number was the best since the service was launched. I had been concerned that the static numbers in July & early August might have been an indication of the service’s capacity in the event of a second wave.
- Of those contacts traced, contacted and advised to self isolate, are they complying? This is not measured but this is the essential step that has to take place. Otherwise, all of the above will be pointless.
I note that NHSt data is available by local authority as well but I haven’t published anything on this.
What has been causing a lot of concern is the speed with which people are given their test results. For pillar 1 tests, the metric used is the % of people given results within 24 hours. For pillar 2, the metric is the median time in hours to receive the results split by the 5 testing channels used for pillar 2. This shows that for Regional, Local & Mobile channels, there is generally a 24 hour wait which has increased a little in the last 2 weeks but not out of the ordinary. The bigger concern are tests undertaken by the Satellite & Home delivery channels where the delay has doubled from 48 to 96 hours before falling back in the last week. These last 2 channels account for over 50% of all pillar 2 tests so this is an issue that has to be addressed.
5 – ONEe – ONS COVID19 Household Infection Survey
For the week ending 19th September 2020, the ONS estimated that 1 in 500 people will test positive for COVID19 in England which is the highest positivity rate seen in England since the end of May. The ONS survey is focused on the wider community rather than institutions and as such is similar in scope to the pillar 2 data stream of the PHEs series.
Going forward, ONS intend to increase the sample size considerably to allow for more granular results. At present, it cannot provide a reliable local estimate given that only 163 people out of a total of 79,901 from the whole of England tested positive for SARS-COV-2 in the most recent fortnight. However, I think the regional estimates are more reliable now and I have included these in my regional charts in section 3a.
The margin of error in the estimates are quite large as shown the vertical bars which represent 95% confidence intervals. The picture of July & August suggested the overall positivity was stable in England but that is clearly not the case now.
ONSe has the potential to give good answers to a number of questions we want answered since it is a properly designed sample based on statistical principles (I run a training course on Statistical Sampling) but for now it is primarily a national trend tracking tool.
ONS also have a sample of households & individuals for antibody testing which measures whether or not you’ve had the disease. Since the 26th April, a total of 9,343 people in England have been tested with 476 testing positive for antibodies. Antibody prevalence is highest in London and lowest in the South West which is completely consistent with the number of cases these regions had at the peak of the outbreak as shown by the regional table in section 3a.
Will we have 50k cases per day by end of October? My thoughts
On Monday 21st, the Chief Scientific Officer gave a presentation which included a scenario (not a forecast!) where the number of positive tests reached 50,000 per day by the end of October in the UK. For England, that would work out at about 42k per day. If, like much of the media, you do not know the difference between a scenario and forecast, please read my article I wrote 3 years ago on “How to identify a good forecaster“. The question I will consider here is whether a scenario of this magnitude is feasible or not.
In section 1, I stated that I use the last 3 weeks of data to extrapolate the trend into the future. This is an arbitrary choice I made but I have found it a useful mechanism to check whether the recent trend is diverging from this extrapolation or not. The chart below shows the same extrapolation for England as the light purple line but I have also added two more extrapolations as explained by the boxes. Together the 3 extrapolations are based on the 7 day centred moving averages from these time periods.
- 3 weeks between 18th July to 7th August
- 3 weeks between 8th August & 28th August i.e. immediately before the surge kicked in at the beginning of September
- Last 3 weeks between 30th August & 19th September. The 19th is the latest date for the 7 day CMA since the 23rd-26th are excluded they are subject to considerable revision.
The reason why I chose these extrapolations will become clearer soon.
The last 3 weeks extrapolated using all England data would give 39k positive tests as of 31st October which is in the same ballpark as the CSO scenario. However, I explained in section 3a that the regional variation is enormous and as such, it makes no sense to use an all England extrapolation. I therefore repeated this chart & the 3 extrapolations for each of the 9 regions along with 3 aggregations for the North, Midlands and South and obtained this table.
By doing separate extrapolations for each region and each aggregation, I can see if the underlying trends are basically the same across an area. For example in the South, if I total the 4 extrapolations of LON, E, SE & SW I get 4,831 for the middle extrapolation which is virtually identical to the 4,775 when I extrapolate the aggregated South instead. I get the same outcome if I use the last 3 weeks for extrapolation instead and roughly the same order of magnitude with the 3rd extrapolation model.
From this, I conclude in the South that the underlying trend is the same everywhere and if nothing changes, I would also conclude from the chart below that the middle extrapolation model (8/8 to 18/8) currently fits the data best at present which leads to almost 5k cases in the South by 31st October. In effect, the South had a spike at the beginning of September which reversed itself and is now back on the trend set in August. Note this trend is mostly (but not completely) explained by increased testing since positivity has increased mildly in the South.
When I look at the Midlands, I see that East & West are following broadly the same trends as I get the same broad outcomes whether I extrapolate east & west separately or as an aggregate. This time, it is the last 3 weeks extrapolation that explains the current trend best and if nothing changes, this also extrapolates to just under 5k positive tests in the Midlands by 31st October. This time, increased testing mostly explains the East Midlands but not the West Midlands which has seen genuine increases in positivity (see the regional positivity matrix in section 3a).
The North is completely different. Extrapolating the 3 regions separately gives very different outcomes than if I extrapolate the North combined. The 3 extrapolation models give wildly different answers and some scenarios are way in excess of the CSO scenario. The current trend is very steep and so whilst one would expect the trend to falter and reverse itself at some point, the number of days it takes for this to happen is critical as the chart below shows that cases will quickly be over 5k by the end of September. Without a rapid slowdown, the North could easily exceed 10k by the end of October.
The North West is the key here since the number of positive tests per day in this region is already double the peak seen in the spring and the trend shows no sign of slowing down. What I find interesting here is that the current trend is more or less the same as the 3rd extrapolation model based on data for the 3 weeks 18th July to 7th August. During that period a number of local lockdowns occurred in Oldham, Preston, Bolton, Rochdale, etc and looking at the next 3 weeks of 8th August to 28th August, it was easy to conclude that those interventions were working as the upward trend was halted and reversed. Then we had the shocking reversal at the end of August and it’s as though that what happened in August was a mirage to mislead us and in fact the interventions had no effect at all. If so, that is of huge concern and implies the trend here is going to continue unless stronger measures are taken.
I still have not heard any explanation for the explosion in the North West which appear to have started in Bolton. The one silver lining is that Bolton (the worst affected UTLA) appears to have halted its surge as shown in section 3b and one hopes from the chart below, future revisions to the last 4 days here are minimal which would mean a pause to the surge. If so, that would give hope that the North West surge will halt sooner rather than later but note the word hope. I have no evidence to back this up.
So is the CSO’s scenario of 40k+ positive tests in England by the end of October feasible? Unless current trends change, especially in the North & North West, the answer is yes, a scenario of roughly that order of magnitude is feasible. The question then becomes will the current trend moderate and pause in the next few weeks? I simply have no idea. If Bolton is a harbinger of an imminent pause in the North West, one thing we need to avoid is the northern surge spreading to the South, especially London. If that happens then any northern pause will get overwhelmed by a southern surge instead.
Bottom line is that the situation is finely poised and we need to be monitoring regional and sub regional trends rather than national trends to make decisions on interventions.
– More posts about COVID19 –
- A very useful guidance to interpreting statistics of COVID19 published by the Royal Statistical Society.
- My collection of links to all kinds of material related to the statistics of COVID19, epidemiological modelling and testing.
- How large a sample is needed in order to decide whether COVID19 restrictions can be lifted? A lot, lot less than you think!
- Latest trends in COVID19 deaths in England using 6 time series
- How many excess deaths will there be as of 19th June? This is my estimate of excess deaths using a statistical model.