Updated on 14th May 2020. New and modified links are italicised.
The Coronavirus Pandemic is a worldwide challenge many of us will have not experienced before. It is natural to want to seek information on the risks and in our world today, it has never been easier to find data, analyses and opinions. Unfortunately, a lot of what you will read out there is either unhelpful or actively misleading. As an independent statistician with 30 years experience of explaining statistics to non-statisticians, my contribution to this crisis will be to try and sort the good from the bad hence this post.
Since 15th March 2020, I have been rapidly bringing myself up to date on the data, statistics and science of Coronavirus/Covid19/SARS-COV-2. The result has been an ever expanding list of links to useful material and I have organised these links below into various groups. I will be updating this list throughout the crisis so please feel free to bookmark this link. I will try and alert people to fresh updates using my Twitter feed.
Please note, since I am based in the UK, my links will be biased towards material of most interest to people living in the UK. If you come across a link that you think should be added to/removed from this list then please email me. However, please bear in mind that the prime purpose of this post is to demystify coronavirus for non-statisticians and non-scientists. Therefore, I will not be linking to material I consider to be too technical for the general public.
Currently, I have organised my links into 8 sections –
- A: Good sources of data, analysis and scientific information
- B: Links to UK government advice, planning and decisions
- C: Statistical models of the pandemic
- D: Comments from the Science Media Centre
- E: Journalists who have impressed me
- F: All things to do with Testing
- G: All things to do with Sampling
- Z: Miscellaneous material
Before you start reading my list of links, please read this guidance to interpreting statistics of COVID19 published by the Royal Statistical Society.
A: Good sources of data, analysis and scientific information
When it comes to getting data, these sources have been useful to me.
- Daily number of cases, deaths and recoveries in the UK by local authority – This is provided by Public Health England. To get data for individual local authorities, you need to zoom in on the map and select the relevant one. A table then appears in the map giving you the figures. You can also get the same data from the BBC here which also provides population figures for each local authority.
- Update 31/3/20 – awareness of data limitations has risen among the public and PHE are being criticised for not making these limitations clear in their data. See this blog by Simon Briscoe for a description of issues.
- Update 7/4/20 – NHS England have started publishing deaths due to or with COVID19 by the date of death. PHE data above is by date that the death certificate was filed which can be later. Please bear in mind that the NHS data for the last few days are underestimates since the death certificates will not have yet have been processed by NHS England.
- Update 14/4/20 – ONS (Office of National Statistics) are also publishing deaths due to or with COVID19 by registration date and date of death. Whilst publication is slower and only weekly, it does include all deaths whereas NHS & PHE data is just deaths within NHS-commissioned services
- Update 14/4/20 – with 3 different data sources for UK deaths (PHE, NHS & ONS), you might find this article by Anthony Masters (see link E4 below) which explains the difference very useful.
- Update 14/4/20 – again with 3 different data sources, you might think they give 3 different answers to questions. In this twitter thread I show that in fact all 3 are showing the same trends which is the most important factor in estimating when the current outbreak will peak.
- Update 1/5/20 – The Care Quality Commission (CQC) is now collecting daily data on deaths in care homes where COVID19 is suspected. The data is published weekly by the ONS and starts from 10th April
- Daily number of cases, deaths and recoveries by Country – This is a Github link prepared by John Hopkins university in the USA. Many media platforms are using charts created by John Hopkins. I have not yet looked at this in great depth but it does have a lot of links.
- Analysis of trends in cases & deaths by country – This is a fantastic resource prepared by the Our World in Data organisation. If you want to read just one link, start with this one. It takes the same data as given in link A2 above but adds commentary, context and interpretation. A number of the charts are interactive allowing you to choose which countries you want displayed and where the scales should linear or logarithmic. As far as I can tell, this is being updated daily so is worth bookmarking.
- Update 11/5/20 – my analysis of latest trends in England using the 6 available datasets. This is updated weekly.
- What is Coronavirus and the science behind it – This is a slide deck prepared by Michael Lin, an academic from California. It is a good introduction to the science of SARS-COV-2 (the correct name for the virus) and what was known and not known as of 13th March 2020. Some of the information may be out of date by the time you come to read it so use this as background before reading link A3 above.
- Comparing Covid19 death rates to normal death rates – This is an article by David Spiegelhalter, former President of the Royal Statistical Society, who looks at how the known death rates of Coronavirus compares with normal mortality in the UK across different ages. It provides a method for future statisticians to determine the overall additional number of deaths as a result of this pandemic.
- Update 11/4/20 – Using mortality data available as of end of March, David has been able to give more refined estimates of the risks by age.
- Update 12/5/20 – Yet another informative update from David. He makes a particularly important point about the difference between PFR & IFR which has been badly misunderstood by the media.
- Advice from Information Commissioner on data collection – The virus has prompted a desire to collect information from the public about their symptons and other information. Since this has potential data privacy implications, the information commissioner has put out this statement and Q&A.
- Update 15/4/20 – a briefing pack on patient records and data records put together by the House of Commons Research Library.
- What are Excess Deaths and why they are the key metric – Excess Deaths is the only metric by which the entire effect of the pandemic can be evaluated. It is also the only way to compare the effect between countries, regions and cities. This article by Anthony Masters (see link E4 below) explains how excess deaths are measured.
- Update 4/5/20 – an excellent article by the FullFact charity which explores some of the implications of excess mortality statistics seen so far in the UK and other countries.
- Update 11/5/20 – my attempt to estimate weekly excess deaths in England given that it is published at least 2 weeks in arrears.
- Why counting deaths is more difficult than you think – David Spiegelhalter, former President of the Royal Statistical Society, explains the difficulty that statisticians have in counting the number of deaths due to COVID19.
B: Links to UK Government advice, planning and decisions
These are all links to official government advice on what UK citizens should be doing to help prevent the spread of the virus.
- Latest advice by the NHS
- Government advice on what to do and not do
- Government advice on social distancing
- Useful collection of government links
- The original government plan for Pandemics prepared in 2011 and updated in 2013. Undoubtedly this is what the government started with.
- An article written by the Chief Scientific Officer on 15th March 2020
- Links to all papers used by SAGE group. SAGE is the scientific body that advises COBRA, the government crisis management body. Where relevant I have linked to specific SAGE papers elsewhere.
- Update 7/5/20 – SAGE Membership List – Here is a list of the members of SAGE and its various working groups which do much of the heavy lifting.
- Options offered by the government to businesses affected by government restrictions
- Options offered by the government to employees affected by government restrictions. This link includes some advice given in links B1 to B4 above.
- Options offered by the government to self-employed people affected by government restrictions
- All Slides & Data used in Daily Press Briefings – Every day, the government has been giving a public press briefing and often these are accompanied by presentations. This link collates all the slides and data used in those briefings.
- (long) Presentation by the Chief Medical Officer – On 30th April, Gresham College published this 80 mins powerpoint presentation by Chris Whitty, Chief Medical Officer. It is a comprehensive overview of what we know and more importantly what we don’t know about COVID19. It looks at the epidemiology, clinical treatment, countermeasures, vaccines and drugs. There are a number of pointers to what will drive future government decisions as well.
- Guidance to employers about COVID19 risk assessments for workplaces – On 11th May 2020, the government published comprehensive guidance to be used by employers to assess the COVID19 risks for their employees. 8 different types of workplaces are covered and the process is clearly laid out. This guidance supplements, not replaces, existing statutory requirements to manage health & safety risks in the workplace.
C: Statistical models of the pandemic
Statistical models of the pandemic are important tools to help governments plan their response to pandemics. Models also help you to identify the areas of greatest uncertainty which are either the factors that have the largest impact or factors that are very hard to measure or both.
- Imperial College simulation model – This paper was published on 16th March 2020 and played a major part in the government’s decision to ramp up measures the same day. Unlike a lot of academic papers, this is quite short at 20 pages and is very readable.
- Comments on Imperial College Paper 1 – It is wrong to assume all scientists agree with each other. There are always debates and the Imperial College paper has prompted a series of comments. These comments are made by a group of people which includes Nassim Taleb.
- Comments on Imperial College Paper 2 – This an article by John Ioannides, Professor of Epidemiology at Stanford University. Whilst not directly addressing the paper, it was published the day after the paper and makes a number of points relevant to it.
- Comments on Imperial College Paper 3 – This is a twitter thread by Trevor Bedford, an epidemiologist from Washington USA, published 19th March 2020.
- Demonstration of a Epidemic Simulation Model – This is a neat interactive demonstration by The Washington Post of how you can build a model to simulate epidemics and thus identify which measures are best for slowing down an epidemic.
- Measuring model parameters in real time – This is an interesting twitter thread by Alex Adamour, Director of London Mathematical Laboratory, which discusses the parameters that epidemiological models need to know and the difficulties in estimating these parameters during a pandemic.
- First estimates of Case Mixes of UK Intensive Care Patients – On 27th March 2020, the UK Intensive Care National Audit & Research Centre (ICNARC), published its analysis of the first 775 patients to be admitted with Covid19. The 9 page report summarises the case mix of these patients i.e. how they break down by demographics and status of health. These estimates are essential to modelling future scenarios and answering the question whether the NHS will be able to cope with peak demand. The Science Media Centre (explained in section D) released a number of expert comments on this paper here.
- Update 8/4/20 – Latest data from NHS England which breaks this down by gender, ethnicity, BMI, pregancy and age. Link in that tweet doesn’t work at the moment so I will try and find the correct link.
- Update 28/4/20 – Latest data from the UK ISARIC protocol. This is a panel of 16,749 patients with COVID19 admitted to 166 hospitals in the UK whose clinical symptoms have been tracked through to completion (death or discharge) covering the period 6th February and 18th April. The UK government has been tracking this on a weekly basis.
- Update 7/5/20 – Latest data from the ONS looking at the risks by ethnicity and how much of the increase risk for BAME can be explained by factors such as population density, health, housing, deprivation, etc. The one factor not accounted in this work is occupation since this variable can be quite subjective.
- Difficulties in measuring model parameters – This is a similar article to link C8 above but I think Tom Chivers, a very good science journalist, does a better job of getting the same points across.
- The story of how decisions were made in the UK (so far) – It is way too soon to be drawing lessons on the UK should have made its decisions on COVID19. That can’t be done until the whole pandemic is over which may take 2 years as no-one (and I stress no-one) knows what is the best way to deal with it. But this Reuters article looks like a good start and has the feel of something that will be updated as the pandemic progresses.
- Why modelling the COVID19 pandemic is so hard – Possibly Nate Silver’s (See E3 below) best article yet! It uses a comic book format to explain the sheer number of variables that have to be taken in account.
- Why COVID19 is different from SARS, MERS & Ebola – At the start of the outbreak, I said in a webinar to my local business network that Coronavirus is the worse kind of virus for decision making. In this article, Nate Silver (See E3 below), explains the differences between this pandemic and previous pandemics.
If you want to learn more about statistical modelling in general, you can find a collection of blog posts about modelling that I have written here.
D: Comments from the Science Media Centre
The Science Media Centre is an organisation that seeks comments on papers, press releases, government decisions, etc from a panel of relevant scientists. The panel includes statisticians for when statistical comments are needed and I am one of the statisticians on that panel. The SMC collates these comments and circulates the commentary to journalists to assist them in the writing of their articles. Journalist feedback is that they find these commentaries to be very useful.
You can follow the Science Media Centre on Twitter which is a good way of being informed of new comments. As you can imagine, the coronavirus has prompted a regular series of comments and I have chosen to list some of them below. If you would like to see a full list of their comments please click on the 1st link below.
- Complete list of comments on Covid19
- Comments on the safety of home deliveries – useful pointers about the relative risk of buying your groceries from supermarkets compared to on-line deliveries
- Comments on why antibody tests are so important – the UK Chief Medical Officer stated these would be “a game changer”.
- Comments on government announcement of a COVID19 tracking panel – Details of this new panel of 20,000 households who will be tested every week can be found in link G2 below.
E: Journalists who have impressed me
One side effect of COVID19 is how revealing it has been of journalistic quality. A number of journalists who I thought were OK have been shown up to be really poor when confronted with the issues and uncertainties of this virus. However, some have really stepped up and impressed me.
Here is a list of journalists who have written multiple (5 or more) articles which in my opinion are easy to read and do an excellent job of explaining the issues. All links take you to their Twitter feeds. The first two names stood out straightaway.
- Tom Chivers – a freelance science journalist whose work I have followed for a while
- Paul Nuki – Health correspondent for the Daily Telegraph. When I first noticed him, I came close to including one of his articles in this collection of links. In the end, I put links to the data he was using but he put me on the track of the data links given in section A.
- Nate Silver – better known for his political analysis and is more like me in that he is a Statistician rather than a Scientist. To be honest he was not that hot at the beginning of the pandemic. However, around the end of March, I noticed a distinct improvement in the quality of his articles and often he was answering statistical questions that I was trying to answer myself!
- Anthony Masters – is actually an ambassador of the Royal Statistical Society rather than a journalist but he does write a lot of articles explaining statistical concepts, some of which I have linked to elsewhere.
F: All things to do with Testing
Testing is crucial for bringing this pandemic to an end in two ways. The first part is developing reliable tests that can be used to make decisions both at the individual level and at a population level. The second part is testing for vaccines and treatments in order to prevent future outbreaks and treat victims with severe symptons.
Both parts cannot succeed without statistics and statistical thinking. Many statisticians work in these fields and have seen many ways statistics has been used and misused in these fields. COVID19 could see these errors amplified if the stats are not used correctly. So the links I list below are ones that do a good job of illustrating these issues.
- Why Immunity Passports are harder than you think – This article by Tom Chivers (one of my trusted journalists in link E1 above) gives an excellent explanation as to why we have to get antibody tests right and can’t rush into the first one that appears to work.
- How to use chemical warfare to validate tests – This article is by the second of my trusted journalists Paul Nuki (link E2 above) is a fascinating read about how Porton Down, the UK’s Chemical Warfare establishment, is at the centre of validation of antibody tests and the problems they are encountering.
- “We need mass unreliable testing now!” – The cry for mass testing is near universal but the callers are either assuming testing is perfect or that errors are minimal. This is not true and this article by my colleague Sophie Carr explains how test results that appear to be clear can in fact be very unclear. This article won Sophie the title of World’s Most Interesting Mathematician in 2019 and it is the second of the two articles in the link.
- Why Sensitivity & Specificity will be hashtags before long – This article by Royal Statistical Society Ambassador, Anthony Masters, explains the difference between sensitivity and specificity of tests and why these two words are so important.
- How unreliable tests screws up Coronavirus Case Counts – Nate Silver (3rd journalist of my list in E3) wrote this very article looking at the impact of unreliable tests on the daily case counts. I was actually doing the exact same thing myself when he published this article and beat me to it! What’s good about this article is that you can download his spreadsheet and change his assumptions to see what the effect is.
- The pointlessness of the 100K tests per day target – Tom Chivers (see link E1 above) explains why just doing more and more tests is pointless without an objective in mind.
If you want to know about how statistics can be used to draw conclusions from tests and experiments, you can find a collection of blog posts about Statistical Inference that I have written here.
G: All things to do with Sampling
The corollary to good testing is good sampling i.e. testing the right people in the first place. There is constant noise in the media calling for more testing but sample size is meaningless if you are not clear about why you are testing in the first and you end up testing the wrong people. This is where the statistics of sampling comes in and the links below explain this.
- Estimating the sample size for making decisions about COVID19 – This is an article I wrote on 2nd April 2020 in response to the twitter hive mind asking what sample size would be needed for mass testing. I prefaced the article by holding a Twitter poll here.
- Announcing the UK Covid19 Household tracking panel – The UK government is setting a panel of 20,000 households who will be tested every week and tracked over time. This is a perfect example of how to do testing properly i.e. do it with an objective in mind. Comments from the Science Media Centre can be read in link D8 above.
If you want to know about how statistics can be used to design samples for testing and other purposes, you can find a collection of blog posts about Sampling & Surveys that I have written here.
Z: Miscellaneous material
This was section D but was relabelled section Z on 25th March 2020.
This collection of links will be very eclectic. They are either posts on points related to the Coronavirus that I may want to talk about in future or they are links to material that don’t yet fit into the above 3 categories or they are articles that appear to be relevant but where I am unsure as to the expertise of the author.
- Who are the experts that the media should be talking to? – An article by a statistician, Graeme Archer, who makes a plea to the media to be careful about who they designate as “experts”.
- Update 4th May – A superb article by Fiona Fox, Science Media Centre, on why the media need to defer to their science & health reporters rather than their political reporters in this pandemic.
- Agreement & disagreements among experts – A short article by Michael Blastland on the nature of disagreement among experts.
- Update 27th April – An article by Freddie Sayers contrasting two epidemiologists, Neil Ferguson of the UK and Johan Giesecke of Sweden, looking at their different approaches.
- Mapping facilities for older people in the UK – This interactive map was developed by my colleague and client Mark Thurstain-Goodwin of GeoFutures who specialise in mapping data. It shows the prevalence of older people in the UK along with information on key support and facilities for the more vulnerable such as supermarkets, foodbanks, etc. It’s aimed at charities who provide support and need to know this kind of information during the pandemic.
- Are you an Armchair Epidemiologist? – Anyone following social media will have to put up with know-it-alls. In normal times, such people would be irritating but in these more sombre time, they can be actively misleading. So I was delighted to find this very funny sketch by Australian comedian Mark Humphries which you should watch so you can get better at spotting these people.
- Why capture-recapture sampling can be more accurate – This is a more technical paper than I would normally allow in this collection but the maths is of interest to me. Christoph Kuzmics, an Austrian economist, describes a method of sampling that can be used to estimate the extent of COVID19 infection in a population. The method is normally used by zoologists to estimate how many animals of a rare species are still living in the wild.