On 30th September 2020, CL:AIRE (the industry body for the land contamination & remediation sector) published new professional guidance for “Comparing Soil Contamination Data with a Critical Concentration“. The 46-page document advises those responsible for deciding whether contaminated land needs to be made safe for human use on how to use statistics to make their decisions. I was the lead author of the guidance and I spent 4 years working with CL:AIRE’s steering committee on what the guidance should cover. The 4 years were bookended by two statements published by the ASA (American Statistical Association) on the use & misuse of P-Values in 2016 & 2019 and in writing this guidance I felt was I an ambassador for turning those statements into something that could used by non-statisticians to make real life decisions that have an impact on us all.
I will be making a number of presentations about this guidance in the future and where possible I will include links to those presentations here as well.
- My presentation to SILC conference on 8th March 2020 with the subtitle “What’s changed in the guidance and why“
My approach to writing the new guidance
In the late 90s, I bought a house in Reading which was a new development built on a former industrial estate. I received a survey report which summarised the tests made on the soil in my garden and how much risk it presented to human life. My brother, who was a laboratory scientist at the time, remarked I could mine my garden for metals and suggested I shouldn’t grow fruit and vegetables in the garden. I had no interest in doing so but it was my first contact with the land contamination industry.
This industry surveys land intended for human use and advises on how safe it is and what can be done to make it safer, a process known as remediation. Planning officers are a key decision maker in the process and they will base their decisions on surveys and analysis undertaken by industry experts. The experts are a mixture of scientists and engineers and whilst they will have received training in statistics, they are not experts in statistical inference hence the need for professional guidance in statistics.
Whilst writing the new guidance, I realised that the concluding paragraph from the 2016 ASA statement perfectly captured what I wanted the guidance to convey and I reproduce it here broken down as 6 bullet points –
“Good statistical practice, as an essential component of good scientific practice, emphasizes …
- … principles of good study design and conduct,
- … a variety of numerical and graphical summaries of data,
- … understanding of the phenomenon under study,
- … interpretation of results in context,
- … complete reporting and
- … proper logical and quantitative understanding of what data summaries mean.
… No single index should substitute for scientific reasoning.”
I wished I had fully realised the importance of this paragraph at the beginning of the project as I could have then recommended the guidance was laid out in this fashion. For reasons that were perfectly understandable at the time given the wishes of the steering committee, the draft guidance followed a different layout but during the revision process, I tried to steer the layout back to the ASA layout with the result the final version ended up somewhere in between. However, I did add Appendix A1 to the final version where I explicitly made the link between what was written and the ASA statement.
When taking a sample for the purpose of making decisions, the first thing a statistician wants to know is what is the population that has to be sampled and what the criteria for making decisions are. In the contaminated land industry, this is delivered by something called the Conceptual Site Model (CSM) where an appropriate expert pulls together all that is already known about the site and combines that knowledge with his understanding of how contaminants behave in soil and how humans are likely to make use of the soil and the final result is a model of the site called the CSM. He or she then uses the CSM to break the site down into 3 parts –
- Areas that can be declared safe for human use.
- Areas that are not safe and need to be remediated.
- Areas that are unclear and need to be sampled in order for a decision to made.
For areas of type 3, a suitable sampling & measurement plan using statistical principles will then need to be developed and a threshold for decision making, known as a Critical Concentration, needs to be specified in advance. The results of the land survey can be then be analysed and interpreted using the new guidance hence its title of “Comparing Soil Contamination Data with a Critical Concentration”.
It is important to appreciate that the new guidance only covers the end of this process, the statistical analysis and decision making, and to my mind focuses on the even numbered bullet points of the ASA statement. The odd numbered bullet points are covered by the CSM and Sample Design which are not covered in the guidance but are essential pre-requisites in order to use the guidance. This explains the copious number of caveats and pre-requisites at the beginning of the document as the steering committee was fully aware that some people will jump to the analysis without having done the CSM and Sample Design work. These are large subjects in their own right and they need separate guidance to be written. It was this debate over the pre-requisites and the extent to which they should be referred to in the new guidance that explains why it took 4 years to publish it.
I would like to thank CL:AIRE for asking me to write the new guidance. It was a hugely educational process and one that forced me to examine my understanding of some basic statistical ideas (such as the Central Limit Theorem) as well as teaching me about the issues the land contamination sector has to deal with. I sincerely hope the outcome is that the sentiment I expressed in this paragraph in Appendix A1 is the one that comes to pass.
“The guidance is written on the assumption that it will be read and used by people with a scientific training who are capable of exercising scientific judgement and who wish to use statistics to SUPPLEMENT their professional judgement, not to REPLACE their professional judgement.”