On 30th September 2020, CL:AIRE (the industry body for the land contamination & remediation sector) published new professional guidance for “Comparing Soil Contamination Data with a Critical Concentration“. The 46-page document advises how to use statistics when assessing land contamination and whether it is safe for development. I was the lead author of the guidance and I spent 4 years working with CL:AIRE’s steering committee on what the guidance should cover. The 4 years were bookended by two statements published by the ASA (American Statistical Association) on the use & misuse of P-Values in 2016 & 2019 and in writing this guidance I felt was I an ambassador for turning those statements into something that could used by non-statisticians to make real life decisions that have an impact on us all.
I will be making a number of presentations about this guidance in the future and where possible I will include links to those presentations here as well.
- My presentation to the SILC conference on 8th March 2020 with the subtitle “What’s changed in the guidance and why“
- My presentation to the SOBRA virtual conference on 2nd December 2020. Clicking on the link takes you to the whole conference and I am the first speaker about 10 mins in. The talk lasts about 40 mins. My thanks to the Society of Brownfield Risk Assessment and the other presenters for allowing me to share this link.
- My presentation to the SCLF AGM on 1oth December 2020 – This was a longer presentation than what I presented to SOBRA. My thanks to the Scottish Contaminated Land Forum for allowing me to share this Youtube link. The presentation starts 2m30s in.
My approach to writing the new guidance
In the late 90s, I bought a house in Reading which was a new development built on a former industrial estate. I received a survey report which summarised the tests made on the soil in my garden and how much risk it presented to potential occupants and the wider environment. My brother, who was a laboratory scientist at the time, remarked I could mine my garden for metals and suggested I shouldn’t grow fruit and vegetables in the garden. I had no interest in doing so but it was my first contact with the land contamination industry.
This industry surveys land to ensure a site is suitable for its new use and to prevent unacceptable risks from contamination. Planning officers decide if the appropriate processes and decisions on surveys and analysis have been undertaken by industry practitioners and the right decisions have been made. Practitioners working in the land contamination industry are a mixture of scientists and engineers and whilst many will have received basic training in statistics, they are not experts in statistical inference hence the need for professional guidance in statistics.
Whilst writing the new guidance, I realised that the concluding paragraph from the 2016 ASA statement perfectly captured what I wanted the guidance to convey and I reproduce it here broken down as 6 bullet points –
“Good statistical practice, as an essential component of good scientific practice, emphasizes …
- … principles of good study design and conduct,
- … a variety of numerical and graphical summaries of data,
- … understanding of the phenomenon under study,
- … interpretation of results in context,
- … complete reporting and
- … proper logical and quantitative understanding of what data summaries mean.
… No single index should substitute for scientific reasoning.”
I wished I had fully realised the importance of this paragraph at the beginning of the project as I could have then recommended the guidance was laid out in this fashion. For reasons that were perfectly understandable at the time given the wishes of the steering committee, the draft guidance followed a different layout but during the revision process, I tried to steer the layout back to the ASA layout with the result the final version ended up somewhere in between. However, I did add Appendix A1 to the final version where I explicitly made the link between what was written and the ASA statement.
When taking a sample for the purpose of making decisions, the first thing a statistician wants to know is what is the population that has to be sampled and what the criteria for making decisions are. In the land contamination industry, this is delivered by something called the Conceptual Site Model (CSM) where a competent practitioner pulls together all that is already known about the site and combines that knowledge with his or her understanding of how contaminants behave in soil & groundwater and what are the potential risks to humans and the wider environment. The final result is a model of the site called the CSM. He or she then uses the CSM to break the site down into 3 parts –
- Areas that can be suitable for use and safe for development.
- Areas that are not suitable for use and the risks will need to be addressed and may require remediation.
- Areas that are unclear and need to be sampled further in order for a decision to be made.
For areas of type 3, a suitable sampling & measurement plan using statistical principles will then need to be developed and a threshold for decision making, known as a Critical Concentration, needs to be specified in advance. The results of the land survey can be then be analysed and interpreted using the new guidance hence its title of “Comparing Soil Contamination Data with a Critical Concentration”.
It is important to appreciate that the new guidance only covers the end of this process, the statistical analysis and decision making, and to my mind focuses on the even numbered bullet points of the ASA statement. The odd numbered bullet points are covered by the CSM and Sample Design which are not covered in the guidance but are essential pre-requisites in order to use the guidance. This explains the copious number of caveats and pre-requisites at the beginning of the document as the steering committee was fully aware that some people will jump to the analysis without having done the CSM and Sample Design work. These are large subjects in their own right and they need separate guidance to be written. It was this debate over the pre-requisites and the extent to which they should be referred to in the new guidance that explains why it took 4 years to publish it.
I would like to thank CL:AIRE for asking me to write the new guidance. It was a hugely educational process and one that forced me to examine my understanding of some basic statistical ideas (such as the Central Limit Theorem) as well as teaching me about the issues the land contamination industry has to deal with. I sincerely hope the outcome is that the sentiment I expressed in this paragraph in Appendix A1 is the one that comes to pass.
“The guidance is written on the assumption that it will be read and used by people with a scientific training who are capable of exercising scientific judgement and who wish to use statistics to SUPPLEMENT their professional judgement, not to REPLACE their professional judgement.”