{"id":1612,"date":"2025-03-08T18:10:59","date_gmt":"2025-03-08T18:10:59","guid":{"rendered":"https:\/\/marriott-stats.com\/nigels-blog\/?p=1612"},"modified":"2026-01-23T13:58:25","modified_gmt":"2026-01-23T13:58:25","slug":"stats-training-materials-hypothesis-testing","status":"publish","type":"post","link":"https:\/\/marriott-stats.com\/nigels-blog\/stats-training-materials-hypothesis-testing\/","title":{"rendered":"Stats Training Materials &#8211; Statistical Inference &amp; Hypothesis Testing"},"content":{"rendered":"<p>Mention P-values and most people will probably shudder at some memory of an incomprehensible lecture or lesson on statistical tests.\u00a0 Words like null hypotheses, t-tests, statistical significance might pop into your mind with little understanding of what they are about.\u00a0 What you may know is that scientists have to report a p-value for any experiment they do or do they?<\/p>\n<p>The area of Statistical Inference is a core area of study for any statistician.\u00a0 Put simply, Inference means to infer from the observations you&#8217;ve made about your data and to draw conclusions about what might be happening in real life.\u00a0 There are two parts to Inference.<\/p>\n<p><!--more--><\/p>\n<ol>\n<li><strong>Exploratory analysis<\/strong> &#8211; where you explore your data through charts, tables and other statistics and end up with one or more hypotheses about what might be going on.<\/li>\n<li><strong>Confirmatory Analysis<\/strong> &#8211; where you seek to confirm your hypothesis which can often be through the use of statistical tests but should not be exclusively confirmed through such tests.<\/li>\n<\/ol>\n<p>I am a fan of using the criminal justice system as an analogy to explain this.\u00a0 When a crime occurs, the police investigate and collect evidence i.e. they undertake an exploratory analysis of the data.\u00a0 The outcome of this is a hypothesis that a person is guilty of the crime.\u00a0 That person is then tried in a court where the null hypothesis is that the person is innocent.\u00a0 The evidence is then examined via a statistical test and the outcome is a p-value that the jury uses to come to a verdict.\u00a0 Either the verdict is to reject the null hypothesis of innocence and therefore find the person guilty or the verdict is that the null hypothesis cannot be rejected and therefore the verdict is not guilty.\u00a0 At no point does a court conclude that the person is innocent, that is not the outcome of a statistical test.<\/p>\n<p>Below is a list of various materials that you can use to learn more about hypothesis testing.<\/p>\n<hr \/>\n<h4><span style=\"color: #008000;\"><strong>A. Experimental Design<\/strong><\/span><\/h4>\n<p>Classically, a hypothesis should be specified before any data is collected.\u00a0 This leads you into the area of Experimental Design (or DOE) which is a vast area of statistics.\u00a0 If you do this, then conclusions drawn once the data has been analysed are usually sounder than data collected by other means.<\/p>\n<p>More commonly, a hypothesis is generated after some data has been collected and analysed.\u00a0 The problem with this approach is that the way the data was collected may not be sufficient for you to draw firm conclusions.\u00a0 In reality, any conclusions should be treated as hypotheses for a proposed experiment.<\/p>\n<p>Two blog posts of mine explain more.<\/p>\n<ol>\n<li>Find out the difference between experiments and observations in my <a href=\"https:\/\/marriott-stats.com\/nigels-blog\/stats-in-the-news-0-the-evidence-hierarchy-and-how-to-use-it\/\" target=\"_blank\" rel=\"noopener noreferrer\">Evidence Hierarchy<\/a>.<\/li>\n<li>See an example of an experiment and how it can be improved in &#8220;<a href=\"https:\/\/marriott-stats.com\/nigels-blog\/stats-in-the-news-2-who-reads-fake-news\/\" target=\"_blank\" rel=\"noopener noreferrer\">Who reads fake news?<\/a>&#8220;<\/li>\n<li>What is the gold standard for an experiment?\u00a0 The answer is <strong>GRRaCE<\/strong> (<strong>G<\/strong>eneralisable, <strong>R<\/strong>eproducible, <strong>RA<\/strong>ndomised, <strong>C<\/strong>ontrolled <strong>E<\/strong>xperiment) which I will expand upon in a post soon.<\/li>\n<\/ol>\n<hr \/>\n<h4><span style=\"color: #008000;\"><strong>B. Statistical Tests<\/strong><\/span><\/h4>\n<p>Hypothesis testing causes a lot of confusion and often explained badly.\u00a0 I intend to add more links to articles that do a good job on this.<\/p>\n<ol>\n<li>Why do women like my logo?\u00a0 To be published soon as an example of doing a 2-way Chi-Squared test in Microsoft Excel.<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/the-changing-diversity-of-the-conservative-party-mps-and-leaders\/\" target=\"_blank\" rel=\"noopener\">Is the Conservative Party intersectional for ethnicity &amp; gender?<\/a> This blog looks at the changing gender &amp; ethnic diversity of Conservative MPs since 2001 and one section uses a 2-way Chi-Squared Test to examine the interaction between gender &amp; ethnicity.<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/uk-general-elections-4-how-accurate-are-the-polls-updated-with-ge19\/\" target=\"_blank\" rel=\"noopener\">Do opinion polls tend to underestimate the gap between Conservative &amp; Labour party?<\/a> I use a simple T-test to examine this hypothesis towards the end of this blog though I don&#8217;t explain the method.<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/uk-weather-trends-2022-meteorological-year\/\" target=\"_blank\" rel=\"noopener\">Has there been a step change in UK annual temperatures<\/a>?\u00a0 I use a 2-sample t-test to see if temperatures in the 21st century are different from the 20th century.\u00a0 This article also introduces the basic principles of SPC (Statistical Process Control) which uses confidence intervals covered in section C below.<\/li>\n<li><a href=\"https:\/\/twitter.com\/MarriottNigel\/status\/1339171936824406018\" target=\"_blank\" rel=\"noopener\">Did the Mayor Paris discriminate in favour of women<\/a>? In 2019, the Mayor of Paris was fined for having too many women in her leadership team.\u00a0 I show in this tweet how stupid this was as a simple Binomial Test demonstrates what happened was completely consistent with a null hypothesis of no discrimination.<\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=PJVo_aGHMaE\" target=\"_blank\" rel=\"noopener\">You the Jury!\u00a0 Is this case a prosecutor&#8217;s fallacy or not?<\/a> In 2021, I gave evidence at a Medical Practitioners Tribunal where a doctor has been charged with cheating in an exam.\u00a0 The case came to light when the examination board noted an unusual degree of similarity between her answers and another candidate.\u00a0 I was required to test whether this similarity was unusual which I did using a Chi-Squared test for a Null Hypothesis which assumed the answers of the 2 candidates were independent.\u00a0 The link takes you to a Youtube recording an event hosted by <a href=\"https:\/\/rss.org.uk\/membership\/rss-groups-and-committees\/sections\/statistics-law\/\" target=\"_blank\" rel=\"noopener\">the Statistics &amp; Law section<\/a> of the Royal Statistical Society where I show how I performed this test and the conclusions.\u00a0 I then went on to list the other evidence for and against the defendant before revealing the verdict.\u00a0 The recording contains <a href=\"https:\/\/docs.google.com\/forms\/d\/e\/1FAIpQLSdk3z6f0u1b0ExeaRoyUuvu2zgf-LORbqFAWABVaKi0Q_pLqA\/viewform\" target=\"_blank\" rel=\"noopener\">a QR code which opens this form<\/a> allowing you to vote on the probability of guilt as the evidence is revealed.<\/li>\n<\/ol>\n<hr \/>\n<h4><strong><span style=\"color: #008000;\">C. Confidence Intervals<\/span><\/strong><\/h4>\n<p>Confidence intervals are often recommended as an alternative to using P-values when assessing statistical significance.\u00a0 Together they are like two sides of the same coin and a case can be made that communication of results is easier with confidence intervals rather than p-values.<\/p>\n<p>Here are some examples of confidence intervals in action.<\/p>\n<ol>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/gender-pay-gap-and-life-on-mars\/\" target=\"_blank\" rel=\"noopener noreferrer\">When is a gender pay gap statistically significant<\/a>?<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/gpg-yoy-trends-unilever-2\/\" target=\"_blank\" rel=\"noopener noreferrer\">Is the published gender pay gap data for an employer correct<\/a>?\u00a0 I show how SPC can be used to conclude whether the year on year change is plausible or not.<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/data-driven-decision-making-1-new-statistical-guidance-for-contaminated-land-surveys\/\" target=\"_blank\" rel=\"noopener noreferrer\">Is the land safe for human activities?<\/a>\u00a0 I was the lead author of a professional guidance document for the contaminated land industry which explains how statistics (specifically confidence intervals) can be used to make decisions on whether land is safe or not.<\/li>\n<\/ol>\n<hr \/>\n<h4><span style=\"color: #008000;\"><strong>D. P-Values<\/strong><\/span><\/h4>\n<p>The heart of traditional hypothesis testing is the calculation and the interpretation of P-Values.\u00a0 Many scientists and researchers in many fields have learned that this is how you decide if your research is statistically significant.<\/p>\n<p>Unfortunately, the use of p-values has not conformed to good statistical practice and a number of issues have emerged.\u00a0 As a result the American Statistical Association (ASA) undertook a widespread consultation to see if these issues could be addressed.\u00a0 The outcome of the consultation has been a series of guidances which are listed below.<\/p>\n<ol>\n<li>In March 2016, <a href=\"https:\/\/www.amstat.org\/\/asa\/files\/pdfs\/P-ValueStatement.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">the ASA Statement on P-Values<\/a> was published which explained how P-values can be misused. <a href=\"https:\/\/amstat.tandfonline.com\/doi\/full\/10.1080\/00031305.2016.1154108?scroll=top&amp;needAccess=true#.XJls3PZ2tdg\" target=\"_blank\" rel=\"noopener noreferrer\">The full statement can be downloaded here.<\/a>\u00a0 This was widely discussed throughout the world of research.<\/li>\n<li>In September 2016, the statement was <a href=\"https:\/\/www.youtube.com\/watch?v=B7mvbOK1ipA\" target=\"_blank\" rel=\"noopener noreferrer\">a keynote session at the Royal Statistical Society&#8217;s (RSS) conference in Manchester<\/a>.\u00a0 I am in the front row of the youtube clip taking many notes!<\/li>\n<li>In March 2019, the ASA published new guidance &#8220;<a href=\"https:\/\/www.tandfonline.com\/doi\/full\/10.1080\/00031305.2019.1583913\" target=\"_blank\" rel=\"noopener noreferrer\">Moving to a world beyond p&lt;0.05<\/a>&#8220;.\u00a0 This note is intended to be guidance on what alternatives there are to using p-values to undertake statistical inference.<\/li>\n<li>In August 2019, the Significance magazine published this article by William Cready which explored <a href=\"https:\/\/rss.onlinelibrary.wiley.com\/doi\/10.1111\/j.1740-9713.2019.01297.x\" target=\"_blank\" rel=\"noopener noreferrer\">some of the difficulties in implementing the ASA guidanc<\/a>e.<\/li>\n<li>In February 2020, I coined the hashtag #Pexit as a shorthand for &#8220;Exit from P-Values&#8221; to describe 2019 ASA statement.\u00a0 In this twitter thread, <a href=\"https:\/\/twitter.com\/MarriottNigel\/status\/1227944307979640835\" target=\"_blank\" rel=\"noopener noreferrer\">I pointed out that Brexit and Pexit have a lot in common<\/a>!<\/li>\n<\/ol>\n<p>The conclusion from the 1st link really resonates with me and is the basis of how I teach hypothesis testing in my courses.<\/p>\n<p><span style=\"color: #993366;\"><em><span style=\"text-align: left; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'Open Sans',Sans-Serif; font-size: 17.6px; font-variant: normal; font-weight: 400; text-decoration: none; float: none; background-color: transparent;\">&#8220;Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.&#8221;<\/span><\/em><\/span><\/p>\n<p>For an interesting discussion on the meaning of the word &#8220;significant&#8221;, <a href=\"https:\/\/rss.onlinelibrary.wiley.com\/doi\/10.1111\/j.1740-9713.2019.01296.x\" target=\"_blank\" rel=\"noopener noreferrer\">have a read of this article by Neil Sheldon where he recommends substituting the word &#8220;outlier&#8221; instead.<\/a><\/p>\n<hr \/>\n<h4><span style=\"color: #008000;\"><strong>E. Inference\u00a0<\/strong><\/span><\/h4>\n<p>Inference comes from the verb &#8220;to infer&#8221; and is about the drawing of conclusions (both strong and weak) from data.\u00a0 Hypothesis Testing &amp; Confidence Intervals are the main statistical methods by which we do this but they are not the only methods.\u00a0 <a href=\"https:\/\/marriott-stats.com\/nigels-blog\/stats-training-materials-forecasting-risk-modelling\/\" target=\"_blank\" rel=\"noopener noreferrer\">Forecasting and Risk Modelling<\/a> are two other options available among many.<\/p>\n<p>Here is a list of blog posts where I draw conclusions from the available data.<\/p>\n<ol>\n<li>I was asked to provide an expert opinion of the <a href=\"https:\/\/marriott-stats.com\/nigels-blog\/stats-in-the-news-3-bath-clean-air-zone-caz\/\" target=\"_blank\" rel=\"noopener noreferrer\">claim made by Bath &amp; North East Somerset Council about their proposed Clear Air Zone plan<\/a>.\u00a0 Did I change their plans?<\/li>\n<li>If you were to join an organisation where everyone was white, c<a href=\"https:\/\/marriott-stats.com\/nigels-blog\/ethnicity-1-is-all-white-alright\/\" target=\"_blank\" rel=\"noopener noreferrer\">ould you conclude that this might be due to racial discrimination<\/a>?\u00a0 I give an answer by introducing the idea of Bayesian Inference.<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/epl-201718-1-what-are-you-expecting-for-your-team-this-season\/\" target=\"_blank\" rel=\"noopener noreferrer\">Has the gap between top and bottom in the English Premier League widened?<\/a>\u00a0 I look at trends in the league placings since 1993.<\/li>\n<li><a href=\"https:\/\/marriott-stats.com\/nigels-blog\/rugby-world-cup-who-will-win-in-2019-3\/\" target=\"_blank\" rel=\"noopener noreferrer\">Are stronger teams doing better than expected in the 2019 Rugby World Cup?<\/a> I used World Rugby&#8217;s rankings to predict matches for the 2019 World Cup and ahead of the final, I evaluate the model&#8217;s performance.<\/li>\n<li><a href=\"https:\/\/rss.org.uk\/RSS\/media\/File-library\/Publications\/ICCA-RSS-guide-version-6-branded-171019-REV03-designed-covers.pdf\" target=\"_blank\" rel=\"noopener\">Understanding the use of statistics evidence in courts &amp; tribunals<\/a> &#8211; This is a joint publication by the Royal Statistical Society and the Inns of Court College of Advocacy in 2017.\u00a0 It is intended for legal professionals when confronted with statistical evidence.\u00a0 I refer to page 24 of this publication in the Youtube recording in link B6 above when I discuss whether this case was a prosecutor&#8217;s fallacy or not.<\/li>\n<\/ol>\n<hr \/>\n<p>If you would like to book a training course in <a href=\"https:\/\/marriott-stats.com\/turn-data-into-insights-with-statistical-inference\/\" target=\"_blank\" rel=\"noopener noreferrer\">Statistical Inference<\/a>, then please <a href=\"https:\/\/marriott-stats.com\/contact-us\/\" target=\"_blank\" rel=\"noopener noreferrer\">contact me<\/a>.<\/p>\n<p>For more information about my other training courses in statistics, please visit my <a href=\"https:\/\/marriott-stats.com\/training\/\" target=\"_blank\" rel=\"noopener noreferrer\">Statistical Training homepage<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Mention P-values and most people will probably shudder at some memory of an incomprehensible lecture or lesson on statistical tests.\u00a0 Words like null hypotheses, t-tests, statistical significance might pop into your mind with little understanding of what they are about.\u00a0 What you may know is that scientists have to report a p-value for any experiment [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","footnotes":""},"categories":[7],"tags":[101,36,35,99,100,93],"class_list":{"0":"post-1612","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-stats-training","7":"tag-confidence-intervals","8":"tag-evidence","9":"tag-evidence-hierarchy","10":"tag-hypothesis-testing","11":"tag-p-values","12":"tag-statistical-training","13":"entry","14":"override"},"_links":{"self":[{"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/posts\/1612","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/comments?post=1612"}],"version-history":[{"count":16,"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/posts\/1612\/revisions"}],"predecessor-version":[{"id":6772,"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/posts\/1612\/revisions\/6772"}],"wp:attachment":[{"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/media?parent=1612"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/categories?post=1612"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/marriott-stats.com\/nigels-blog\/wp-json\/wp\/v2\/tags?post=1612"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}