Hypothesis Testing

Nadeeha Salam
5 min readJul 5, 2021

“Science is advanced by proposing and testing hypothesis, not by declaring questions unsolvable”- Nick Matzke

We make assumptions or claims almost every day, but have you ever wondered how we can propose an explanation or support our assumptions mathematically. Here’s exactly where Hypothesis testing comes into the picture. To put it in simple terms, hypothesis testing validates our claims and assumptions. So, how can we make use of hypothesis testing as data scientists?

Let’s try to understand Hypothesis Testing in more general terms first and then dig into the concepts of Hypothesis statistically.

Presumption of Innocence: Are you really guilty or innocent?

I don’t think there’s any other example as simple as the presumption of Innocence for explaining Hypothesis Testing. The presumption of innocence is a legal principle stating that the accused is innocent until proven guilty. Hence, if you are being accused of a crime, you are innocent under the eyes of the court until the prosecution proves that you are guilty. That is, you being guilty is an assumption by the prosecutor and it is up to the prosecution to prove that you are a criminal which is essentially a hypothesis testing of your innocence.

In a problem scenario concerned with Data Science, the evidence to support the claim or disprove the claim is a probability and the assumptions as per problems form either the null hypothesis or an alternative hypothesis.

Terms related to Hypothesis Testing

First, let’s define a problem statement and solve it as we go along the concepts.

Suppose that the average marks of grade 10 students for 50 students for a subject were found to be 81. Carry out a statistical test to determine if average marks are greater than 77, assuming that marks are normally distributed. It is known that the standard deviation of marks is approximately 10.

Now, let’s understand the different terms related to Hypothesis Testing.

Hypothesis: An assumption about parameters of population distribution.

Hypothesis Test: Standard steps for testing a claim about parameters of a population.

Rare Event Rule for Inferential Statistics: If probability of an event is very small for a given assumption, then it’s good to conclude that assumption is wrong.

Components of a formal hypothesis test:

  1. Null Hypothesis & Alternate Hypothesis:

Null hypothesis- For a statement or claim, identify the null hypothesis (the statement that value of a population parameter is equal to a proposed value).

Eg: In our given example, the population mean which describes the entire mean of a population is 77 which is essentially the already proved value here, hence the null hypothesis statement is that the average marks is equal to 77.

Alternate hypothesis-While the proved statement becomes the null hypothesis, the new claim against null hypothesis becomes your alternate hypothesis.

Eg: Here, we are asked to test if average marks are greater than 77; hence this would easily be our alternative hypothesis.

i.e,

Null hypothesis and alternate hypothesis for the given problem

Two-tailed test : null_hypothesis: = , alternate_hypothesis: ≠

Right tailed test : null_hypothesis: = , alternate_hypothesis: >

Left tailed test : null_hypothesis : = ,alternate_hypothesis : <

For the given problem, it is a right tailed test.

Test Statistic: For computing the value of a test statistic, we first assume that the null hypothesis is true; then the sample statistic is converted to a score that defines the test statistic.

There are different formulas that can be used for calculating test statistics depending on whether the test statistic is for proportions, mean and variance.

Test statistic for Proportion
Test Statistic for Mean
Test Statistic for variance

In the example, since hypothesis is based on population mean, let’s choose the test statistics for mean formula to calculate the z score test statistic for mean.

i.e,

Test statistics and other parameters for the problem

Significance Level: Significance level is similar to a threshold value and it defines the amount of evidence needed to reject the null hypothesis in favour of the alternate hypothesis.

Here, let’s assume the value of alpha as 0.05 or 5 percentage.

ie,

The significance value of 0.05

Critical Value & critical region: Critical region contains all the values of test statistics which may lead to the rejection of the null hypothesis. Critical values separate critical regions from test statistics.

P-Value: P-value corresponds to the probability of getting test statistic considering the null hypothesis to be true.

Note: If p-value is very small (less than 0.05), the null hypothesis is usually rejected.

eg; we can compute probability value corresponding to test statistic value 2.82 using python’s statistics packages as follows:

P value was found to be 0.0024

When do we reject a null hypothesis or fail to reject a null hypothesis?

Case 1. P-value significance level — reject the null hypothesis

case 2. P-value >significance level —fail to reject the null hypothesis

eg: Here p-value is less than significance value i.e, 0.0024 < 0.005 and hence we could confidently reject the null hypothesis of average mark equal to 77.

Type I Error & Type II Error: Just like any other testing process, errors are prone to happen even in the case of hypothesis testing. Here, choosing the significance level could actually make or break the test.

Power of Hypothesis Test: It refers to the probability of rejecting false null hypothesis or in other words, it is probability of supporting an alternative hypothesis which is true.

To conclude a hypothesis testing has the following steps:

  1. Formulate the null hypothesis and alternate hypothesis from the problem statement.
  2. Identify the type of tail: two-tail, right tail or left tail.
  3. Calculate the value of test statistic.
  4. For a significance level, find critical values
  5. For the test statistic found, identify the p-value.
  6. Compare the p-value with the significance level to get inferences.
  7. If p-value < significance level, reject the null hypothesis; else fail to reject the null hypothesis.

--

--