In mathematical statistics, the null hypothesis is a general statement that there is no relationship between two measurable groups or phenomena. Testing (rejecting or accepting) the null hypothesis – and, therefore, the conclusion of whether or not there is reason to believe that there is a connection between the two phenomena – is the central task of statistics, which gives precise criteria for rejecting the null hypothesis. For instance, an increase in the interest rate on loans will decrease the number of borrowers.
The null hypothesis corresponds to the default assertion. It is believed to be confirmed until the evidence indicates otherwise. In other words, the claim that there is no relationship between rate hikes and borrower churn is considered valid until proven otherwise.
The null hypothesis concept is used variously in the two approaches to statistical inference. In Ronald Fisher’s approach, the null hypothesis is rejected in case the observed phenomena are improbable if the null hypothesis is true. In this case, the scientist should reject the null hypothesis and accepts the alternative hypothesis instead.
If the data corresponds to the null hypothesis, it is not rejected. However, neither the null hypothesis nor the alternative is proved in either case. Instead, the null hypothesis is tested against the data, and a decision is made based on how likely or unlikely the data is. In the hypothesis testing approach proposed by Neumann and Pearson, the null hypothesis is opposed to the alternative. A guide with examples on how to write a null and alternative one can be found here.
The null hypothesis means that the considered distribution belongs to a particular class. Often the null hypothesis is formulated by specifying the value of some parameter. The adjective “null” originated in statistical terminology because the original hypothesis often observed no difference.
An alternate hypothesis is the assumption made when the null hypothesis is rejected. Typically, an alternative hypothesis is the only statement that logically negates the null hypothesis. Often, an alternative hypothesis means there is a relationship between the studied variables.
While conducting a scientific experiment, we analyze the information received to be able to choose between hypotheses. For example, suppose you believe that nature should behave in a given situation in such and such way, and you experiment to prove or disprove this. In that case, you want to state that the experimental data confirm your hypothesis and not someone else’s. In other words, we expect the data to prove that the experimental results are dependent on variables rather than differently. In most cases, there is no single “clean” experiment, so we have to repeat measurements many times to guarantee the reliability of the result. Therefore, we often need statistical analysis of the information received. It often turns out that the result depends on many factors. In this case, we need to separate the main ones from the minor ones – the grain from the husk.
For example, when a scientist wants to find a link between lung cancer and smoking, it is not enough for him/her to find one smoker who has (or did not get) lung cancer. A significant amount of data must be collected and analyzed before this scientist can argue a relationship between lung cancer and smoking. In this kind of research, the null hypothesis plays a key role. The null hypothesis essentially assumes that the outcome — the ultimate goal of any research — does not exist. As far as your search for a relationship between smoking and lung cancer goes, the null hypothesis will say that no such relationship exists. The question is, at what point will the data collected be sufficient to override this claim.
If we talk about smoking and lung cancer, the null hypothesis was ruled out long ago: no self-respecting scientist will resort to it now. But there was a time when there was simply not enough data to rule it out, and the researchers could not prove that the incidence of lung cancer among smokers and nonsmokers was not just a matter of chance. Only by having a large amount of data and thereby reducing the possibility of a random result to a minimum can the null hypothesis be ruled out.
In our example, we had to accumulate a large amount of data – scientists would say “large sample” – to rule out the null hypothesis. But it may be different. For example, Tycho Brahe, whose many years of work led to the creation of Kepler’s laws of planetary motion, simply made the most accurate measurements, which were enough to reject the null hypothesis and make sure the result was correct.
So, the next time you read a paper that claims there is a correlation between a disease and its putative cause, ask yourself if the researchers looked at enough cases before ruling out the null hypothesis.