Introduction
The Mann-Whitney U Test is a non-parametric statistical technique used to investigate the differences between two independent distributions. The test is widely used to measure the relation between two variables when there is no underlying assumption of normality with the same sample size. It is an appropriate test for many situations such as two groups of different sizes, two groups with variables that are not normally distributed, or for paired data where the assumptions of normality, equality of variance and linearity cannot be guaranteed. The Mann-Whitney U Test is also known as the Wilcoxon Rank Sum or Wilcoxon Test.
Background
The Mann-Whitney U Test is a type of non-parametric test which is used to compare two different sets of data. It allows us to determine if the distributions of the two data sets are significantly different, and it is often used when the assumptions of parametric tests cannot be satisfied. This test is named after Henry Mann and Donald Whitney who developed it in 1947. While the Mann-Whitney U Test is similar to the t-test in its ability to identify significant differences between two data sets, it does not require the data to be normally distributed and does not require the two data sets to have equal variance.
Methodology
The Mann-Whitney U Test is a non-parametric hypothesis test and the steps that are performed in order to apply the test are outlined here. Firstly, the two data sets are combined and the samples from each data set are given a rank. So in a data set of two groups, all the values from the first group are ordered from smallest to largest as are the values from the second group. A rank is then assigned to each sample starting from one for the smallest value. If there are multiple samples with the same value, they are all assigned the same rank, which is the average of the ranks that correspond to that value. The ranks are then added up for each group and the Mann-Whitney U statistic is calculated.
The formula used to calculate the Mann-Whitney U statistic is:
U = N1N2 + (N1 N2 (N1 + N2 + 1)/12) – R1
Where N is the sample size, R1 is the sum of ranks in the first group, and N1 and N2 are the numbers of samples in the first and second groups, respectively.
The Mann-Whitney U statistic is then compared to a critical value from a table of pre-defined values to determine the significance of the results. The accepted null hypothesis for the Mann-Whitney U Test is that the two data sets are from the same population, and if the calculated statistic is less than or equal to the critical value then this hypothesis is accepted and the two data sets are deemed to be from the same population. If, however, the calculated statistic is greater than the critical value then the null hypothesis is rejected and the two data sets are deemed to be from different populations.
Conclusion
In conclusion, the Mann-Whitney U Test is a non-parametric hypothesis test which is used to compare two independent data sets. It is a suitable test in many circumstances where the assumptions of parametric tests cannot be met, such as when the underlying distribution is not normal or when the two data sets are of different sizes. The test involves combining the two datasets and assigning a rank to each sample, and then uses this rank to calculate the U statistic which is compared to a critical value from a predefined table. Depending on the calculated U statistic, the null hypothesis of equal populations is either accepted or rejected.