Structure of the Chi-Squared Test

The chi-square test is used to determine if a distribution of observed frequencies differs from the theoretical expected frequencies.

Learning Objective

Apply the chi-square test to approximate the probability of an event, distinguishing the different sample conditions in which it can be applied

Key Points

A chi-square test statistic is a measure of how different the data we observe are to what we would expect to observe if the variables were truly independent.
The higher the test-statistic, the more likely that the data we observe did not come from independent variables.
The chi-square distribution shows us how likely it is that the test statistic value was due to chance.
If the difference between what we observe and what we expect from independent variables is large (and not just by chance), then we reject the null hypothesis that the two variables are independent and conclude that there is a relationship between the variables.
Two types of chi-square tests include the test for goodness of fit and the test for independence.
Certain assumptions must be made when conducting a goodness of fit test, including a simple random sample, a large enough sample size, independence, and adequate expected cell count.

Terms

degrees of freedom
any unrestricted variable in a frequency distribution
Fisher's exact test
a statistical significance test used in the analysis of contingency tables, in which the significance of the deviation from a null hypothesis can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity

Full Text

The chi-square ($\chi^2$) test is a nonparametric statistical technique used to determine if a distribution of observed frequencies differs from the theoretical expected frequencies. Chi-square statistics use nominal (categorical) or ordinal level data. Thus, instead of using means and variances, this test uses frequencies.

Generally, the chi-squared statistic summarizes the discrepancies between the expected number of times each outcome occurs (assuming that the model is true) and the observed number of times each outcome occurs, by summing the squares of the discrepancies, normalized by the expected numbers, over all the categories.

Data used in a chi-square analysis has to satisfy the following conditions:

Simple random sample – The sample data is a random sampling from a fixed distribution or population where each member of the population has an equal probability of selection. Variants of the test have been developed for complex samples, such as where the data is weighted.
Sample size (whole table) – A sample with a sufficiently large size is assumed. If a chi squared test is conducted on a sample with a smaller size, then the chi squared test will yield an inaccurate inference. The researcher, by using chi squared test on small samples, might end up committing a Type II error.
Expected cell count – Adequate expected cell counts. Some require 5 or more, and others require 10 or more. A common rule is 5 or more in all cells of a 2-by-2 table, and 5 or more in 80% of cells in larger tables, but no cells with zero expected count.
Independence – The observations are always assumed to be independent of each other. This means chi-squared cannot be used to test correlated data (like matched pairs or panel data).

There are two types of chi-square test:

The Chi-square test for goodness of fit, which compares the expected and observed values to determine how well an experimenter's predictions fit the data.
The Chi-square test for independence, which compares two sets of categories to determine whether the two groups are distributed differently among the categories.

How Do We Perform a Chi-Square Test?

First, we calculate a chi-square test statistic. The higher the test-statistic, the more likely that the data we observe did not come from independent variables.

Second, we use the chi-square distribution. We may observe data that give us a high test-statistic just by chance, but the chi-square distribution shows us how likely it is. The chi-square distribution takes slightly different shapes depending on how many categories (degrees of freedom) our variables have. Interestingly, when the degrees of freedom get very large, the shape begins to look like the bell curve we know and love. This is a property shared by the $T$-distribution.

If the difference between what we observe and what we expect from independent variables is large (that is, the chi-square distribution tells us it is unlikely to be that large just by chance) then we reject the null hypothesis that the two variables are independent. Instead, we favor the alternative that there is a relationship between the variables. Therefore, chi-square can help us discover that there is a relationship but cannot look too deeply into what that relationship is.

Problems

The approximation to the chi-squared distribution breaks down if expected frequencies are too low. It will normally be acceptable so long as no more than 20% of the events have expected frequencies below 5. Where there is only 1 degree of freedom, the approximation is not reliable if expected frequencies are below 10. In this case, a better approximation can be obtained by reducing the absolute value of each difference between observed and expected frequencies by 0.5 before squaring. This is called Yates's correction for continuity.

In cases where the expected value, $E$, is found to be small (indicating a small underlying population probability, and/or a small number of observations), the normal approximation of the multinomial distribution can fail. In such cases it is found to be more appropriate to use the $G$-test, a likelihood ratio-based test statistic. Where the total sample size is small, it is necessary to use an appropriate exact test, typically either the binomial test or (for contingency tables) Fisher's exact test. However, note that this test assumes fixed and known totals in all margins, an assumption which is typically false.

[ edit ]

Prev Concept

Categorical Data and the Multinomial Experiment

How Fisher Used the Chi-Squared Test

Next Concept