to: Index and Menu

Simple Interactive Statistical Analysis

Go to procedure

Two by two tables

Input.

Fill the values in the table. Are supposed to be integer values, whole positive numbers without decimals.

Explanation.

Pearson Chi-square
Likelihood Ratio Chi-Square
Yates Chi-square
Mantel Heanszel Chi-square
Risk Ratio
Efficacy (VE)
Odds Ratio
Log Odds Ratio
Yules-Q
Yules-Y
Phi-square
Pearson correlation
Kappa
McNemar Test
Fisher Exact

Two by two tables provides you with various statistics and measures of association. In this page we will try to explain the measures concerned, however, many of these measures are difficult to understand. Before applying these statistics acquaint yourself very well with your data, the meaning of your variables, and build up a good understanding of what "relationship" and "independence" means in the case of your data.

All four Chi-squares presented give probability values for the relationship between two dichotomous variables. They calculate the difference between the data observed and the data expected, considering the given marginals and the assumptions of the model of independence. The four Chi-squares give only an estimate of the true Chi-square and associated probability value, an estimate which might not be very good in the case of the marginals being very uneven or with a small value (~less than five) in one of the cells. In that case the Fisher Exact is a good alternative for the Chi-square. However, with a large number of cases the Chi-square is preferred as the Fisher is difficult to calculate.

Learn more about the Chi-square-test from Statistics at Square One.

TOP / to procedure

Pearson"s Goodness of Fit Chi-square (GFX) is most often used in research. Pearson"s Chi-square is mathematically related to the classical Pearson"s Correlation co-efficient and to Analysis of Variance.

TOP / to procedure

Likelihood Ratio Chi-square (LRX) was developed more recently than the Pearson chi-square and is the second most frequently used Chi-square. It is directly related to log-linear analysis and logistic regression. The LRX has the important property that an LRX with more than one degree of freedom can be partialised into a number of smaller tables each with its own (smaller) LRX and (lower numbers of) degrees of freedom. The sum of the partial LRXs and associated partial degrees of freedom, as found in the smaller tables, equals the original LRX and original number of degrees of freedom.

TOP / to procedure

Yate"s Chi-square is equivalent to Pearson"s Chi-square with continuity correction.

TOP / to procedure

Mantel Heanszel Chi-square Mantel-Haenszel Chi-square is thought to be closer to the "true" Chi-square if small numbers of cases are involved. It is not often used. If you have doubts about your results, use Fisher Exact instead.

TOP / to procedure

Risk-ratio. The risk ratio takes on values between zero ("0") and infinity. One ("1") is the neutral value and means that there is no difference between the groups compared, close to zero or infinity means a large difference between the two groups on the variable concerned. A risk ratio larger than one means that group one has a larger proportion than group two; if the opposite is true the risk ratio will be smaller than one. If you swap the two proportions, the risk ratio will take on its inverse (1/RR).

The risk ratio gives you the percentage difference in classification between group one and group two. For example, the proportion of people suffering from complications after traditional surgery equals 0.10 (10%), while the proportion suffering from complications after alternative surgery equals 0.125 (12.5%). The risk ratio equals 0.8 (0.1/0.125); 20% ((1-0.8)*100) fewer patients treated by the traditional method suffer from complications. Another example: 8% of freezers produced without quality control have paint scratches. This percentage is reduced to 5% if quality control is introduced. The risk ratio equals 1.6 (8/5); 60% more freezers are damaged if there is no quality control.

TOP / to procedure

Efficacy (VE) The medicine or vaccine (VE) efficacy is an outcome measure from a randomized controlled trial. The measure show the proportional reduction in the negative outcome in the treated experimental group as opposed to the untreated control group. Note that the efficacy can only be generalized to the population from which the study population is obtained. Ideally the study population is a random sample from this population. If diseases change the efficacy of a treatment will also change overtime, often not for the better. The efficacy is the upper value of the effectiveness, the real world effect of a treatment or vaccine. For example, if a treatment has many side effects uptake might be low and the effectiveness of the treatment to solve public health problems will then also be low. Same if there are quality issues with the vaccine, logistical difficulties or many in the population who fall in a disease risk group. In the t-test procedure outcome in the experimental groups goes in the top (E) box, in the two by two table procedure in the top left (1,1) cell. Negative efficacy means that you are better off in the control group.

The Pfizer/Biontech covid 19 vaccine (Polack FP. et.al: 2020) shows 8 positive cases in 18,198 people observed (0.044%) in the vaccinated group, and 162 in 18,325 (0.884%) in the unvaccinated group (table 2). Efficacy according to Pfizer (0.884-0.044)/0.884=95.0% (95% CI: 90.3–97.6) according to the SISA t-test procedure 95.0% (95% CI: 89.9-97.6). So, the probability to become ill from Covid 19 is about 95% less in the vaccinated compared with the unvaccinated group.

TOP / to procedure

Odds-ratio The odds ratio takes values between zero ("0") and infinity. One ("1") is the neutral value and means that there is no difference between the groups compared; close to zero or infinity means a large difference. An odds ratio larger than one means that group one has a larger proportion than group two, if the opposite is true the odds ratio will be smaller than one. If you swap the two proportions, the odds ratio will take on its inverse (1/OR). The odds ratio gives the ratio of the odds of suffering some fate. The odds themselves are also a ratio. To explain this we will take the example of traditional versus alternative surgery. If 10% of operations results in complications, then the odds of having complications if traditional surgery is used equals 0.11 (0.1/0.9, you have a 0.11 times higher chance of getting complications than of not getting complications). 12.5% of the operations using the alternative method result in complications, giving odds of 0.143 (0.125/0.875). The odds ratio equals 0.778 (0.11/0.143). You have a 0.778 times higher chance of getting complications than of not getting complications, in traditional as compared with alternative surgery. The inverse of the odds ratio equals 1.286. You have a 1.286 times higher chance of getting complications than of not getting complications, in alternative as compared with traditional surgery. This takes some getting used to, we admit, but it has its advantages. The odds ratio can be compared with the Risk Ratio. The risk ratio is easier to interpret than the odds ratio. However, in practice the odds ratio is used more often. This has to do with the fact that the odds ratio is more closely related to frequently used statistical techniques such as logistic regression. Also, the odds ratio has the attractive property that, however you turn the table, it will always take on the same value or the inverse (1/odds) of that value.

TOP / to procedure

Log-odds, the natural logarithm of the odds-ratio, does not have an easy to understand meaning. However, the log-odds is symmetric, running from minus infinity to plus infinity, with zero being the neutral value. This makes it easier to compare negative with positive associations.

TOP / to procedure

Yules Q Yule"s Q is based on the odds ratio and a symmetric measure taking on values between -1 and +1. 1 (one) implies perfect negative or positive association, 0 (zero) no association. In two by two tables Yule"s Q is equal to Goodman and Kruskal"s Gamma. The interpretation of Q as Gamma is easiest to understand. Each observation is compared with each other observation, these are called pairs, the relationship between two observations. If an observation is higher in value as another observation on both the horizontal and the vertical marginals, the pair of observations is called concordant, if this is not the case the pair is discordant. The Gamma is the ratio of concordant pairs on the total number of pairs. A high Gamma means that there is a high proportion of concordant pairs, high values on the vertical marginal tend to go with high values on the horizontal marginal.

TOP / to procedure

Yules Y is based on the odds ratio and a symmetric measure taking on values between -1 and +1. 1 (one) implies perfect negative or positive association, 0 (zero) no association. The measure tends to estimate associations more conservatively than Yule"s Q. The measure has little substantive or theoretical meaning.

TOP / to procedure

Phi-square is simply the Pearson Chi-square divided by the number of cases. It takes on the value 0 (zero) if there is no association between the variables, the value is one if there is perfect association. The Phi-square is equal to the square of the Pearson correlation coefficient. This relationship with the correlation coefficient means that the Phi-square gives you the proportion of variance in one variable explained by the variance in the other variable (This is considered very meaningful by many. However, Reynolds correctly remarked that explained variance might have mathematical meaning, it does not necessarily mean anything substantive or theoretical).

TOP / to procedure

Pearson correlation coefficient. Square for explained variance.

TOP / to procedure

Kappa measure of agreement. Kappa takes on the value zero if there is no more agreement between two judges or tests as can be expected on the basis of chance. Kappa takes on the value 1 if there is perfect agreement; all observations are on the diagonal from the upper left to the bottom right, the diagonal of agreement. It is considered that Kappa values lower than 0.4 represent poor agreement, values between 0.4 and 0.75 fair to good agreement, and values higher than 0.75 excellent agreement. Negative Kappa indicates an application problem

TOP / to procedure

McNemar Change Test. This test studies the change in a group of respondents measured twice on a dichotomous variable. It is customary in that case to tabulate the data in a two by two table. If as many respondents changed from A to B as changed from B to A then the number of respondents in the bottom left and top right cell, the diagonal of changers, would be equal. If the number in the two cells is not equal this indicates a certain direction in the change observed. The McNemar indicates to what extent the observed direction in the change is caused by chance. For a slightly more thorough discussion of this test consult the pairwise help page.

TOP / to procedure

Fisher Exact. Gives various exact probablities for this table. An extensive description of Fisher Exact analysis is provided on the Fisher help page .

TOP / to procedure

Technical Discussion.

The algorithm to calculate the significance of the Chi-square comes from Poole et al, the algorithm is also mentioned in the "Epi-Info" manual (1994).

Go to procedure

to: Index and Menu

All software and text copyright by SISA