A discussion of the intra correlation coefficient

SISA Research Paper

About the Intra Correlation Spreadsheet

The intra-correlation coefficient is used for a number of purposes, two of which are discussed here. (we will use the examples of Fleiss to discuss the methods for the single cluster, Fleiss’s method is far better known, however, the method by Bennet et.al. seems to produce better results in practice).

First the intra-correlation is used as an alternative to the kappa measure of agreement in case of multiple raters or judges considering the presence or absence of a trait on a larger number of items (objects) or individuals (subjects). High kappa means that there is a high level of agreement between the judges with regard to the presence of the trait. The method proposed by Fleiss (1982) concerns the example of a single cluster of judges. The data could have been produced by a group of doctors judging (i=25) patients on the absence or presence of a disease. In the case of the spreadsheet the doctors do 81 judgments and in 46 cases they declare that the patient is diseased. The kappa equals 0.54577, which is a reasonable level of agreement. The expansion of Fleiss by Donner and Klar (1994) adds an additional dimension. You can imagine this example as a group of doctors judging two groups of patients, one having had treatment (table 1, 12 patients), the other group not having had treatment (table 2, also 12 patients). In table one the doctors did 1341 judgments and found the 12 patients diseased on 58 occasions, for table 2 the numbers are 1479 and 91 judged and found diseased respectively. The kappa equals 0.011, there is very little agreement between the doctors. For a further discussion of kappa please see the SISA-tables help-file.

Second the intracorrelation is used as a measure of clustering in data that is collected using a multistage, multilevel, procedure of data collection. This is the case if, for example, a number of schools are selected randomly and in each school a number of pupils is sampled for study. The analysis is taking place at the pupil level. If there is intracorrelation the standard errors of statistics at the pupil level are estimated too narrow, tests become statistically significant too quickly and confidence intervals will be too small. The intracorrelation coefficient gives you an insight as to what extent the observations at the individual level are influenced by clustering of observations in higher level groups, which is the case if, for example, schools are particularly good, and others particularly bad, at the trait measured with the dependent variable. If the intracorrelation coefficient equals 0 the schools are not different with regard to the independent variable, all the pupils could have been sampled from a single school and the result of the analysis would have been the same. The number of pupils is the correct sample size. If the intracorrelation coefficient equals 1 the schools are totally different and pupil performance is totally influenced by the school, effective sample size is the number of schools, and not the number of pupils. The example by Fleiss can be seen as the case of a single proportion, the proportion of pupils that passed a test. In the spreadsheet the data concerns 25 schools with 81 pupils, 46 were tested positive. The example of Donner and Klar is the case of two proportions, treatment and no-treatment, being compared in a two by two table. Donner and Klar discuss in their article various adjustment procedures for common statistics used in two by two tables, such as the Chi-square and the t-test, to consider intracorrelation. It should be noted in this particular field that the fast developing sample-resample, bootstrapping, methodology is a powerful, but unfortunately still complex, alternative to the use of the intracorrelation coefficient in clustered data. You should consider using a computer package such as Wesvar instead of this spreadsheet. Also, resampling procedures are increasingly available in standard packages, such as in “R”. For a simple discussion of the performance of the spreadsheet compared with Wesvar go here. You will find the spreadsheet it self here.

Bennett S, Woods T, Liyanage WM, Smith DL. A simplified general method for cluster-sample surveys of health in developing countries. World Health Statistics Quarterly 1991;44(3):98-106.->Medline

Donner A, Klar N. Methods for comparing event rates in intervention studies when the unit of allocation is a cluster. American Journal of Epidemiology 1994;140(3):279-289.->Medline

Fleiss JL. Statistical methods for rates and proportions, 2nd edition. New York [etc.]: John Wiley 1982.

TOP of page

Compare Car Rentals!
Help SISA and compare two rental cars!
An easy way to find the best option.
www.quantitativeskills.com

SISA Research Paper