Article about the poisson test for use in accident analysis


SISA Research Paper

On the mathematical relationship between the number of events in which people are injured and the number of people injured.

Previously published as: Uitenbroek DG. The mathematical relationship between the number of events in which people are injured and the number of people injured. British Journal of Sports Medicine 1995 (29/2): 126-128.

A complicating factor in the measurement of the number of injuries and the establishment of injury rates is that two different ways of counting can be used. The counting of occasions on which people were injured (accidents or events) using a particular time frame, and the counting of people injured in a similar time frame, are two methods which give very different results from a mathematical perspective. For example, if one takes throughput in an accident and emergency department as a basis for calculating injury rates then one counts the number of individual events, ie the number of injuries. This is the case because if the same person reported to the accident and emergency department the next day having sustained another injury he or she would be counted again, as a new case. However, if patient files are used as a basis for calculating injury rates one often does this by taking the proportion of files in which reports of injuries are made. In this case the number of people injured will be counted. A lower sum total than for 'events' will be found as a number of those counted will have sustained an injury on more than one occasion. A similar problem arises in retrospective surveys in which the question asked is: "how many patients or respondents were injured in the previous month or year?" In this case, again, it is the number of injured that is counted, and we have to multiply the number of injured with the average number of events per injured individual to arrive at the total number of events for the population.

This problem has been recognized by researchers and the number of occasions on which people where injured within the period under study is often established by using follow up questions. A second method is to relate the number of injured and the number of events observed in a particular time frame using mathematical theory. This is relatively easy and can be done using classical and well tested statistical methods. However, one has to be able to support some of the assumptions to do these calculations reliably and validly. Two assumptions are considered to be particularly important, 1) that any member of a population has an equal chance of being involved in an 'event' and therefore to sustain an injury and, 2) that the chance of sustaining an injury on another occasion is not influenced by having had an injury recently. On the basis of these assumptions, the chance of getting an injury can be viewed as a random process with replication; the victim is being put back in the population immediately after being 'drawn'. It will not always be possible to support these assumptions. For example, a serious or fatal injury would lower the victims chance of sustaining an injury on another occasion quite dramatically. In the psychological literature there is mention of the injury prone or clumsy personality, and of 'risk-seekers', these individuals have a higher than random chance of sustaining injuries (KerrJH, 1991; KelleyMJ, 1990; YaffeM, 1983).

However, although the assumptions stating that the injury causing 'events' are a random process with replication might not be fully met, mathematical models can be used in the description of injuries and can be used to see if sustaining injuries is a random process, or whether other factors also need to be taken into consideration.

The Poisson distribution

The first basic rule to consider is that the chance of an individual sustaining at least one injury (p_>=1) is one minus the chance of this individual sustaining no injuries (p₀) thus:

p_>=1 = 1-p₀

The proportion of individuals in a population who sustained injuries on exactly one, two, three or more occasions is given by a multiplication rule where the proportion of people who sustained one more injury (p_n+1) is calculated by multiplying a particular number of injuries (p_n) with a factor. The number of people sustaining injuries on none, one, two, three or more events is related as determined by the multiplication factor. The Poisson distribution can be best used to express the relation between the number of events and the number of injured:

p_n+1 = p_n*-ln(p0)/(n+1) (n=number of events)

The basic rule of statistics of course applies:

p₀+p₁+p₂+......pĄ _-1+pĄ = 1

On the basis of this rule the proportion of individuals who were injured on more than a particular number of injury causing occasions (p_>n) is one minus the chance of being involved in nil, one, two, up to the number in which one is interested:

p_>n = 1-p₀-p₁-p₂-......p_n

These basic rules of statistics can be applied to study the distribution of injuries in a population and to see how many people sustain an injury on one occasion, how many on two, or more occasions, and how many sustain no injuries. These rules ensure that the results obtained in mathematical calculations are also 'reasonable' in a more tentative way. For example, under these rules there will always be a lucky few who sustain no injuries, while, on the other hand, it is possible that there will be some unlucky individuals who are injured on a very large number of occasions. Further, these rules ensure that in the calculations the number of victims of injuries does not exceed the size of the population. However, the total number of occasions on which people are injured can be larger than the size of the population. This is the case if on average the number of 'events' per head of the population is higher than one, which would inevitably happen if injury causing events are studied using a long time frame. The expected average number of events per head of the population (ì) is related to the chance of not sustaining an injury (p₀) according to the following formula:

ì = -ln(p₀) and p₀ = exp(-u)

Of course, the total number of events in a population is the average number of events per head of the population times the size of the population. Lastly, two formulae to recalculate data which uses the month as frame of reference to using the year as frame of reference. First, to obtain the average number of events per head of the population per year one takes the sum of the averages for the twelve months:

ì_(year) = ì_(jan)+ì_(feb)+ì_(mar)+......ì_(dec)

However, to obtain the proportion of individuals who sustained no injury in a year, the product of the proportions found for each month needs to be considered, thus:

p_0(year) = p_0(jan)*p_0(feb)*p_0(mar)*......p_0(dec)

A practical example

Table 1. Observed and predicted number of running injuries in a Dallas fitness club.

Number of Injuries	Observed	Predicted
0 (p0)	76.0%	76.0%
1 (p1)	16.8%	20.9%
2 (p2)	5.0%	2.9%
More than 3 (p>2)	2.2%	0.3%

^{a Source: Blair SN, Kohl HW, Goodyear NN. Rates and Risks for Running and Exercise Injuries: Studies in Three Populations. Res Qua Exerc Spt 1987; 58: 221-8.
Blair et al (1987) reports data regarding running injuries sustained by members of a health and fitness club in Dallas, Texas. The data concerns 438 runners. Runners who had to stop running for at least 7 days in the last 12 months due to an injury were classified as injured. For each injured individual the number of occasions on which they were injured was also registered. Table one shows the distribution of running injuries reported by Blair et al and the predicted number of injuries on the basis of the Poisson distribution. The Chi-square for the difference between the observed and predicted scores, calculated for the injured only, equals 62.7 which gives a p-value smaller than .001 by two degrees of freedom. Thus, the observed distribution of injuries is non random. Comparing the observed with the expected distribution, the over-estimation of high numbers of injuries sustained and the under-estimation of low numbers of injuries sustained points to the fact that the injured runners who have a higher number of injuries are injury prone, it is not only a question of bad luck to be injured more than once. Blair et al discuss stretching habits, age, time and place of running, weight relative to squared height (BMI), average speed of running and average distance run per week as factors which might possibly be related to sustaining running injuries. Only BMI and average distance run were found to be related to an elevated risk of being injured. However, the non-randomness observed is very pronounced: seven times more runners who had more than two injuries were observed than could be expected had sustaining this number of injuries been due to random chance. Therefore other explanations, such as the above mentioned psychological factors, deserve future study. The chronic nature of many running injuries should also be considered. It might be that among those who stop running due to frequent injury, this might have been the same injury causing repeated problems. In this last case calculating an average number of newly sustained injuries on the basis of the data of Blair et al would have overestimated the actual number of newly sustained injuries.
Conclusion
In this paper the mathematical relationship between the number of injury causing events and the number of injured in a particular time frame was studied. These are two very different figures which have different mathematical properties. Particularly if a relatively long time frame is used in counting injury causing events and injured individuals the nature of the data one collects needs to be carefully considered. A practical example was given. This example showed that running injuries are not randomly distributed, injury proneness could be observed, ie some personal or situational factor must be related with getting more injuries. Comparing a distribution of running injuries with a random distribution can in this way be helpful in interpreting the collected data. Lastly, the author has a computer program available which can be used to calculate Poisson distributions. To obtain this program send a DOS formatted floppy disk, packaging material and a self addressed envelope. This will be returned with the computer program using 2nd class surface mail. (or find it here and also in MS-Windows)

Blair SN, Kohl HW, Goodyear NN. Rates and Risks for Running and Exercise Injuries: Studies in Three Populations. Res Qua Exerc Spt 1987; 58: 221-8.
Kelly MJ. Psychological Risk Factors and Sport Injuries. J Spt Med Phys Fitn 1990; 30: 202-21.
Kerr JH. Arousal-Seeking in Risk Sport Participants. Per Ind Diff 1991; 12: 613-16.
Yaffé M. Sports Injuries: Psychological Aspects. Brit J Hosp Med 1983; 29: 224-32.

TOP of page

Compare Car Rentals!

Help SISA and compare two rental cars!

An easy way to find the best option.

www.quantitativeskills.com

SISA Research Paper}