to: Index and Menu

Simple Interactive Statistical Analysis


Go to procedure

Random Numbers

Input.

Push one of the distribution buttons and random numbers are generated from the distribution selected. The two top boxes allow you to set the range or scale and shape of the selected distribution. Give the number of random numbers you want in the third, 'numbers', box. 'Default' seed is related to time and will be different each time you use the procedure. Fill in another value for seed if you want to. Method is the method by which the numbers are generated. The default method 'MINSTD' is very good, but you can select another method if you want.

Applications | Discussion | LCGNon LCG | Uniform | WeibullNormalGammaBeta

Applications.

Random numbers are amazingly often used in daily life, often with the aim of excluding human preferences and dishonesty. A basic example is a dice in home games. Random numbers are used in gambling games such as bingo, poker or baccarat. Also if some selection has to be made or something has to be divided among people a random process is often the most honest "referee". The SISA website has a number of tools to decide home issues were randomness is a critical issue, such as a procedure to allocate people to groups, and a procedure to randomly select cases out of a larger pool, this procedure can also be used to simulate a dice by picking one case out of six cases.

Discussion.

It is amazing how much is written about the simple matter of drawing a random number. However, for the logical machine a computer is, doing something completely unlogical as producing a random number is impossible to accomplish. Therefore, random numbers are produced by computers using some kind of mathematical approach. Precisely what is the right approach, is the topic of much debate.

Basically the approach taken is to start with a number, called the seed, and to then apply a mathematical formulae which will produce another number in a relatively difficult to predict way. Another, different, seed will produce a completely different number, however, the same seed will always produce the same number. One calls this a pseudo random procedure, it is difficult to predict the outcome of  applying the formulae to a seed, however, applied to the same seed the outcome is completely logical and always the same. One of the advantages of pseudo random procedures is that the process can be easily replicated, as long as you remember the seed, the initial value. SISA will use a seed provided by the user or will, default, use a seed it generated itself on the basis of the computers internal clock. The seed used will be given in the programs output.

Methods whereby the random number produced on the basis of applying a formulae to a seed are used as the seed for the next number drawn, are most often used. Then, on the basis of one seed a large number of pseudo random, i.e., seemingly unrelated, numbers can be produced. All methods provided in this SISA procedure work according to this principle. The most often used procedures take a set of integer numbers between the values '0' and 'max' and then use a formulae to 'trek' through this set, each number pointing to a seemingly unrelated other number from the set, until after having produced a very large amount of unrelated numbers one starts again at the original seed. One calls this the 'cycle'. The quality of the random numbers generated is dependent on a number of factors: 1) the length of the cycle, which is dependent on 'max', which in turn is dependent on the number of bits your computer can handle, the more the better; 2) the chance of each number of the set becoming a selected number, this chance should be 1.0 for each number; 3) the extent to which the 1st, 2nd, 3rd...etc, number selected after each selected number is truly seemingly unrelated to previous numbers.

A linear congruential generator (LCG) is based on the formulae developed by Lehmer (1948), which is written {rnd(i+1)=(rnd(i)*b+a) mod max}. LCG's are at the basis of almost all random number generators currently in use, ranging from random numbers used for screensavers to random numbers used in Fortran, MsWindows and SPSS. From the method box you can select a number of the better known linear congruential generators. These generators are selected by SISA because they use different sizes of 'max'. Mostly you use the highest number of bits your computer can handle. However, do not use a procedure with more bits then your computer can handle, you can easily run into the 'overflow problem'. The 'overflow problem' is caused because computers handle large numbers by chopping of the most right hand side digits of a large number. This habit can easily endanger the pseudo randomness of a pseudo random procedure. Most browsers will handle 32 bit numbers, however, some browsers can handle 35 or even 45 bit numbers. However, do not forget that cycle length is not the only quality consideration! MINSTD, the default for SISA, is only 31 bits in length but has been extensively studied and found to produce pseudo random numbers of a very high quality. Lastly, although Lehmer's formulae is widely used problems do occur and you must inspect a set of random numbers before use. Particularly pay attention to the right most digits, if they show a short cycle you should try again with another seed or using another method.

Lower bit procedures can be used to exploit a peculiar characteristic of linear congruential generators. This is the fact that these procedures draw numbers from a set of numbers using a procedure without replication. Thus, once a certain number has been selected it will not return again in the same cycle. Although this characteristic is not often discussed by mathematicians it can be handy from a practical perspective. If you want to have a set of all different numbers randomly 'mixed' and in a certain range use the following procedure: a) draw from a 'uniform' distribution between '0' and 'max'; and, b) remove the numbers which fall outside your range. Using a lower bit procedure might make this approach considerably more efficient.

Non Linear Congruential Generators are designed to address some of the problems with LCG's. The problem is that Non Linear Congruential Generators have not been very well studied so nobody really knows about their performance. SISA has incorporated two non-LCG generators. Both are based on the principle of producing two random numbers in each draw which are then put together.

L'Ecuyer portable combined generator produces numbers with replication. An advantage of this procedure is that it requires you to only have a 32 bit computer but the cycle length is approximately 43 bits.

PaMi is the generator suggested by Park and Miller, 1988 in their article 'Random Number Generators, good ones are hard to find', 1988.

Uniform. In the uniform distribution all value's between 'lower value' and 'higher value' have an equal chance of being selected. In the SISA default: numbers are selected between '0' (zero) and 'max', the maximum value the selected method can produce. You can recalculate the range to numbers between the two value's 'lower value' and 'higher value' by giving your own values in the appropriate boxes. For example, if you want to draw a number between 'zero' and 'one' fill in '0.0' and '1.0'. Although all value's provided will stay unique, i.e, we sampled without replication, if you round your numbers after recalculation you will lose the characteristic of producing unique numbers.

Weibull allows you to draw random numbers from the Weibull distribution with the given shape and scale parameter. For a discussion of the Weibull distribution see the Weibull distribution page. The Weibull random number generator can also be used to generate numbers from the Rayleigh distribution (set alpha, the shape, at 2) and the exponential distribution (set alpha, the shape, at 1; and beta, the scale, at 1/lambda). The exponential distribution is often used in discrete simulation to model the arrival of, for example, customers at a counter with the mean waiting time until the next customer arrives being beta. Lambda (1/beta) is the hazard rate.

Weibull QQ plot SISA against R

The figure shows you the qq-plot of 5000 SISA Weibull random numbers generated with the MINSTD procedure against 5000 R random numbers using the default. As can be seen, the performance of both is very simmilar, with the exception of the very extreme values, the SISA MINSTD procedure generates these less often compared with the R default. A footnote below shows you how this figure was made.

Normal. For drawing pseudo random numbers from a normal distribution one provides a mean and a standard deviation. "Normal distribution" means that the probability of a number appearing will be high for values appearing close to the mean while the probability of a value further removed from the mean will get lower and lower the further removed from the mean. How 'flat' the distribution is depends on the value of the standard deviation. If the standard deviation equals '1' (one) then 68% of observations will fall within the range mean+ 1; 95% of observations will fall in the range mean+ 1.96. Default the value for mean will be zero, the value for the standard deviation will be 'max'. Given the fact that 32% of observations will be outside 'max' you might have an 'overflow' problem if you run at maximum bits for your computer.

Weibull QQ plot SISA against R

Normal I is SISA's old random number generator. This generator works on the principle that the sum of a very large number (vln) of random numbers in the range 0 to 1 drawn from random distributions is normally distributed with mean vln/2 and variance vln/12. Vln is in the case of sisa equal to 12, that is sufficient for most purposes. This method, based on the central limmit theory, is conceptually correct and a very elegant method to produce normal random numbers, with one flaw, it requires a large number of very random numbers from other distributions. Numbers selected using this procedure are not necessarily unique vis-a-vis each other.

The figure shows you the qq-plot of the SISA Normal 1 Super Duper random number generator compared with R normal random numbers using the default. Again, the performance of both is very simmilar, with the exception of the very extreme values. The Shapiro test on the 5000 SISA random numbers gives W = 0.9995, p-value = 0.2086, the observed difference has a high probability of being a chance phenomenon.

Normal II has been more recently implemented. This generator converts numbers from the uniform distribution into normally distributed numbers using a mathematical formula. The polar form of the well known Box-Muller (1958) transformation is used. This is an often used normal random number generator. Numbers selected using this procedure are not necessarily unique vis-a-vis each other.

Gamma draws random numbers from the gamma distribution. Input the gamma scale and shape parameters in the scale and in the shape box. The shape must be an integer number. Thus, in fact the procedure is limited to the Erlang distribution. Please note that you will have to give the numbers in the opposite order as used in the Gamma & Beta procedure.

Beta draws random numbers from the beta distribution. Input the two beta shape parameters in the scale and in the shape box. Both must be integer numbers. Please note that you will have to give the numbers in the opposite order as used in the Gamma & Beta procedure.

TOP of page

Go to procedure

to: Index and Menu

All software and text copyright by SISA

R job for figures above:
setwd("D:/quantitativeskills/sisa/calculations")
mydata <- read.table("rweibull.txt",header=TRUE)
dist = rweibull(5000, 1, 2) //or dist = rnorm(5000, 0, 1)
par(mar=c(5,5,5,2)+0.1)
qqplot(dist, mydata$outcome, main = "Weibull Q-Q Plot",xlab = "R Quantiles", ylab = "Sisa Quantiles", cex.lab=2, cex.axis=1.5, cex.main=3)
abline(0,1,col="red",lwd=3)
//for normal: shapiro.test(mydata$outcome)