The program is meant to provide a full analysis of a two by r table. Mostly it concerns the description and analysis of differences in a dichotomous -percentage- outcome between various groups or empirical categories. The interest is then to study the presence or absence of a certain characteristic between various groups, for example, the difference in the proportion of smokers between different age groups or geographical regions. The usual contingency table analysis instruments such as chi-squares and various statistical tests are available. Additionally a few concepts from oneway analysis of variance are also applied, such as basic multiple contrasts and (LRX) variance analysis. A basic 2xr table is shown below. This table concerns two columns and five rows, the counted integer numbers found in the yellow boxes are the data used for analysis.
To do a similar analysis on data with a continuous dependent variable, such as weight in kilos or pounds, or length in cm or inches, do a oneway anova combined with the means procedure.
Table1. Example of a 2xr table:
Active and non-active respondents |
|||
---|---|---|---|
|
Active |
Not-active |
|
very young |
10 |
24 |
|
young |
15 |
15 |
|
intermediate |
19 |
21 |
|
old |
18 |
17 |
|
very old |
22 |
14 |
The procedure concerns three webpages which follow on one another.
This help page is about how to input data or a table for analysis. Another help page explains how to edit the data and specify the analysis.
For the data input procedure two columns of individual level person or object data are required, the first column is considered to be an explanatory, multinomial, categorical, factor variable which explains a dichotomous numerical outcome or dependent variable in the second column. The factor variable in the first column can be a name, numerical or text, the outcome variable in the second column is always numerical, otherwise invalid.
Read weights considers that every third value in a third column is the case weight of the previous two values. The case weight must be numerical, if not the case with its values is ignored and counted as invalid. A weighing corrected table is produced. For a discussion of data weighing and the correction applied please read this paper.
The procedure dichotomizes the variable in the second collumn into two groups, those with values below the number in the "Split data at value" box and those with values which are the same or above.
The data can be copied for example from a spreadsheet or word processor or the data can be typed in manually into the "Input data in here:" field. The numbers in the input field have to be separated by spaces, or returns, or semicolons, or colons, or tabs, but can NOT be separated by comma’s or full stops. The data in the input field is read row wise, so first the data in the first row followed by the second, the third and all the other rows.
Sort Descending. Sorts the categories of the factor variable or labels in the first column descending, otherwise ascending.
LowerCase. Lowercases all non numerical text characters of the factor variable. Use this option if you want to categorize the data case insensitive.
Show Rows limits or expands the number of rows displayed. As the procedure cannot handle very large tables with many rows you can limit the number of rows by giving an integer value in the "Show Rows" box. This box can also be used to exclude particularly high or low (after "Sort Descending") (missing) value categories from the analysis. If you want to input more data the number of rows can be expanded by giving a higher number in this box.
Solve problems into 99999.9. Change the data sequence -cariage return-line feed-tab- and the sequence -tab-cariage return-line feed- into 99999.9 if labels or delete the case if value. Wil mostly solve the problem of system missing values in data copied and pasted from SPSS. Might cause other problems.
You can also specify a table in the input field. A table concerns counted numbers such as in the yellow fields of the "Table 1. Example of a 2xr table" shown above. The table can be copied for example from a spreadsheet or word processor or the table can be typed in manually. The numbers in the input field have to be separated by spaces, or returns, or semicolons, or colons, or tabs, but can NOT be separated by comma’s or full stops. The table in the input field is read row wise, so first the males in the first row, than the females, end then the same for the second, the third and all the other rows. The table specification above gets precedence over the input field. Thus, if you have too much table in the input field the too much table is ignored, if you have too little table additional zero’s are added to the table generated for the next "check, edit and input, and specify analysis" web page.
For this procedure you must first define the number of rows in your table. A table filled with zero’s having two columns and the number of rows you requested will be generated in a new webpage with as title "check, edit and input, and specify analysis" . You can edit data which was read from the input field or insert values. You can also limit the number of rows to exclude particularly high or low (missing) values. In the "check, edit and input, and specify analysis" web page you can check boxes indicating the procedures you want and by pushing the "calculate" button you get yet another webpage with the results of the analysis.
Check "Read labels in 1st column" if you want to read the row labels from the input field. The data now consists of three columns, first the row label value, which can be any type of text, word or value, followed by two columns of numeric data. Apply the usual separators and take care that they are not part of the label. The column headings you have to manually input into the next form, these cannot be read from the input field.
The procedure is meant for relatively small tables. Number of rows is in principle limited to 120, but might be less dependent on your browser and other settings. Is also rather less with weighted data as more info has to be transferred.