Contents: Ordered categories as dependent variables; Introduction to LIMDEP; Basic LIMDEP commands; Using LIMDEP to carry out ordered logit analysis; Interpreting the results from ordered logit analysis; Further reading and acknowledgement; References; Examples.

If we have a dependent variable which is measured only on an ordinal
scale, strictly speaking we cannot use linear regression to examine it.
However, in practice, so long as the dependent variable has a reasonable
number of levels, regression will work perfectly adequately. If the dependent
variable is dichotomous, we can use discriminant analysis or logistic regression.
But what about the intermediate case, where the dependent variable has
3 to perhaps 6 different levels? In such cases, ordinary linear regression
may give misleading results. We need to use **ordered logit** analysis.
There are a number of types of ordered logit model; what is described here
is the one most commonly used, called the **proportional odds** model,
which uses **cumulative logits**.

Like logistic regression, ordered logit uses maximum likelihood methods,
and finds the best set of regression coefficients to predict values of
the logit-transformed probability that the dependent variable falls into
one category rather than another. Logistic regression assumes that if the
fitted probability, *p*, is greater than 0.5, the dependent variable
should have value 1 rather than 0. Ordered logit doesn't have such a fixed
assumption. Instead, it fits a set of cutoff points. If there are *r*
levels of the dependent variable (1 to *r*), it will find *r*-1
cutoff values *k*_{1} to *k*_{r-1} such that
if the fitted value of logit(*p*) is below *k*_{1}, the
dependent variable is predicted to take value 0, if the fitted value of
logit(*p*) is between *k*_{1} and *k*_{2},
the dependent variable is predicted to take value 1, and so on. As with
logistic regression, we get an overall chi-square for the goodness of fit
of the entire fitted model, and we can also use a chi-squared test to assess
the improvement due to adding an extra independent variable or group of
independent variables. As with logistic regression, a crucial piece of
information for evaluating the fit of the model is a table of predicted
versus observed category membership.

*back to top*

Unfortunately ordered logit is not available in SPSS. It can be done in SAS, but we have no licence for that at Exeter. However, ordered logit can also be found in a package called LIMDEP, which specialises in fitting models with LIMited DEPendent variables (though it also contains procedures for doing ordinary regression). It was written with econometricians in mind, and is most used in economics departments; however, we have a licence for Version 6.0 on Singer. LIMDEP is not as comprehensive as SPSS, and not as easy to use as Minitab, but it is not difficult to use it for a restricted purpose such as doing an ordered logit analysis on data which we have already prepared for SPSS.

Even if we only want to use LIMDEP to carry out this single type of analysis, however, we need to know a little bit about it: its command syntax, how to prepare a data file for the analysis, and how it deals with the basic functions every statistics package must implement. These basic functions include starting a session, reading in a text file, assigning names to columns of data, transformations and other calculations based on columns of data, dealing with missing values, creating and using dummy variables, finishing a session, creating a file of commands that can be edited and reused, and sending output to a file for editing or printing.

All these are described in the comprehensive LIMDEP manual (800+ pages). The abridged version (200 pages) is fine if you have once known what to do and just need reminding, and for some purposes is as good as the full manual. The manuals refer to our version as the "mainframe" version; the PC version, around which the manuals are based, contains some additional facilities. For example, there is no usable HELP facility on the Singer version.

*back to top*

This section aims to give you just enough information to enable you
to take data that you have prepared for SPSS (or produced from SPSS or
Minitab using a **WRITE** command) and get them into LIMDEP ready for
an ordered logit analysis. It does not cover all the facilities of LIMDEP,
which include some useful short cuts not available in SPSS. If you find
yourself using LIMDEP a lot, read through the abridged manual to find out
what is available.

In this section, LIMDEP commands are printed in **bold** type; these
should be typed in to the computer exactly as given here. The bits in *italic*
type are where you have to substitute in information that is specific to
your project.

- LIMDEP command syntax. Like SPSS and Minitab, LIMDEP has commands and
subcommands (though the LIMDEP manual does not use the word "subcommands"
for the latter). All commands and subcommands can be typed in upper or
lower case or any mixture. In these notes, commands will be put in UPPER
CASE and subcommands in lower case. Subcommands are separated by semi-colons
(same as Minitab; SPSS uses /). The entire command is terminated by $ (Singer
SPSS uses .). LIMDEP variable names consist of 8 letters or numbers, starting
with a letter (similar to SPSS).

- Preparing data files for LIMDEP. As for SPSS or Minitab, we want the
data to appear in orderly columns, with all the data for the first person
followed by all the data for the second person, etc. Text data files prepared
for use with SPSS or Minitab, or produced from either with its WRITE command,
should be suitable provided they are sensibly formatted. There is one catch,
however: the dependent variable for an ordered logit analysis MUST be coded
with values 0, 1, 2..., NOT 1, 2, 3..., -1, 0, 1, or any of the other logically
equivalent possibilities. If an alternative code has been used, the values
must be converted either before transferring the data to LIMDEP, or after
they have been read in but before an ordered logit analysis is attempted
(you would use the
**COMPUTE**command for this, see below). Missing values should preferably be coded with a digit or pattern of digits that will never occur in real data (as in SPSS), though an asterisk (as in Minitab) can be used (but see below)

- Starting a session. At the Singer prompt, type
**limdep**. This will produce an introductory screen. Type**start**. This will give you the LIMDEP prompt; confusingly, you are taken to the line below to type in your response.

- Reading in a text file. All data must be numerical. The command is

**READ;nvar=***number of variables***;file=***name of file***;names=***full list of names***$**This version of the command assumes that the data are typed case by case, as for the SPSS

**DATA LIST**command, or Minitab**READ**. The subcommands**nvar**,**file**, and**names**must all be present.*Note that the names in the list must be separated by commas, but there must not be a comma after the last name.*

- Assigning names to columns of data. This is done directly by the
**READ**command, or by other commands such as**CREATE**, see below (similar to SPSS). There is no way of assigning names to levels of variables (i.e. no equivalent of SPSS**VALUE LABELS**)

- Transformations and computations based on columns. The usual command
is

**CREATE;if(***logical expression***)***name***=***expression***$**This sets the value of a variable called

*name*equal to*expression*for all cases in which*logical expression*is true; if*name*already exists, its values are overwritten. For cases where*logical expression*is false,*name*is set to 0 if it is a new variable, or left unchanged if it already exists.**if(***logical expression***)**can be omitted. If**if**is present, the subcommand**else***name***=***expression*may be used; the two*names*need not be the same. All the usual logical and arithmetic operators (>, <, =, &, +, -, etc) are available, as are standard mathematical functions. The procedure is very similar to SPSS**IF**or**COMPUTE**, and in simple cases it is similar to Minitab**LET**.

- Missing values. Numerical codes (from SPSS) or
alphabetical codes (in particular the Minitab * code) can both be used
in LIMDEP data files. However, alphabetical codes (including *) are changed
to -999 on input.

We have to tell LIMDEP explicitly to ignore cases including the specified missing values. This can be done with the commands

**SAMPLE;all$**

REJECT;*logical expression***$**An example of a

*logical expression*would be**age=99+incomegp=9**, where 99 is the missing value code for**age**and 9 is the missing value code for**incomegp**. Note the use of**+**to mean 'or', and the absence of parentheses between expressions linked by**+**. The**SAMPLE;all**command restores the full data set; successive**REJECT**s without intervening**SAMPLE**commands would have the same effect as linking in further logical expressions by**+**. Note that we have to set up the correct**REJECT**s before doing our analysis; the analysis commands do not themselves detect missing values.

- Creating and using dummy variables. This has to be done by brute force,
as in the following example, based on a 5-level categorical variable,
**worktype**:

**CREATE;if (worktype=1) fulltime=1$**Note that we don't have to worry about missing values in the dummy variables, because we will deal with them by setting up a

CREATE;if (worktype=2) parttime=1$

CREATE;if (worktype=3) housespo=1$

CREATE;if (worktype=4) unemploy=1$

CREATE;if (worktype=5) retired=1$

**REJECT**based on**worktype**.

- Finishing a session. To leave LIMDEP, type
**STOP$**(same as Minitab; SPSS uses**FINISH**)

- Setting up a file of commands to re-use. Strings of commands, using
exactly the same syntax as outlined here, can be prepared as text files
using any Singer text editor. Then if you enter LIMDEP as usual, and enter
the command

**OPEN;input=***filename***$**your commands will be read in and executed. If you don't want to see intermediate results, precede

**OPEN by the command FAST$**. When all the commands in the file have been executed, the LIMDEP prompt will appear, and you can continue working interactively.

- Sending output to a file for editing or printing. This is done by the
command

**OPEN;output=***filename***$**Like Minitab

**OUTFILE**, this copies subsequent screen output (or most of it) to*filename*. To stop copying, use the command**CLOSE$**(compare Minitab**NOOUTFILE**).

*back to top*

The command for ordered logit in LIMDEP is the following:

**ORDERED PROBIT;lhs=***name of DV***;rhs=one,***names of
IVs***;logit;output=5$
**The command name can be abbreviated to

According to the manuals, a subcommand

Output from the program is pretty well self-explanatory, except for
three problems. The first problem is that the output is headed "Ordered
Probit Model", which is confusing since ordered probit and ordered
logit are two different kinds of analysis. The heading arises because LIMDEP
uses the same command for both (the subcommand **logit** tells it which
version we want). Second, before LIMDEP embarks on the ordered logit analysis,
it does an approximate linear regression, and reports the results. It's
easy to read these by mistake instead of the results we actually want.
It is probably sensible to delete them from the output file before printing
it out, to save paper and to avoid confusion. Finally, if you forget to
recode the dependent variable so it starts from level 0, you will get the
message "Insufficient variation in dependent variable", which
is unlikely to help you realise what has gone wrong.

The same command syntax is used for all LIMDEP's model-fitting commands.
For example, a linear regression would be carried out by

**REGRESS;lhs=***name of DV***;rhs=one,***names of IVs***$**

*back to top*

The same five questions can be addressed as with other regression-type analyses that we have considered previously:

*How well does the model account for the data?*Ordered logit does not produce an*R*^{2}_{adj}statistic. It does produce a chi-squared value for the model, and this could probably be converted to the*LRFC*_{1}statistic recommended by Darlington (1990, chapter 18) for logistic regression, but I have not found a statistical authority for doing this. However the most useful process is probably to examine the**classification table**produced at the end of the analysis, comparing actual group membership with membership predicted on the basis of the model. As well as giving us a measure of goodness of fit (what proportion of cases were correctly predicted?) this may alert us to problems with the analysis - for example if the model does not predict any cases in one or more of the categories.

*Is the overall relationship between the IVs and the DV significant?*This is addressed by the**log-likelihood ratios**for the model. The LIMDEP output will include the log likelihood ratio for the null model, in which the coefficients for all regressors are taken as zero, and also for the fitted model. The difference between these two LLRs, multiplied by two, is distributed like chi-squared with degrees of freedom equal to the number of IVs, and so can be used to test the overall significance of the model. In the same way, we can test the significance of adding a group of regressors to a model.

*What is the effect of an individual IV on the DV in the presence of all the other IVs?*LIMDEP gives regression coefficients which can be interpreted in the usual way, though note that as in logistic regression they give the effects of a unit increase of the IV on the log odds of the DV taking a higher value, not on the DV itself.

*Is the effect of an individual IV significant in the presence of all the other IVs?*This is tested by a*t*value associated with each IV, exactly as for linear regression. Note that unlike SPSS's**LOGISTIC REGRESSION**command, it is a*t*rather than a chi-squared test statistics that is produced. The mathematics of this mean that marginally significant values should be regarded with caution where sample sizes are small.

*What are the relative importances of IVs in predicting the DV value?*As in logistic regression, we cannot calculate**beta-weights**from the regression coefficients, but we can get a measure of the relative importance of different IVs by multiplying each by the standard deviation of the corresponding independent variable. (LIMDEP does not, unfortunately, do this for you, though it does give the standard deviation of each IV in the regression table). Note, though, that these quotients do not have the interpretation they have in linear regression, of being the regression coefficients you would get if you reran the regression after**standardising**the variables: because of the categorical nature of the DV, it would not be meaningful to standardise it.

*back to top*

If you can cope with the maths, much information is to be found in the books by Agresti (1984, 1990). For further details about LIMDEP, see Greene (1992). For examples of the use, interpretation, and presentation of the results of ordered logit analysis, see Lea, Webley and Levine (1993) and Lea, Webley and Walker (1995).

My knowledge of both ordered logit and LIMDEP is largely owed to help from Dr Nichola Crichton, formerly of the MSOR department. I have a set of notes she wrote for me which are more help than any of the books, and I will lend them to anyone who has to get into this analysis.

*back to top*

- Agresti, A. (1984),
*Analysis of ordinal categorical data*. New York: Wiley - Agresti, A. (1990),
*Categorical data analysis*. New York, Wiley. - Darlington, R. B. (1990),
*Regression and linear models*. New York: McGraw-Hill. - Greene, W. H. (1992).
*LIMDEP user's manual and reference guide*. Belport NY: Econometric software. - Lea, S. E. G., Webley, P. & Levine, R. M. (1993). The economic
psychology of consumer debt.
*Journal of Economic Psychology*, 14, 85-119. - Lea, S. E. G., Webley, P. & Walker, C. M. (1995). Psychological
factors in consumer debt: money management, economic socialzation, and
credit use.
*Journal of Economic Psychology*, 16, 681-701.

*back to top*

The file **/singer1/eps/psybin/stats/hefce.txt** contains (hypothetical)
data from a study in which 100 recent graduates rated the teaching in their
departments as "excellent", "satisfactory" or "unsatisfactory".**
**It could be read into SPSS with the following command file, which is
available as /singer1/eps/psybin/stats/hefce.in on Singer; it is also available
on the fileserver PSYCHO.

DATA LIST file='/singer1/eps/psybin/stats/hefce.txt' / gender 1 dept 3 origin 5 examclas 7 rating 9. VALUE LABELS gender 1 'male' 2 'female' / dept 1 'chemistry' 2 'law' 3 'history' / origin 1 'UK' 2 'EU' 3 'overseas' / examclas 1 'first' 2 'two1' 3 'two2' 4 'third' / rating 1 'unsatisfactory' 2 'good' 3 'excellent'. MISSING VALUES gender dept origin examclas rating (9). DO REPEAT x=UKorigin,EUorigin,OSorigin / i=1 to 3. COMPUTE x=0. IF origin=i x=1. IF missing(origin) x=9. END REPEAT. DO REPEAT x=deptchem,deptlaw,depthist / i=1 to 3. COMPUTE x=0. IF dept=i x=1. IF missing(dept) x=9. END REPEAT. MISSING VALUES UKorigin,EUorigin,OSorigin,deptlaw,deptchem,depthist (9).

- Write out the corresponding command file for LIMDEP.
- Prepare the LIMDEP version of the command file, using the Singer editor. Use each version to read the data into the package concerned.
- Use linear regression to predict ratings from the other variables in both packages, print your output, and compare their results.
- Use LIMDEP to analyse the same data using ordered logit (
*Hint: don't forget to think about the coding of the dependent variable*). Print your output, and compare the results with those you obtained using linear regression.

*back to top*

Stephen Lea

University of Exeter Department of Psychology

Washington Singer Laboratories

Exeter EX4 4QG

United Kingdom

Tel +44 1392 264626

Fax +44 1392 264623

Send questions and
comments to the departmental
administrator or to the author
of this page

Goto Home page for
this course | previous topic | FAQ
file

Goto home page for: University of
Exeter | Department of
Psychology | Staff
| Students |
Research | Teaching
| Miscellaneous

(access count since 1st March 1997).