All these examples use data held
in the Singer file /singer1/eps/psybin/stats/debt.MTW. This is a Minitab
worksheet containing some of the data from a large postal survey on the
psychology of debt. The data in the file are, for each of 464 respondents,
- income group (1=lowest, 5=highest)
- security of housing tenure (1=rent, 2=mortgage, 3=owned outright)
- number of children in household
- is the respondent a single parent?
- age group (1=youngest)
- does the respondent have a bank account?
- does the respondent have a building society account?
- self-rating of money management skill (high values=high skill)
- how often did s/he use credit cards (1=never... 3=regularly)
- does s/he buy cigarettes?
- does s/he buy Christmas presents for children?
- score on a locus of control scale (high values=internal)
- score on a scale of attitudes to debt (high values=favourable to debt)
All yes/no questions are coded 0=no, 1=yes. These are real data (Lea,
Webley & Walker, 1995, Journal
of Economic Psychology, 16, 181-701), though the published paper
also also deals with many other variables. Locus of control is a personality
measure introduced by Rotter, which claims to differentiate people according
to how much they feel things that happen to them are as a result of processes
within themselves (internal locus of control) or outside events (external
locus of control).
- Get the data into Minitab. Use INFO to find out what columns are in
use; use PRINT on some of these columns to see how Minitab reports missing
values; and use DESCRIBE on these columns to see what Minitab does when
there are values missing in data on which it is doing calculations.
- Store this worksheet into your own filespace. Use the command SYSTEM
ls to check that you have stored the worksheet correctly (note that ls
is a unix command so must be in lower case). .
- Use simple tto find out whether there are significant differences
in debt attitudes between (a) smokers and non(b) those with and without
bank accounts. Repeat these tests for locus of control.
- Use BREG to find what combination of all the other variables in the
list above gives the best explanation of variations in attitude to debt.
- Use REGRESS to find out which of those variables are significantly
associated with attitude, and to discover what the nature of the associations
is. You may find that the R2adj value reported
by REGRESS is not the same as the one you obtained from BREG; can you see
why?
- Get a printout of the full results of your best regression model
- Use BREG to find out what combination of variables gives the most efficient
explanation of variations in attitude to debt
- (Optional). Use STEPWISE to answer the previous question in a different
way, and see whether you get the same results as you did before. HELP STEPWISE
will tell you more about how STEPWISE works.
Sample of BREG output
This sample shows how BREG would be used to look for the best model
to fit the teenage gambling data used in the introductory multiple regression
examples. It assumes we have already read in the data and named the columns
appropriately.
MTB > BREG C6 C2
Best Subsets Regression of gambling
p v
o e
s c r
t m b
m a o i
0 t n n
Adj. f u e t
Vars R-sq R-sq C-p s 1 s y l
1 38.7 37.3 11.4 24.948 X
1 16.6 14.8 31.0 29.094 X
2 50.1 47.9 3.2 22.754 X X
2 40.3 37.6 12.0 24.904 X X
3 52.6 49.3 3.0 22.434 X X X
3 50.6 47.1 4.9 22.915 X X X
4 52.7 48.2 5.0 22.690 X X X X
Note the following:
- In calling BREG, the dependent variable comes first, then the full
set of possible independent variables
- BREG reports the R2 and R2adj
values for the best and second best model for each number of regressors.
The first column gives the number of variables included in each model.
Why does BREG only report on one 4-regressor model?
- You can ignore the columns labelled C-p and s.
- BREG lists all the possible independent variables, by name, spelt vertically.
This can be quite difficult to spot, and it is still difficult to read
even when you have spotted it.
- It then puts an X in the column corresponding to each independent variable
that is included in the model being reported in a given row. So in the
example above, the best 1-variable model includes the regressor 'pocmoney'.
Stephen Lea
University of Exeter
Department of Psychology
Washington Singer Laboratories
Exeter EX4 4QG
United Kingdom
Tel +44 1392 264626
Fax +44 1392 264623
Send questions and comments to the
departmental administrator or to the
author of this page
Goto Home page for this
course | handout
for this topic | next
topic
Goto home page for: University of
Exeter | Department of
Psychology | Staff
| Students |
Research | Teaching
| Miscellaneous
(access count).
Document revised 10th January 1997