Using Minitab to do simple
regression:
Procedures and examples
Scope of this example sheet
In this set of practical work, you will first revisit some basic Minitab
skills, including how to work out the basic descriptive statistics (mean,
standard deviation etc) for variables, and using the TWOT command to carry
out a t-test for significant differences between two independent
groups. Then you will learn how to use Minitab to calculate the regression
equation between one independent variable and one dependent variable. This
involves you in the following:
- Logging in to singer using one of the Mac computers. You should already
know how to do this; for more information see the handbook kept by each
computer in the undergraduate laboratory.
- Gaining access to the Minitab package. Again you should already know
how to do this. In any case, it is extremely simple (you just type MINITAB
in response to a unix prompt).
- Using the following Minitab commands: SET, PRINT, NAME, INFO, HELP,
LET, DESCRIBE, PLOT, TWOT, TTEST, REGRESS, and STOP. Of these, the ones
that actually do the work are SET, DESCRIBE, TWOT, LET, TTEST, REGRESS
and STOP. But to make it easy to understand the results you get, NAME is
virtually essential; and PRINT, INFO and HELP are valuable aids to doing
anything with Minitab. NB: in these notes, Minitab commands are typed in
CAPITAL LETTERS so you can recognise them. But you can put them into the
computer in lower case, or in a mixture, if you prefer. Note that with
the exception of REGRESS you should already have met all these commands.
Functions of the Minitab commands
Students often find it difficult to remember which Minitab command does
what. This list tries to help you. But don't rely on this list,
and don't rely on your memory--you don't need to. If you can only
remember that there is a command called (say) SET, then you can always
type HELP SET and you will get some information about what SET does. It
is more useful to learn how to interpret the information from HELP than
to try to memorize the individual commands. However, HELP is much more
effective for reminding you how to do something you have already tried
than telling you about a command you have never used before. So it is important
to try out the examples.
- HELP gives you information about Minitab commands
- SET enters data into Minitab
- NAME lets you give a column of data a name (e.g. 'Age', 'IQ', 'RT',
'Income')
- PRINT shows you the contents of one or more columns of data on the
screen
- INFO gives you a list of the columns of data you have in use, their
names, and the amount of data in each.
- LET does arithmetic on whole columns (e.g. works out differences between
two columns), or on individual entries within columns (e.g. corrects a
data point).
- TTEST does related (i.e. dependent, matched) samples t-test
(usually you have to use LET beforehand).
- TWOT does independent sample t-tests
- DESCRIBE puts out on the screen the mean, median, standard deviation,
minimum, maximum, etc, of the data in one or more columns.
- PLOT puts a simple graph of the relationship between two variables
- REGRESS works out a simple or multiple regression equation, and does
appropriate significance tests.
- STOP leaves Minitab.
You should already know how to use almost all these commands. But, remember:
the information is all there in HELP. Make sure you know how to get it
out.
The REGRESS command
The syntax for the REGRESS command is very simple. If the dependent
variable (y) is held in C1, and the independent variable (x)
in C2, we type
REGRESS C1 1 C2
at a Minitab prompt. Note the figure 1 after C1: this is telling Minitab
that this is a simple regression, with only one independent variable. When
we come to do multiple regressions, we will replace the number 1 with the
number of dependent variables involved.
Examples
NB Unless you are told otherwise, throughout this course all data used
in examples are entirely imaginary and should not be taken to represent
real psychological trends.
- Work out the mean of the following numbers: 34, 278, 132, 87, 432,
276.
- The percentage scores on a statistics test were as follows:
Men: 100, 98, 43, 65, 97, 12, 55
Women: 55, 63, 98, 43, 42, 88, 72, 66, 60, 87, 39, 100.
Put all these data into a single minitab column called 'pcent'. Put a gender
marker (1 for men, 2 for women) in a second column called 'gender'. Use
TWOT (a) to give you the mean score and standard deviation for each gender
and (b) to tell you the value of t which you can use to see whether
the difference of mean scores is statistically significant.
- The following are the IQ scores on the Verbal and Numerical scales
of a certain test for a group of students:
Verbal: 98 120 85 97 100 132 124 88 91 144
Numerical: 92 105 100 92 93 144 143 75 85 121
calculate the mean and standard deviation of the scores on each scale.
Use LET to work out the difference between them and put it in a new column.
Use TTEST on this column to see whether there is a significant difference
between the verbal and numberical scores.
- Using the data from the previous example, work out the regression line
for predicting Numerical scores (dependent variable) from Verbal scores
(independent variable).
- A social psychologist observes the scores achieved on a video game
in a pub, by the first new (previously unobserved) player to use the machine
after each half hour through the evening. They are as follows:
Time: 6pm 6.30 7pm 7.30 8pm 8.30 9pm 9.30 10pm 10.30
Score: 1760 995 2130 770 1535 3975 2120 5660 3341 4995
Do the data support the psychologist's hypothesis that more expert players
use the machine later in the evening? What would be the most likely score
to observe at 9.45pm?
- The following data show the levels of anxiety recorded by a paper-and-pencil
test just before a group of students took an examination, together with
the exam marks obtained. Use PLOT to decide whether it would be appropriate
to use linear regression to summarize these data.
Anxiety score: 5 17 10 12 3 19 2 11 9 8 13 18 4 7
Exam mark: 45 20 55 72 45 39 50 75 60 57 58 52 43 57
Answers for checking.
These are not complete answers, but figures you can use to check whether
you have done the examples correctly. If you don't get these numerical
values, something is wrong:
- 206.50
- (a) men: mean 67.1, s.d. 33.4; women mean 67.7, s.d. 21.5; t17=0.05
(if you get a different number of degrees of freedom, you forgot to use
the POOLED subcommand on the TWOT command)
(b) verbal: mean 107.9, s.d. 20.5; numerical mean 105.0, s.d. 23.6; t9=0.66
- a = 3.7, b = 0.939
- a = -4240, b = 845, predicted score at 9.45pm = 3995
Stephen Lea
University of Exeter
Department of Psychology
Washington Singer Laboratories
Exeter EX4 4QG
United Kingdom
Tel +44 1392 264626
Fax +44 1392 264623
Send questions and comments to the
departmental administrator or to the
author of this page
Goto Home page for this
course | handout
for this topic | next
topic
Goto home page for: University of Exeter
| Department of Psychology
| Staff | Students
| Research |
Teaching | Miscellaneous
(access count since 2nd January 1997).
Document revised 1st February 1997