 # Using Minitab to do simple regression: Procedures and examples

### Scope of this example sheet

In this set of practical work, you will first revisit some basic Minitab skills, including how to work out the basic descriptive statistics (mean, standard deviation etc) for variables, and using the TWOT command to carry out a t-test for significant differences between two independent groups. Then you will learn how to use Minitab to calculate the regression equation between one independent variable and one dependent variable. This involves you in the following:

• Logging in to singer using one of the Mac computers. You should already know how to do this; for more information see the handbook kept by each computer in the undergraduate laboratory.
• Gaining access to the Minitab package. Again you should already know how to do this. In any case, it is extremely simple (you just type MINITAB in response to a unix prompt).
• Using the following Minitab commands: SET, PRINT, NAME, INFO, HELP, LET, DESCRIBE, PLOT, TWOT, TTEST, REGRESS, and STOP. Of these, the ones that actually do the work are SET, DESCRIBE, TWOT, LET, TTEST, REGRESS and STOP. But to make it easy to understand the results you get, NAME is virtually essential; and PRINT, INFO and HELP are valuable aids to doing anything with Minitab. NB: in these notes, Minitab commands are typed in CAPITAL LETTERS so you can recognise them. But you can put them into the computer in lower case, or in a mixture, if you prefer. Note that with the exception of REGRESS you should already have met all these commands.

### Functions of the Minitab commands

Students often find it difficult to remember which Minitab command does what. This list tries to help you. But don't rely on this list, and don't rely on your memory--you don't need to. If you can only remember that there is a command called (say) SET, then you can always type HELP SET and you will get some information about what SET does. It is more useful to learn how to interpret the information from HELP than to try to memorize the individual commands. However, HELP is much more effective for reminding you how to do something you have already tried than telling you about a command you have never used before. So it is important to try out the examples.

• HELP gives you information about Minitab commands
• SET enters data into Minitab
• NAME lets you give a column of data a name (e.g. 'Age', 'IQ', 'RT', 'Income')
• PRINT shows you the contents of one or more columns of data on the screen
• INFO gives you a list of the columns of data you have in use, their names, and the amount of data in each.
• LET does arithmetic on whole columns (e.g. works out differences between two columns), or on individual entries within columns (e.g. corrects a data point).
• TTEST does related (i.e. dependent, matched) samples t-test (usually you have to use LET beforehand).
• TWOT does independent sample t-tests
• DESCRIBE puts out on the screen the mean, median, standard deviation, minimum, maximum, etc, of the data in one or more columns.
• PLOT puts a simple graph of the relationship between two variables
• REGRESS works out a simple or multiple regression equation, and does appropriate significance tests.
• STOP leaves Minitab.

You should already know how to use almost all these commands. But, remember: the information is all there in HELP. Make sure you know how to get it out.

### The REGRESS command

The syntax for the REGRESS command is very simple. If the dependent variable (y) is held in C1, and the independent variable (x) in C2, we type

REGRESS C1 1 C2

at a Minitab prompt. Note the figure 1 after C1: this is telling Minitab that this is a simple regression, with only one independent variable. When we come to do multiple regressions, we will replace the number 1 with the number of dependent variables involved.

### Examples

NB Unless you are told otherwise, throughout this course all data used in examples are entirely imaginary and should not be taken to represent real psychological trends.

1. Work out the mean of the following numbers: 34, 278, 132, 87, 432, 276.
1. The percentage scores on a statistics test were as follows:
Men: 100, 98, 43, 65, 97, 12, 55
Women: 55, 63, 98, 43, 42, 88, 72, 66, 60, 87, 39, 100.
Put all these data into a single minitab column called 'pcent'. Put a gender marker (1 for men, 2 for women) in a second column called 'gender'. Use TWOT (a) to give you the mean score and standard deviation for each gender and (b) to tell you the value of t which you can use to see whether the difference of mean scores is statistically significant.
2. The following are the IQ scores on the Verbal and Numerical scales of a certain test for a group of students:
Verbal: 98 120 85 97 100 132 124 88 91 144
Numerical: 92 105 100 92 93 144 143 75 85 121
calculate the mean and standard deviation of the scores on each scale. Use LET to work out the difference between them and put it in a new column. Use TTEST on this column to see whether there is a significant difference between the verbal and numberical scores.
3. Using the data from the previous example, work out the regression line for predicting Numerical scores (dependent variable) from Verbal scores (independent variable).
4. A social psychologist observes the scores achieved on a video game in a pub, by the first new (previously unobserved) player to use the machine after each half hour through the evening. They are as follows:
Time: 6pm 6.30 7pm 7.30 8pm 8.30 9pm 9.30 10pm 10.30
Score: 1760 995 2130 770 1535 3975 2120 5660 3341 4995
Do the data support the psychologist's hypothesis that more expert players use the machine later in the evening? What would be the most likely score to observe at 9.45pm?
5. The following data show the levels of anxiety recorded by a paper-and-pencil test just before a group of students took an examination, together with the exam marks obtained. Use PLOT to decide whether it would be appropriate to use linear regression to summarize these data.
Anxiety score: 5 17 10 12 3 19 2 11 9 8 13 18 4 7
Exam mark: 45 20 55 72 45 39 50 75 60 57 58 52 43 57

These are not complete answers, but figures you can use to check whether you have done the examples correctly. If you don't get these numerical values, something is wrong:

1. 206.50
2. (a) men: mean 67.1, s.d. 33.4; women mean 67.7, s.d. 21.5; t17=0.05 (if you get a different number of degrees of freedom, you forgot to use the POOLED subcommand on the TWOT command)
(b) verbal: mean 107.9, s.d. 20.5; numerical mean 105.0, s.d. 23.6; t9=0.66
3. a = 3.7, b = 0.939
4. a = -4240, b = 845, predicted score at 9.45pm = 3995

Stephen Lea

University of Exeter
Department of Psychology
Washington Singer Laboratories
Exeter EX4 4QG
United Kingdom
Tel +44 1392 264626
Fax +44 1392 264623   (access count since 2nd January 1997).