University of Exeter

DEPARTMENT OF PSYCHOLOGY


PSY2005 Statistics and Research Methods: Quantitative data analysis component

Using Minitab to do simple regression:
Procedures and examples


Scope of this example sheet

In this set of practical work, you will first revisit some basic Minitab skills, including how to work out the basic descriptive statistics (mean, standard deviation etc) for variables, and using the TWOT command to carry out a t-test for significant differences between two independent groups. Then you will learn how to use Minitab to calculate the regression equation between one independent variable and one dependent variable. This involves you in the following:

Functions of the Minitab commands

Students often find it difficult to remember which Minitab command does what. This list tries to help you. But don't rely on this list, and don't rely on your memory--you don't need to. If you can only remember that there is a command called (say) SET, then you can always type HELP SET and you will get some information about what SET does. It is more useful to learn how to interpret the information from HELP than to try to memorize the individual commands. However, HELP is much more effective for reminding you how to do something you have already tried than telling you about a command you have never used before. So it is important to try out the examples.

You should already know how to use almost all these commands. But, remember: the information is all there in HELP. Make sure you know how to get it out.

The REGRESS command

The syntax for the REGRESS command is very simple. If the dependent variable (y) is held in C1, and the independent variable (x) in C2, we type

REGRESS C1 1 C2

at a Minitab prompt. Note the figure 1 after C1: this is telling Minitab that this is a simple regression, with only one independent variable. When we come to do multiple regressions, we will replace the number 1 with the number of dependent variables involved.

Examples

NB Unless you are told otherwise, throughout this course all data used in examples are entirely imaginary and should not be taken to represent real psychological trends.

  1. Work out the mean of the following numbers: 34, 278, 132, 87, 432, 276.
  1. The percentage scores on a statistics test were as follows:
    Men: 100, 98, 43, 65, 97, 12, 55
    Women: 55, 63, 98, 43, 42, 88, 72, 66, 60, 87, 39, 100.
    Put all these data into a single minitab column called 'pcent'. Put a gender marker (1 for men, 2 for women) in a second column called 'gender'. Use TWOT (a) to give you the mean score and standard deviation for each gender and (b) to tell you the value of t which you can use to see whether the difference of mean scores is statistically significant.
  2. The following are the IQ scores on the Verbal and Numerical scales of a certain test for a group of students:
    Verbal: 98 120 85 97 100 132 124 88 91 144
    Numerical: 92 105 100 92 93 144 143 75 85 121
    calculate the mean and standard deviation of the scores on each scale. Use LET to work out the difference between them and put it in a new column. Use TTEST on this column to see whether there is a significant difference between the verbal and numberical scores.
  3. Using the data from the previous example, work out the regression line for predicting Numerical scores (dependent variable) from Verbal scores (independent variable).
  4. A social psychologist observes the scores achieved on a video game in a pub, by the first new (previously unobserved) player to use the machine after each half hour through the evening. They are as follows:
    Time: 6pm 6.30 7pm 7.30 8pm 8.30 9pm 9.30 10pm 10.30
    Score: 1760 995 2130 770 1535 3975 2120 5660 3341 4995
    Do the data support the psychologist's hypothesis that more expert players use the machine later in the evening? What would be the most likely score to observe at 9.45pm?
  5. The following data show the levels of anxiety recorded by a paper-and-pencil test just before a group of students took an examination, together with the exam marks obtained. Use PLOT to decide whether it would be appropriate to use linear regression to summarize these data.
    Anxiety score: 5 17 10 12 3 19 2 11 9 8 13 18 4 7
    Exam mark: 45 20 55 72 45 39 50 75 60 57 58 52 43 57

Answers for checking.

These are not complete answers, but figures you can use to check whether you have done the examples correctly. If you don't get these numerical values, something is wrong:

  1. 206.50
  2. (a) men: mean 67.1, s.d. 33.4; women mean 67.7, s.d. 21.5; t17=0.05 (if you get a different number of degrees of freedom, you forgot to use the POOLED subcommand on the TWOT command)
    (b) verbal: mean 107.9, s.d. 20.5; numerical mean 105.0, s.d. 23.6; t9=0.66
  3. a = 3.7, b = 0.939
  4. a = -4240, b = 845, predicted score at 9.45pm = 3995


Stephen Lea

University of Exeter
Department of Psychology
Washington Singer Laboratories
Exeter EX4 4QG
United Kingdom
Tel +44 1392 264626
Fax +44 1392 264623


Send questions and comments to the departmental administrator or to the author of this page


Goto Home page for this course | handout for this topic | next topic
Goto home page for: University of Exeter | Department of Psychology | Staff | Students | Research | Teaching | Miscellaneous


Disclaimer Home (access count since 2nd January 1997).
Document revised 1st February 1997