University of Exeter

DEPARTMENT OF PSYCHOLOGY


PSY2005 Statistics and Research Methods: Quantitative data analysis component

Minitab and multiple regression: Introduction



These notes are designed to help you remember what the introductory lecture to the multiple regression part of the course was about. They are not explanations! For those, you will have to listen to the lecture and/or do some reading. In particular, the terms printed in bold type are all things which you should understand by the end of the course. Many of them you will already know; some will be explained in the course of this lecture. In some cases we will explain them later in the course.

The first multiple regression lecture has three aims:

  1. To remind you to get yourselves ready to use the department's network server computer, singer, to access the statistics package called Minitab, and use it for simple statistical operations.
  2. To explain what the statistical procedure called Multiple Regression is, how it relates to other procedures, and what its uses are.
  3. To warn you of some rules that must be obeyed if multiple regression is to give meaningful results

1. Using Minitab on singer.

You have already been taught how to do this. Before next week's class, take a few minutes to remind yourself how to do it. Remember that you must have a network password and user name valid for the current academic year. If for some reason you have not, you must go to the computer centre in the Laver building, taking your Guild card, and get a new password. Then you must log in to singer using the password issued to you, and change it to something more sensible and memorable.
At next week's class, I shall assume that you can: Even if you think you can do all these things, please try them again before next week's class, to make sure.

2. What is multiple regression, where does it fit in, and what is it good for?

Multiple regression is the simplest of the large family of multivariate statistical techniques. That means it deals with numerous variables at the same time. Other multivariate techniques used in psychology include factor analysis, item analysis, multivariate analysis of variance (manova), discriminant analysis, path analysis, cluster analysis, and multidimensional scaling. Multiple regression is a manifest variables technique (i.e. it says things about the variables you actually measured, not a latent variables technique (these use hypothetical underlying quantities to account for the observed data).
Mathematically, multiple regression is a straightforward generalisation of simple regression, the process of fitting the best straight line through the dots on an x-y plot or scattergram. We will discuss what "best" means in this context in the next lecture.
Regression (simple and multiple) techniques are closely related to the analysis of variance (anova) which you studied last term. Both regression and anova are special cases of a single underlying mathematical model. You can combine the two, when what you have is an analysis of covariance (ancova), which we will introduce briefly later this term.
Two main points distinguish multiple regression from these other techniques: This means that multiple regression is useful in the following general class of situations. We observe one dependent variable, whose variation we want to explain in terms of a number of other independent variables, which we can also observe. These other variables are not under experimental control we just have to accept the variations in them that happen to occur in the sample of people or situations we can observe. We want to know which if any of these independent variables is significantly correlated with the dependent variable, taking into account the various correlations that may exist between the independent variables. So typically we use multiple regression to analyse data that come from "natural" rather than experimental situations. This makes it very useful in social psychology, and social science generally, and also in biological field work. Note, however, that it is inherently a correlational technique; it cannot of itself tell us anything about the causalities that may underlie the relationships it describes. Also, as with all statistical inference, the data need to be a random sample from some specified population; the technique will allow us draw inferences from our sample to that population, but not to any other.

3. Rules for using multiple regression

There are some additional rules that have to be obeyed if multiple regression is to be useful:
Stephen Lea

University of Exeter

Department of Psychology
Washington Singer Laboratories
Exeter EX4 4QG
United Kingdom
Tel +44 1392 264626
Fax +44 1392 264623


Send questions and comments to the departmental administrator or to the author of this page


Goto Home page for this course | next topic
Goto home page for: University of Exeter | Department of Psychology | Staff | Students | Research | Teaching | Miscellaneous


DisclaimerHome (access count since 2nd January 1997). 
Document revised 2nd January 1997