The Third Variable Problem

In this video I explain the third variable problem in correlational studies, how matched samples and matched pairs can be used to eliminate a possible third variable, and why measurement alone can never truly solve the third variable problem.

Don’t forget to subscribe to the channel to see future videos! Have questions or topics you’d like to see covered in a future video? Let me know by commenting or sending me an email!

Need more explanation? Check out my full psychology guide: Master Introductory Psychology: http://amzn.to/2eTqm5s

Video transcript:

Hi, I’m Michael Corayer and this is Psych Exam Review. In this video I’m going to talk about the third variable problem and this is something that I mentioned in the the last video on correlations.

So the third variable problem is this idea that when we do a correlational study so let’s imagine we have some data here go out and we measure two variables X and Y and we collect this and see that there’s a pattern.

The problem is that we can’t conclude anything about the causation. We don’t know which direction the causation is happening. It could be the case that X is causing Y, so as X increases that causes Y to increase.

It could be the case that Y causes X, so as Y increases that causes X to increase. Or, the third variable problem is that it could be that Z causes changes in X and Y. So Z is some other third variable that I didn’t measure.

I only measured two things and it could be a third thing that wasn’t in my study that’s influencing both of the two things I measured. The problem we have is that this possible third thing is infinite, it could be anything.

So let’s look an actual example of this. Let’s imagine that I did my study and I went to a school and I measured tons of students.

I measured them for X which we’ll say is their study time and I measured their exam performance. So there’s their score on the exams and I found this data here. Now you might look at that and immediately say well, Mike this seems obvious obviously studying causes the scores. So studying more causes higher exam performance.

That might seem obvious to you, but actually well that’s not necessarily the case. I can’t conclude that. I can’t conclude that causation. Because it could be the case that exam scores cause studying. You might say “Well how would that work”? Well it could be the case that students would do well on the exams throughout the course, that this motivates them to study, they enjoy the class more, they’re feeling competent, and therefore they study more. That could be the case, we don’t know.

Or it could be the case that some other thing, some other Z, causes both study time to increase and exam scores to increase.

So what could this Z possibly be? Well, it could be anything but let’s start with some plausible explanations. We could say alright, a potential Z that could be influencing this data would be teaching style. So it could be the case that students who have a teacher that they really like, they study more because they enjoy the class more and then also they do better on exams, they learn more, because they like the teaching style. So it makes sense to them and they do better on the tests. That’s one possible third variable that could influence this data.

Or you could say that it might be parental pressure. So it could be the case that students whose parents put them under a lot of pressure are forced to study more and because of this pressure, they also work harder to do better on exams. And students who have low parental pressure, their parents don’t really care how they’re doing, those students don’t study as much and also they aren’t as motivated to work hard on exams because they don’t really care.

That’s another possible third variable it could be. So it could be parental pressure causing this variation. Or it could be something that we wouldn’t think of, or we wouldn’t initially think of. Could it be coffee consumption?

Could it be the case that the students who drink more coffee are awake more and so they study more? They’re able to stay awake and read more? Could it be the case that they’re more alert during the exam?

The coffee consumption is keeping them awake during a boring test and the students who don’t drink coffee are falling asleep, that’s why they’re not doing so well. Well I can’t rule this out because they haven’t measured it. So I don’t know for sure, it could be coffee or it could be what they ate for breakfast that morning or it could be literally anything else.

Because I didn’t measure, I can’t say for sure that it’s not that other thing, it’s not their breakfast.

So how would I go about controlling for this problem? I say maybe it is coffee, let’s look into this. Well one thing that I could do is I could do a matched sample.

And all a matched sample refers to is comparing samples that are matched for a third variable. So I have my group of students here hopefully it’s going to be a lot more than four people in I my study, but I have my whole group and I find out of that on average they drink 2 cups of coffee per day. What I want to do is find another group of students who, on average, also drink 2 cups of coffee per day and then I want to look at their exam performance and I want to look at their study time and if I find out that their performance and their study time is very different from my first group of students, even though they drink the same amount coffee and that tells me that the coffee is not causing the exam performance or the study time. That would be a matched sample.

Another way I could do it, a more specific way, would be to match my students in pairs, this would be a matched pair design and in a matched pair what I would do is say OK here’s a student, here’s a student, both of these students each drink 1 cup of coffee per day. Here’s a student who drinks 2 cups here’s another student who also drinks 2 cups. And here’s the student who drinks 3 cups, here’s another student who drinks 3 cups. So each level of coffee consumption has a pair and then what I do is I compare those pairs.

I say OK this student drinks 1 cup per day and he has high study time this student drinks 1 cup of coffee per day and has low study time and this student drinks 2 cups per day and has high exam scores and this student drinks 2 cups per day and has low exam scores. And then I would, by doing this for all these pairs, I would be able to see if there was a pattern or not. I would say OK if they drink the same amount of coffee and they have very different scores and very different study times then it’s not coffee that’s causing this pattern of variation

Now the real problem here is that even if we eliminate coffee, we’ve only eliminated one item from an infinite list of other possible items. If it’s not coffee, it could still be something else. No matter how many of these third variables I eliminate, I still can’t say that studying more increases exam score or that higher exam scores causes people to study more. I can’t say that because there’s still the possibility of a third variable. You can measure everything that you can think, and it could still be that thing that you didn’t think of yet. So we’re never going to get around this infinite third variable problem just by measuring.

When we only measure, there’s a chance that it’s something we didn’t measure. So in the next video I’m going to talk about the way we have for trying to get around this third variable problem. What this technique attempts to do is to eliminate all third variables at the same time. And the way that we do this is called the experimental method. That’s what we’ll be looking at in the next video. I hope you found this helpful. If so, please like the video and subscribe to the channel for more.

Thanks for watching!

Leave a Reply

Your email address will not be published. Required fields are marked *