In this video I explain simple ways of describing the dispersion of data that can be ordered or ranked. These are the range, the interquartile range, and the semi-interquartile range. I also briefly describe a more precise method for calculating the range for a continuous variable, though this method is not common. I explain each of these measures of dispersion, their strengths and weaknesses, and what they can tell us about our data.
Video Transcript
Hi, I’m Michael Corayer and this is Psych Exam Review. For all other types of data except nominal we have distances between our scores. This means that we can calculate measures of dispersion that are more detailed than the variation ratio. The simplest of these measures is the range which just tells us the distance from the lowest score up to the highest score.
Now the range is easy to calculate we simply take the highest score and subtract the lowest score from it, but it also doesn’t tell us very much. This is because it’s only looking at the endpoints of our data and this makes it very sensitive to outliers. If we have one very high or very low score this will have a dramatic influence on the range. Nevertheless it gives us a useful way to start thinking about the spread of our data.
Just like with the interpolated median, which I explained in another video, there is a more precise method for calculating the range. This takes into consideration that scores have infinite fractional parts. So if I measured in whole seconds, a score of 4 seconds could be as low as 3.5 or as high as 4.5. So each score has a lower limit and an upper limit. Now if we think about the range, technically we should go from the lower limit of the lowest score up to the upper limit of the highest score. So let’s say we had a range of scores from 4 to 9. The usual method for calculating the range will give us a range of 5, but if we used this more precise method we would say that the range is actually from 3.5 all the way up to 9.5 and this will give us a range of 6.
So the formula for this precise range is the highest score minus the lowest score plus 1. Now this isn’t very common and you’ll probably never use it but if you see it somewhere, hopefully you’ll have a better understanding of why. Of course, even this more precise method of calculating the range doesn’t address the issue of sensitivity to extreme scores.
In order to address this we can use a version of the range called the interquartile range. What this does is it ignores the bottom 25% of our data and the top 25% percent of our data, eliminating extremes. So we’re just looking at the middle 50% of our data and this gives us a better idea of the spread around the median. In order to understand the interquartile range we can start by noticing that there are three values that would divide our data up into four sections or quartiles. These are the 25th percentile, the 50th percentile, which is the median, and the 75th percentile.
Now we’ve already seen how to calculate the median, this is the middle score in an odd number of scores, or it’s the mean of the two middle scores in an even number of scores. Now we simply repeat this process for the lower half of our data in order to find the 25th percentile and again for the upper half of our data in order to find the 75th percentile. The interquartile range is just the range of the middle two quartiles of our data. So we’re looking at the distance from the 25th percentile to the 75th percentile.
So to calculate it we simply take the 75th percentile and subtract the 25th percentile. This gives us the spread of our data in the 50% around the median. This means that it won’t be influenced by outliers because those would fall in the lower 25 percent or the upper 25 percent of our data that we’re ignoring.
Let’s practice finding these points with a set of 15 scores. So first we want to find the median. Since we have an odd number of scores this will be the middle score at n plus 1 divided by 2. In this case that’s the eighth position and that gives us a median of 20. Now we repeat this process for the upper and lower halves of our data. For the lower half, we look at the fourth position and this gives us a 25th percentile of 13. And for the upper half we look at the 12th position and this gives us a 75th percentile of 26. So to find the interquartile range we take the 75th percentile minus the 25th percentile, so we have 26 minus 13 and this gives us an interquartile range of 13.
If you have a large interquartile range this suggests that our data is spread out around the median, and if we have a small interquartile range that suggests that our data is closer together around the median. Now we should keep in mind that this interquartile range is really only looking at two points and it’s ignoring most of our data, but it still gives us a way of thinking about the spread of our data around the median. And this is useful for ordinal data where we can’t calculate a mean or if we have an asymmetrical distribution where a mean might be misleading.
Once we have the interquartile range we can also calculate what’s called the semi-interquartile range. This is simply the interquartile range divided by 2. So in our case that would be 13 divided by 2 gives us 6.5. This is telling us the average distance from the median down to the 25th percentile or up to the 75th percentile. So it’s saying it’s about 6.5 points down to get to the 25th percentile and about 6.5 points up to get to the 75th percentile. Now of course it’s possible that these distances are quite different and the semi-interquartile range wouldn’t be able to tell us that.
So that’s the range, interquartile range, and semi-interquartile range. I hope you found this helpful. If so, let me know in the comments, like and subscribe, and make sure to check out the hundreds of other psychology tutorials that I have on the channel. Thanks for watching!