Chapter 9: Intelligence — Master Introductory Psychology

Master Introductory Psychology · Chapter 9 of 16

Intelligence

What intelligence is, how it is measured, what IQ scores do and don't tell us, and the complex factors behind individual and group differences.

📖 21 sections ⌛ ~45 min read 🔑 30 key terms ✎ 10 review questions

What is Intelligence?

IntelligenceIntelligenceThe ability to learn from experience, solve problems, and adapt effectively to new situations. is something we all think is important, but what exactly does this term refer to? We might all hope to have intelligence, but what are we hoping for? Always knowing the right answer? Making better decisions? Earning higher grades or more income? Being more successful in reaching our goals? Having a greater sense of control over our lives?

In this chapter we'll be examining the concept of intelligence in three main ways. First, we'll consider how to define intelligence. What does it mean to have intelligence (or not) and how have definitions of intelligence changed over time? Second, once we have some ideas about how to define intelligence, we'll look at attempts to measure it. How can we assess someone's intelligence? Finally, once we have assessments of intelligence, what do these results really mean? What causes differences in scores and how should we interpret these differences?

The first thing that we should remember when considering the idea of intelligence is that it is just that; an idea. Like many properties that psychologists attempt to investigate, intelligence is a hypothetical construct. We need to avoid the temptation for reification: treating an idea as if it were a real, concrete object that can be objectively measured. When we consider someone's IQ score there can be a tendency to think of that score as a measurement of some fixed entity, but we should try to avoid thinking this way.

Since intelligence isn't a concrete object we can directly observe, we need to agree on some terms in order to make sure that we're all talking about the same thing. What should we consider intelligence to be? Let's take a look at some historical precedents for the concept of intelligence, along with some newer approaches that have attempted to clarify this potentially vague notion.

Charles Spearman (1863- 1945) is considered to be one of the first researchers to provide evidence that intelligence is a single underlying trait which affects a number of abilities. While looking at relationships between performance in different areas, Spearman found that high performance seemed to be linked across domains. People who did well on one task also tended to do well on other tasks. Spearman collected results for performance on many tasks, then used factor analysisfactor analysisA statistical procedure used to identify clusters of related variables — used to identify factors underlying intelligence test performance.; a statistical technique for examining whether a large number of correlations can be explained using a small number of factors. Spearman identified one underlying factor influencing many cognitive abilities and he called this g-factor, for general intelligencegeneral intelligenceSpearman's proposed factor (g) that underlies performance across diverse cognitive tasks.. Spearman recognized that people also have specialized skills but he believed that these were influenced by an individual's g-factor. A helpful mnemonic for remembering Spearman's association with g-factor is to imagine a spear with one point – pointing to the single factor of g.

Louis Leon Thurstone (1887-1955) challenged Spearman's concept of g and proposed a different organization of mental abilities. Thurstone proposed a multi-factor theory of 7 primary mental abilities which each influenced particular sets of skills. Despite Thurstone's evidence for these domains, further analysis has supported the idea that there is still a general intelligence factor which influences several primary mental abilities at once. You could visualize this as a sort of pyramid: a single g-factor (at the top) influences several different domains of mental ability (the middle), which each then influence the development of many specific skills (at the base).

We still see echos of this debate today and definitions of intelligence are never satisfactory for everyone. Similar to Thurstone's primary mental abilities, Howard Gardner has proposed a theory of multiple intelligencesmultiple intelligencesGardner's theory that intelligence is not a single capacity but a set of distinct abilities including linguistic, musical, spatial, and interpersonal., believing that our mental abilities should be considered as separate modules. It is possible for an individual to flourish in one area of intelligence independent of others. Gardner has suggested there is evidence for the existence of at least 8 separate intelligences including verbal, mathematical, musical, spatial, bodily-kinesthetic, interpersonal, intrapersonal, and naturalistic intelligences.

Possible evidence for these separate intelligences can be seen in prodigies, children with high ability in a single area but normal development in other areas, and savants, individuals with extremely high proficiency in one area, accompanied by low ability or disability in other areas. The existence of prodigies and savants suggests that intelligences are (or at least can be) separate domains.

Perhaps the most famous savant in recent times was Kim Peek (1951-2009), the inspiration for Dustin Hoffman's character Raymond Babbitt in the film Rain Man (note: in the film, Babbitt was portrayed as autistic though Peek did not have autism). Peek's prodigious memory for facts and figures was accompanied by disabilities in other areas such as difficulties with reasoning and metaphorical thinking along with physical coordination problems.

Cases of patients with brain damage resulting in specific types of deficits also provide some support for a modular view of multiple intelligences. If a person is in an accident and subsequently loses the ability to perform a specific set of skills without loss of ability in other areas, this might indicate those skills are a separate module of intelligence. Similar evidence that intelligences are separated comes from cases of acquired savant syndrome. In these cases, brain damage actually causes a sudden heightening of ability in a particular area, which may mean that inhibition of some brain regions (via injury) has the potential to boost performance in other areas. This doesn't mean that you should start bashing your head against a wall in hopes of boosting your math scores, but perhaps there is a way to harness this inhibition on a temporary basis. At the end of this chapter we'll look at transcranial stimulation, which may allow us to temporarily knock out specific brain regions in order to allow others to shine.

While these cases of damage suggest a separation of skills, they don't necessarily suggest that overall development is not still influenced by some single underlying factor (like g). Specific damage demonstrates that some types of intelligence may be isolated in their localization in the brain but this doesn't provide conclusive evidence that these types of intelligence develop separately.

Even if we accept that there are separate multiple intelligences this still doesn't necessarily provide us with clear definitions of what intelligence is. For instance, if we were to agree on some definition of intelligence which included bodily-kinesthetic abilities, we would still have the problem of how to define this particular area and then how to assess it. Should bodily-kinesthetic intelligence include reaction time, muscle fiber types, balance, endurance, or hand-eye coordination? Should we assess Michael Jordan's bodily-kinesthetic intelligence based on his basketball playing or his baseball playing? How do we separate this intelligence from specific skills developed through extensive practice? If I practice free-throws for hours each day, I will probably improve at least a little bit, but has my bodily-kinesthetic intelligence improved, or just this specific skill? Is there a difference?

Robert Sternberg has suggested that our definitions of intelligence should really be focused on those abilities which bring us overall success in living. Excellent performance on math or reasoning questions may not accurately predict success in dealing with real-life problems, and a practical measure of intelligence should reflect this. In real life, problems aren't clearly defined and almost never have single solutions. We're often on our own in defining the problems we face, generating multiple solutions for these problems, and then weighing a number of factors in determining which courses of action we should implement and how we should adjust those courses along the way.

Sternberg's triarchic theorytriarchic theorySternberg's theory that intelligence includes analytical, creative, and practical components. of intelligence proposes 3 main types of intelligence: analytical intelligence; the ability to generate solutions to specific problems, creative intelligence; the ability to generate novel solutions, and practical intelligence; the ability to choose the most appropriate solutions based on the situation and context. Practical intelligence means recognizing that the “best” answer isn't always the same as the “correct” answer, such as knowing when to accept your boss's flawed decision, when to tell your friend a new outfit looks great (when it doesn't), or when to give up fighting an argument even though logic is on your side.

This brings us to consider the possible role of emotional intelligenceemotional intelligenceThe ability to perceive, understand, manage, and use emotions effectively in oneself and in relationships with others., first proposed by Peter Salovey and John Mayer (no, not that John Mayer), though perhaps most associated with Daniel Goleman, who has written several popular books on the subject. Emotional intelligence emphasizes the importance of recognizing, expressing, managing, and using emotions. The ability to console a friend, conceal contempt for a co-worker, or deliver a rousing speech can influence our likelihood of success and therefore could be considered a fundamental part of behaving intelligently.

A final idea for defining intelligence comes from Raymond Cattell, who used factor analysis to distinguish between fluid intelligencefluid intelligenceThe capacity for novel problem-solving and abstract reasoning, independent of acquired knowledge — declines earlier with aging. and crystallized intelligencecrystallized intelligenceAccumulated knowledge and verbal skills built through education and experience — tends to remain stable or improve with age.. Fluid intelligence refers to the capacity to solve new problems and incorporate new information effectively. Crystallized intelligence, on the other hand, refers to specific knowledge and skills that have been accumulated through experience.

Imagine that you're an avid video gamer who has logged many, many (perhaps too many) hours playing Call of Duty. Now you purchase a new first-person shooter game. You are delighted to find that the controls for this new game are identical to those for Call of Duty. Your immediate success in the early levels of this new game could be attributed to your crystallized intelligence. You have previously accumulated specific skills in using this controller layout, and these existing skills can be applied directly to the new game. In later levels, however, you find that none of your old strategies from Call of Duty seem to be working, and you are forced to use your fluid intelligence to adapt to these new circumstances and come up with novel solutions in order to beat these levels.

✎ Quick check — Section 1

Spearman's concept of 'g' refers to:

Assessing Intelligence

Even if psychologists can't quite agree on the perfect definition of what intelligence is or should be, this hasn't stopped them from attempting to measure it. This desire to measure first and ask questions later is true for many other psychological traits, abilities, and processes. The study and design of testing for abilities and traits, including personality traits (which will be addressed in the next chapter), is known as psychometrics, and a person doing the investigating would be a psychometrician.

The normal distribution of IQ scores — mean of 100, standard deviation of 15. Approximately 68% of people score between 85 and 115.

✎ Quick check — Section 2

A test's validity refers to:

Test Types

In designing an intelligence assessment, I may want to know the level of difficulty that someone is capable of solving. In this case, I would probably look at whether a person is able to solve a particularly difficult puzzle or not. This would be considered a power test. In this context power refers to how well a measurement can differentiate results.

A question that everyone can solve has low power because it doesn't tell us anything about how people differ. A question that some people can solve and others can't solve allows us to make finer distinctions between people and therefore would have higher power.

Designing tests isn't just about the questions but also includes other factors like time. Questions with low power may still be useful if you want to see how quickly someone can solve a number of simple puzzles. You could measure how many puzzles a person is able to solve in some set amount of time. This would be considered a speed test rather than a power test. In this case, it's ok if you use questions that everyone can solve (low power) because you're differentiating people based on their speed, not on whether they can answer the question.

Most tests you take in school measure knowledge or skills that you have learned, so they would be considered achievement tests. IQ tests are generally intended to be aptitude tests; tests that predict potential ability. Predicting ability is far more challenging than simply measuring achievement, which is why the design and implementation of intelligence tests can be controversial. While a low score on an achievement testachievement testA test measuring what a person has already learned — assessing current knowledge and skills. may simply indicate someone hasn't learned something yet, a low score on an aptitude testaptitude testA test designed to predict future performance or capacity to learn a new skill. could be interpreted as meaning the person is incapable of ever learning something. This interpretation may have major implications for a person's future, so we have to be particularly cautious when it comes to designing aptitude tests and drawing conclusions from them.

Whenever we consider some hypothetical construct or property like “intelligence” there are three points we should examine closely: The property (which we think exists), the behaviors we associate with this property (the observable behaviors that make us think it exists in the first place), and finally, responses to assessments aimed at measuring this property.

As we consider these three points, we can question the relationships between them. Do those behaviors really represent the property? Does this assessment actually relate to the property? If the answer to both of those questions seems to be yes, then we should find a clear relationship between the assessment and the actual behaviors. In other words, if grades truly reflect intelligence, and an IQ test is actually assessing intelligence, then we should find a relationship between grades and IQ scores. Ideally this relationship would always hold, but of course, we can imagine exceptions; the brilliant prodigy so bored in class she doesn't bother doing assignments (high IQ score but low grades) or the charming student whose charisma earns him higher grades than his IQ score might otherwise predict (low IQ score but high grades).

✎ Quick check — Section 3

Alfred Binet originally developed intelligence tests to:

Validity

As with any variable we investigate, we want to reduce the possibility of bias and error in our assessment of intelligence. When it comes to intelligence tests, there are a number of techniques we can use to try to ensure validityvalidityThe degree to which a test measures what it claims to measure. and reliabilityreliabilityThe consistency of a test — it produces similar results on repeated administrations..

Validity refers to making sure that the test actually measures what we want to measure. It's tough to determine how to define intelligence, but once we've decided upon a possible definition, we want to design a test which will actually measure it. While there are a lot of terms for different types of validity, this is not to torture you with a laundry list of definitions. Once learned, these terms will make it easier to talk about potential flaws in a test. Ideally we could satisfy all the types of validity below, but in practice, most tests can be criticized in at least one of these areas.

Construct validity refers to ensuring that there is actually a relationship between what we measure on a test and the property (or construct) of intelligence. We want to be sure that there is a relationship between what we're measuring and what we're claiming to assess. This is probably the most essential of the types of validity discussed here, because without a clear relationship to the property, even the most carefully-collected data is useless for drawing conclusions.

Face validity refers to a surface-level assessment that a test relates to what it is intended to measure. For instance, if I wanted to assess an artist's skill, I might create a test of different brush techniques. It would be easy to see that this test is inappropriate for assessing the skill of a chef, but for an artist it seems to make sense. This doesn't mean my test is a detailed assessment of artistic skill, just that it looks like it is related to the property I'm interested in.

To consider the validity of a test more carefully, content validity (also referred to as logical validity) asks whether the test is comprehensive enough to cover all the aspects needed to assess the property in question. In the case of artistic skill, I'm going to need to assess more than just brush strokes, because we all know that there is more to artistic skill than brush technique. Content validity can be problematic for assessing intelligence because concepts of what should be considered intelligence can be quite broad. An intelligence test which only assessed verbal abilities could have face validity (those abilities appear to be related to intelligence) but this test would be lacking in content validity because it fails to assess many other abilities which are also related to the concept of intelligence. Even fairly comprehensive IQ tests available today could be argued to lack content validity for their failures to fully assess creativity, divergent thinking, emotional intelligence or practical intelligence.

Finally, we may wonder how well a test compares to other assessments and how well it predicts future events, known as criterion-related validity. This can be divided into two main components. Concurrent validity refers to whether a test's results match up with other related results at the same time. For instance, if you took an IQ test today, I might want to know how your score relates to your current grades in school (just remember that concurrent means existing simultaneously). Predictive validity, however, assesses how well a test can predict future outcomes, like whether your childhood IQ score can predict your later SAT results, or whether your SAT results can predict your later college performance, etc. (remember it is about predicting future outcomes).

✎ Quick check — Section 4

The Flynn Effect refers to:

Reliability

Reliability refers to the idea that if I measure the same object with the same measure, I should get the same result. There are a few ways to assess this for intelligence. One way would be to look at the test questions themselves.

Split-half reliability consists of randomly splitting a test into two halves, then calculating two separate scores. If the questions are randomly divided, on average test-takers should receive about the same score on each half. If the two half-scores are drastically different for many test-takers, this would indicate a possible lack of reliability.

Test-retest reliability refers to a single participant completing an entire test on more than one occasion, then comparing the scores. If you're measuring the same person, you should get roughly the same result each time. This technique can also help to identify the possible presence of bias in test administration. For instance, if many people receive lower scores when retesting with a particular examiner, that might indicate bias on the part of the examiner, rather than an actual drop in all participant IQs.

If people will take the same exam more than once, we need to have different versions of the test. Otherwise, there's a chance that scores could improve simply as a result of some participants already knowing the answers. Naturally, we need to ensure that alternate versions of the test are still assessing the same thing in the same way. We need to ensure that we have equivalent form reliability. The SAT provides a great example of this. The test needs to be different each time, since students can take the exam more than once, but each version needs to be the same level of difficulty in order to compare scores among many students.

This brings us to a related point. In considering the validity and reliability of a test, we need to consider more than just the test itself, we need to consider who is taking the test and how it is being scored. We want a test to be standardized, meaning that the rules for administering and scoring the test are clearly dictated. The SAT, SAT II, ACT, and Advanced Placement Exams are well-known examples of tests that are standardized.

We also want to ensure that a test is appropriate for the people it is intended for. For a test like the SAT, we want to know how actual high school students will respond to the questions. We need to test these questions with a sample of high school students, who would be the standardizationstandardizationEstablishing uniform procedures for administering a test and developing norms based on a representative sample. sample. This is something that the SAT does regularly, in fact it does so with a very representative sample; actual SAT test-takers. New experimental questions are added to the tests students take, but these have no influence on scores. Instead, these questions are assessed for possible use on future exams. In scoring, norms are established for which questions are of the appropriate difficulty and how scores will be calculated and compared.

✎ Quick check — Section 5

Heritability of IQ tends to be higher for:

A Brief History of Intelligence Testing

Now that's we've got the vocabulary for discussing assessment and psychometrics, let's take a brief tour through the history of different attempts to assess intelligence, followed by an overview of the main intelligence tests used by psychometricians today.

We'll begin with Sir Francis Galton, a half-cousin of Charles Darwin and a man obsessed with the measurement of man. Spurred by his relative's idea of natural selection, Galton believed that high intelligence was inherited and he spent years studying the genealogical histories of eminent minds (detailed in his book Hereditary Genius in 1869). Galton thought that in order to be passed down, intelligence must offer survival advantages and these advantages might be seen in physiological measurements. Galton invented the technique for calculating correlation, then measured reaction times, sensory acuity, muscular power, and even the body proportions and head sizes of thousands of people in an attempt to connect these physical traits to intelligence (which is why Galton is generally considered the founder of psychometrics). Despite his best efforts, however, Galton was unable to provide convincing evidence that any of these traits were strongly correlated with mental ability.

As the 19th century drew to a close, the French education system was dealing with a great difficulty. Education reforms meant that thousands of French children would receive mandatory free schooling, but a lack of previous education meant that many of these students were quite far behind. The wide variety of education levels meant that students couldn't simply be lumped together by age. Schools needed to identify which students would be able to catch up through remedial classes and determine how to place students appropriately. In order to identify and help those children who could benefit most, Alfred BinetAlfred BinetFrench psychologist who developed the first practical intelligence test to identify children needing educational support. (pronounced Bih-nay) worked to design assessments. This goal led to collaboration with Theodore Simon (try your best French accent – See-mohn) to create a multi-faceted examination. This Binet-Simon test compared a child's skills to the performance of children of different ages. This allowed Binet and Simon to compare children based on mental agemental ageBinet's concept of the intellectual level at which a child performs, compared to average children of various ages.. For instance, a precocious child of 8 who was able to perform as well as children aged 12 was said to have a mental age of 12. If, on the other hand, a 12 year-old only performed as well as most 10 year-old children, he would have a mental age of 10.

This concept of mental age was adopted by the German psychologist William Stern, who used it to calculate an Intelligence QuotientIntelligence QuotientA score originally derived by dividing mental age by chronological age and multiplying by 100 — now determined by comparing performance to same-age norms., or IQ. Stern divided mental age by chronological age, then multiplied that result by 100. If a child had a mental age of 5 and was actually five years old (chronological age), this would give a value of 1 (5/5), which, multiplied by 100, would give an IQ of 100. A five-year old performing at a mental age of 6, however, would end up with an IQ of (6/5) * 100 = 120.

This type of calculation, known as a ratio IQ score, worked pretty well for comparing children. After all, a 5 year-old who can perform as well as most 8 year-olds certainly would seem to be highly intelligent, while a 9 year-old performing at a 6 year-old level could indicate low intelligence. The problem arises once ages pass a certain point. At younger ages, just one year of mental age can mean a great deal of difference in abilities. But what's the difference between a mental age of 18 and 19? What would it mean for a 60 year-old to perform at the level of most 30 year-olds? Wouldn't that be a good thing? In this equation the denominator (chronological age) is always increasing at a steady rate, while the numerator may not be, meaning that scores would appear to continuously drop with age. One way around this was to use a chronological age of 20 for all adults in order to compare them, but this wasn't an ideal solution.

Psychometricians no longer rely on ratio IQ. Standard IQ tests today calculate what's known as a deviation IQ score. A deviation IQ score is calculated by dividing an individual's score by the average score of all test-takers within that age range, then multiplying the result by 100. This is why the average IQ score will always be 100; if your score matches the average for your group, this will divide to give you 1, which, multiplied by 100, gives you a deviation IQ score of 100. If you perform better than the group average, your IQ will be above 100, and if you score worse then it will be below 100. Even if everyone's raw test scores improved by 10 points, the average IQ would still be 100 (though as we'll see later, this makes comparing scores over time more difficult).

Because your IQ score is only compared to those within your age group, IQ scores aren't meant to be assessments of intellectual ability across age ranges. In other words, a 5 year-old child with an IQ of 130 might not be able to solve problems that an 18 year-old with an IQ of only 90 can solve. Remember that IQ is meant to be about aptitude, so a high score at a young age doesn't mean that the child actually has more skills, but rather, the child is believed to have a greater potential to develop skills in the future.

In the United States, Alfred Binet's work in creating assessments was taken up by Lewis Terman at Stanford University. Drawing on the Binet-Simon test, combined with Stern's calculation of ratio IQ, Terman created the Stanford-Binet Intelligence Scale in 1916, and it is still in use today (though now in its 5th edition and calculating deviation IQ scores, not ratio IQ scores). Terman is perhaps best known for initiating a longitudinal study in 1921 which aimed to identify the long-term predictive ability of IQ assessment. Terman believed that while educational environment did play a role in reaching one's fullest IQ potential, intelligence was mostly determined by heredity and thus could be identified at an early age.

Terman assessed 168,000 children, then followed 1,500 with exceptionally high scores (above 135 on the Stanford-Binet test) throughout their lives. This was a change from previous methods of considering the relationship between intelligence and success, which were mostly done by studying successful adults and attempting to retrace their childhood experiences and family histories (such as Galton's genealogical studies).

Terman tracked the educational outcomes, occupational status, income levels, and achievements of his high-IQ children (who were affectionately referred to as his “Termites”) compared to average-scoring children to determine just how much childhood IQ mattered to later life (in fact the study is ongoing with the remaining survivors). Terman hoped results would dispel the notion that childhood geniuses were socially inept or tended to burn out and he believed that early high IQ could predict later success. Terman did find that these children tended to do better in life than their peers; they published more academic papers and novels, registered more patents, received higher educations, job status, and earned higher incomes, though their high IQ wasn't a guarantee of success and not all of his Termites lived happily ever after. IQ also wasn't a guarantee of truly world-class scholarship and Terman didn't find any Nobel prize winners among his 1,500 child geniuses. Ironically, William Shockley was a young child tested by Terman who didn't score high enough to become a “Termite”, though he did go on to win a Nobel prize in Physics.

While the Stanford-Binet test developed by Terman is still in use, the most commonly administered intelligence test today is the Wechsler Adult Intelligence Scale (WAIS), first developed by David Wechsler in 1939 (as the Wechsler-Bellevue Intelligence Scale) and currently in its 4th edition. In addition to the WAIS, there are two other versions designed for younger children: the Wechsler Intelligence Scale for Children (WISC), and the Wechsler Preschool and Primary Scale of Intelligence (WPPSI). The Wechsler scales all test multiple abilities and include verbal and performance-based assessments. Despite what you might see in popular depictions of IQ testing, these tests are not predominantly pencil-and-paper exams and they are administered in person by a trained psychologist. The WAIS-IV consists of 10 core sub-tests (and 5 supplemental sub-tests) assessing areas such as general knowledge, word meanings, memory, and rule application. The majority of these sub-tests do not involve any writing, and none involve writing words.

✎ Quick check — Section 6

Stereotype threat refers to:

IQ Differences

IQ scores differ and this is what makes IQ worth studying. Intelligence tests are all about differentiation. If everyone got all the questions correct, or everyone got all the questions wrong, everyone would have the same IQ score, and as a result, IQ scores would be meaningless. We want IQ scores to be as different as possible, because that means we're able to generate finer distinctions between people.

This push for differentiation may create a problem when it comes to interpreting IQ scores. Perhaps the tendency to emphasize differences leads test-creators to create artificial distinctions between people whose intelligences aren't really that different (though their IQ scores may be). In other words, the difference in intelligence between two people, one who scores 100 and one who scores 104 may actually be negligible. While it's great that a test can discriminate between these two people's abilities, this doesn't necessarily imply any practical difference in their intelligences.

This is not to suggest that IQ scores don't provide information about individuals, but rather that there are a number of caveats to keep in mind when it comes to interpreting score differences. That said, let's look at how scores are usually analyzed, and then we can consider what differences in scores may mean for understanding individuals.

✎ Quick check — Section 7

What does research on early childhood enrichment programs (like Head Start) suggest?

Individual Variation

When we look at the frequency of different scores on IQ tests, as well as many other traits, ranging from height or weight to personality traits, they tend to be distributed in something close to a normal curvenormal curveThe bell-shaped distribution of IQ scores with a mean of 100 and standard deviation of 15. or bell curve. This means that scores in the middle of the range are the most frequent, and the frequency of extreme scores falls off symmetrically in both directions. Very low scores are rare and very high scores are rare, so most scores are clustered around the mean. As we saw in the calculation of deviation IQ, the average IQ score is 100. The standard deviation for IQ tends to be about 15 points. This means that most people have an IQ score close to 100, and about 68% of people will be within 15 points of 100 in either direction (from 85 to 115).

Knowing that IQ follows a normal distribution with a standard deviation of about 15 means that one can also estimate a percentile score. A percentile score lets an individual know how his or her score compares to everyone else who took the test by telling what percentage of people received lower scores. An IQ score of 100 would be in the 50th percentile, because half of the test-takers scored below this point (which means that half also scored above). An IQ score of 115 would be around the 84th percentile, because about 84% of test-takers would score below this level. Just remember that higher percentile scores indicate better performance, while lower percentile scores indicate worse performance compared to other test-takers. You may be familiar with percentile scores if you've taken the SAT, as the College Board includes a percentile score in your score report.

So what can be said about individuals who score above, below, or at the mean? What predictions can we make about other behaviors? What real-life outcomes are associated with IQ scores?

It turns out that an IQ score can act as a fairly good predictor of future success. Higher IQ is associated with a number of positive outcomes ranging from education level, job status, income, and job performance. While it could be argued that many of the above consequences are too closely related (higher IQ leads to higher grades which lead to better college acceptances which lead to better job opportunities and higher pay), higher IQ is also related to somewhat unexpected consequences such as greater longevity. Perhaps we could see this as the ultimate measure of “success in living” as those with higher IQ generally manage to live longer. Whether this increased longevity is a result of better “system integrity”, health management, accident and disease prevention, or access to greater resources is still open to debate, though we may expect that all of these factors (and perhaps others) play a role.

IQ is also correlated with other more directly observable phenomena and individuals with higher IQ are able to correctly judge line lengths and differentiate colors and tones with shorter “inspection time” than others and these individuals also show faster and less variable reaction times to a number of different stimuli. These results suggest that despite flaws and controversies around what intelligence is and how it can be tested, IQ scores actually reveal some ways in which individuals differ and scores do allow some behavioral predictions.

IQ Extremes

IQ extremes can be defined in a statistical manner, according to the normal curve. Once a score is more than 2 standard deviations (30 points) away from the mean, it can be considered to be exceptional, as only about 2-3% of scores will fall into these ranges in either direction.

High levels of intelligence are referred to as intellectual giftednessintellectual giftednessExceptionally high intellectual ability, typically defined as an IQ above 130.. While this is often considered to be IQs over 130 (2 standard deviations above the mean) there is no precise score value that defines giftedness, and as the rest of this chapter has already indicated, conceptions of intelligence can vary widely. So what does giftedness mean? Despite some popular misconceptions, intellectual giftedness does not necessarily imply introversion, poor social skills, or depression. In fact, most people with high IQ seem to be well-adjusted and have fulfilling work, family, and social lives. Giftedness generally refers to children who have the ability to learn material quickly, with fewer repetitions, and in more broad areas compared to their peers.

People with IQs below 70 are considered to have an intellectual disabilityintellectual disabilitySignificantly below-average intellectual functioning (IQ below 70) with deficits in adaptive behavior, present before age 18. (ID). This was previously regarded as the cutoff point for mental retardation, but this term is no longer used. The farther below 70 an individual's score, the more severe the disability tends to be, though extremely low scores (more than 3 standard deviations below the mean) are difficult to assess reliably.

What Causes IQ Differences?

When we think about IQ differences, we inevitably wonder what causes these differences. You probably already know that the answer won't be as simple as “genes” or “environment”, but rather a complex interaction of these forces.

Evidence for genetic influence on intelligence comes from twin and family studies which have demonstrated that identical twins tend to have more similar intelligence scores than fraternal twins, who in turn are more similar than random unrelated individuals. It seems that the more genes people share, the more similar their IQ scores are. Twin studies can also tell us about the importance of the environment by comparing shared vs. nonshared environments. Will identical twins raised in different homes still have similar IQ scores? Will two non-related adoptive children have more similar IQ scores when they are raised in the same home environment?

Studies of monozygotic (identical) twins raised in the same home have found an IQ score correlation of 0.85 but a correlation of 0.6 for dizygotic (fraternal) twins raised together. Nonbiological siblings raised in the same home show an IQ score correlation of only 0.25 in childhood and this correlation drops to zero by adulthood, suggesting that an early shared environment does not have a powerful long-lasting effect on general cognitive ability (Plomin et al, 2013 p.30, p.97).

From these types of studies, researchers calculate a heritabilityheritabilityThe proportion of variation in a trait within a population that is attributable to genetic differences. score to represent the strength of genetic and environmental influences. Heritability scores can range from 0 to 1 and tell us how much of the variance in a particular trait can be explained by genetic forces. The heritability score for IQ is generally estimated to be somewhere around 0.5 which means that about half of the variance in IQ scores in the population is due to genetic factors. Remember that the concept of heritability is not about individuals; heritability only tells us about groups. It doesn't explain the scores themselves, it explains why the scores differ. Saying that heritability is 0.5 doesn't mean that half of your IQ score comes from your genes, or even that your genes account for half of your intelligence. A heritability score of 0.5 means that when we attempt to answer the question of why people have different IQs, the fact that people have different genes provides about 50 percent of the explanation, while differences in environments account for the other half.

We should also remember that heritability scores vary. When environments are more similar, genes may matter more. For example, if all students had equal nutrition, identical educational opportunities, the same parenting, etc. then any differences in their IQ scores would mostly come from their differing genes (so heritability would be high). If students had very different environments (some went to school, others didn't, some were malnourished, others weren't) then these environmental differences would contribute more to explaining any IQ differences, and thus the heritability score for this population would be lower. Heritability for intelligence has been shown to increase with age and heritability scores as high as 0.8 have been calculated in older populations, suggesting that genes may have a snowball effect of increasing influence over time.

A heritability score doesn't identify which genes or which aspects of the environment are involved. At this point, other than identifying genes for specific disorders which affect cognitive development (discussed below), we don't have particularly precise ideas of which individual genes are “responsible” for IQ and it seems exceedingly unlikely that there is a single intelligence gene. Intelligence is likely the result of the complex interactions between many genes and many environmental factors.

The role of genes can be seen in genetic and chromosomal abnormalities which are known to affect intelligence and cognitive development. Down Syndrome is a disorder that results from an extra copy of the 21st chromosome (known as trisomy) and causes cognitive impairments and other symptoms. Down Syndrome is one of the most common causes of intellectual disability, affecting about 1 in 1,000 births. Down Syndrome is the result of a mutation, and thus is genetic but not actually inherited and does not run in families (sufferers of Down Syndrome are infertile and thus do not pass on the disorder; each case represents a new incidence of the mutation).

Fragile X syndrome, resulting from repeated triplets of base pairs on the X chromosome, is the second most common cause of cognitive disability, affecting approximately 1 in 5,000 males (and 1 in 10,000 females). It is the most common inherited cause of cognitive impairment (repeated triplets can build up over successive generations, meaning that incidence increases over time).

Another case of genetic influence on IQ is Williams Syndrome, a disorder caused by the deletion of a number of genes on chromosome 7, affecting about 1 in 10,000 births. Williams Syndrome is an intriguing example because it highlights the difficulties of defining what “intelligence” means. People with Williams Syndrome tend to score below 70 on IQ tests and have difficulty with everyday tasks such as counting, tying their shoes, or determining left from right. Yet, these same people often show exceptional gifts in areas of language and music which defy our usual expectations for people with low IQ scores and raise questions about how genes may influence different areas of cognitive development.

Another example highlights the importance of the interaction between genes and environment in understanding intellectual development. Phenylketonuria (PKU) is a genetic mutation which prevents the body from making phenylalanine hydroxylase (PAH), an enzyme that breaks down phenylalanine, a common substance found in many foods which can be toxic to neurons. Most people naturally produce this enzyme and as a result, consuming phenylalanine is harmless. Those with PKU, however, accumulate toxic levels of phenylalanine, destroying neurons and stunting their intellectual development. While this sounds entirely genetic, the key idea here is that by controlling the environment (avoiding foods with phenylalanine), the risk of cognitive impairment is greatly reduced. Despite obvious genetic factors, environmental interaction is still fundamental for understanding PKU's influence on development.

Now imagine that instead of a single gene, we had hundreds of genes that each might be involved in intelligence in different ways. Each gene may help or hinder cognitive development, depending on interaction with hundreds of other environmental factors. The same environmental factor could trigger the expression of one gene while inhibiting another. The “best” environment could depend on one's genes, or the “best” genes could depend on features of one's environment. Now we begin to see the mind-boggling level of complexity involved in understanding the role of genes on something as broad as intelligence.

There is not necessarily a single way that genes operate and the environment is almost always fundamental for understanding traits and behaviors, even when specific genes are known to be involved. This applies whether we are trying to understand the role of genes on intelligence, a personality trait, or a mental disorder.

Interpreting Group Differences

We've already seen that individual IQ scores vary. If we were to collect IQ scores from many different people, we might not be surprised to find that some groups of people (such as neurosurgeons) outscore other groups of people (such as janitors). We might even use this information to estimate someone's IQ based on their group membership (so without any other information, I might guess that a neurosurgeon's IQ is fairly high). But we may find ourselves more hesitant when we consider the possibility of group variation in IQ scores according to gender, nationality, or ethnicity. How do these groups differ when it comes to average IQ scores, and how can we interpret these differences?

Potential group differences also complicate the question of exam design, even before we begin to consider the social and political implications of research on this sensitive topic. Our social ideals and our scientific objectivity may occasionally appear at odds with one another. Addressing our scientific curiosity to understand how people differ has the potential to uncover differences we might prefer not to discover. We may also wonder whether we are really uncovering fundamental ways that humans can vary or if our current methods for collecting or analyzing data are flawed.

While we don't want hindsight to reveal that we were blinded by an overly-idealist preconception of complete equality, we also don't want to jump to hasty conclusions that could potentially marginalize entire groups of people or limit their opportunities. Perhaps the best we can do is admit the complexity of the questions we face and proceed with caution.

What do we do when tests reveal differences between groups of people? A big part of this question is tied to atrocities that have been committed by those who have used flawed intelligence research to rationalize racism and legitimize discrimination. The history of group differences in intelligence testing is a dark one, as early researchers combined views of heredity with a distorted interpretation of natural selection in order to promote the concept of eugenics; a belief in the inherent racial superiority of certain groups (this term was actually coined by Sir Francis Galton himself, the Greek prefix eu added to genic to refer to “good genes”).

The eugenics movement suggested that the human gene pool could be improved by only allowing the reproduction of those believed to be superior. This was sometimes also referred to as social Darwinism – a disgrace to Darwin's good name. To be clear, Darwin's notion of natural selection doesn't necessarily suggest inherent superiority or inferiority and the oft-used expression “survival of the fittest” (not coined by Darwin) should refer to organisms best suited to their living environment with the “fittest” referring to the “best fit” not the “most fit”. A particular trait may be advantageous in some environments but not others, so a “best fit” at a particular time isn't necessarily a sign of inherent superiority of that trait.

While eugenicists used early IQ results as evidence of racial inferiority, it's quite likely that bias was to blame for the group differences observed. Robert M. Yerkes (pronounced Yer-keys) formed a committee including Lewis Terman and Henry Goddard to create intelligence tests that were eventually used on over 1.75 million army recruits during the first world war. Many of these recruits were recent immigrants to the United States, and their results were used to place them in ranks and determine their suitability for officer training. There were two tests designed by Yerkes, the Army Alpha test, for literate soldiers, and the Army Beta test, for those who were illiterate.

Ideally, these tests would have been administered in a standardized way, but this was not the case. In “A Nation of Morons” Stephen Jay Gould suggested that incorrect test forms were often used (giving the Alpha test to those who probably should have taken the Beta) and that procedures weren't standardized for instructions, timing of sections, or even scoring. For the Beta test, Gould noted “The Beta examination contained only pictures, numbers and symbols. But it still required pencil work and, on three of its seven parts, a knowledge of numbers and how to write them.” As a result, scores were inaccurate and the average mental age of an army recruit was about 13. In this first mass-produced intelligence test, there were really two types of bias occurring. There was bias in the administration of the exams, and there was also cultural bias in the formation of test questions.

Those who had been living in the United States longer performed better on the Alpha Test, while more recent immigrants struggled with questions that had nothing to do with innate intelligence, such as “Cornell University is at: Ithaca Cambridge Annapolis New Haven” and “ 'There's a reason' is an “ad” for a: drink revolver flour cleanser”.

In retrospect, the obvious cultural bias in these questions seems almost comical but the implications were far from funny. Unfortunately, the eugenics movement developed powerful political influence in the United States, and played a role in the adoption of the Immigration Act of 1924, which restricted the immigration of people from southern and eastern European countries. This immigration policy prevented millions of people from escaping the later rise of fascism and its adoption was also seen as endorsement of the “science” of eugenics.

It's tempting to look back and think that the eugenics movement must have been the result of second-rate scientists, but this isn't the case. Eugenics provided a way to rationalize racism and warped the views of some of the most eminent minds, including Nobel Laureate William Shockley, who supported sterilization, and fellow Nobel Laureate Konrad Lorenz, who wrote:

“Usually, a man of high value is disgusted with special intensity by slight symptoms of degeneracy in men of the other race...The selection for toughness, heroism, social utility...must be accomplished by some human institution if mankind, in default of selective factors, is not to be ruined by domestication-induced degeneracy. The racial idea as the basis of our state has already accomplished much in this respect...” Lorenz (1940), quoted in Lerner (1992) p. 63-64

This “human institution” referred to artificial selection to refine the gene pool and remove the “bad” genes believed to be holding the human species back. This was done by controlling the reproduction of those believed to be inferior, resulting in tens of thousands of sterilizations in the United States. The mentally ill, those on the fringes of society (such as prostitutes and criminals), members of ethnic minorities, and people with disabilities believed to be of genetic origin had their reproductive rights taken from them forcibly and sometimes without their knowledge.

The widespread adoption of eugenic policies by Nazis during the 1930s is common knowledge, but the early precedents set by American eugenicists and the fact that some sterilization practices continued into the 1970s is less well-known. It's not a pleasant part of the past to ponder, but it should provide us an important warning of the possible ramifications when discriminatory beliefs are allowed to masquerade as science. We should never forget that psychometric tests have been used as justification for discrimination and violation of basic human rights.

While early group differences can be mostly attributed to bias in the administration, scoring, and interpretation of tests, bias doesn't seem to account for group differences found in modern test results. Bias in testing design has been greatly reduced in the past few decades (some might say eliminated) and yet we still find performance gaps on the basis of race and gender. Are modern tests really providing an accurate depiction of differing abilities? What might be causing these performance gaps and how should we interpret them?

Gender Gaps

When we compare male and female performance on IQ tests, we don't find a difference in general intelligence and average scores are comparable. That said, studies have found that each gender has a tendency to do better on certain types of tasks. In a summary of this research, Diane Halpern noted that males tend to outperform females on tasks involving visual and spatial transformations, some motor skills, fluid reasoning, and abstract scientific reasoning while females tend to perform better than males on tasks involving semantic information, complex prose, fine motor skills, and perceptual speed of verbal information. Another interesting gender difference is that males tend to be overrepresented in extreme scores and males are more frequently diagnosed with learning disabilities than females (Halpern, 1997). This may be partly explained by the prevalence of disorders such as Fragile X syndrome (discussed above), which is more common in males.

So what do these gender differences mean? Well, first we must remember that these differences are only about group tendencies, not individuals. There are millions of females who can outperform most males in abstract reasoning and millions of males who outperform most females in production and comprehension of complex language. Even so, we may wonder what is causing these overall group differences.

Addressing this question is complex because being male or female affects a huge number of other variables that may play a role in intelligence development. It's not just that males and females have different genes, different hormones, different levels of risk-taking, or that they spend different amounts of time on different activities; it's also relevant that cultural expectations and preferences exert pressure on the options and opportunities available for males and females. In fact, countries which are more gender-neutral tend to have smaller performance gaps. With all of these factors combined, it's clear that observed differences in task performance are not simply genetic and are the result of a complex biopsychosocial process which is not yet fully understood.

Race Gaps

When comparing the IQ distributions of different racial groups, researchers have found consistent performance gaps. Might these differences in average performance be caused by genetic differences between races?

In their book The Bell Curve, Richard Herrnstein and Charles Murray (1994) suggested genetic factors could be responsible for observed racial differences in IQ and suggested using intelligence scores in the formation of social policies. Subsequent critiques challenged many of Murray and Herrnstein's assumptions, as well as their statistics and data analysis. In response to the controversy, a series of statements signed by a group of intelligence researchers was published in the Wall Street Journal which acknowledged that the normal curve for blacks was approximately 15 points lower than whites, with Hispanics somewhere between those two, and the curves for Jewish and Asian groups perhaps higher (Arvey et al, 1994).

This response was criticized for appearing to support some of Herrnstein and Murray's claims and the American Psychological Association created a task force to report on the current state of intelligence research. This report (Neisser et al, 1996) clarified that the causes for group differences are not known and emphasized the possible roles of socioeconomic status and culture, noting that even when some within-group differences are genetic this does not mean that genes explain average differences between groups. In addition, the report noted that disadvantaged groups in caste-like systems tend to have lower scores than dominant groups, even when these groups are not considered racially distinct.

Some evidence indicates that the 15 point Black/White gap has closed somewhat since the publication of The Bell Curve. More recent estimates put the gap around 10 points and suggest it may continue to close (Dickens and Flynn, 2006). James Flynn has also suggested that the higher average estimates for Asian populations have been miscalculated due to improperly normed tests and nonrepresentative samples. Nevertheless, at present, performance gaps persist. What should we make of these differences?

What is Race?

A fundamental assumption for considering group differences by race is that race is a category that can be used to group people. But just how useful is a racial category like “black” or “white”? In terms of genetics, skin color certainly can't tell the whole story, especially when we consider that black-skinned Africans have more in common genetically with light-skinned Europeans than they do with similarly-black-skinned Australian aborigines.

While there are genetic differences between people that lead to what we might identify as “race”, these differences are far from clear-cut. Ancestries get complicated quickly and the lines between supposedly different races are blurred. Racial categorizations aren't being made by any sort of genetic evaluation, they're often self-selected by an individual checking a box on a form. When it comes to what being “black” means, it's worth noting that "Almost all Americans who identify themselves as black have white ancestors" (Arvey et al, 1994). The same can be said of many other supposedly distinct racial groups. As Sternberg, Grigorenko, Kidd, and Kenneth noted in their review on intelligence, race, and genetics: "Race is a socially constructed concept, not a biological one. It derives from people's desire to classify."

Environmental Forces

As we saw with gender, performance gaps may also be influenced by differing cultural pressures. For example, the fact that Asian students routinely outscore their white American counterparts in mathematics might suggest genetics at first glance, but when we consider that Asian students spend approximately 30% more class and homework time on math, we might be more likely to consider the gap a result of environmental differences.

James Flynn has found that IQ scores have risen dramatically throughout the past decades, with the average score gaining approximately 0.3 points each year, or 3 points per decade, a phenomenon referred to as the Flynn EffectFlynn EffectThe documented rise in average IQ scores across generations in many countries, likely reflecting environmental improvements.. Tests are constantly rescaled so the average has stayed at 100, though questions have become more challenging. This means that a person who scores 100 today would score above average compared to earlier test-takers. If we extrapolate from this increase, however, we might wonder whether we are now a nation of geniuses, or whether our ancestors really were a nation of morons. Are we really getting this much smarter this quickly?

What could account for this increase? It's probably not genes; it's far too rapid for higher IQ to be selected for. More likely, it's a number of other factors. Improvements in nutrition are certainly part of the explanation. This can be seen in developing countries; as nutrition improves and fewer children are malnourished, average IQ rises. It has even been suggested that the introduction of iodine into table salt was responsible for several points of average IQ increase in the United States by preventing cases of iodine deficiency disorder and the intellectual impairment it causes.

Education is another factor. Most people today have far higher levels of education than was common just a few decades ago. Flynn has noted that in addition to more time spent in school, the type of education we receive has also changed. Modern schools place great emphasis on abstract reasoning, and this is something that wasn't as important in the past. Many people in previous generations weren't considering purely hypothetical situations in order to make decisions but now young children are trained to do this day after day, year after year.

When we see how environmental forces can create gaps between groups of people over time, it becomes easier to see how environmental influences could be related to the group performance gaps that we see today. It's probably not the case that the smart are getting smarter but rather that lower scoring individuals who were previously holding the average down (due to factors like poor nutrition, lack of educational opportunity, testing bias, etc.) now have far greater opportunities to demonstrate their true potential. Evidence also suggests that the Flynn Effect has been slowing in recent years, but perhaps this can be seen as a triumph of the greater equalization of opportunity. Those who had previously been held back have made significant gains and we are now closer to a level playing field than ever before.

Expectations Matter

Perhaps gender or race-based expectations may be partially responsible for some observed group differences. Teachers may create self-fulfilling prophecies in which expectations actually cause the expected results. In a study published by Robert Rosenthal and Leonore Jacobson titled “Pygmalion in the Classroom”, all students at a California elementary school took an IQ test at the beginning of the school year. Teachers were then given a list of the 20% of students who were considered to be “spurters”; children expected to make great progress in the coming year. At the end of the year, all students completed another IQ test, and the 20% “spurters” did show more improvement than the other students.

The twist was that these 20% had actually been randomly selected at the beginning of the year. Just by believing that these students would improve, teachers managed to create this improvement. When teachers believe students are gifted they may raise expectations, call on them in class more often, and encourage them more persistently, all of which may turn that expectation into reality. What if teachers believed that every student would make a breakthrough? Why shouldn't we expect all students to make improvements?

The fact that students experience different environments is also seen in “tracking” that many schools practice. Early on, students are placed into tracks based on performance. This might sound appropriate (like what Binet was trying to accomplish), but schools may have a tendency to focus limited resources on the higher tracks, and view the lower tracks as lost causes. When we try to estimate the long-term consequences of high IQ in young children, it can be difficult because we have created situations in which the good get better and the bad get worse. When it comes to consequences like occupational status, educational achievement, or income later in life, it may be that early high-IQ children tend to do better because of additional educational opportunities and resources. Students who struggle initially may get the message that school is not for them, closing doors that could be opened with a little more push.

Programs like the Head Start Program seek to improve opportunities for children and enhance their cognitive development. There's evidence that the Head Start program does have a positive effect on child IQ and can indeed boost scores. Critics of the program, however, point out that these boosts in IQ seem to dissipate over time, suggesting that the program isn't able to create permanent change. But maybe we shouldn't expect too much of these programs when it comes to creating lasting improvement. After all, if we remove the enriched environment, perhaps we shouldn't be surprised that the benefits fade. We wouldn't be surprised if a perfectly well-nourished body could not maintain itself indefinitely when faced with later malnutrition or starvation. We shouldn't think of enriched environments as one-shot approaches to boosting IQ, but as long-term interventions that need to be maintained.

This relates to another problem in IQ research, which is the assumption that abilities are fixed and stable. IQ improvements which are only temporary are often dismissed, implying that intelligence is stable and short-term changes represent error rather than actual fluctuation. This message that IQ doesn't change is one that many students receive from an early age, causing them to believe that their IQ is a set number and there's nothing they can do about it. This might seem fine for those who do well, but for students who initially score poorly it may reduce motivation to work hard.

Carol Dweck has contrasted a Fixed Mindset and a Growth Mindset, based on whether people believe that change is possible or not. She has shown how these two mindsets influence behavior when it comes to effort and problem-solving. Those who have a fixed mindset believe that abilities are stable and innate and they tend to avoid hard work, even if they believe their own ability is high. When ability is believed to be high, it can be accompanied by a fear of failure. People with fixed mindsets who initially perform well may avoid future challenges because failure could indicate that they weren't really “smart” in the first place. With a growth mindset that IQ is malleable and can be improved with effort, the possibility of receiving a lower score is not so frightening. A low score represents a temporary setback rather than a fixed part of one's identity.

Situations Matter

What about gaps that are found even when environments are similar? Before thinking that this must mean genes, we may wonder if subtle differences in testing situations are causing these gaps. Claude Steele and Joshua Aronson had black and white undergraduate students at Stanford answer tough GRE questions. These students were told that the questions would assess their innate intelligence. Results showed that white students outperformed black students by a significant margin, even when adjusted for SAT score. The race gap seemed to be alive and well.

In a second condition, however, students were given the same GRE questions, but were told that this was a problem-solving task (rather than an assessment of intelligence). Now the race gap between SAT-adjusted scores disappeared and black students performed comparably to white students.

Steele and Aronson proposed a theory of stereotype threatstereotype threatThe performance-disrupting concern that one's behavior will confirm a negative stereotype about one's group.. Stereotype threat suggests that when people are placed in situations where there is the possibility of confirming a negative stereotype about their group, this creates additional anxiety which affects performance on the task. An IQ test may cause stereotype threat for black students and this means that the experience of taking an IQ test is different for black and white students.

Steele has referred to these circumstances as “identity contingencies”, suggesting that we all encounter situations where some aspect of our identity matters. Whether we are black students taking an IQ test, females taking a math exam, or older adults visiting a youthful start-up, there are circumstances in which possible negative stereotypes about our identity are relevant and these can influence behavior. Even if we don't believe these stereotypes, our knowledge of their existence can be enough to disrupt focus and impede performance.

This concept of how our identity influences us has direct implications for how we think about the way the world works. Stereotype threat isn’t just about our ethnicity or our gender, it’s about the fundamental nature of how our minds process social information. It can inform us of how subtle cues in the environment can have surprising effects on behavior and performance, and on the positive side, how minor changes can break down these nearly invisible barriers and help to bring out everyone’s best performance.

Steele and Aronson's work also demonstrates that even if there is no bias in the administration of IQ testing, stereotypes and biases could still be influencing results. Neither the white nor black students needed to actually believe stereotypes regarding race and intelligence. The test administrators and instructions were not intentionally or overtly prejudiced. Awareness of possible stereotypes combined with a scenario which makes them relevant can be enough to negatively affect performance. Non-prejudiced people with the best of intentions may inadvertently be creating situations with subtle but powerful identity contingencies. You can have a prejudice-free test administrator giving a reliable and unbiased test to people who do not believe the negative stereotypes associated with their group and yet these stereotypes can still be responsible for negative effects on their scores. For more on this, I highly recommend reading Claude Steele's book, Whistling Vivaldi: How Stereotypes Affect Us and What We Can Do.

Smart Shortcuts?

It's inspiring to see environmental interventions that can help to raise group averages overall and reduce performance gaps, but what about your individual score? Are there ways of quickly increasing an individual's intelligence or improving cognitive performance?

Stimulation

One new approach for quickly modifying your brain's abilities is transcranial direct current stimulation (tDCS) or transcranial magnetic stimulation (TMS). These technologies have the potential to temporarily influence activity in particular brain regions by altering the flow of ions and making neurons depolarized or hyperpolarized, changing their likelihood of firing. tDCS delivers stimulation using electrodes placed on the scalp, while TMS uses a magnetic coil held over the skull. There's some evidence that stimulating different brain regions can enhance or inhibit certain skills for a short period of time, including possibly improving some cognitive abilities and motor skills, though this research is still in initial stages.

“Cognitive Enhancers”

Another avenue for attempting to alter cognitive performance is pharmaceutical intervention in the form of drugs, most of which were created for the treatment of attention and wakefulness-related disorders like ADHD and narcolepsy. Drugs like methylphenidate (Ritalin), Modafinil, and Adderall are increasingly used by students and workers in an attempt to increase concentration and reduce mental fatigue, allowing them to work or study for longer hours. Off-label and recreational use can be dangerous and it's important to remember that these drugs may be addictive and have potential side effects, particularly when administered without medical supervision.

In addition to the prescription drugs above, there are other supplements known as “nootropics” which are claimed by some to offer cognitive benefits. These include a class of drugs known as racetams, in addition to nutritional supplements like fish oil, B-vitamins, ginseng, or others, all purportedly providing a cognitive edge. While some of these supplements may have benefits, they are part of an industry where limited regulation means that marketing can overshadow merit and the placebo effect may be convincing users of efficacy that isn't really there.

Training Your Brain

In 2008, a study by Susanne Jaeggi (pronounced YAH-kee) and colleagues raised the possibility that fluid intelligence could be improved through training with the dual-n-back task, a working memory task that involves remembering a sequence of visual and auditory stimuli. If you'd like to try the dual-n-back task, you can play a free version at www.soakyourhead.com. In recent years other “brain training” games and apps have gained tens of millions of users, most notably through subscription sites like Lumosity, BrainHQ and CogMed. Some question whether these sites offer more hype than science and evidence that improved performance transfers to other tasks is weak or nonexistent. Some of these sites have even claimed their games can stave off cognitive decline (leading to fines for deceptive advertising from the FTC) though it's not certain these tasks are any more effective than real-world cognitive challenges like reading, solving puzzles, or learning a new language or musical instrument.

The Flynn Effect — average IQ scores have risen across generations in many countries, likely reflecting improved nutrition, education, and more abstract thinking demands.

Before you commit yourself to hours of daily dual-n-back in hopes it will raise your grades, you may want to consider a more focused approach. While we don't yet have much evidence brain training improves performance on other tasks, we do know that practicing specific tasks can lead to improvements in those skills. In other words, if you're hoping to boost your intelligence because you think it would make calculus class easier, you're probably better off just spending that time working on calculus problems.

In addition to spending more time on the types of activities you'd like to improve at, there are a few other simple interventions that can help you to maximize your performance on a wide range of tasks. You've undoubtedly heard them all before, probably from your mother. Proper sleep (deprivation is known to reduce performance on cognitive and physical tasks), regular exercise, and a healthy diet are far more likely to help you than any games or nootropic powders. When it comes to reaching your fullest potential, my advice is get your sleep, exercise, and nutrition on track, maintain a growth mindset, and then just put in the time on the tasks you want to improve.

Chapter Summary

Key takeaways — Chapter 9

Theorists have considered a single factor (g), multiple intelligences, practical intelligence, and emotional intelligence in their definitions of intelligence.
The most commonly used intelligence assessments today include the Wechsler Adult Intelligence Scale and the Stanford-Binet Intelligence Scale.
Scores on intelligence tests tend to follow a normal distribution with a mean of 100 and a standard deviation of about 15 points. A score below 70 is considered to be a sign of intellectual disability, while a score above 130 is evidence of intellectual giftedness.
The heritability score for intelligence seems to be around 0.5, though this may vary. Though specific genes for intelligence have not been identified, genetic influence on cognitive ability can be seen in some disorders including Down Syndrome, Fragile X syndrome, and Williams Syndrome.
Gender and race-based performance gaps in intelligence have been observed. Some interventions have successfully reduced these gaps, which probably result from a complex interaction of biological, environmental, and social forces.
Average intelligence has risen over the past several decades, though this Flynn Effect appears to be slowing. The possibility of individual IQ improvement from training, supplements, and brain stimulation has received some attention recently, though the efficacy of any of these approaches remains unclear.

Review Questions

Chapter 9 — Intelligence

10 multiple choice questions · Select an answer then click Check

Question 1 of 10 Score: 0 / 0

out of 10

Study tools

Practice the Chapter 9 key terms

30 flashcards covering all key terms from this chapter — with instant definitions.

Study Flashcards →