How legitimate is the Flynn effect for the gifted?

The Flynn Effect in Gifted Samples: Status as of 2007

by John D. Wasserman
George Mason University

The rise in cognitive and intellectual test scores for at least three generations has been termed the Flynn effect (Herrnstein & Murray, 1994). Based upon findings reported initially in 1984, the Flynn effect describes a robust finding of massive IQ gains over time and across nations (Flynn, 1984, 1987, 1994, 1999). For IQ tests, the rate of gain is about .3 IQ points per year (or 3 IQ points per decade), roughly uniform over time and similar for all ages (Flynn, 1999). Flynn (2006) clearly intends that his finding of IQ gains over time should be applied to scores in all parts of the score distribution, from very low to average to very high. He even advocates that the Flynn effect, which was derived from large group studies, be used to generate corrected scores for individual test findings, in spite of the likelihood that such corrections are likely to contain much more error than accurate prediction.

My January, 2007 examination of psychological research databases suggests that the Flynn effect has not yet been adequately demonstrated for all levels of ability; there is some support for its validity with low ability individuals (e.g., those with intellectual disabilities or learning disabilities) but there is no substantive evidence for its validity with high ability individuals (particularly those who are intellectually gifted). Evidence supporting the Flynn effect has been reported for the mild mentally retarded range as well as the borderline range, with IQ gains over time showing similar magnitudes as Flynn found in the middle of the IQ distribution (Kanaya, Scullin, & Ceci, 2003). The idea that IQ gains are concentrated in the lower half of the distribution was asserted by the researchers who originally coined the term (Herrnstein & Murray, 1994). The impact of the Flynn effect upon classification of individuals with learning disabilities has been recently addressed by Truscott and Volker (2005; see also Sanborn, Truscott, & Phelps, 2003), although their research is based on LD diagnosis using IQ tests (a practice that has been seriously challenged in recent years). I have yet to see any sound empirical studies of the Flynn effect in gifted samples.

In his insightful critique of the Flynn effect, Rodgers (1999; see also Rowe & Rodgers, 2002) notes that changing means in a distribution of IQ scores does not identify which (sub)groups in the distribution actually experienced change. For example, IQ gains over time can be produced by improving harmful environments (e.g., through improved education and better nutritional practices) in low-IQ individuals at risk. Some evidence supports the contribution of low-IQ individuals to rising test scores: In large samples of Danish draft-age males, for example, Teasdale and Owen (1989) reported that test score gains appear to be concentrated among low ability examinees.

Rowe and Rodgers (2002) also note that changes in the variance of a distribution of scores can mimic changes in means. A gain in IQ scores over time can be produced by decreasing the variability in scores at the low ability end of the distribution, or alternatively by increasing the variability of scores in the high ability end of the distribution.

The Flynn effect was originally derived from research with tests that are good measures of general intellectual ability (Spearman’s g factor), such as Raven’s Progressive Matrices. If the effect does indeed reflect gains in g over time, then Kane and Oakland (2000) may have found an explanation for its relative absence in the highly gifted. Kane and Oakland reported that on the Wechsler intelligence scales, g accounts for less variance in high IQ individuals than low IQ individuals. Based on the argument advanced by Rowe and Rodgers (2002) that changes in the variance in selected parts of a distribution can account for the appearance of mean score changes in IQ, the Kane and Oakland finding may explain why extremes in the intelligence test score distribution do not all show the same Flynn effect.

The methodology required to demonstrate the Flynn effect for high ability samples requires tests measuring comparable constructs with adequate ceilings and samples that are free from selection bias across cohorts—all requirements that constitute substantial challenges. To further complicate matters, some recent findings suggest that the Flynn effect has run its course and that general population IQ gains over time are reaching plateaus (Teasdale & Owen, 2005; Sundet, Barlaug, & Torjussen, 2004).


John D. Wasserman, Ph.D., Associate Professor of Psychology at George Mason University, is an educator, practitioner, and researcher in psychology. After earning his Ph.D. in clinical psychology from the University of Miami, he completed a fellowship in clinical neuropsychology at Louisiana State University and Tulane Medical Centers in New Orleans. He developed and directed a pediatric neuropsychology service at Children’s Hospital in New Orleans, before spending 8 years in the test publishing industry, directing the development of psychological tests to measure cognitive, neuropsychological, social and emotional functioning—including the Stanford-Binet Intelligence Scale (5th edition). From 2001 to 2005, Dr. Wasserman directed the George Mason University Gifted Assessment Program in Fairfax, Virginia.

