Test names:
Reynolds Intellectual Assessment Scales (RIAS)
Snijders Oomen Nonverbal Intelligence Test (SON-R 6-40)
Intelligence and Development Scales (IDS)
Wechsler Intelligence Scale for Children, 4th edition (WISC-IV)
Culture Fair Intelligence Test Scale 2 (CFT 20- R)
Test scores:
Mean test scores were similar, ranging between 100.40 (RIAS) and 102.96 (WISC-IV). Individual children’s scores, however, differed significantly in 12% to 38% of cases, depending on the tests compared. The greatest discrepancy was observed between the SON-R 6-40 and the WISC-IV, with 38% of children showing discrepant scores. The differences found were greater than expected considering that the total intelligence test score reflects an aggregate score, which is seen as a more reliable measure than the score on an individual subtest.
Additional analyzes revealed that differences in test scores did not depend on which test was used, but rather on unexplained errors and the interaction between the examinee and the test situation. The reliability of intelligence test scores was greater when two or more tests were combined. Out of 10 possible combinations of two tests, 5 met or exceeded the reliability index of 0.80: RIAS and IDS, RIAS and WISC-IV, SON-R 6-40 and IDS, SON-R 6-40 and CFT 20-R, and IDS and WISC-IV.
Summary
The study by Hagmann-von Arx et al. (2016) investigated the comparability of results from five different intelligence tests in typically developing children. The research sought to determine whether these tests produce similar results and whether they can be considered reliable measures of intelligence.
The authors found that although the mean scores of the different tests were similar and strongly correlated, indicating that they measure a similar underlying construct (general intelligence), children’s individual scores varied considerably between tests. This variation was attributed to unexplained errors and the interaction between the individual and the testing situation rather than to systematic differences between the tests themselves.
Test Reliability:
The study suggests that the reliability of a single intelligence test is limited, with a considerable margin of error. However, reliability increases significantly when two or more tests are combined. Therefore, the study highlights the importance of using multiple tests for a more accurate and reliable assessment of intelligence, especially in high-stakes situations such as important education or career decisions.
In summary, the study does not claim that individual tests are completely reliable, but emphasizes that combining different tests can increase the reliability of intelligence assessment. The research also highlights the need for caution in interpreting individual test results, due to individual variability in scores.
Hagmann-von Arx, P., Lemola, S., & Grob, A. (2016). Does IQ = IQ? Comparability of intelligence test scores in typically developing children. Assessment, 23(6), 688-700.