A Statistical Study of the Responses of a Group of Normal Children to the Individual Tests in The Stanford-Revision of The Binet-Simon Scale

Author:

Magda Skalet

Research Associate, Brush Foundation, Western Reserve University It is a recognized fact that the intelligence quotient and mental age, either separately or together, constitute an altogether inadequate description of any child’s mental equipment, and yet these are usually the only data that are utilized for classification or analytical purposes. According to Bronner, Healy, Lowe, and Shimberg (1) a numerical evaluation is very limited in its revelations. As put together the Stanford-Binet forms a hodge-podge of tests and the values given for success in each are quite arbitrary.

Thorndike (7) criticizes the scale for ambiguity of content, arbitrariness in units, and ambiguity in significance. “What the individual tests purport to measure, what types of tests are included in the series, and what the differences are between the responses made by children of various ages and mental levels, are questions that have been almost wholly neglected. Although there have been numerous investigations made of the responses of normal and feebleminded children of a limited age range to the individual tests in the Stanford Revision of the Binet-Simon Scale, there has been no detailed analysis of the responses of a group of normal children to these tests. McFadden (4) and Wallin (9) have given comprehensive reviews of the literature concerning this question of success on the sub-tests in the scale. They have compared the relative difficulty of the tests for various grades of mental development and found that many of the tests were misplaced in the series according to the mental ages gained by the children, not only in the sub-normal group, but in the normal as well. Some were too difficult, and others too easy for the general year-level in which they were placed. Specific comparisons between the results of these two investigators and those obtained in this study will be made later.

In the present study, the following specific problems were con183 sidered: the responses of children of various mental ages to the individual tests in the series, types of test according to the mental ability measured, a comparison of the percentage of each type passed by children of various chronological ages, mental ages and intelligence quotients, and a determination of the range of the tests passed according to these three criteria.

The data were obtained from the mental test records taken by the Bureau of Educational Experiments, New York City, of children enrolled in the City and Country School. The children came mostly from the professional and managerial classes, and hence formed a rather highly selected group. A very rigid selection from the Bureau’s records was made in order to insure accuracy of the data to be analyzed. The tests had been given from 1919 to 1929 by three competent psychologists. Success in the individual tests, the chronological and mental ages, and the intelligence quotients were checked for each test. In Table I is given the distribution of the tests included in this study.

Table I The Number of Tests Included in the Study No. of Tests. Total No. of children . . Total no. of tests 133 133 26 52 15 45 10 40 5 25 189 295 Mean of deviations between highest and lowest I.Q. for each child 8.2 17.6 16.5 17.4

There were a total of 189 children included in the study, to whom 295 tests had been given. No attempt was made to differentiate between the first and later tests in the analysis.

The distribution of the tests according to C.A., M.A., and I.Q., is given, in another connection, in Table IV. The range in chronological age is from 2 years and 10 months to 13 years and 7 months, with a mean of 7 years and 1 month. The mental age range is from 3 years to 14 years and 10 months, with a mean of 8 years and 2 months. The I.Q. range is from 79 to 165, with all but 15 cases falling between 90 and 148, with a mean for the whole group at 115.2. The group is thus somewhat superior.

It would, no doubt, have been better to determine upon the type of sample desired before giving the tests so as to obtain a more representative normal group, and to use only the first test on each child, but since all of the data had been collected before this studywas contemplated, it was thought advisable to retain all of the records available. The number of cases does not provide an adequate basis for a conclusive study, but provides material for indicating the importance of an analysis of the individual tests and each child’s performances in any general test such as the StanfordBinet. A comparison of the intelligence quotients obtained indicates several marked differences in repeated tests. Two samples are given to indicate extreme and slight variation in repeated tests.

C.A. M.A. I.Q. Child A. Child B. 3-4 5-3 3-0 4-6 5-5 6-6 7-8 4-0 8-8 3-0 5-0 5-6 7-4 8-0 120 165 100 111 102 113 105

The mean of the deviations between the highest and lowest I.Q. for each child is also presented in Table I. For the 26 children given two tests the mean deviation is 8.2 points, while for those given 3, 4, and 5 tests the mean range is approximately 17 points in each case. According to Terman (5) repeated tests show a change of not more than 8 points, and only 4 points on the average. In a later book (6), however, he cites positive changes of as much as 49 points, and negative changes of 36 points and less. Gates (3) reports deviations of plus 18 to minus 12 with an average change of 6 points in consecutive tests.

The problem of variations in intelligence ratings of children in repeated tests is by no means solved. Our results merely point out that great variations in the I.Q.s obtained by the same children at different times occur frequently enough to warrant further study of the alleged constancy of this measure. The causation of these changes is not entered into here because of lack of the necessary data, but it is of very great importance in studying individual children. In standardizing the Binet-Simon test, Terman (5) placed the tests at those ages where approximately two-thirds of the children of a given chronological age passed them, and also so as to obtain a normal distribution of mental ratings around the norm for each age group. The actual distribution of these percentages has never been published, but studies reporting the results secured by other investigators show that there is an unequal proportion of children passing the individual tests in each age series. In Table II is preTable II Percentage of Subjects in the Different M.A. Groups Passing the Subtests IN THE STANFORD-BlNET Mental ago. No. of cases.

32 11 Test Per Cent Passing III. 1. Points body parts.. .. 2. Names objects…. 3. Describes pictures. 4. Gives sex 5. Gives last name … 6. Repeats syllables IV. 1. Compare lines. …… 2. Form discrimination . 3. Counting (4) 4. Copying square 5. Comprehension 6. Rep. 4 digits V. 1. Compare weights 2. Naming colors 3. ^Esthetic comp 4. Definition (use) 5. “Patience” 6. Commissions VI. 1. Right and left 2. Omissions, pict 3. Counts 13 . 4. Comprehension 5. Names coins 6. Repeats syllables VII. 1. No. of fingers 2. Describ. Pictures 3. Repeats 5 digits 4. Bow-knot 5. Differences 6. Copy diamond VIII. 1. Ball and field 2. Counts 20-1 3. Comprehension 4. Similarities 5. Definitions 6. Vocabulary (20) 100 100 86 100 82 91 91 55 27 5 55 18 41 32 18 5 5 23 9 0 0 0 0 0 100 100 100 100 93 100 89 86 61 32 93 82 73 70 59 45 34 64 25 2 0 18 5 14 0 7 7 2 2 0 100 100 96 87 100 98 96 71 75 95 78 96 60 30 27 71 9 40 2 30 32 6 10 12 0 0 4 2 4 0 100 100 91 98 98 100 75 80 84 86 34 82 45 75 50 61 61 34 9 5 20 18 43 0 100 100 98 100 100 100 98 100 96 96 84 92 92 78 88 98 80 45 37 51 47 61 22 97 100 100 100 100 100 100 100 97 94 100 94 78 50 94 75 100 63 92 92 100 96 96 96 86 100 100 100 100 100 RESPONSES TO BINET-SIMON SCALE 187 Table II?Continued Mental age. No. of cases. 32 Test Per Cent Passing IX. 1. Date 2. Arranges weights 3. Makes change 4. Reverse 4 digits 5. Sentence construct. .. 6. Rhymes 1. Vocabulary (30) 2. Absurdities 3. Designs, memory 4. Reading and report.. 5. Comprehension 6. Names 60 words XII. 1. Vocabulary (40) 2. Abstract words 3. Ball and field 4. Dissected sentences .. 5. Fables 6. 5 digits reversed 7. Picture interpretation 8. Similarities XIV. 1. Vocabulary (50) 2. Induction test 3. President and king. .. 4. Problem questions… 5. Arith. reasoning 6. Reversing clock XVI. 1. Vocabulary (65) 2. Fables 3. Differences, abstract words 4. Enclosed boxes 5. 6 digits reversed 6. Code XVIII. 1. Vocabulary (75) 2. Paper cutting test 3. Repeats 8 digits 4. Repeats thought 5. 7 Digits reversed 6. Ingenuity test 2 49 4 14 4 18 0 4 2 0 2 13 66 41 25 34 81 0 22 3 0 19 22 73 88 73 73 73 92 35 58 46 27 58 62 4 4 12 4 4 15 23 19 100 86 100 86 100 100 57 100 57 57 86 29 14 43 29 57 43 43 29 0 0 14 14 0 0 100 100 100 100 100 100 67 100 67 100 83 67 50 33 50 33 33 67 67 83 17 17 0 0 17 17 0 0 0 0 33 0 0 0 17 0 0 0 100 100 100 80 100 100 100 100 60 80 100 100 100 100 60 40 80 40 100 80 60 40 80 40 40 40 0 20 0 60 0 0 0 0 20 0 0 0 100 100 100 80 100 100 100 80 80 80 80 20 80 80 0 60 20 80 40 20 0 20 20 20 0 0

sented the percentage of subjects in the different mental age groups passing the sub-tests in the Stanford-Binet. Since the alternates were not given in the majority of cases, they were not included in the table. From those that were given, however, it appears that the alternates are misplaced in a greater number of instances than the tests of the main series.

The intervals in each case include the 11 months immediately following the age cited, as for example, 3 includes all cases with mental ages from 3 years and no months to 3 years and 11 months. The number of children tested is given directly underneath the ages in the second row of figures. In case 60 to 90 per cent of the children pass the test for their mental age, this test is considered as correctly placed. Ninety per cent is used as the upper limit because of the inclusion of all cases with mental ages up to the next unit year. Any deviation from this rule gives definite indication that for this group the test is misplaced in that year-level, when compared with the other tests in that series.

The tests in the three year series are of about the same difficulty for all of the children obtaining mental ages of three to three years and eleven months. According to the percentages passing the tests in the four year series, the comparison of lines (IV?1) should be placed in the three instead of the four year group, whereas copying a square (IV?4) is too difficult for the four year old and is more typical of a five year’s performance. Comparing weights (V?1), naming colors (V?2), and carrying out three commands (V?6) appear to be too easy, since two-thirds of the four year old group pass these tests. An analysis of the responses made by children having mental ages of three, four, and five years has not been reported previously so that no comparisons can be made between these results and those of other investigators.

The tests for distinguishing right and left (VI?1) and the ability to comprehend situations (VI?4) are relatively easier for children of five years than the other tests in the group. Only one-third of the six year old children are able to name three of four coins (VI?5), and it seems that this test would be more correctly placed at seven years. McFadden (4) also finds naming coins to be too difficult. He indicates that giving the number of fingers (VII?1) is too simple for the child with a mental age of seven as compared with the general difficulty of the year-level in which the test was placed, but this seems to be correctly placed according to the results presented here. Describing pictures (VII?2) was less difficult for the six year old children than any of the other seven year tests.

Counting backwards from 20 to 1 (VIII?2), and giving definitions superior to use (VIII?5) are too hard as demonstrated both by these results and by those reported by McFadden (4). “VVallin (9) found that the ball and field test (VIII?1) is too comRESPONSES TO BINET-SIMON SCALE 189 plex for the eight year old, but this finding is not substantiated here.

Making rhymes (IX?6) is too easy at nine years and should properly be placed in the eight year series in accordance with these findings as well as those of McFadden (4) and Wallin (9). Arranging weights (IX?2) is passed by two-thirds of the children in the eight year mental age group.

The number of cases included above the nine year group is too limited to permit of anything but a tentative interpretation. The number of words required in the vocabulary is too great in both the ten and fourteen year-level, and too small in the twelve year series, according to the percentage passing these tests at the various mental ages. The detection of absurdities (X?2), pointing out similarities (XII?8), giving differences between a president and a king (XIV?3) and the problem of enclosed boxes (XVI?4) are easier than the other tests in those age groups.

These internal inconsistencies indicate conclusively that the individual’s detailed performances must be considered together with his general rating in order to describe his mental ability. As pointed out (1) there is an abundance of evidence to indicate that the rating achieved is markedly influenced by specialized abilities and disabilities. If these are not adequately tested in the StanfordBinet, other tests must be utilized, but it is apparent that the contribution of the individual tests in the Binet to this knowledge of each child’s specific abilities has been seriously neglected. It is to be regretted that the number of records is so limited in the upper age-levels, but the results indicate fruitful possibilities for further study along these lines.

Instead of attempting a comparison of the individual tests, with the small number of cases available, it was decided to group the tests into seven general types, in order to determine whether or not there are any differences between the children of various chronological and mental ages and intelligence quotients in their ability to pass the tests in each classification. A list of the groupings is given here, and the specific tests referred to can be seen in Table II. A Classification of the Individual, Tests in the Stanford Revision of

THE BlNET-SlMON SCALE INTO SEVEN TYPES (Numbers refer to: year?test no.) 1. Immediate memory. 1) repeating digits forwards: 3-a, 4-6, 7-3, 10-al, 14?a, 18-3. 2) repeating digits backwards: 7-a2, 9-4, 12-6, 16-5, 18-5. 3) repeat syllables: 3-6, 4-a, 6-6, 10-a2, 16-al. 4) memory for ideas: 18-4. 190 THE PSYCHOLOGICAL CLINIC 2. Comprehending situations. ?) From verbal description 1) Associating words and objects: 3-1, 3-2, 3-4, 3-5, 5-2, 6-1. 2) Associating words and situations: 4-5, 6-4, 8-3, 10-2, 10-4, 10-5, 12-4, 12-5, 14-4, 16-2, 16-4, 16-a2. ?) Comprehending pictured situations: 3-3, 7-2, 12-7. 3. Spontaneous interest in number and time. 1) numbers: 4-3, 5-a, 6-3, 6-5, 7-1, 8-al, 9-a2, 14-5. 2) time: 6-a, 7-al, 9-1, 9-al. 4. Geometrical forms. 1) Identification: 4-2, 5-3, 6-2, 10-a3. 2) Reproduction: 4-4, 7-6, 10-3. 3) Construction: 5-5, 8-1, 12-3, 16-6, 18-2. 5. Following a guiding idea: 4-1, 5-1, 5-6, 7-4, 8-2, 8-a2, 9-5, 9-2, 9-3, 9-6, 14-6, 18-6.

  1. Vocabulary: 5-4, 8-5, 8-6, 10-1, 10-6, 12-1, 12-2, 14-1, 16-1, 18-1.

7. Differences and similarities: 7-5, 8-4, 12-8, 14-2, 14-3, 16-3. As presented, all of the classes but the fifth are self-explanatory. Under “following a guiding idea” are included those tests generally thought of as measuring the ability to follow directions in solving more or less practical problems. Since the overlapping of such factors as attention, memory, reasoning, and imagination in the tests is so great, it is impractical to make any classification other than this objective one based upon the specific performances tested. In tabulating the individual records, only those tests were included which were given above the basal year, and only for those years in which at least one test was passed. This was done so as not to include all of the tests passed below the basal year or all those failed above the last year where a success occurred which would conceal any differences that might exist. The purpose was to determine if there were some types of tests which were passed more frequently than other types classed as equal in difficulty. The per cent passed of each type was computed in the following manner. This example consists of the group of 55 children from five to five years and eleven months of age.

Type of Test Total No. Given No. Passed No. Failed Per Cent Passed 1 . . 2.. 3.. 4.. 5. . 6. . 7. . Total. 166 209 253 157 185 100 77 1147 95 129 107 74 60 39 46 550 71 80 146 123 125 61 31 597 57 62 42 37 32 39 60 48

It can be seen from this that the bases for the percentages are different in each instance. Because of the unequal number of tests of each tj7pe at the various year-levels, the percentages are not of equal significance. They do, nevertheless, permit of a comparison between the relative difficulty of the various kinds of tests for each age and between the ages for the same type.

Although these percentages were computed for each chronological and mental age group, they are not included here because there was no internal consistency in the increase or decrease of these proportions for the various ages. The differences observed were probably due to the sampling and not to any real differences between the types of tests at each age.

In the case of the intelligence quotient ratings, however, there are significant differences in the proportion of the various types of tests passed above the basal year. On the basis of several trial groupings, the ones indicated in Table III appeared to distinguish the most clearly between the different classes.

Table III

The Proportion of the Various Types of Tests Passed Above the Basal Year for Three I.Q. Groups Intelligence Quotient.. No. of Cases Included. Total No. of Tests…. 70-89 9 118 90-109 87 1381 110-169 199 3956 Totals 295 5455 Type of Test No. in Each Per Cent Passed of Each Type Given in the Group 774 1120 920 843 882 595 321 33 56 33 19 67 28 40 42 58 40 41 55 46 43 56 57 39 47 48 47 56 52 57 39 44 50 46 53 Total per cent passed Average no. of tests per record 42 13.1 48 15.9 50 19.9 49 18.5

The percent of tests passed of each type given in that group is presented separately for all yielding I.Q.s from 70 to 89 (below normal rating), from 90 to 109 (normal rating), and from 110 to 169 (above normal rating). A total of 5455 sub-tests were studied from the 295 separate records included. These figures enable the evaluation of the relative importance that should be attached to the different percentages given.

The number of sub-tests of each type included in the entire study is given in the second column of figures. These form the bases for only the last column of percentages. There is a consistent increase in the proportion of the tests of immediate memory (type 1), and the recognition of differences and similarities (type 7) passed with increasing intelligence ratings in the case of these children. Comprehending situations (type 2) and tests of spontaneous interest in numbers and time (type 3) are approximately as difficult as the other tests in the same year levels for all of the children regardless of their rating. Tests of the perception of geometrical form (type 4), following a guiding idea (type 5), and vocabulary (type 6), do not reveal any notable differences in their relative difficulty except between the children of normal rating and those below this.

In the last column on the page are given the percentages passing each type for the group as a whole. Approximately half (49%) of all of the tests given above the basal year are passed in those age series where at least one test was completed satisfactorily. Comprehending situations (type 2) seems to be slightly easier for the children than the other tests at the same year levels as indicated by the fact that a greater proportion of these tests are passed than any other type. Those tests of spontaneous interest in numbers and time (type 3) and perception of geometrical forms (type 4) are more difficult than the other types. Individual differences are concealed in these percentages for the whole group, but the general differences are indicated in the comparative difficulty of each of the various types of tests included in the Stanford-Binet series. There is a slight increase in the percentage of tests passed at the higher I.Q. levels, as well as in the average number of tests given above the basal year per record.

These conclusions hold for the group averages but are not applicable to individual cases either for single records or for particular children. For those children who were given from three to five consecutive tests, the following analysis failed to distinguish any important characteristics. In each case, however, there were great individual differences in the proportions and ratios obtained. There was no tendency noted in the percentages of each type of tests passed which would definitely delineate the individuals with higher or lower I.Q.s. There were marked variations between individuals, but not with regard to I.Q. groupings, in the proportion contributed to the mental age by the various types above the basal age. This was true also with respect to the ratio between the amount that was contributed by each type and the amount that would have been contributed had all of the tests been of equal difficulty for the individual children. In the analysis of the individual records, the number of sub-tests was too limited for computing any percentages of value. There was a marked inconsistency noted, however, between the types of tests above the basal age contributing the most to the mental age in successive tests. A study of the proportion of the mental age earned that was due to mixed tests of the basal year and below, showed that there was no relation between the I.Q. rating and this percentage. In an analysis of the number of year-series in which tests were passed above the basal year, several differences were noted. The data in Table IV present the mean age range in which tests were passed above the basal year for each chronological and mental age, and intelligence quotient rating.

Table IV The Range in Which Tests Abe Passed Above the Basal Year C.A. No. of Cases Mean Range M.A. No. of Cases Mean Range I.Q. No. of Cases Mean Range 2-2, 11 3 4 5 6 7 8 9 10 11 12 13 4 50 62 55 48 30 20 10 5 5 4 2 2.0 2.6 2.6 3.2 3.0 2.8 2.5 2.2 2.6 3.0 4.0 2.0 3-3,11 4 5 6 7 8 9 10 11 12 13 14 22 44 55 44 49 32 26 7 6 5 2.2 2.7 2.4 3.6 2.9 2.9 2.8 2.9 2.7 4.0 3.0 70-79 80 90 100 110 120 130 140 150 160 14 73 80 61 35 17 5 1 2.0 2.1 2.4 2.4 2.8 2.9 3.4 3.5 2.6 3.0 Totals 295 2.8 295 2.8 295 2.8

The range in the scatter varies for individuals from zero to six years. For instance, if the range for child W was three years, this would mean that he passed tests in three year-levels above the year in which he passed all of the tests. The mean range for all of the 295 records was 2.8 years above the basal year obtained in each case. In the chronological and mental age levels there is no consistent tendency indicated for either an increase or decrease in the scatter in the tests passed. The Pearsonian coefficients of correlation of this variability in the range with C.A. is -f- .079, and with M.A. it is + -173. This shows that there is a very slight and negligible relationship between the increase in age and the scatter in the tests that were passed. There is a significant increase in the number of years included for the higher intelligence quotients, except for the six tests with I.Q. ratings of from 150 to 165. The correlation between I.Q. and the extent of scatter is .292, indicating that the children of the higher mental ratings tend to have a wider age range in which they pass tests than do those of lower mental ratings. Although the means increase consistently, the variability within each I.Q. interval is wide, hence the relationship is not very close.

There is a difference of opinion expressed in the literature as to the amount of variability or scatter in the tests passed by children of different mental levels. Wallin (8) holds that the normals succeed in tests over a larger age range, and Doll (2) thinks that the feebleminded individuals vary more. Scattering in the develops ment of different mental functions according to Wallin (8) is a perfectly normal and typical phenomenon among all classes of human beings.

Summary

Many of the individual tests in the Stanford-Binet Scale appear to be of unequal difficulty when compared with the other tests in the year level where they are placed. Some of the tests are too complex, and others are too simple, not only according to these findings but also according to those reported by other investigators. This inequality indicates the necessity of a careful analysis of the successes and failures in order to differentiate between children of similar mental levels but dissimilar special abilities. The classification of the sub-tests into types shows that those involving spontaneous interest in numbers and time, and geometrical form are relatively more difficult for all of the children, and comprehending situations easier than the tests of the other types in each year series. There is no indication of any significant differences between the various C.A. and M.A. groups as to the relative complexity of the different types of tests. The tests of immediate memory, perception of geometrical forms, and the recognition of differences and similarities are more difficult for children of the lower than the higher I.Q.s. The other types are approximately of equal difficulty for children of all levels.

The analysis of individual performances indicates great differences with respect to the relative proportion of the types of tests passed in the successive tests and the proportion contributed by each to the mental ages earned. There was no consistent tendency noted in any of the percentages of the various types of tests that were passed which would apply to individual intelligence quotient ratings.

The correlations and means calculated indicate that there is a negligible relationship existing between the scatter of tests passed for the different C.A. and M.A. classes, but that children of higher I.Q.s tend to pass tests in a greater age range than those with lower ratings.

The value of this analysis lies primarily in its emphasis upon the individual differences in the types of tests passed above the basal year and the necessity of an adequate recognition of this in the interpretation of any mental test. It has not been possible, on the basis of these results, to make any conclusive statements as to the differences between single children of different mental ratings, but a few marked group tendencies have been pointed out. It is not surprising that the I.Q alone is insufficient as a description of a child’s mental status, in view of the wide variation in the individual tests contributing to the same mental age scores and the inconsistency in the relative proportion of the various types of tests passed from year to year by the same child.

Bibliography

1. Bronner, A. F., Healy, W., Lowe, G. M., and Shimberg, M. E.: A manual of individual mental testing. Boston: Little, Brown and Co., 1928. 2. Doll, E. A.: A brief Binet-Simon scale. Psychol. Clin., 1917-18, 11, 197211, 254-257. 3. Gates, A. T.: Psychology for students of education. New York: Macmillan Co., 1927. 4. McFadden, J. H.: Differential responses of normal and feebleminded subjects of equal mental age, on the Kent-Bosanoff free association test and the Stanford Kevision of the Binet-Simon intelligence test. Mental Measurement Monographs, 1931, No. 7. 5. Terman, L. M.: The measurement of intelligence. New York: Houghton, Mifflin and Co., 1916. 6. Terman, L. M.: Genetie studies of genius, Vol. 3. Stanford University Press, 1930. 7. Thorndike, E. L.: Measurement of intelligence. Psychol. Rev., 1924, 31, 219-253. 8. Wallin, J. E. W.: The phenomenon of scattering in the Binet-Simon Scale. Psychol. Clinic, 1917-18, 11, 179-1S5. 9. Wallin, J. E. W.: A statistical study of the individual tests in ages VIII and IX in the Stanford-Binet Scale. Mental Measurement Monographs .1929, No. 6.

Disclaimer

The historical material in this project falls into one of three categories for clearances and permissions:

  1. Material currently under copyright, made available with a Creative Commons license chosen by the publisher.

  2. Material that is in the public domain

  3. Material identified by the Welcome Trust as an Orphan Work, made available with a Creative Commons Attribution-NonCommercial 4.0 International License.

While we are in the process of adding metadata to the articles, please check the article at its original source for specific copyrights.

See https://www.ncbi.nlm.nih.gov/pmc/about/scanning/