A Picture Arrangement Test 

The Psychological Clinic Copyright, 1917, by Lightner Witmer, Editor. Vol. XI, No. 2 April 15, 1917 Contributed from the Bureau of Juvenile Research, Columbus, Ohio. J :Author: Alida C. Bowler, A.M.,1 Mental Examiner.

1. Introduction.?In its study of juvenile delinquents, so many of whom are extremely ignorant and some of whom are either foreign-born or come from homes where English is seldom spoken, the Bureau of Juvenile Research continually feels the need of adequately standardized performance tests with which to supplement the Year or Point-Scale findings. Some desirable characteristics of such tests are as follows: 1. That they shall require little or no comprehension or use of language. 2. That they shall catch and hold the interest of the subject. 3. That they be graded in difficulty. 4. That they admit of a scoring which gives some credit for partial success. The suggestion of a picture-arrangement test, designed primarily to measure logical judgment, seemed to offer some hope of fulfilling these requirements. The suggestion came from Dr 0. Decroly2 who tried out a series of such tests with about five hundred school children in Brussels. His material consisted of eleven series of pictures taken from children’s books, each series telling a complete though simple story when the members were arranged in the right ?rder. Six of these series contained four scenes each, one contained five, three had six, and one as many as eight scenes. A series, with the scenes arranged in illogical order, was placed in the child’s hands with the request that he lay them out in such a way as to make them tell a continuous story. The order in which he placed them and the time taken was recorded. If the order was incorrect he was asked to tell the story as he understood it and his account was noted. This often resulted in the correction of the errors. If he did not seem to understand what was wanted a simple series was used for an example. The subjects were for the most part children in the public and private schools of Brussels. They ranged in age from four to fourteen. There were quite a number of backward and retarded children among them. A few adults were included. Apparently no attempt was made to score partial successes. The aim was rather to find series adapted to different ages. He concludes in general that a series of such tests can be found which will indicate approximately the mental age, and that the measure of time is an important feature of the test. He also finds that children of the leisure classes are on the whole superior to the children of the laboring classes, thus apparently sustaining the contention of Binet with regard to judgment.

So far as is known to the writer the only attempt in this country to make use of Decroly’s suggestion is the work of D. K. Frazer at Cornell University. He assembled a collection of “Foxy Grandpa” pictures, each story containing six scenes. There were fifteen series in all, arranged in three groups; I, designated “easy,” containing series lettered A-E inclusive; II, designated “medium,” H-L inclusive; and III, designated “hard,” O-S inclusive. Frazer’s method of presentation was very similar to that of Decroly. A standard illogical order was adopted for each series. The general nature of the task was explained to the subject and one of the series was laid out before him with the request that he place the pictures in the right order to make a sensible story. If he arranged it incorrectly he was told that it was wrong and was asked to correct it. If he still thought he was right he was asked to tell the story and this usually caused him to detect his errors. He was always encouraged as much as possible to continue until he discovered his errors. When he had finally arranged it correctly or had “given up” the time was recorded. The only data at hand concerning the Cornell work with these pictures is a table of grouped results from twenty adults (students, men 10, women 10). This table gives the median, average, minimum and maximum times for the different series but no other data.

We obtained copies of these fifteen “Foxy Grandpa” picture series from Mr. Frazer and in the spring of 1915 we tried them out, using his method of presentation, with twenty-five delinquent girls at the Ohio Girls’ Industrial Home. It soon became evident that certain of the series permitted more than one logical arrangement. That is, some of the girls were able to give logical stories, without any very great gaps between scenes, for arrangements which were not the” story and arrangement looked for and expected by the examiner. These series were eliminated. Six series (B, D, I, J, R and S) were dropped at this time fov various reasons. It was also observed that after the first three or four series several of the brightei gills always picked out and placed the last picture first. This was easily explained by the fact that on the last member of each series appeared the signature “Bunny,” accompanied by the figure of a rabbit. Therefore, on all the pictures retained, this was obliterated by means of ink, colors and paint-brush.

We now had nine “Foxy Grandpa” stories, each consisting of six scenes. It seemed desirable to have some shorter, simpler series if we were to have a graded test. After much searching in children s books and Sunday supplements, and some wielding of the pen and brush, we had assembled four series of four scenes, and three of five scenes each. We were continually surprised at the difficulty encountered in finding a series of scenes which would tell a story without the aid of written or spoken words, and which would permit ?f but one logical arrangement. One experience in particular serves to illustrate how worthless an a priori judgment may be as to the value of a picture. A four-member series which seemed very simple and to permit of but one logical arrangement, was exactly reversed by about one-fourth of the children, who were able to furnish a perfectly legitimate, though somewhat fanciful, account.

Our picture-stories now numbered sixteen, roughly graded in difficulty. Still adhering to the Frazer method of presentation these were given to about forty children at a public play-ground in Columbus in the summer of 1915. A careful study of their records resulted in the dropping of one four-member, one five-member, and two six-member series, and the revision of the order. Moreover, the longer we worked the greater became the dissatisfaction with the method of presentation. In the first place, although we were desirous ?f eliminating the use of language, we were asking the subject to tell a story! In the second place, by letting the subject know he was wrong, again and again, we were deliberately opening the door to some very unwelcome visitors, namely, discouragement, embarrassment, and loss of confidence,?all factors which tend so to influence the attitude of a subject as to render him incapable of putting forth his best efforts. And lastly, how were we ever to standardize a method which included in its instructions “encourage the subject as much as possible” and “record the time when he either succeeds or gives up?” Would not the quantity and quality of encouragement vary with every examiner and even with the same examiner from day to day?

The two particular problems demanding solution at this point in the work were: (1) to determine the most satisfactory method of procedure in administering the test; and (2) to discover the less desirable pictures and discard them until we should have a test which could be completed in from five to ten minutes, yet be graded in difficulty. The first of these problems was solved before the main body of work was begun. But the full twelve series was given to about six hundred children before it was considered that sufficient data had been gathered to warrant the choosing of the final test series. The percentages of these six hundred children correctly completing each picture at each age from ten to fourteen inclusive, were computed and curves plotted from these figures. The curves seemed to indicate that some of the series would be of little value as mental tests, being quite as difficult for fourteen-year-old children as for ten-year olds, more difficult for one sex than the other, and showing some very curious ups and downs. After careful consideration six of the original twelve were chosen to constitute the graded series for a picture arrangement test. The materials and method of procedure finally decided upon, and the results obtained from their use with some one thousand and twelve individuals, will be discussed in the following pages.

2. Materials.?The test material includes six series of pictures, each series telling a complete story. They will be designated hereafter by the letters X, A, B, C, D, and E. The plan of each story follows:

X. The Stolen Slipper (4 scenes). Xi, an old woman, evidently just awakened, is seated in an arm-chair. She has on one slipper, while a pup is making off with the other; X2, the old woman, standing in the door, sees the pup in the yard with her slipper. X3, she gives chase. X4, she has caught the poor pup and is spanking him with the recovered slipper.

A. The Spilled Ink (4 scenes). Ai, a little girl is standing beside a table on which are a blank sheet of paper, a pen, a bottle of ink, and a sleeping kitten. A2, the little girl has climbed on a chair and is scribbling on the paper; kitten has awakened. A3, the kitten upsets the ink. A4, the little girl is standing on the floor, ink dripping from her hands and dress and tears from her eyes.

B. The Little Flirt (4 scenes). Bi, on a bench beside a road are seated a little girl and boy. Some distance up the road is another little boy with a bag in his hand. B2, the second boy has come up to the bench. B3, he has seated himself beside the girl and is evidently inviting her to go with him. B4, the little girl, with a stick of candy in her hand, goes off with the second boy, much to the chagrin of her former companion.

C. Foxy Grandpa and the Swans (6 scenes). Ci, the boys, in a shed, are dressing up like swans while in the distance Foxy Grandpa and little brother are visible, walking towards the pond. C2, Foxj Grandpa and brother, about to feed the real swans, are startled by the appearance of the make-believe birds. C3, they start to run. C4, the real swans take after the make-believe. C5, hot in pursuit. C6, the false heads have fallen off, the boys have climbed a tree to escape the angry birds, and Foxy Grandpa and brother have returned to laugh at them.

D. The Elephant and the Bees (6 scenes). Di, little brother is excitedly telling Foxy Grandpa something and pointing out into the yard. D2, Foxy Grandpa, having gone out to investigate, is frightened by the appearance of an elephant. D3, it chases him. 1^4, he runs among the bee-hives, overturning one. D5, the angry bees attack the elephant which comes apart, revealing the boys inside. D$, Foxy Grandpa and brother are laughing at the boys, whose hands, arms and legs are swollen and bandaged.

E. Foxy Grandpa and the Tramp (6 scenes). Ei, Foxy Grandpa, who has been reading a newspaper, is seated in an armchair in the yard. Little brother is telling him to look at the tough tramp who is peering over the high board fence. E2, Foxy Grandpa picks up the foot-stool. E3, he hurls it, hitting the tramp squarely in the head. E4, the stool falls to the ground but to the amazement of Foxy Grandpa and brother the tramp is apparently unharmed. E5, Foxy Grandpa has made a noose of the clothesline which lay near by and is lassoing the impudent tramp. E6, he has captured the intruder which proves to be merely a clothes-pole dressed up, and in its place appear the grinning faces of the boys.

The size of each individual pictured scene is about 4 by 4/2 inches, so that the whole packet of pictures is small and easily cariied. They are done in colors to attract the child’s attention. Moreover, the stories are purposely humorous in character with a view to holding his attention by introducing the element of amusement.

3. Procedure.?The subject was seated at a table opposite the examiner who recorded his name, age, birthday, and school grade. E then laid out series X, in its standard illogical order, directly in front of S, saying as he did so, “These little pictures will tell a funny story if you put them in the right order. Ihey are all mixed up now. You put them in a row here (pointing) so that they will tell a good story.” Usually S started in at once. If he hesitated and seemed at a loss, E asked “which one do you think ought to come first?” and when he pointed to one said, “All right, that’s good, put that one here (placing it) and now put up the one you think is next (and so on).” If he completed X correctly he was commended. If it was incorrect he was told that it was not quite right and asked if he could fix it. If he was unable to do so it was arranged for him. Records of the time and arrangements of X were not kept as it was intended solely for purposes of illustration, to make sure that S was given a complete exposition of just what was wanted. As soon as X was finished it was removed and A laid out in its standard illogical order with the remark, “And now make these little pictures tell a good story.” The stop-watch was started as the last picture was placed in front of S. When he indicated that he had finished the time was taken, the series removed, and his arrangement recorded. A, C, D, and E were then given in exactly the same manner. After the X series S was never told when he had made mistakes. He was made to feel that he was doing well. Inconspicuous lettering and numbering on the back of each picture rendered it easy for E to see at a glance as he picked up the finished series what the arrangement was. The “standard illogical order” adopted for the six series is as follows: X, 2-4-3-1 C, 2-4-6-5-3-1 A, 3-2-4-1 D, 6-4-1-5-3-2 B, 4-2-1-3 E, 2-4-6-5-3-1 4. Subjects.1?During the school year 1915-1916 the test was given by the author to some 961 children in the public schools of Columbus. Of these, 710 were in two grade schools in different sections of the city, 95 were in a junior high school, and 156 were in the Commercial High School. They came, of course, from different social classes, but there were very few cases of actual poverty, or of foreign parentage, among them. No attempt was made to select subjects. They were taken just as they came, one after another, straight through the grades. Only one individual was rejected and that because of extreme myopia. It is possible that the fifteen and sixteen year results might have been somewhat higher had more subjects been secured or had a general high school been invaded. For the commercial high school is, to some extent, a selective agent. The distribution by sex was about even, there being in all 490 boys and 471 girls. The ages ranged from six to sixteen. Table I shows the distribution of the 961 cases by age and grade.

1 The writer wishes here to acknowledge the kindness and courtesy of Mrs. Margaret McNamara, Chief Matron of the Ohio Girls’ Industrial School, Mr. J. A. Shawan, Superintendent of Schools, Miss Lucy Thompson, Principal of Avondale School, Miss Margaret H. Mulligan, Principal of Ohio Avenue School, Mr. Townsend, Principal of the Commercial High School, Columbus. Ohio, and the Department of Psychology of Ohio State University.

Age 5.5- 6.0 6.0- 6.5 6.5- 7.0 J O- 7.5 7.5- 8.0 8.0- 8.5 8.5- 9.0 9-0- 9.5 9.5-10.0 10.0-10.5 10.5-11.0 n.o-ii.5 H.5-12.0 12.0-12.5 12.5-13.0 130-13.5 13.5-14.0 14.0-14.5 145-15.0 15-0-15.5 15.5-16.0 16.0-16.5 No. of Grades IB 1A 2 B 2 A 3 B 3 A 4B 4 A 961 22 17 36 40 47 52 44 71 51 83 65 40 51 61 130 95 5 B 5 A 6 Bj6 A 7 B 7 A 8 B 8 A High School IB 1 A 105 28 IIB IIA IIIB III A

table i.?the distribution by age and grade of the 961 school children. In addition to the school children the test was performed by fifty-one adults, who were students at the Ohio State University summer school. Their ages run from nineteen to forty-nine, the median being twenty-six. Thirty-four of them are teachers, fourteen are undergraduate and two are graduate students, and one is a Y. W. C.A. secretary.

5. Results.?The first attempt to discover just what there was of value in this accumulating mass of data came with the plotting ?f the curves shown in Figure I, which indicate the percentage of correct arrangements at each age for the five series A?E inclusive. In grouping by ages, age was reckoned from the nearest birthday. That is, the six-year group includes all those from five years and six nionths, to six years and five months, the seven-year group includes all those from six vears and six months, to seven years and five months, etc. Table II gives the actual percentages from which these curves were drawn, together with the number of boys and girls tested at each age. The curves show clearly that we have achieved a graded series, ranging from A, which is extremely easy for all children who are nine years or more, to E, which is too difficult to be correctly arranged by fifty per cent at any age.

The next step was to regard the test as a whole and determine what percentage at each age correctly arranged one or more series, what percentage correctly arranged two or more, etc. 6 7 8 9 10 11 12 13 14 15 16 Ad 6 7 8 9 10 11 12 13 14 15 16 Ad FIGURE I.? CURVES SHOWING PERCENTAGE OF CORRECT ARRANGEMENTS FOR EACH BEX AND AGE GROUP; CURVE A FOR SERIES A, CURVE B FOR SERIES B, ETC. A PICTURE ARRANGEMENT TEST. 45 Age 6 7 8 9 10 11 12 13 14 15 IS Adult Number Tested Tot. Series (Percentage of Correct Arrangements) B. Tot. G. Tot. G. Tot. B. G. Tot. G. Tot. table II. NUMBER OF BOYS AND GIRLS TESTED AT EACH AGE AND PERCENTAGE OP CORRECT ARRANGEMENTS OF EACH SERIES FOR EACH SEX AND AGE GROUP. Age 6 7 8 9 10 11 12 13 14 15 . 16 Adult Number Tested Boys Girls Total 20 40 78 95 111 100 99 113 109 123 73 51 1 Correct Boys 18 50 82 85 95 96 100 100 100 Girls 33 40 66 94 96 100 100 95 100 98 97 100 Total 25 45 74 89 95 98 99 96 100 2 Correct Boys 9 15 50 69 75 81 94 85 91 95 97 84 Girls Total 5 15 49 64 80 84 89 86 92 91 94 90 3 Correct Boys Girls 0 5 10 37 68 53 69 62 67 74 61 72 Total 0 5 13 41 55 61 72 66 72 72 74 68 4 Correct Boys Girls 0 0 0 18 33 23 46 36 S2 35 33 52 Total 0 0 3 17 29 29 46 41 50 38 44 49

Table III.?PERCENTAGES CORRECTLY ARRANGING ONE OR MORE SERIES, TWO OR MORE, ETC., FOR EACH AGE AND SEX GROUP.

The figures yielded by this method appear in Table III. An inspection of these figures, with a view to standardization, seems to indicate: (1) That a normal eight-year-old ought to arrange one out of five correctly, 74 per cent doing so at that age, while only 45 per cent did so at seven years. (2) That two out of five would be a hard nine-year test, rising from 49 per cent at eight to 64 per cent at nine, or an easy ten-year test, at which point 80 per cent are able to pass it. (3) That while three out of five are arranged correctly by 68 per cent of the girls at ten years and but 45 per cent of the boys, at eleven years the conditions are exactly reversed, 68 per cent of the boys passing and but 53 per cent of the girls; from twelve years on no age drops below 60 per cent, but further data are necessary before this can be straightened up, there being no plausible explanation of the irregularity. (4) That to arrange four out of five would be too great a demand at any age, the curve reaching 60 per cent at no point. (5) That the greater the demand the more irregular the girls’ curve becomes. (6) That there is no decided or sustained increase in ability beyond twelve years (adults included) shown in any of these curves.

Up to this point no attempt had been made to evaluate partially correct responses. This we were especially desirous of doing but the devising of a method of scoring which should be reasonably free from objections proved to be an extremely difficult task. Four different schemes were tried out before a decision was reached. A description of each follows.

6. Methods of Scoring.?Method 1.?A simple mechanical device which we designated the gain-in-place method was first suggested. If a member was shifted one place from its proper position in the series one point was scored against it, if it were three places removed, three against it, and so on. In order to make the score magnitude and consequently the score differences greater, the one point was increased to three. Thus a 1-3-2-4- arrangement of A would receive -3 as its score, a 1-3-2-5-4-6 arrangement of C would score -6, etc. Only those which gained place were scored. The best possible score for the test would therefore be 0, the worse possible score -105.

Objections to this method were that it gave no more credit for the correct arrangement of E than for A; and that it assumed the gaps between scenes to be of equal weight, whereas this was obviously not true. For example a 1-2-4-3 arrangement of B and a 2-1-3-4 would both be scored -3. Yet the former occurred 69 times, the latter but once. Another objection to this method is that it laid too much stress upon the mere position of a series member with respect to the perfect arrangement and too little upon possible relations between the members as placed. Is 1-4-3-2 (scored -6) better than 4-1-2-3 (scored -9)? In the former there is a correct placing of 1, but no logical sequence, in the latter no correct position of members with respect to the perfect arrangement, but 1, 2, and 3 correctly placed with respect to each other. Likewise a 6-1-2-3-4-5 arrangement of D would be scored -15, a 1-2-3-4-6-5 arrangement -3. Yet the former occurred 178 times, the latter once. Which would seem to have a more reasonable basis? In other words this method was too mechanical.

Method 2.?Twenty points credit were given for the correct arrangement of each of the five series. Arbitrary assignments of 15, 10, and 5 points credit were made for such partially correct performA PICTURE ARRANGEMENT TEST. 47 ances as seemed warranted by the frequency of occurrence. The guide for scoring by this method is:

1-2-3-4 20 1-2-3-4 1-2-4-3 1-3-2-4 1-2-3-4-5-6 1-2-3-4-6-5 2-3-4-5-6-1 1-2-3-4-5-6 1-6-2-3-4-5 2-3-4-5-1-6 6-1-2-3-4-5 1-2-3-4-5-6 1-2-4-3-5-6 1-2-4-5-3-6 1?2?4?5?6?3 1-4-2-3-5-6 2-1-3-4-5-6 2-1-4-3-5-6 2-4-1-3-5-6

The same objection held for this method as for the first in so far as it gave the same credit for complete success in each series from the easiest to the hardest. Moreover, the credit given to partial successes was determined by individual judgment, aided and guided, to be sure, by a consideration of the frequency of occurrence, but even so not resting upon a sufficiently solid foundation to be easily defensible in the face of criticism. Still another solution was therefore sought.

Method 8.?This scheme rests upon a purely empirical base. All arrangements of each series which were made by the 9G1 school children were recorded, together with the number of times each occurred. The number of occurrences was then converted into percentage of the whole. It was found that A was correctly arranged by 84 per cent, B by 72 per cent, C by 51 per cent, D by 38 per cent, and E by 23 per cent. Assigning 38 points credit to E, by inverse proportion D would then be worth 22, C 17, B 12, and A 9 points. By increasing A to 10 and C to 18 the perfect total score for the test became 100 points. Similarly, by means of proportion, score values for the partially correct responses were worked out. Thus:

Arrangement Freq. Per cent Score

1-2-3-4-5-6 1-2-4-3-5-6 2-1-4-3-5-6 225 88 64 23.0 9.1 6.6 38.0 14.8 10.6 Arrangement I Freq. Per cent Score 1-2-3-4-5-6 6-1-2-3-4-5 1-6-2-3-4-5 372 178 124

All arrangements which commanded a score of less than .5 were scored 0. But it was observed that some arrangements, occurring frequently among the very young or very dull but rarely among the brighter children, namely, the placing of the pictures in the same order in which they were laid out or beginning at the other end and exactly reversing them, would receive credit, which they evidently did not deserve. Such arrangements were therefore thrown into the nocredit group. The arrangements receiving credit by this method, with the scores for each, are as follows:

1-2-3-4 10 1-2-3-4 1-2-4-3 1-3-2-4 12.0 1.2 1-2-3-4-5-6 1-2-3-4-6-5 1-2-4-5-6-3 1-2-6-3-4-5 2-3-4-5-6-1 2-4-5-6-3-1 18.0 1.6 .7 .7 2.5 .5 1-2-3-4-5-6 1-2-3-5-4-6 1-6-2-3-4-5 1-6-2-3-5-4 2-3-4-5-1-6 6-1-2-3-4-5 6-1-2-3-5-4 6-1-2-4-3-5 6-1-4-5-3-2 22.0 .5 .7 .7 1.3 10.0 .7 .5 .7 1-2-3-4-5-6 1-2-3-4-6-5 1-2-3-5-6-4 1-2-4-3-5-6 1-2-4-3-6-5 1-2-3-5-4-6 1-2-4-5-3-6 1-2-4-5-6-3 1-2-4-6-5-3 1-2-4-6-3-5 1-2-5-3-4-6 1-3-2-4-5-6 1-3-5-2-4-6 1-3-5-6-2-4 1-4-2-3-5-6 1-4-2-5-3-6 1-4-2-5-6-3 1-4-5-6-2-3 2-1-3-4-5-6 2-1-4-3-5-6 2-1-4-5-3-6 2-1-4-3-6-5 2-1-4-5-6-3 2-1-4-6-5-3 2-3-1-4-5-6 2-3-4-5-6-1 2-4-1-3-5-6 2-4-1-5-3-6 2-4-1-5-6-3 2-4-1-6-5-3 2-4-3-1-5-6 2-4-3-5-1-6 2-4-5-3-1-6 2-4-5-3-6-1 2-4-5-6-1-3 2-4-5-6-3-1 2-4-6-3-5-1 2-4-6-5-1-3 2-6-4-5-3-1 4-1-2-3-5-6 4-2-3-1-5-6 4-2-6-5-3-1 38.0 1.1 .8 14.8 .8 .5 6.8 7.2 3.4 .5 .5 .6 .8 .5 10. .8 .8 1.5 4.2 10.6 2.3 .6 3.8 1.1 .6 .6 7.2 1.1 1.1 .6 .5 .5 1.5 4.9 .6 .5 .5 1.1

Method 4.?In order to determine definitely whether these arrangements referred to above (putting up the pictures in the same illogical order or its exact reversal) were not typical of the very dull or very young children and whether they could not drop into the nocredit group if these were excluded, still another device was tried. This time only those records were used which showed two or more of the series correctly arranged. This would seem to insure that the individuals on whom we were basing our credit system had a definite idea of what was desired and possessed a certain amount of logical judgment. There were 748 cases fulfilling this requirement. From their records, in the manner described in Method 3, the following score-system was developed. The score was made to read in half-credits each time. That is, .3 to .7 inclusive was scored .5, .8 to 1.3 was scored 1.0, 1.3 to 1.7 was scored 1.5, etc. In the case of all series except the most difficult one the expected happened, the replacing in illogical order dropping to the no-credit class. In the case of E the number of such arrangements dropped from 34 to 9 but would still have received some credit, had they not been eliminated. No credit was given if less than one per cent showed the arrangement. Below is the guide for scoring in this manner:

1-2-3-4 1-2-3-4 13.0 1-2-4-3 1-3-2-4 1-2-3-4-5-6 1-2-3-4-6-5 1-2-4-5-3-6 1-2-4-5-6-3 1-2?6-3?4?5 1-2-6-5-3-4 2-1-3-4-5-6 2-3-4-5-6-1 17. 1. .5 .5 .5 .5 .5 2.5 1-2-3-4-5-6 1-2-3-5-4-6 1-6-2-3-4-5 1-6-2-3-5-4 2-3-4-5-1-6 6-1?2-3?4?5 6-1-2-3-5-4 22. .5 5.5 .5 1. 8. .5 1-2-3-4-5-6 1-2-3-4-6-5 1-2-4-3-5-6 1-2-4-5-3-6 1-2-4-5-6-3 1-2-4-6-5-3 1-4-2-3-5-6 2-1-3-4-5-6 2-1-4-3-5-6 2-1-4-5-3-6 2-1-4-5-6-3 2-4-1-3-5-6 2-4-5-6-3-1 37. 1.5 12. 5.5 6.5 2. 9.5 4. 10. 1.5 3. 6. 3.

The records of all of the school children were scored by each of the above methods. Below are several sample scores. They are placed in pairs for purposes of comparison, so as to emphasize the fact that identical scores by Method I show very great differences when scored by other methods.

Curves were plotted showing the median scores at each age from six to sixteen for the four methods. Bearing in mind the mechanical character of Method 1 and the objections that arose as it was used, it is surprising to note how even is the curve that rises from its medians. The curve for Method 2 was particularly gratifying from the point of view of one seeking a standardization by age, but as remarked above, the method by which it was derived savors too much of the “this must be best because we think it is attitude. Therefore, in the end, it was deemed best to adopt Method 4 as the Arrangements

Correct “Correct ^Correct ^Correct ICorrect] CorrectT Correct; m 2-1-4-3 Correct Correct 1-3-2-4 Correct VW* ^Correct Correct ti 3-4-1-2 Correct t 3-2-4-1I Correct > Correct 1-2-3-4-6-5 Correct * 1-2-6-4-5-3. {Correct k $9 p* r 1-2-3-4-6-5 ?2-3-4-5-6-1 1-2-4-6-5-3 2-6-3-4-5-1 I ? Correct ?? Correct 1-5-2-3-4-6 1-2-3-5-4-6 6-1-2-3-4-5 1-2-3-4-6-5 6-1-2-3-4-5 1-6-3-2-4-5 6-1-2-3-4-5 E ^ Correct 1-2-3-4-6-5 Correct 1-2-4-3-5-6 Correct 6-1-2-4-5-3 1-2-4-3-5-6 2-4-1-6-5-3 2-4-3-5-6-1 Scores by Four Methods M. 1 -15 + 15 -33 -33 -45 -45 M. 2 M. 3 100 100 35.6 78 37.3 11.6 49.3 32 M.4 100 36.5 86 12 46.5

most reasonable device for scoring this particular picture arrangement test. In the following discussions scorings by this method only are used.

Graphs were drawn of the distribution of the scores made in the 8, 9, 10, 11, 12, 14, and 16 year groups, and the adults. For convenience, the one hundred possible scores were divided into groups as indicated (a = 0 ?10.5, b = ll ? 23.5, c = 24 = 40.5, d = 41 ?62.5, e = 63 ? 99.5, f = 100). The division was made in this manner so that the first group would include those who did not correctly complete any one (the least credit for correct completion of any series being 11), the second group would include those who did one correctly with partial success in one or two others, etc. The ability (whatever it may be) seems rather widely distributed. Eleven and twelve years each show one decided mode, falling at d (41?62.5) for eleven, and e (63?99.5) for twelve. At nine years practically the same number make b, c, and d (ranging from 11 to 62.5). The adult group shows a mode at e (63 ? 99.5) with rather high identical levels for c, d, and f. Indications are that a few quite young children are very well endowed with this particular line of ability while a corresponding number of older children and adults are poorly equipped with it.

A somewhat different device for showing the distribution of the scores by ages is that used in Table IV from the figures in which the curves in Figure II are plotted. In this table are given the maximum, the twenty-five percentile (below which 75 per cent fall), the median, the 75 percentile (below which 25 per cent fall) and the minimum scores for each age. The adult group is included. The sexes are separated.

Boys Age 6 7 8 9 10 11 12 13 14 15 16 Adult No. 11 20 40 46 60 53 56 55 57 58 34 26 Min. 0 0 0 0 0.5 0 11 1 12 13 18 17 75 %i!e 0 0 12 17.5 27.5 41 46.5 39 50 42.5 52 43.5 Med. 0 10.5 24 40 43.5 53 64.5 67 63 56 71.5 66 25 %ile 5.0 17.5 36 59 66 72.5 85.5 90.5 78 78 86.5 100 Max. 41. 49 100 100 100 100 100 100 100 100 100 100 Girls Age 10 11 12 13 14 15 16 Adult No. 20 38 49 51 47 43 58 52 65 39 25 Min. 0 0 0 0 0 16.5 11 0 12.5 10 10 29.5 75 %ile 0 0 5.5 18.5 40 31.5 40 35 40 41.5 38.5 47.5 Med. 0 1.0 15.5 31.5 52 46.5 63 57 67 56.5 53 67 25 %ile 13 18.5 37.5 58 75.5 65 80 78 86 73 63.5 Max. 25 56.5 69 100 100 100 100 100 100 100 100 100

TABLE IV.?SHOWING MINIMUM, 75 PERCENTILE, MEDIAN, 25 PERCENTILE AND MAXIMUM SCORES.

It will be observed from a comparison of the charts in Figure II, that the boys are slightly superior to the girls at all ages except

Median 75%ile Minimum Ad

10, 14, and 15. The greatest difference is at sixteen, one of the smallest groups. Since the difference is small and not persistent the chances are that a multiplication of cases will tend to bring them closer together. Neither is the range of variation very different for the two sexes. Mrs. Woolley,1 in her work with adolescents, found that the girls showed a greater number of very poor individuals and a smaller number of very good as compared with the boys. In this test 11 per cent of the boys and 10 per cent of the girls score 100, and 17 per cent of the boys and 18 per cent of the girls score less than 24. The numbers of very good and of very poor individuals seem therefore to be about evenly divided.

One striking feature of the curves is the failure to show any decided or sustained increase in ability beyond the tenth or at most beyond the twelfth year. This was observed when the results were treated in the all-or-none, correct-or-not-correct manner. In an effort to discover an explanation of the relatively low 15 and 16 year levels, it was observed that at sixteen years 30 per cent of the subjects were more than one year behind the grade expected if they entered at six years and progressed at the normal rate. At fifteen 18 per cent were more than one year behind, at fourteen 17 per cent, while at the relatively high ten-year level only 3 per cent were so retarded. It was thought that this might explain the irregularity but a closer study revealed, that when those individuals who were more than one year behind grade were eliminated from the sixteen year group the median remained at exactly the same point; and the minimum also remained the same. So that apparently, so far as this test is concerned, the ability of these educationally retarded individuals parallels that of the up to grade group. It would not do to assume that a low or high level was caused by an over-weighting of the group with educationally retarded or accelerated individuals. So far the time element, which Decroly considered of prime importance, has been disregarded. This is almost rendered necessary by the new procedure which accepts incorrect as well as correct responses. The time spent upon an incorrect completion would have little meaning. As a matter of interest the time medians were determined for those cases correctly completing such series. These medians (expressed in seconds) are shown on page 53. There is for each series a definite decrease in time with increase in age, but the number of cases on which the figures are based is in many instances comparatively small. Again it will be observed that the most decided differences occur before the twelfth year.

1 Woolley, Helen Thompson. New Scale of Mental and Physical Measurements for Adolescents and Some of its Uses. Jour, of Educational Psychol. Vol. VI, No. 9, p. 521.

Age No. of Cases 6 7 8 9 10 11 12 13 14 15 16 2 11 45 70 87 90 90 99 102 102 69 Time 68 49 31 29 24 21 20 15 17 16 16 No. of 2 9 38 53 79 74 79 83 90 100 63 Time 42 42 36 31 24 22 22 19 18 19 18 No. of Cases 1 4 15 36 43 50 53 68 75 81 43 Time 134 115 81 69 61 51 50 48 44 45 44 No. of Casea 1 0 2 19 41 37 47 54 63 62 37 Time 90 71 72 53 52 46 50 40 41 39 No of Cases 0 0 5 12 22 14 37 41 32 33 26 Time 90 90 71 57 63 60 50 52 44

Of particular interest to the examiner who is using the test for practical purposes are the different types of reaction to the task. There is the careful worker who looks at all the pictures and evidently “gets” the scheme before he begins to arrange at all. There is the one who starts out hastily and works by a sort of trial and error method, putting them up and changing them about as he detects his errors until he is finally satisfied. Some work so rapidly and carelessly as not to seem to think at all. Especially interesting is the one who is evidently influenced by suggestion; that is, he can not get away from the illogical order in which he first sees them. He can not strike out on his own initiative, or even if he does place one oi two correctly, when he comes to a difficult point he is likely to accept the suggestion offered by whatever order the remainder of the scenes happen to have assumed. This type is especially prevalent among the younger and the very dull children. Then there is the one who can not grasp the idea of connection between the scenes. Quite frequently among the six year olds the child wanted to tell a little story about each separate picture.

Just what is the nature of the ability which this test measures is uncertain. Decroly seems to consider it as testing primarily logical judgment. Yet he too points out that the difficulties involved are dependent not only on the length of a series but upon differences of meaning, and details of form, of color, of arrangement, and of perspective. Certainly it does call for rather close attention, keen perception, appreciation of the meaning of the perceived details, and logical judgment, based on analysis and imagination. The mental activity involved is of a decidedly complex nature. ^ Successful performance of the task implies not only that the subject is possessed of several very different mental powers but that he is able to direct their harmonious working together to achieve a desired end.

7. Summary.?Briefly summarized the work thus far seems to have achieved the following results:

A picture-arrangement test has been devised which avoids the use of language by the subject, which rarely fails to attract and hold the attention; which is almost entirely independent of “school learning;” which can be given in from five to ten minutes; which is graded in difficulty; which admits of a scoring which gives credit for partial success; and whose method of administering and scoring eliminates almost wholly the personal equation of the examiner. The data at hand indicate that the ability to perform the test is almost entirely lacking at seven years or below, emerges rapidly from seven to ten, and beyond twelve is a very variable quantity. The adults in our study showed little or no improvement over twelve years. (It should be remembered that they were a small group of summer students.) It seems quite possible to the author that here is a test which, though of little value in establishing “mental age,” at least beyond ten years, may prove to be of great significance to the clinician seeking to determine the possibilities of a subject for independent work in which he may have to meet new situations and be able to “put two and two together and get four.” Much further experimentation with different types and classes of normal and subnormal subjects is necessary before a final conclusion can be reached as to the value of the test, but it seems to offer great promise.

At present, for all practical purposes, the best interpretation of results obtained from its use with any one subject can be made by referring his score to the proper age and sex group and noting where it falls within that group, whether at or near the median, twenty-five percentile, seventy-five percentile, etc. For example, suppose that a nine year old girl made a score of forty-two points. Referring to Figure II (Girls) we see that she has considerably surpassed the median score for her age. Or if a twelve-year-old boy makes a score of twenty-five points, by the same procedure it would be evident that he had made an exceedingly poor performance, falling about half-way between the seventy-five percentile and the minimum. Any score falling within the upper shaded portions represents an exceptionally good performance, any within the lower shaded portions an extremely poor one, while those located between the shaded portions represent fair, medium, or rather good performances. Since the completion of the above study this picture arrangement test has been given along with the Yerkes-Bridges Point Scale and other mental tests to seventy girls at the Ohio Girls’ Industrial Home. The correlation between the picture arrangement scores and other point scale scores of these girls, obtained by the Spearman “foot-rule” method, is 0.50 (P. E. 0.58).

Disclaimer

The historical material in this project falls into one of three categories for clearances and permissions:

Material currently under copyright, made available with a Creative Commons license chosen by the publisher.

Material that is in the public domain

Material identified by the Welcome Trust as an Orphan Work, made available with a Creative Commons Attribution-NonCommercial 4.0 International License.

While we are in the process of adding metadata to the articles, please check the article at its original source for specific copyrights.

See https://www.ncbi.nlm.nih.gov/pmc/about/scanning/

A Picture Arrangement Test

A Picture Arrangement Test 