A Judgment Test for Measuring Intelligence

Author:

Cyril Burt, M.A., D.Sc.

Psychologists are often asked for a good test of intelligence to be used with adults. Of those at present available none is wholly satisfactory. The harder problems in the Terman-Binet scale sound too much like a teacher’s crossquestioning in school : ” If two pencils cost fivepence, how many pencils can you buy for fifty pence?”* On the other hand, the booklets of logical problems used * Test 4 for “Average Adults” in the new Terman Merrill scale.

for ordinary group-testing are to some examinees a painful reminder of the printed papers set at the end of a summer term and productive of so much dread and humiliation. Indeed, are not all adults, and particularly those whose education finished at fourteen in Standard V or VI, likely to feel embarrassed, not to say resentful, when some critical stranger arrives and proceeds openly to probe the shortcomings of their intellectual capacities or attainments?

It is partly in the-hope of avoiding these and other difficulties that the following test has been constructed. The examinee is given a number of short reasoned statements, each typed out on separate cards, and relating to everyday facts and problems. He is asked to arrange them, so far as he can, in an order of merit, indicating which he thinks are the more justifiable and which he thinks the most stupid. Each statement is, as it were, a potted argument; and the wording is so chosen that the validity of the conclusion depends not on the truth of the premises but on the reasoning involved. The examinee’s intelligence is then measured by the correlation between his order and a standard order, which is taken to be the arrangement which a perfectly intelligent person would make. The dullest do little more than sort the statements into two main groups?those that they are willing to accept and those that they feel to be manifestly foolish or absurd. The abler persons can discriminate much finer shades of validity; and can sort the cards into five or ten groups, or even rank them in a sequence.

The test has been tried out in two forms. The harder contains thirty reasoned statements, of which only a few are flagrantly absurd and the majority involve fallacies that are more subtle or concealed. This was the earliest form attempted. It was originally devised as a vocational test to measure the intelligence of abler adults, for whom the Binet scale or the written group tests were far too childish; and was first tentatively employed at the National Institute of Industrial Psychology by Miss Gaw and myself. We found that it worked quite smoothly, and apparently gave reliable results in cases of vocational selection. More recently, after some revision of the material, it has been applied to forty-five senior students and members of a College staff, whose relative intelligence could be fairly accurately assessed; and, on correlating the results with the order of intelligence, a co-efficient of .67 has been obtained.

This has encouraged us to attempt a more elementary form of the test; and, in spite of difficulties in choosing a sufficiently simple wording, we find that the general method is almost as effective with duller adults as with bright. Mr. Russell has tried out this simplified form with a group of thirty-three adults at a working men’s club, and obtains a correlation of .73.

Both groups were small; but, with adults, it is exceedingly difficult to procure any reliable estimation for the intelligence of larger groups against which the test can be checked. The test material itself will no doubt need further amendment and extension before it can be recommended for general use : Mr. Russell, for example, has been for some time experimenting with performance material, in order to avoid the handicap which all verbal tests at times impose. Meanwhile, we believe the underlying principle may be of interest to those who desire to construct tests of their own for similar purposes.

The special advantages of the method would seem to be these. It is well known that tests that depend on logical reasoning, and presuppose no special knowledge of the subject-matter, yield by far the best measurements of intelligence. If the questions and their wording are simple enough, such problems can be solved by children with a mental age of 6 or 7. But tests of reasoning are of two main kinds, ” constructive ” and ” critical In the former, which make up the bulk of tests in common use, premises are given, a question raised, and the examinee has mentally to construct an argument leading to the right solution : the arithmetical problem quoted above from Terman is of this type. In the other form, the whole argument is presented to the examinee ready-made, and the examinee has to judge its logical value : Binet’s absurdity-tests are the best known examples.

Now the orthodox view of the theoretical psychologist maintains that intelligence essentially consists in explicitly ” educing relations and correlates “. So it does, when the reasoner is an adult theorist. But in a very early paper on mental testing?the first, I believe, in which the importance of logical relations was stressed?I pointed out that, among children and practical people, intelligence was more frequently displayed, not by the explicit, step-by-step inference of the logician and the scientists, but by a ” complex synthetic activity, comparable to what is popularly described as ‘ intuition ‘, whereby we implicitly comprehend the intelligible character of a whole, without explicitly analysing it into its component parts or distinctly formulating their relations “f. With children, tests of ” judgment” (as this intuitive process may be called) gave even higher correlations than tests requiring formal syllogistic inference. Further experience has led me to the conviction that the more familiar tests of the latter type may at times do grave injustice to the practical or intuitive intelligence of the ” average pdult ” as we find him in daily life.

For this reason absurdity tests prove extremely effective so far as they go. But with adults, at any rate, a long series of statements containing nothing but absurdities is more likely to insult their intelligence than to measure it. Hence the fallacies employed must be for the most part mild rather than obtrusive. But then how are you to tell whether the examinee has really detected the fallacy or not? Once again the practical man may be quick to perceive the flaw, and yet find it very hard to put his criticism into words. In testing aesthetic * For fuller explanations, with examples, see ” The Backward Child,” pp. 523 et seq. f “Experimental Tests of Higher Mental Processes and their Relation to Intelligence,” J. Exp. Ped., I, 1911, pp. 101 et seq. Cf. Board of Education, ” Report on Secondary Education, 1939,” p. 436. appreciation we should never judge a person by his ability to append detailed reasons for his intuitive preferences; we should be satisfied if he could grade the test-specimens?pictures, patterns, musical extracts, or the like?in an order which agreed with that of accredited experts on art. Similarly, I suggest that the best way to test appreciation of what is rational or irrational is to be found in a double comparison : the examinee compares the items and we compare the results of his comparison with those of the logical expert.

The essential principle is by no means new. It was briefly described in a review of psychological tests drawn up for the Consultative Committee of the Board of Education*; but.it has mainly been employed for testing artistic and moral judgment rather than intellectual. But after all, plenty of intelligence tests require the examinee to arrange objects in order and then measure his accuracy by the correlation between his order and the true order : Binet’s test of weights, for example, where five boxes are to be arranged according to their heaviness, really exploits the same procedure in an elementary guise.

Perhaps the greatest merit of the test lies in the fact that its aim is not too plainly manifest. When the test was given to a class of students in 1935. not one suspected that the object was to measure their intelligence, though several of them volunteered the comment that intelligence was the essential quality required to carry out the task efficiently. The majority thought that the object of the test was to measure their suggestibility or credulity; and fortunately no person objects to demonstrating that he is by no means credulous or suggestible. Finally, though such proposals are perhaps irrelevant to the main purpose of this article, I may point out that the same principle may be applied to test the concrete nature of different persons’ judgments on a wide varietv of topics. ” Mass observation ” is the fashion at the moment. By getting various individuals to rank beliefs in order of acceptability, and then applying a factor-analysis to the results, striking sidelights may be thrown on current views in regard to social, industrial, and even political problems. The method is therefore one of wide possibilities for the social psychologist.

  • Report on Psychological Tests of Edncable Capacity, 1924, p. 58.

Disclaimer

The historical material in this project falls into one of three categories for clearances and permissions:

  1. Material currently under copyright, made available with a Creative Commons license chosen by the publisher.

  2. Material that is in the public domain

  3. Material identified by the Welcome Trust as an Orphan Work, made available with a Creative Commons Attribution-NonCommercial 4.0 International License.

While we are in the process of adding metadata to the articles, please check the article at its original source for specific copyrights.

See https://www.ncbi.nlm.nih.gov/pmc/about/scanning/