We’ve all taken an exam or two in our lifetime but how did the process of examinations and testing first develop? James Elwick, author of Making a Grade discusses his new book and looks at how standardized testing practices quietly appeared during the Victorian era, and then spread worldwide.
From James Elwick
Examinations test for knowledge and skill, but they really gain their power when they become standardized, which allows more people to write them and trust in the results. Making a Grade: Victorian Examinations and the Rise of Standardized Testing is a book about exams and how and when this standardization began. Some key points from Making a Grade can be found in a video posted here.
Making a Grade emerged partly out of my curiosity about some unexplored parts of the life of the famous Victorian biologist T.H. Huxley. While Huxley is noted mostly for his defense of Darwinian evolution by natural selection, it’s virtually unknown that he also ran large-scale physiology and zoology exams. A chapter recreates one of Huxley’s large-scale 1873 exams. Over the decades, these were taken by tens of thousands of candidates, many of whom in turn went on to teach biology, imitating Huxley’s system. There were others involved with exams too: Alfred Russel Wallace, who co-discovered evolution by natural selection, made extra money by marking thousands of exams in physical geography. Meanwhile, the first-place result of Phillippa Fawcett on the 1890 Cambridge University Mathematics Tripos demonstrated women’s intellectual equality.
Huxley, Wallace, and Fawcett were part of a larger culture that was obsessed not with teaching, but with examinations. Between about 1850 and 1890, both in the British Isles and then across much of the English-speaking world, examinations came to be seen as a key device not only of education but of identifying talented people. Huxley, Fawcett, and other talented individuals were discovered through their exceptional performances in exams. The same is true today.
My interest in exams was inevitably coloured by my early teaching experiences. “How much of this will they remember in 2 weeks?” my inner sceptic’s voice asked as I watched my students write their tests, scribbling out answers to exam questions on pre-printed blank booklets. “Do they truly understand the material?” that same sceptic’s voice asked when I marked those tests. For many answers simply repeated phrases, sometimes word for word from lectures or readings, and often jumbled together in strange new formulations. I suspect other readers of this blog may ask themselves the same questions. Huxley certainly did.
This problem of “regurgitation” (a word Making a Grade studies in detail) is not only a problem in fields requiring memorization, either, but also where concepts must be applied, such as in mathematics or the “harder” sciences. Eric Mazur administered “traditional” exams, testing his Harvard students’ knowledge of Newtonian mechanics, and got the expected normal results. However, when he tested them again on their actual understanding of the underlying concepts, he found that half “had no clue as to what Newtonian mechanics were about.” Making a Grade discusses earlier, similar cases – the Royal Society mathematician tutoring Euclidean geometry to students who had memorized the entire Elements word for word, but who did not seem to understand it, or examinees who did well on the Cambridge Mathematics Tripos then confessed they did not actually understand the mathematics they were being tested on. Such cases illustrate a gap between what exams are supposed to do (test knowledge), and what people actually understand. To close this gap, people come up with ingenious tactics (e.g. forms of rote learning we call cramming; ways to subvert the standardization we call “cheating”). As examiners respond in turn, an arms race emerges. Think about today’s students using Chegg on their remote exams, and the scramble of many instructors to respond in order to protect the credibility of these tests.
When combined with other topics – a growing reliance on paper credentials to certify experts; the rise of cheating and other forms of academic misconduct; an individualistic insistence on academic “integrity”; the use of exam-derived metrics as measures of “accountability” – Making a Grade covers many issues that still matter in our own suspicious time. If the examination is indeed a hidden engine of meritocracy, a key device that certifies the experts to whom we are supposed to listen…can it be trusted?
However, it is hard to build anything on such sceptical foundations. One can ask more constructive questions as well. How can exams be trusted? How can one examinee trust that their test will be marked fairly, in exactly the same way as others’ exams? How can a person be accredited as knowing something (correctly)? How do people come to mean the same thing when they use complicated words like “chromosome,” “evolution,” or “iambic pentameter,” and trust that others roughly know the same information as they do?
Such questions focus on how knowledge becomes a collective attainment that can be trusted. Such questions can be answered by focusing on the physical objects, rules, and routines that make exams possible. Making a Grade calls these, collectively, an exam’s “infrastructure.” We tend to think of “infrastructure” as big invisible systems, like roads or bridges. But scholars such as Geoffrey Bowker and Martha Lampland have lately argued that infrastructure is anything that makes possible large-scale activities. Infrastructure also tends to be boring, in that its very success means it gets taken for granted.
By focusing on objects, rules, and routines, it’s possible to also study how large-scale exams travelled overseas from the British Isles to places like Trinidad: for instance, how in 1867 blank templates of exam answer booklets were sent from the University of London to Mauritius for local printers to copy, and then to ten exam centres in Canada the next year. Similarly, we can see how the controversial idea of paying teachers or students for good exam grades – called “payment by results” – travelled from the UK to India around 1885. What about influences in the other direction? Did the idea for widespread testing come from the famous Chinese imperial exams? This is less clear – there were certainly French influences, which had been inspired by accounts from China. A recurring point made by many Britons was that exams were making their country more “Chinese,” and Making a Grade discusses why.
Every reader of this blog has probably taken an exam or two in their lives, and possibly given an exam or two. Making a Grade tries to show that their experiences are widely shared. It also tries to show that abstract concepts such as “meritocracy” or “accountability” depend not only on concrete devices like examinations, but also on systems too often taken for granted – systems we rediscovered when COVID-19 forced us to test remotely.