The future of the essay as an assignment type is reportedly increasingly in doubt as new AI bots are becoming cleverer. The Guardian reported that academics were “stunned” by the latest outputs and opined that academics, journalists, and programmers’ job security were in doubt (not the programmers… won’t someone think of the programmers?!).
As someone who has had to write more than enough essays on various topics in history in the last few years, I have a selfish interest in hoping that the demise of this format is rapidly accelerated. I decided to try the latest one getting attention (Chat GPT) and there is no doubt that the output is very impressive. I put in three different prompts from across disciplines relating to: (1) limitations of qualitative research (heheh), (2) to explain using an example the SN2 reaction mechanism in chemistry, and (3) from my recent studies, to explain some peculiarities about Welsh aspects of the Reformation in early modern Europe (with a note to tutors that this search came after my essay submission, yn amlwg).
The bot did a very good job in all three, and while I didn’t join my Grauniad colleagues in being stunned, I was impressed… The question prompts were slightly different. The first is a classic “what are the disadvantages” type essay question and the bot dutifully summarised issues relating to sample size and generalisability that anyone versed in the discipline would be expected to recount, parsing it as an opening gambit followed by 2 – 3 bullet points.
In the second, I asked about the chemistry mechanism with a request that a particular example be cited (of the bot’s choice, I am not a monster). The format of the answer ran very similar to the first. An opening statement – 2-3 bullet points about the headline points of the mechanism, and then supplemented by a discussion of the mechanism as it applies to a particular reaction. It was impressive, but like my banking bot’s friendly-seeming demeanour, the parsing began to look suspiciously repetitive.
My Welsh history prompt was meant to catch it out. But the bot is clever, and did highlight some particular issues relevant to Wales (the conflicts over the vernacular, the role of laity) naming some of the key players, but the parsing was looking increasingly familiar and the level of detail of the particular nuances would not be enough to impress my tutor who likes a much more in-depth analysis.
Across all three prompts then, the bot proved remarkably good at summarising the headline themes raised in the prompt, and presenting them in the order they appeared in the prompt. And that word – summarise – should be our key to how we think about bots and their role in education and assessment. The level of detail – while very impressive – is essentially (in my opinion) a nice parsing of the kind of top level information Google would throw up in its knowledge insight about a particular topic. So concerns about academics’ future job prospects could be mitigated if we think about assessment prompts that incorporate some kinds of synthesis, problematising, reference to particular publication, or anything where a human insight is needed. There will likely be much more to say about this in the coming years as the bots get smarter and the concerns about academic integrity grows, but there is an old-age lesson in here: assessment types that facilitate easy or lazy approaches are likely to be those that risk greatest exposure to integrity issues: be it straight-forward Ctrl-C/Ctrl-V or some bot that can assemble readily available information in a click.