Estimated reading time: 5 minutes, 12 seconds
In case you missed my first post, I am blogging an unpublished paper as a series in parts over several days. You can read that post to understand the story and the reason behind this. Comments on each post welcome. This is the seventh post, and it will cover Attitudes Toward Essay grading, as part of the Findings Section. Part 1 covered abstract/references, Part 2 intro/lit review, Part 3: methodology/positionality, Part 4: Findings: General Attitudes Towards AI, Part 5: Attitudes Towards Turnitin, Part 6: Attitudes towards Teacher Bots.
Attitudes Towards Automated Essay Grading
Participants were quite skeptical of this application of AI. They all could not imagine this working well for more difficult tasks of understanding diverse ways of writing and structures. They could see it working for grammar and more technical aspects of writing, but not of actually assessing writing quality. Most mentioned their willingness to work with it as a first line of assessment before a human looked at the writing in more depth. A few people expressed concern that the automated grader would reproduce biases and expect more standard or dominant modes of writing, and therefore wrongly downgrade writing that was less common or more creative.
AUC1 and AUC5’s immediate reaction to this was a firm “no”. AUC5 believed that a human may need to read something several times to understand it correctly and a machine would miss nuances, perhaps misunderstand colloquial language, and that it may not be able to judge whether someone made the perfect word choice even if it is a correct word choice. They also said “when dealing with students, you have a history”, for example you know if you had taught them a particular word or structure before and can refer to that.
AUC3 reminded us that even when two humans grade the same paper, there are discrepancies, sometimes up to a full grade up or down. AUC1 raised a similar concern:
“I teach and I know how difficult.. interrater reliability and bias and standardizing [are] and I feel the writing part has to have a human element involved. You can do this [use software] in grammar, count mistakes, but for, like, ideas and doing thesis statement and details and examples… You can’t do that. I wouldn’t trust the number. Could be like a starting point to filter, then [you would need] a second eye, second grader”, a human one. Several participants (e.g. AUC1, AUC2, SAU1, SAU4) were comfortable allowing an automated essay grading system to give preliminary feedback before a human looked at the essay, what SAU2 called a “balance” of “moderating” it afterwards. SAU4 put it this way “don’t take the human out, but take the boring stuff that the human has to do” and focus on getting higher value of the face-to-face time for human interaction, and they likened this to the rationale for flipped classroom teaching.
SAU4 had some experience teaching writing. When asked about automated essay grading, she said:
“On the face of it “ew”… coz you want a human to be reading… but research I found, is that machine grading compared to human grading is not as far off. We valorize how good human grading is in the first place”.
Her point was that in large first-year classes, the grading is often done by tutors with less experience and students either don’t receive quality feedback, or it takes longer to reach them, to the point of being too late to be useful. SAU3 made a similar point – that students currently are probably not getting a great experience anyway. Both SAU3 and SAU4 suggested we collect evidence on the quality of these tools to test them before we judge.
SAU3 and AUC5 expressed a concern on how these services would assess the writing of non-native speakers, whether it would be trained on more dominant ways of writing or expression, whether it might mark correct but less commonly used text, or colloquial language down unnecessarily. In a similar vein, AUC3 was concerned the software would be biased towards one standard way of writing, because people had different writing styles and “it’s not necessary that every A paper looks the same”. She was concerned such a software would expect that. SAU1 raised the concern of the knowledge bank used to train the AI being “Northern or Western” such that it disempowers the expression of local knowledge in local ways and instead assumes “standardized means a particular [Western or Northern] discourse” then it would be “quite problematic”.
Participants across AUC all mentioned particular aspects they felt the software would be unable to judge, such as nuances, metaphor, context, and actual meaning, beyond how the sentences looked on the surface. A couple of people in both institutions mentioned Grammerly as a tool that did help with grammar, as it was a little bit more rule-based and easier to work on than other aspects of language that were more nuanced and culturally-dependent.
While AUC4 recognized that even human grading of essays could be highly problematic, they remain skeptical of automated essay grading because we should always ask “does it make education better?” before we try something, and that we should not use a tool “just because you can”.
SAU5 said “I suppose it really challenges what we understand as a teacher and the position of the teacher and lecturer. It’s quite a new role that is being developed and one needs to kind of see how that feeds into our understanding of teaching and learning.”
That’s it for now – what do you think? Will we someday have tools that help students automatically write their papers, only to be graded automatically and plagiarism-checked automatically? What the heck? It does not feel like an exaggeration to imagine this slippery slope to me!
Photo by Setyaki Irham on Unsplash This is another one that is not literal because I thought when I searched for grading I’d find something with a teacher’s markings over a paper or something but I got odd things. This one showed up because of “color grading” and it reminded me of how much they use color in things like Turnitin.com so it looks kinda pretty and a bit messy.