Estimated reading time: 5 minutes, 46 seconds
In case you missed my first post, I am blogging an unpublished paper as a series in parts over several days. You can read that post to understand the story and the reason behind this. Comments on each post welcome. This is the eighth post, and it will cover Attitudes Towards Voice/Speech Recognition and Automated translation., as part of the Findings Section. Part 1 covered abstract/references, Part 2 intro/lit review, Part 3: methodology/positionality, Part 4: Findings: General Attitudes Towards AI, Part 5: Attitudes Towards Turnitin, Part 6: Attitudes towards Teacher Bots, Part 7: Attitudes Towards Automated Essay grading
Attitudes Towards Voice/Speech Recognition
Most participants had some experience with speech recognition like Siri, or their car’s speech recognition software, or Google docs, and some had experience with packages that were individually trained on the person’s voice (such as Dragon). AUC3 thought speech recognition was useful for people “while on the go” or who don’t like to write, but that it often made mistakes and needed a layer of editing afterwards. Interestingly, two participants (AUC1, AUC2) mentioned use of speech recognition software to teach students to speak with a clearer/better English accent. AUC1 had actually already used it this way in class; AUC2 was imagining it as an option. Having said so, several participants recognized the potential for bias in speech recognition software, for example towards native accents, such that when trying to use it for other purposes like actually writing text or transcribing recordings, it would not work as well with non-native accents (SAU4, SAU5, AUC1, AUC2, AUC5). In particular, SAU4 and SAU5 had experiences with using such software to create text scripts for videos and such and found it sometimes useful, but not as useful with technical terms and various Englishes. SAU5 mentioned how even using human transcribers who were from a different culture has been problematic, because they don’t understand the accents and the context of South African material, and they needed a local person to correct some of those mistakes made by the oversees transcribers.
Most participants felt that this software could be most beneficial for people with disabilities. AUC4 specifically talked about how these softwares needed extensive training in order to work accurately, and that no one usually took the time to train them – unless it was someone who would use them a lot, such as someone with a disability, in which case they might invest the time. This was perhaps the first instance in the interviews of placing responsibility and agency on the human user, in all of the interviews so far, to help improve the software, not just fill in the gaps where the software fell short. Perhaps because speech recognition software is indeed individually trainable (especially non-generic tools like Dragon, rather than YouTube’s free closed captioning) in ways accessible to the lay user. SAU1 said she had experience with the early versions of such software and the new versions are much improved at dealing with non-native accents and jargon. SAU2 and SAU3 seem to have had good experience of using it with participants who have disabilities (e.g. paraplegics) and this probably comes back to what AUC4 was saying about those particular individuals investing time to train the software on their speech. Of course, it is also useful for people who have a hearing disability or non-native speakers who may not easily understand native speakers or vice versa.
Across the board, participants recognized that speech recognition tended to work best for native speaker accents and were more accurate with a more generic context free of jargon and technical terms. They also tended to recognize that human transcribers could have similar limitations
Attitudes Towards Automated Translation
Before sharing findings here, it is important to note that Google Translate is not of equal quality across languages. It is much better quality in translating English to French/German/Spanish and vice versa, but not very good at translating Arabic to English and vice versa. There are many reasons why this might be the case, including simply the amount of data available for it to train on, the differences between grammar structures across languages, and the deeply important context needed to understand Arabic writing because the diatrics needed to understand a word (the vowel sounds) are almost never written in digital texts, and so the translation software needs to rely on context much more than usual to figure out which word it is translating, let alone what it means in this context. This is already a difficult task, even for native speakers of Arabic.
All Arabic-speaking participants felt that Google translate was very poor quality for Arabic beyond words and phrases, and the one non-Egyptian living in Egypt said it was particularly problematic across different Arabic dialects, sometimes offering words in non-Egyptian dialects, and confusing people in Egypt. Participants in Egypt mentioned that students sometimes use it, with very poor results, such as “sophisticated” words used incorrectly in an essay (AUC3). One of them (AUC2) said she tried it once for work, thinking she could translate a first round then edit, but it turned out to need a lot more work than expected. Another participant (AUC1) mentioned how Arabic words could be spelled the same but the pronunciation (because of diatrics) and meaning not be understood without context, and the AI would not necessarily understand the context. Most participants said they mainly used it for translating words or phrases but rarely entire paragraphs or articles. They felt it missed nuances and contexts.
SAU3 felt auto-translation was a potentially “huge asset” with “massive advantage to education in future” but that it would take a while to become accurate. The main problem is that when you are translating into a language you do not know, you have no “quality control”, unlike transcription where you can tell. At the moment, the quality of existing tools requires human moderation, but if it becomes better quality, it would be beneficial in diverse classrooms to empower students to “read or listen – to choose to read or listen in a language they are more comfortable in”.
SAU2 had limited experience with automated translation. But she felt the “possibility is wonderful” to be able to read academic texts written in different languages, but she could see a challenge in the nuances, and how “heritage and culture get lost in translation” and how language and culture can be commodified and appropriated with such tools.
That’s it for now. What’s the latest development in these tools, are they getting better? Can they solve some huge problems?
Photo by Jason Rosewell on Unsplash I loved this image of the boy who looks like he’s screaming to be heard, because I think humans need to recognize other humans voices, to hear them, to truly listen, to respond and reciprocate, maybe more than we want machines to recognize our voices and follow our commands.