Blogging an Unpublished Paper: South African & Egyptian Academic Developers’ Perceptions of AI in Education Part 8: Attitudes Towards Voice/Speech Recognition & Automated Translation

Estimated reading time: 5 minutes, 46 seconds

In case you missed my first post, I am blogging an unpublished paper as a series in parts over several days. You can read that post to understand the story and the reason behind this. Comments on each post welcome. This is the eighth post, and it will cover Attitudes Towards Voice/Speech Recognition and Automated translation., as part of the Findings Section. Part 1 covered abstract/references, Part 2 intro/lit review, Part 3: methodology/positionality, Part 4: Findings: General Attitudes Towards AI, Part 5: Attitudes Towards Turnitin, Part 6: Attitudes towards Teacher Bots, Part 7: Attitudes Towards Automated Essay grading

Findings (continued)

Attitudes Towards Voice/Speech Recognition 

Most participants had some experience with speech recognition like Siri, or their car’s speech recognition software, or Google docs, and some had experience with packages that were individually trained on the person’s voice (such as Dragon). AUC3 thought speech recognition was useful for people “while on the go” or who don’t like to write, but that it often made mistakes and needed a layer of editing afterwards. Interestingly, two participants (AUC1, AUC2) mentioned use of speech recognition software to teach students to speak with a clearer/better English accent. AUC1 had actually already used it this way in class; AUC2 was imagining it as an option. Having said so, several participants recognized the potential for bias in speech recognition software, for example towards native accents, such that when trying to use it for other purposes like actually writing text or transcribing recordings, it would not work as well with non-native accents (SAU4, SAU5, AUC1, AUC2, AUC5). In particular, SAU4 and SAU5 had experiences with using such software to create text scripts for videos and such and found it sometimes useful, but not as useful with technical terms and various Englishes. SAU5 mentioned how even using human transcribers who were from a different culture has been problematic, because they don’t understand the accents and the context of South African material, and they needed a local person to correct some of those mistakes made by the oversees transcribers. 

Most participants felt that this software could be most beneficial for people with disabilities. AUC4 specifically talked about how these softwares needed extensive training in order to work accurately, and that no one usually took the time to train them – unless it was someone who would use them a lot, such as someone with a disability, in which case they might invest the time. This was perhaps the first instance in the interviews of placing responsibility and agency on the human user, in all of the interviews so far, to help improve the software, not just fill in the gaps where the software fell short.  Perhaps because speech recognition software is indeed individually trainable (especially non-generic tools like Dragon, rather than YouTube’s free closed captioning) in ways accessible to the lay user.  SAU1 said she had experience with the early versions of such software and the new versions are much improved at dealing with non-native accents and jargon. SAU2 and SAU3 seem to have had good experience of using it with participants who have disabilities (e.g. paraplegics) and this probably comes back to what AUC4 was saying about those particular individuals investing time to train the software on their speech. Of course, it is also useful for people who have a hearing disability or non-native speakers who may not easily understand native speakers or vice versa.

Across the board, participants recognized that speech recognition tended to work best for native speaker accents and were more accurate with a more generic context free of jargon and technical terms. They also tended to recognize that human transcribers could have similar limitations 

Attitudes Towards Automated Translation

Before sharing findings here, it is important to note that Google Translate is not of equal quality across languages. It is much better quality in translating English to French/German/Spanish and vice versa, but not very good at translating Arabic to English and vice versa. There are many reasons why this might be the case, including simply the amount of data available for it to train on, the differences between grammar structures across languages, and the deeply important context needed to understand Arabic writing because the diatrics needed to understand a word (the vowel sounds) are almost never written in digital texts, and so the translation software needs to rely on context much more than usual to figure out which word it is translating, let alone what it means in this context. This is already a difficult task, even for native speakers of Arabic.

All Arabic-speaking participants felt that Google translate was very poor quality for Arabic beyond words and phrases, and the one non-Egyptian living in Egypt said it was particularly problematic across different Arabic dialects, sometimes offering words in non-Egyptian dialects, and confusing people in Egypt. Participants in Egypt mentioned that students sometimes use it, with very poor results, such as “sophisticated” words used incorrectly in an essay (AUC3). One of them (AUC2) said she tried it once for work, thinking she could translate a first round then edit, but it turned out to need a lot more work than expected. Another participant (AUC1) mentioned how Arabic words could be spelled the same but the pronunciation (because of diatrics) and meaning not be understood without context, and the AI would not necessarily understand the context. Most participants said they mainly used it for translating words or phrases but rarely entire paragraphs or articles. They felt it missed nuances and contexts.

SAU3 felt auto-translation was a potentially “huge asset” with “massive advantage to education in future” but that it would take a while to become accurate. The main problem is that when you are translating into a language you do not know, you have no “quality control”, unlike transcription where you can tell. At the moment, the quality of existing tools requires human moderation, but if it becomes better quality, it would be beneficial in diverse classrooms to empower students to “read or listen – to choose to read or listen in a language they are more comfortable in”.

SAU2 had limited experience with automated translation. But she felt the “possibility is wonderful” to be able to read academic texts written in different languages, but she could see a challenge in the nuances, and how “heritage and culture get lost in translation” and how language and culture can be commodified and appropriated with such tools.

That’s it for now. What’s the latest development in these tools, are they getting better? Can they solve some huge problems?

Photo by Jason Rosewell on Unsplash I loved this image of the boy who looks like he’s screaming to be heard, because I think humans need to recognize other humans voices, to hear them, to truly listen, to respond and reciprocate, maybe more than we want machines to recognize our voices and follow our commands.

2 thoughts on “Blogging an Unpublished Paper: South African & Egyptian Academic Developers’ Perceptions of AI in Education Part 8: Attitudes Towards Voice/Speech Recognition & Automated Translation

  1. When I saw the photo above the article, I thought the boy was singing his heart out, using music to express his deepest thoughts. Your idea of humans needing to listen to each other, respond, and reciprocate really resonates with me. I feel that connects well with the idea of the social brain. (For those not familiar with the social brain, Lieberman’s “Social: Why our brains are wired to connect” is a good starting point. This is a useful review of that book: https://www.mindbrained.org/2020/09/social-why-our-brains-are-wired-to-connect/ .) I also recently read your blog post on “A New Approach for Listening” and have been reflecting on the ideas there: https://blog.mahabali.me/pedagogy/critical-pedagogy/a-new-approach-for-listening/ .

    I work as a foreign language instructor in Germany in HE, and Google Translate does a really good job translating German to English and vice versa. Foreign language teachers here are thinking about what kinds of writing tasks to give students if all they need to do is copy and paste their German text into Google Translate. I had a conversation about this earlier this week with colleagues from Japan and Turkey, and we bounced around the idea of giving our students a text from Google Translate and having them work with that text. Some students (and this may be culture-bound) believe that what Google Translate produces is the end product. So, our job as language teachers is to change students’ mindsets and help them adapt the Google text to a specific purpose and/or to different audiences. So, to move beyond the foreign language teaching context, I think all of us can help students think critically about automated translation, check what the software produces, and adjust that product if necessary. It’s my hope that we use automated translation to facilitate communication and understanding.

    1. Thank you so much for sharing that. I think you’re pointing to something so important here…
      That even as AI gets better at the technical aspects of writing and translation, there is still human work to be done in order to actually connect and communicate. Which changes the role of language teaching and as you said what we actually ask learners to do to demonstrate their learning. I love that you’re being explicit by bringing in Google translate text and looking at it critically, rather than what most ppl do which is tell them to avoid it altogether! That won’t work, and perhaps isn’t even the best approach or best use of their time.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.