Pemanfaatan Teknologi Speech-to-Text untuk Penilaian Diri dalam Pengucapan Bahasa Inggris bagi Pembelajar EFL

Rini Puspasari; Maharani Nur Khafifah; Ulil Albab

doi:10.65837/p5pqcm76

PDF

Published: Jul 28, 2025

DOI: https://doi.org/10.65837/p5pqcm76

Keywords:

Automatic Speech Recognition (ASR), Speech-to-Text Technology, EFL Pronunciation, Pronunciation Training, Learner Autonomy, Teknologi Pembelajaran Bahasa, Kemandirian Belajar

Rini Puspasari

Maharani Nur Khafifah

Ulil Albab

Abstract

The latest development in Automatic Speech Recognition (ASR) technology has revolutionized speaking practice for English as a Foreign Language (EFL) students. Furthermore, based on the systematical review of literatures containing 25 published peer-reviewed research articles ranging from 2020 to 2025, has been examined the effectiveness of speech-to-text tools in facilitating self-assessment of speaking practice. The studies choosen and analysed by using mixed methods which focused on qualitative with PRISMA Guide. The results reveal 3 central points, they are: 1). ASR tools are significantly improving students’ speaking accuracy, especially on vowel sounds and regular past tense suffix; 2). The tools give visual and objective direct feedbacks that improve autonomous and motivation of learning; 3). Althouh effective, ASR system still facing challenge on recognizing non-standard accents and requires optimum audio condition for better performance. Recent evidence also shows that regular and structured ASR practice supports longterm retention to the improvement of speaking skill, though the suprasegmental features like intonation and stress still feel difficult to master. On the other hand, this article also offers practical recommendation to integrate the technology in language curriculum, as well as suggestion for future research on improving accent introduction and the implementation in real world context. By mediating technology innovation and evidence-based pedagogy, this study provides practical knowledge for teachers, curriculum designers, and researchers who eager to implement ASR as a motivating and sutainable speaking practice tool in EFL context which based on either digital or even blended learning.

Issue

Vol. 1 No. 1 (2025): Juli: Edulingua: Journal of Language Education

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Aljabr, A. (2025). ASR using Speechnotes for EFL Learners: A Study of the Effects on English Pronunciation and Prosody Skills. 4(2), 979–987-979–987. https://doi.org/10.62754/joe.v4i2.6384

Clarke, V., & Braun, V. (2014). Thematic analysis. In Encyclopedia of critical psychology (pp. 1947-1952). Springer. https://doi.org/10.1007/978-1-4614-5583-7_311

Cumbal, R., Moell, B., Lopes, J., & Engwall, O. (2024). You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish. https://doi.org/10.21437/interspeech.2021-2140

Derwing, T. M., & Munro, M. J. (2022). Pronunciation learning and teaching. In The Routledge handbook of Second Language Acquisition and speaking (pp. 147-159). Routledge. https://doi.org/10.4324/9781003022497-14

Guskaroska, A. (2019). ASR as a tool for providing feedback for vowel pronunciation practice Iowa State University].

Hodges, C., Moore, S., Lockee, B., Trust, T., & Bond, A. (2020). The difference between emergency remote teaching and online learning. 27(1), 1-9. https://doi.org/10.1163/9789004702813_021

Inceoglu, S., Chen, W.-H., & Lim, H. (2024). Monitoring student behavior in autonomous automatic speech recognition-based pronunciation practice. 124, 103387. https://doi.org/10.2139/ssrn.4663652

Koenecke, A., Nam, A., Lake, E., Nudell, J., Quartey, M., Mengesha, Z., . . . Goel, S. (2020). Racial disparities in automated speech recognition. 117(14), 7684-7689. https://doi.org/10.1073/pnas.1915768117

Liakin, D., Cardoso, W., & Liakina, N. (2017). Mobilizing instruction in a second-language context: Learners’ perceptions of two speech technologies. 2(3), 11. https://doi.org/10.3390/languages2030011

Ma, Q., Mei, F., & Qian, B. (2024). Exploring EFL students’ pronunciation learning supported by corpus-based language pedagogy. 1-27. https://doi.org/10.1080/09588221.2024.2432965

Ngo, T. T.-N., Chen, H. H.-J., & Lai, K. K.-W. (2024). The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. 36(1), 4-21. https://doi.org/10.1017/s0958344023000113

Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., . . . Brennan, S. E. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. 372. https://doi.org/10.31222/osf.io/v7gm2_v1

Prinos, K., Patwari, N., & Power, C. A. (2024). Speaking of accent: A content analysis of accent misconceptions in ASR research. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency,

Saadia, K. H. (2023). Assessing the Effectiveness of Text-to-Speech and Automatic Speech Recognition in Improving EFL Learner’s Pronunciation of Regular Past-ed.

Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. 14, 1210187.

Article Sidebar

Main Article Content

Abstract

Article Details

References