Discrepancies Between ChatGPT and Vietnamese EFL Teachers in Writing Assessment
DOI:
https://doi.org/10.54855/ijaile.26311Keywords:
ChatGPT, Vietnamese EFL teacher, writing, discrepancy, rubricAbstract
Artificial intelligence, particularly ChatGPT, strongly influences education, especially in modern EFL classrooms. However, a significant gap still exists in Vietnam regarding how ChatGPT’s writing assessments differ from those of EFL teachers. To address this gap, this study directly compares ChatGPT’s writing assessments with those of Vietnamese EFL teachers to identify key discrepancies. Twenty experienced teachers working at English centers and universities in Ho Chi Minh City participated in this quantitative research. ChatGPT and human raters used an analytic rubric adapted from a standardized rubric to assess the written works of twenty students. A paired-sample t-test was then employed to compare the scores of human raters and ChatGPT in assessing twenty students’ essays across the rubric criteria. The findings revealed a statistically significant difference between ChatGPT and Vietnamese EFL teachers in writing evaluations (p = .034), with ChatGPT (M = 16.08, SD = 3.29) assigning a higher total score than human raters (M = 14.75, SD = 2.96). Particularly, ChatGPT tended to give higher scores than human raters across all criteria, including content, organization, vocabulary, and grammar, with the content criterion showing the greatest discrepancy.
References
Atasoy, A., & Moslemi Nezhad Arani, S. (2025). ChatGPT: A reliable assistant for the evaluation of students’ written texts?. Education and Information Technologies. https://doi.org/10.1007/s10639-025-13553-1
Baker, N. L. (2014). “Get it off my stack”: Teachers’ tools for grading papers. Assessing Writing, 19, 36–50. https://doi.org/10.1016/j.asw.2013.11.005
Berry, V., Sheehan, S., & Munro, S. (2019). What does language assessment literacy mean to teachers?. ELT Journal, 73(2), 113–123. https://doi.org/10.1093/elt/ccy055
Bucol, J. L., & Sangkawong, N. (2024). Exploring ChatGPT as a writing assessment tool. Innovations in Education and Teaching International, 62 (6), 1–16. https://doi.org/10.1080/14703297.2024.2363901
Bui, N. M., & Barrot, J. S. (2025). ChatGPT as an automated essay scoring tool in the writing classrooms: how it compares with human scoring. Educ Inf Technol 30, 2041–2058. https://doi.org/10.1007/s10639-024-12891-w
Geckin, V., Kızıltaş, E., & Çınar, Ç. (2023). Assessing second-language academic writing: AI vs. Human raters. Journal of Educational Technology and Online Learning, 6(4), 1096–1108. https://doi.org/10.31681/jetol.1336599
González, E. F., Trejo, N. P. & Roux, R. (2017). Assessing EFL university students’ writing: a study of score reliability. Revista Electrónica de Investigacion Educativa, 19(2), 91–103. https://doi.org/10.24320/redie.2017.19.2.928
Hang, N. T. T. (2021). Vietnamese upper-high school teachers’ views, practices, difficulties, and expectations on teaching EFL writing. Journal on English as a Foreign Language, 11(1), 1–20. https://doi.org/10.23971/jefl.v11i1.2228
Hoang, T. T. H., Dang, H. N., Pham, N. B. Q., & Truong, T. K. X. (2025). EFL teachers’ perceptions of utilizing ChatGPT in designing lesson plans for IELTS reading skills. International Journal of AI in Language Education, 2(3), 19–39. https://doi.org/10.54855/ijaile.25232
Hong, W. C. H. (2023). The impact of ChatGPT on foreign language teaching and learning: Opportunities in education and research. Journal of Educational Technology and Innovation, 3(1), 37–45. https://doi.org/10.61414/jeti.v5i1.103
International English Language Testing System (IELTS). (2023). IELTS Writing Band Descriptors. https://ielts.org/cdn/Guides/ielts-writing-band-descriptors.pdf
Jackaria, P. M., Hajan, B. H., & Mastul, A. R. H. (2024). A comparative analysis of the rating of college students’ essays by ChatGPT versus human raters. International Journal of Learning, Teaching and Educational Research, 23(2), 478–492. https://doi.org/10.26803/ijlter.23.2.23
Kim, H., Baghestani, Sh., Yin, Sh., Karatay, Y., Kurt, S., Beck, J., & Karatay, L. (2024). ChatGPT for writing evaluation: Examining the accuracy and reliability of AI-generated scores compared to human raters. Exploring artificial intelligence in applied linguistics (pp. 73–95). Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.06
Koraishi, O. (2024). The intersection of AI and language assessment: A study on the
Reliability of ChatGPT in grading IELTS writing task 2. Language Teaching Research Quarterly, 43, 22–42. https://doi.org/10.32038/ltrq.2024.43.02
Li, J., Huang, J., Wu, W., & Whipple, P. B. (2024). Evaluating the role of ChatGPT in enhancing EFL writing assessments in classroom settings: A preliminary investigation. Humanities and Social Sciences Communications, 11(1), 1–9. https://doi.org/10.1057/s41599-024-03755-2
Ludwig, S., Mayer, C., Hansen, C. L., Eilers, K., & Brandt, S. (2021). Automated essay scoring using transformer models. Psych, 3(4), 897–915. https://doi.org/10.3390/psych3040056
Nguyen, T. T. H. (2023). EFL Teachers’ Perspectives toward the Use of ChatGPT in Writing Classes: A Case Study at Van Lang University. International Journal of Language Instruction, 2(3), 1–47. https://doi.org/10.54855/ijli.23231
Nguyen, T. H. B., & Tran, T. D. H. (2023). Exploring the Efficacy of ChatGPT in Language Teaching. AsiaCALL Online Journal, 14(2), 156–167. https://doi.org/10.54855/acoj.2314210
Palmer, J., Williams, R. E., & Dreher, H. (2002). Automated essay grading system applied to a first-year university subject - How can we do it better?. Proceedings of IS2002 Informing Science and IT Education Conference (pp. 1221–1229). Informing Science Institute. https://doi.org/10.28945/2553
Pham, M. T., & Cao, T. X. T. (2025). The Practice of ChatGPT in English Teaching and Learning in Vietnam: A Systematic Review. International Journal of TESOL & Education, 5(1), 50–70. https://doi.org/10.54855/ijte.25513
Poláková, P., Ivenz, P., & Klímová, B. (2024). Examining the reliability of ChatGPT as an assessment tool compared to human evaluators. Procedia Computer Science, 246, 2332–2341. https://doi.org/10.1016/j.procs.2024.09.543
Topuz, A. C., Yıldız, M., Taşlıbeyaz, E., Polat, H., & Kurşun, E. (2025). Is generative AI ready to replace human raters in scoring EFL writing? Comparison of human and automated essay evaluation. Educational Technology & Society, 28(3), 36–50. https://doi.org/10.30191/ETS.202507_28(3).SP04
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education?. Journal of Applied Learning and Teaching, 6(1), 342– 363. https://doi.org/10.37074/jalt.2023.6.1.9
Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894
Uyar, A. C., & Büyükahıska, D. (2025). Artificial intelligence as an automated essay scoring tool: A focus on ChatGPT. International Journal of Assessment Tools in Education, 12(1), 20–32. https://doi.org/10.21449/ijate.1517994
Vo, L., & Huynh, N. (2025). Vietnamese EFL Teachers’ Perspectives on ChatGPT: A Conceptual Metaphor Analysis. Arab World English Journal, 16(1), 162–178. https://doi.org/10.24093/awej/vol16no1.10
Yang, Y. (2024). The Reliability of Using ChatGPT in Rating EFL Writings. Shanlax International Journal of Education, 12(4), 49–59. https://doi.org/10.34293/education.v12i4.7855
Yoon, S. Y., Miszoglad, E., & Pierce, L. R. (2023). Evaluation of ChatGPT Feedback on ELL Writers’ Coherence and Cohesion. arXiv. https://doi.org/10.48550/arXiv.2310.06505
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Ho Nhut Nam, Tra Thi Cam Thu, Pham Thi Quyen, Tran Phuong Thanh (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.










