Discrepancies Between ChatGPT and Vietnamese EFL Teachers in Writing Assessment

Authors

DOI:

https://doi.org/10.54855/ijaile.26311

Keywords:

ChatGPT, Vietnamese EFL teacher, writing, discrepancy, rubric

Abstract

Artificial intelligence, particularly ChatGPT, strongly influences education, especially in modern EFL classrooms. However, a significant gap still exists in Vietnam regarding how ChatGPT’s writing assessments differ from those of EFL teachers. To address this gap, this study directly compares ChatGPT’s writing assessments with those of Vietnamese EFL teachers to identify key discrepancies. Twenty experienced teachers working at English centers and universities in Ho Chi Minh City participated in this quantitative research. ChatGPT and human raters used an analytic rubric adapted from a standardized rubric to assess the written works of twenty students. A paired-sample t-test was then employed to compare the scores of human raters and ChatGPT in assessing twenty students’ essays across the rubric criteria. The findings revealed a statistically significant difference between ChatGPT and Vietnamese EFL teachers in writing evaluations (p = .034), with ChatGPT (M = 16.08, SD = 3.29)  assigning a higher total score than human raters (M = 14.75, SD = 2.96). Particularly, ChatGPT tended to give higher scores than human raters across all criteria, including content, organization, vocabulary, and grammar, with the content criterion showing the greatest discrepancy.

Author Biographies

  • Ho Nhut Nam, Industrial University of Ho Chi Minh City

    Ho Nhut Nam completed a Bachelor's program in English Language at the Industrial University of Ho Chi Minh City, Vietnam, and is pursuing a master's degree at the same institution. He has many years of teaching young learners and adults. His research interests include the integration of AI into language education, second-language acquisition, and interdisciplinary studies of language and culture.

  • Tra Thi Cam Thu, Industrial University of Ho Chi Minh City

    Tra Thi Cam Thu is currently pursuing a master’s degree in English Language at the Industrial University of Ho Chi Minh City. She is an experienced English teacher who has worked with a wide range of students at a public school and at VUS. Her practical classroom experience has shaped her academic focus on developing effective strategies that enhance both student engagement and the language-learning process for EFL learners.

  • Pham Thi Quyen, Industrial University of Ho Chi Minh City

    Pham Thi Quyen is currently pursuing her master’s degree in English language at the Industrial University of Ho Chi Minh City, Vietnam. She is an officer at the Department of International Cooperation at a university in Ho Chi Minh City and has three years of experience in teaching English to young learners. Her interest in doing research is in English teaching and learning methodology.

  • Tran Phuong Thanh, Industrial University of Ho Chi Minh City

    Tran Phuong Thanh is currently pursuing an MA in English Language at the Industrial University of Ho Chi Minh City, Vietnam, building on her Bachelor of Arts in the same field. She has over three years of experience in teaching English communication courses to young learners and adults at private English centers. Her current academic and professional focus is on advancing teaching methodology and exploring the effective integration of AI into educational contexts.

References

Atasoy, A., & Moslemi Nezhad Arani, S. (2025). ChatGPT: A reliable assistant for the evaluation of students’ written texts?. Education and Information Technologies. https://doi.org/10.1007/s10639-025-13553-1

Baker, N. L. (2014). “Get it off my stack”: Teachers’ tools for grading papers. Assessing Writing, 19, 36–50. https://doi.org/10.1016/j.asw.2013.11.005

Berry, V., Sheehan, S., & Munro, S. (2019). What does language assessment literacy mean to teachers?. ELT Journal, 73(2), 113–123. https://doi.org/10.1093/elt/ccy055

Bucol, J. L., & Sangkawong, N. (2024). Exploring ChatGPT as a writing assessment tool. Innovations in Education and Teaching International, 62 (6), 1–16. https://doi.org/10.1080/14703297.2024.2363901

Bui, N. M., & Barrot, J. S. (2025). ChatGPT as an automated essay scoring tool in the writing classrooms: how it compares with human scoring. Educ Inf Technol 30, 2041–2058. https://doi.org/10.1007/s10639-024-12891-w

Geckin, V., Kızıltaş, E., & Çınar, Ç. (2023). Assessing second-language academic writing: AI vs. Human raters. Journal of Educational Technology and Online Learning, 6(4), 1096–1108. https://doi.org/10.31681/jetol.1336599

González, E. F., Trejo, N. P. & Roux, R. (2017). Assessing EFL university students’ writing: a study of score reliability. Revista Electrónica de Investigacion Educativa, 19(2), 91–103. https://doi.org/10.24320/redie.2017.19.2.928

Hang, N. T. T. (2021). Vietnamese upper-high school teachers’ views, practices, difficulties, and expectations on teaching EFL writing. Journal on English as a Foreign Language, 11(1), 1–20. https://doi.org/10.23971/jefl.v11i1.2228

Hoang, T. T. H., Dang, H. N., Pham, N. B. Q., & Truong, T. K. X. (2025). EFL teachers’ perceptions of utilizing ChatGPT in designing lesson plans for IELTS reading skills. International Journal of AI in Language Education, 2(3), 19–39. https://doi.org/10.54855/ijaile.25232

Hong, W. C. H. (2023). The impact of ChatGPT on foreign language teaching and learning: Opportunities in education and research. Journal of Educational Technology and Innovation, 3(1), 37–45. https://doi.org/10.61414/jeti.v5i1.103

International English Language Testing System (IELTS). (2023). IELTS Writing Band Descriptors. https://ielts.org/cdn/Guides/ielts-writing-band-descriptors.pdf

Jackaria, P. M., Hajan, B. H., & Mastul, A. R. H. (2024). A comparative analysis of the rating of college students’ essays by ChatGPT versus human raters. International Journal of Learning, Teaching and Educational Research, 23(2), 478–492. https://doi.org/10.26803/ijlter.23.2.23

Kim, H., Baghestani, Sh., Yin, Sh., Karatay, Y., Kurt, S., Beck, J., & Karatay, L. (2024). ChatGPT for writing evaluation: Examining the accuracy and reliability of AI-generated scores compared to human raters. Exploring artificial intelligence in applied linguistics (pp. 73–95). Iowa State University Digital Press. https://doi.org/10.31274/isudp.2024.154.06

Koraishi, O. (2024). The intersection of AI and language assessment: A study on the

Reliability of ChatGPT in grading IELTS writing task 2. Language Teaching Research Quarterly, 43, 22–42. https://doi.org/10.32038/ltrq.2024.43.02

Li, J., Huang, J., Wu, W., & Whipple, P. B. (2024). Evaluating the role of ChatGPT in enhancing EFL writing assessments in classroom settings: A preliminary investigation. Humanities and Social Sciences Communications, 11(1), 1–9. https://doi.org/10.1057/s41599-024-03755-2

Ludwig, S., Mayer, C., Hansen, C. L., Eilers, K., & Brandt, S. (2021). Automated essay scoring using transformer models. Psych, 3(4), 897–915. https://doi.org/10.3390/psych3040056

Nguyen, T. T. H. (2023). EFL Teachers’ Perspectives toward the Use of ChatGPT in Writing Classes: A Case Study at Van Lang University. International Journal of Language Instruction, 2(3), 1–47. https://doi.org/10.54855/ijli.23231

Nguyen, T. H. B., & Tran, T. D. H. (2023). Exploring the Efficacy of ChatGPT in Language Teaching. AsiaCALL Online Journal, 14(2), 156–167. https://doi.org/10.54855/acoj.2314210

Palmer, J., Williams, R. E., & Dreher, H. (2002). Automated essay grading system applied to a first-year university subject - How can we do it better?. Proceedings of IS2002 Informing Science and IT Education Conference (pp. 1221–1229). Informing Science Institute. https://doi.org/10.28945/2553

Pham, M. T., & Cao, T. X. T. (2025). The Practice of ChatGPT in English Teaching and Learning in Vietnam: A Systematic Review. International Journal of TESOL & Education, 5(1), 50–70. https://doi.org/10.54855/ijte.25513

Poláková, P., Ivenz, P., & Klímová, B. (2024). Examining the reliability of ChatGPT as an assessment tool compared to human evaluators. Procedia Computer Science, 246, 2332–2341. https://doi.org/10.1016/j.procs.2024.09.543

Topuz, A. C., Yıldız, M., Taşlıbeyaz, E., Polat, H., & Kurşun, E. (2025). Is generative AI ready to replace human raters in scoring EFL writing? Comparison of human and automated essay evaluation. Educational Technology & Society, 28(3), 36–50. https://doi.org/10.30191/ETS.202507_28(3).SP04

Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education?. Journal of Applied Learning and Teaching, 6(1), 342– 363. https://doi.org/10.37074/jalt.2023.6.1.9

Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894

Uyar, A. C., & Büyükahıska, D. (2025). Artificial intelligence as an automated essay scoring tool: A focus on ChatGPT. International Journal of Assessment Tools in Education, 12(1), 20–32. https://doi.org/10.21449/ijate.1517994

Vo, L., & Huynh, N. (2025). Vietnamese EFL Teachers’ Perspectives on ChatGPT: A Conceptual Metaphor Analysis. Arab World English Journal, 16(1), 162–178. https://doi.org/10.24093/awej/vol16no1.10

Yang, Y. (2024). The Reliability of Using ChatGPT in Rating EFL Writings. Shanlax International Journal of Education, 12(4), 49–59. https://doi.org/10.34293/education.v12i4.7855

Yoon, S. Y., Miszoglad, E., & Pierce, L. R. (2023). Evaluation of ChatGPT Feedback on ELL Writers’ Coherence and Cohesion. arXiv. https://doi.org/10.48550/arXiv.2310.06505

Downloads

Published

04/29/2026

Issue

Section

Research Articles

How to Cite

Ho, N. N., Tra, T. C. T., Pham, T. Q., & Tran, P. T. (2026). Discrepancies Between ChatGPT and Vietnamese EFL Teachers in Writing Assessment. International Journal of AI in Language Education, 3(1), 1-16. https://doi.org/10.54855/ijaile.26311

Similar Articles

1-10 of 28

You may also start an advanced similarity search for this article.