DebugProGrade: Improving Automated Assessment of Coding Assignments with a Focus on Debugging
Main Article Content
Abstract
In education, the evaluation of programming assignments is challenging, especially to do with the debugging aspect. Self-grading technologies are unable to capture the level of understanding of students and context-bound responses. In light of these, we created DebugProGrade to take what we normally know about grading and improve it with semantic analysis and keyword extraction. DebugProGrade identified 1000 first-year BCA students who in Google Forms provided their answers to evaluate error detection and solution proposals for a basic C programming assignment. For the explanations’ specificity and for the context-level evaluation, the system employs the SBERT embedding, namely, the sentence-transformer bidirectional encoding representations from transformers. We employ the methods with tuned parameters and apply academic criteria to the evaluations performed by them. Other key functionality in DebugProGrade that should be mentioned is the classification of debugging skills into competence levels providing more comprehensive view of student proficiency regarding bugs which remain unaddressed by traditional grading systems – that is the ability to identify or fix bugs. Upon optimizing the Gradient Boosting Regressor algorithm, it gives outstanding results in terms of evaluating and predicting redshift. The mean squared error is very low with a value of MSE = 0.025107 and the MAE is also quite low with the value 0.031335, overall the high R² score 0.99932 shows that the given dataset has been predicted with high accuracy with reference to the target variable. DebugProGrade precisely flips the paradigm of conventional grading and provides us with even greater understanding of where exactly students are strong.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All articles published in JIWE are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) License. Readers are allowed to
- Share — copy and redistribute the material in any medium or format under the following conditions:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use;
- NonCommercial — You may not use the material for commercial purposes;
- NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
References
H. Vimalaraj et al., “Automated Programming Assignment Marking Tool,” 2022, pp. 1–8, doi: 10.1109/I2CT54291.2022.9824339.
M. Messer, Automated Grading and Feedback of Programming Assignments, vol. 2, no. 1. Association for Computing Machinery, 2022.
B. P. Cipriano, N. Fachada, and P. Alves, “Drop Project: An automatic assessment tool for programming assignments,” SoftwareX, vol. 18, p. 101079, 2022, doi: 10.1016/j.softx.2022.101079.
X. Liu, S. Wang, P. Wang, and D. Wu, “Automatic grading of programming assignments: An approach based on formal semantics,” Proc. - 2019 IEEE/ACM 41st Int. Conf. Softw. Eng. Softw. Eng. Educ. Training, ICSE-SEET 2019, pp. 126–137, 2019, doi: 10.1109/ICSE-SEET.2019.00022.
A. Singh, S. Karayev, K. Gutowski, and P. Abbeel, “Gradescope: A fast, flexible, and fair system for scalable assessment of handwritten work,” L@S 2017 - Proc. 4th ACM Conf. Learn. Scale, pp. 81–88, 2017, doi: 10.1145/3051457.3051466.
D. M. Souza, K. R. Felizardo, and E. F. Barbosa, “A systematic literature review of assessment tools for programming assignments,” Proc. - 2016 IEEE 29th Conf. Softw. Eng. Educ. Training, CSEEandT 2016, pp. 147–156, 2016, doi: 10.1109/CSEET.2016.48.
N. Süzen, A. N. Gorban, J. Levesley, and E. M. Mirkes, “Automatic short answer grading and feedback using text mining methods,” Procedia Comput. Sci., vol. 169, no. 2019, pp. 726–743, 2020, doi: 10.1016/j.procs.2020.02.171.
Y. Kumar, S. Aggarwal, D. Mahata, R. R. Shah, P. Kumaraguru, and R. Zimmermann, “Get it scored using autosas - An automated system for scoring short answers,” 33rd AAAI Conf. Artif. Intell. AAAI 2019, 31st Innov. Appl. Artif. Intell. Conf. IAAI 2019 9th AAAI Symp. Educ. Adv. Artif. Intell. EAAI 2019, no. Higgins 2014, pp. 9662–9669, 2019, doi: 10.1609/aaai.v33i01.33019662.
R. Siddiqi, C. J. Harrison, and R. Siddiqi, “Improving teaching and learning through automated short-answer marking,” IEEE Trans. Learn. Technol., vol. 3, no. 3, pp. 237–249, 2010, doi: 10.1109/TLT.2010.4.
A. Condor, M. Litster, and Z. Pardos, “Automatic short answer grading with SBERT on out-of-sample questions,” Proc. 14th Int. Conf. Educ. Data Mining, EDM 2021, pp. 345–352, 2021.
S. Haller, A. Aldea, C. Seifert, and N. Strisciuglio, “Survey on Automated Short Answer Grading with Deep Learning: from Word Embeddings to Transformers,” vol. 1, no. 1, 2022, [Online]. Available: http://arxiv.org/abs/2204.03503.
X. Sun et al., “Sentence Similarity Based on Contexts,” Trans. Assoc. Comput. Linguist., vol. 10, pp. 573–588, 2022, doi: 10.1162/tacl_a_00477.
I. G. Ndukwe, C. E. Amadi, L. M. Nkomo, and B. K. Daniel, Automatic Grading System Using Sentence-BERT Network, vol. 12164 LNAI. Springer International Publishing, 2020.
N. Ayewah, D. Hovemeyer, D. J. Morgenthaler, J. Penix, and W. Pugh, “Using static analysis to find bugs,” IEEE Softw., vol. 25, no. 5, pp. 22–29, 2008, doi: 10.1109/MS.2008.130.
U. von Matt, "Kassandra: the automatic grading system," SIGCUE Outlook, vol. 22, no. 1, pp. 26–40, Jan. 1994, doi: 10.1145/182107.182101.
G. Singh, S. Srikant, and V. Aggarwal, “Question independent grading using machine learning: The case of computer program grading,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., vol. 13-17-August-2016, pp. 263–272, 2016, doi: 10.1145/2939672.2939696.
H. Nguyen, M. Lim, S. Moore, E. Nyberg, M. Sakr, and J. Stamper, Exploring metrics for the analysis of code submissions in an introductory data science course, vol. 1, no. 1. Association for Computing Machinery, 2021.
M. Jukiewicz, “The future of grading programming assignments in education: The role of ChatGPT in automating the assessment and feedback process,” Think. Ski. Creat., vol. 52, no. July 2023, p. 101522, 2024, doi: 10.1016/j.tsc.2024.101522.
H. Cheers, Y. Lin, and S. P. Smith, “Academic source code plagiarism detection by measuring program behavioral similarity,” IEEE Access, vol. 9, pp. 50391–50412, 2021, doi: 10.1109/ACCESS.2021.3069367.
M. Messer, N. C. C. Brown, M. Kölling, and M. Shi, “Automated Grading and Feedback Tools for Programming Education: A Systematic Review,” ACM Trans. Comput. Educ., vol. 24, no. 1, 2024, doi: 10.1145/3636515.
A. L. C. Barczak, A. Mathrani, B. Han, and N. H. Reyes, “Automated assessment system for programming courses: a case study for teaching data structures and algorithms,” Educ. Technol. Res. Dev., vol. 71, no. 6, pp. 2365–2388, 2023, doi: 10.1007/s11423-023-10277-2.
A. Kumar, A. Walter, and P. Manolios, “Automated Grading of Automata with ACL2s,” Electron. Proc. Theor. Comput. Sci. EPTCS, vol. 375, pp. 77–91, 2023, doi: 10.4204/EPTCS.375.7.
M. Novak and D. Kermek, “Assessment Automation of Complex Student Programming Assignments,” Educ. Sci., vol. 14, no. 1, 2024, doi: 10.3390/educsci14010054.
N. Denissov, L. M. Advisor, J. Sorva, and T. In, “Creating an educational plugin to sup-port online programming learning A case of IntelliJ IDEA plugin for A+ Learn-ing Management System Title: Creating an educational plugin to support online programming learning A case of IntelliJ IDEA plugin for A+ Lea,” 2021.
A. Shah, “Web-cat: A web-based center for automated testing,” ACM J. Educ. Resour. Comput., vol. 3, no. 3, pp. 1–24, 2003, [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109.1522&rep=rep1&type=pdf.
H. Kim, Y. Jiang, S. Kannan, S. Oh, and P. Viswanath, “DeepCode: Feedback codes via deep learning,” IEEE J. Sel. Areas Inf. Theory, vol. 1, no. 1, pp. 194–206, 2020, doi: 10.1109/JSAIT.2020.2986752.
R. S. Pettit, J. D. Homer, K. M. Holcomb, N. Simone, and S. A. Mengel, “Are automated assessment tools helpful in programming courses?,” ASEE Annu. Conf. Expo. Conf. Proc., vol. 122nd ASEE Annual Conference and Exposition: Making Value for Society, no. 122nd ASEE Annual Conference and Exposition: Making Value for Society, 2015, doi: 10.18260/p.23569.
S. Parihar, “Automated Grading Tool for Introductory Programming,” IIT, KANPUR, 2015.
X. Hu, Z. Cai, M. Louwerse, A. Olney, P. Penumatsa, and A. Graesser, “A revised algorithm for latent semantic analysis,” IJCAI Int. Jt. Conf. Artif. Intell., pp. 1489–1491, 2003.