Using fine-tuned GPT-3 models to evaluate difficulty levels for auto-generated computer science exercises

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: George Dawod Bassilious; [2023]

Nyckelord: ;

Sammanfattning: Technological advancements in large language models, such as OpenAI’s GPT-3 and the subsequent release of GPT-3.5, have opened up new paths for research. While the potential applications of these models in education are yet to be fully explored, there is an opportunity to streamline educators’ workload and enhance students’ learning experiences. This study aims to investigate the use of large language models for auto-generating programming exercises, which can save time for educators and dynamically adapt questions to individual students’ needs. By assessing the different difficulty levels of exercises, this study addresses the challenge of accommodating varied learning rates and ensuring the correctness of the generated questions. The research methodology involves fine-tuning GPT-3 models using LeetCode’s programming exercise dataset and classifying exercises generated by GPT-3.5 using these fine-tuned models. The difficulty levels considered for classification are categorized as easy, medium, and hard. The findings indicate that, except for one fine-tuned model, the models tended to underestimate the difficulty of exercises. Furthermore, the GPT-3.5 base model showed higher accuracy in evaluating difficulty compared to the fine-tuned models. Although the inclusion of real-life data during the fine-tuning process influenced the models’ performance, it proved insufficient on its own. Achieving optimal accuracy requires balanced and high-quality training data. Future work could be to explore alternative data sources, involve input from educators and students, and refine the fine-tuning process. Additional research is necessary to improve the models’ accuracy in evaluating exercise difficulty. Despite the current limitations, these models have the potential to revolutionize the education sector by providing more efficient methods for exercise generation and assessment.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)