Evaluating Questions About Learners’ Code generated with OpenAI’s GPT-4

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Emil Pernler; Leo Vainio; [2023]

Nyckelord: ;

Sammanfattning: OpenAI’s ChatGPT has sparked a wave of interest worldwide since its public release in November, 2022. While concerns have been raised about the potential for students using the tool in order to cheat, researchers have recently started exploring how such models can be integrated into education to benefit learning. This report explores the potential of using GPT-4 to generate multiplechoice questions about student code by assessing the generated questions using qualitative metrics and employing numerical analysis to evaluate the extent to which they are correct, relevant, and reasonable to ask a student. Overall, the results show that there is good potential in using GPT-4 to generate questions about student code, with absence of logical errors, correct use of programming concepts, and a correct analysis of the program code. However, the results also indicate that a significant proportion of the generated questions were deemed not sensible to ask a student, often because they were too easy and did not assess the students’ comprehension of the program in any meaningful way. Furthermore, we found that it was easy to influence which aspects of the program that the generated question assessed by providing a set of keywords in the prompt.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)