Evaluating ChatGPT’s Ability to Compose Music Using the MIDI File Format

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Marcus Warnerfjord; [2023]

Nyckelord: ;

Sammanfattning: This thesis examines the capabilities of the artificial intelligence model (AI), ChatGPT 3.5-turbo, to compose valuable music in a digital format (MIDI) from natural language prompts. Implementing a multifaceted quantitative approach, the study combines objective musical metrics with subjective user evaluations. The proof-of-concept system developed for this research generated MIDI files using ChatGPT, which were analyzed against human-composed music from the Lakh MIDI dataset. Objective measures, including pitch-class distributions, Inter-Onset-Interval (IOI), pitch range, average pitch intervals, pitch counts, note length, and transition matrices, facilitated a comprehensive comparison. Findings revealed that while the AI model’s output demonstrated stylistic consistency and a certain level of musical texture, it exhibited less complexity and variety compared to human compositions. Subjective evaluations, derived from a feedback survey, revealed moderate to low satisfaction with the AI-generated music. The results suggested that users with higher musical experience were less satisfied with the compositions, indicating a correlation between musical experience and perception of the AI-generated music. Despite its limitations, ChatGPT exhibits the capability to generate valuable music from natural language prompts. However, enhancements are necessary to better mimic the complexity and variance found in human compositions in order to make it applicable in music production.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)