Abstract:
Automating the process of melody generation from lyrics has been a challenging research task in the field of artificial intelligence. Lately, however, music-related datasets have become available at large-scale, and with the advancements of deep learning techniques, it has become possible to better explore this task. In particular, Generative Adversarial Networks (GANs) have shown a lot of potential in generation tasks involving continuous-valued data such as images. In this work, however, we explore Conditional Generative Adversarial Networks (CGANs) for discrete-valued sequence generation, in particular, we exploit the Gumbel-Softmax relaxation technique to train GANs for discrete sequence generation. We propose a novel architecture,Three Branch Conditional (TBC) LSTM-GAN for melody generation from lyrics. Through extensive experimentation, we show that our proposed model outperforms the baseline models by generating tuneful and plausible melodies from the given lyrics.