Music Genre Classification using Transfer Learning on log-based MEL Spectrogram
Published in IEEE, 2021
Abstract:
Deep Learning, a branch of Machine Learning is a rapidly expanding field in the Industry 4.0 revolution. The number of applications of Deep Learning are enormous - finding multiple uses in a single domain. Deep Learning enhances current research and provides better perception to a wide spectrum of domains, including Music Information Retrieval. Music is an art, that is widely celebrated worldwide, with countless songs being released or published every day. Music can be classified into many "genres". A statistical study conducted in 2018 shows that there are more than 10 genres of music, with `Hip-Hop/Rap 'being the most with most music composed. When one decides to listen to music, one usually has a particular genre in mind, and expects the music application he or she is using to provide numerous songs falling in that genre. A good music application would not only provide many songs, but also increase the ease of access to a particular song or genre. Due to tremendous increase in the amount of digital data available on the internet, the task of accurately classifying music files has become a major problem for music applications like YouTube Music, iTunes or Spotify, which can be solved with the help of Deep Learning techniques. This paper includes an indepth comparison of four transfer learning architectures, i.e. Resnet34, Resnet50, VGG16 and AlexNet, which were the best performing models during different times in ImageNet, to accomplish the task of classification of different songs on the basis of their genre.