Modeling temporal tonal relations in polyphonic music through deep networks with a novel image-based representation

Ching Hua Chuan, Dorien Herremans

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

We propose an end-to-end approach for modeling polyphonic music with a novel graphical representation, based on music theory, in a deep neural network. Despite the success of deep learning in various applications, it remains a challenge to incorporate existing domain knowledge in a network without affecting its training routines. In this paper we present a novel approach for predictive music modeling and music generation that incorporates domain knowledge in its representation. In this work, music is transformed into a 2D representation, inspired by tonnetz from music theory, which graphically encodes musical relationships between pitches. This representation is incorporated in a deep network structure consisting of multilayered convolutional neural networks (CNN, for learning an efficient abstract encoding of the representation) and recurrent neural networks with long short-term memory cells (LSTM, for capturing temporal dependencies in music sequences). We empirically evaluate the nature and the effectiveness of the network by using a dataset of classical music from various composers. We investigate the effect of parameters including the number of convolution feature maps, pooling strategies, and three configurations of the network: LSTM without CNN, LSTM with CNN (pre-trained vs. not pre-trained). Visualizations of the feature maps and filters in the CNN are explored, and a comparison is made between the proposed tonnetz-inspired representation and pianoroll, a commonly used representation of music in computational systems. Experimental results show that the tonnetz representation produces musical sequences that are more tonally stable and contain more repeated patterns than sequences generated by pianoroll-based models, a finding that is directly useful for tackling current challenges in music and AI such as smart music generation.

Original languageEnglish (US)
Title of host publication32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PublisherAAAI Press
Pages2159-2166
Number of pages8
ISBN (Electronic)9781577358008
StatePublished - Jan 1 2018
Event32nd AAAI Conference on Artificial Intelligence, AAAI 2018 - New Orleans, United States
Duration: Feb 2 2018Feb 7 2018

Publication series

Name32nd AAAI Conference on Artificial Intelligence, AAAI 2018

Other

Other32nd AAAI Conference on Artificial Intelligence, AAAI 2018
CountryUnited States
CityNew Orleans
Period2/2/182/7/18

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Modeling temporal tonal relations in polyphonic music through deep networks with a novel image-based representation'. Together they form a unique fingerprint.

  • Cite this

    Chuan, C. H., & Herremans, D. (2018). Modeling temporal tonal relations in polyphonic music through deep networks with a novel image-based representation. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 2159-2166). (32nd AAAI Conference on Artificial Intelligence, AAAI 2018). AAAI Press.