In this article we present a hybrid approach to the design of an automatic, style-specific accompaniment system that combines statistical learning with a music-theoretic framework, and we propose quantitative methods for evaluating the results of machine-generated accompaniment. The system is capable of learning accompaniment style from sparse input information, and of capturing style over a variety of musical genres. Generating accompaniments involves several aspects, including choosing chords, determining the bass line, arranging chords for voicing, instrumentation, etc. In this article we focus on harmonization: selecting chords for melodies, with an emphasis on style. Given exemplar songs as MIDI melodies with corresponding chords labeled as text, the system uses decision trees to learn the melody-chord relations shared among the songs. Markov chains on the neo-Riemannian framework are applied tomodel the likelihood of chord patterns. Harmonization is generated in a divide-and-conquer manner: Melody fragments that strongly imply certain triads are designated as checkpoints that are in turn connected by chord progressions generated using the Markov model. Chord subsequences are then refined and combined to form the final sequence. We propose two types of measures to quantify the degree to which a machine-generated accompaniment achieves its style emulation goal: one based on chord distance, and the other based on the statistical metrics entropy and perplexity. Using these measures, we conduct two sets of experiments using Western popular songs. Two albums by each of three artists (Green Day, Keane, and Radiohead), for a total of six albums, are used to evaluate the Computer Music Journal, 35:4, pp. 64-82, Winter 2011 c 2011 Massachusetts Institute of Technology. proposed system. The first set of experiments are inter-system tests that compare the output of the proposed system with that of the Temperley-Sleator (T-S) Harmonic Analyzer, a rule-based system, and that of a näive harmonization system. The results show that the hybrid system produces harmonizations that have more chords identical to those in the original song, and that are more "consistent" with the original. Consistent harmonizations have more chord phrases-a sequence of chords that harmonizes a (vocal) melody phrase-that are frequently observed in the original, as indicated by the lower entropy and perplexity values. The second set of experiments consists of intra-system comparisons that test the effect of training data (one versus many) on style emulation effectiveness, and tests that investigate the performance of each system component separately. The results show that the system with the single-song training set generates, on average, more chords that are identical to the original than systems trained on all songs in the album. The results also demonstrate the robustness of the music-theoretic framework against erroneous information. Although the system is evaluated using popular music, it can be generalized to produce style-specific harmonization for other music genres where Western music theory is applied.
ASJC Scopus subject areas
- Media Technology
- Computer Science Applications