November 11, 2009 : Matt Hoffman

Princeton University

Probabilistic Graphical Models for the Analysis (and Synthesis) of Musical Audio

I will present applications of new and existing generative probabilistic models to several problems related to the extraction of musically meaningful information from audio: timbral similarity estimation, semantic annotation and retrieval, and latent source discovery and separation. I will conclude by demonstrating an example of how these models can also be repurposed to generate novel musical audio.

In order to estimate how similar two songs sound to one another, we employ a Hierarchical Dirichlet Process (HDP) mixture model to discover a shared representation of the distribution of timbres in each song. Comparing songs under this shared representation yields better query-by-example retrieval quality and scalability than previous approaches.

To predict what tags are likely to apply to a song (e.g., “rap,” “happy,” or “driving music”), we develop the Codeword Bernoulli Average (CBA) model, a simple and fast mixture-of-experts model. Despite its simplicity, CBA performs at least as well as state-of-the-art approaches at automatically annotating songs and finding to what songs in a database a given tag most applies.

Finally, we extend the HDP to discover the latent sonic sources (e.g. bass drums, guitar chords, etc.) that are present in sets of songs and to allow the isolation or suppression of individual sources. The ability of our Shift-Invariant HDP (SIHDP) to decide how many latent sources are necessary to model the data is particularly valuable in this application, since it is impossible to guess a priori how many sounds will appear in a given song or set of songs.

We can also adapt the SIHDP model to create new versions of input audio with arbitrary sample sets, for example, to create a sound file that matches a song as closely as possible by combining spoken text to create a sort of “laptop a cappella” arrangement.

seminars/seminaritems/2009-11-11.txt · Last modified: 2009/11/09 09:58 by koray