Talks titles and abstracts


The neural representation of pitch in primate auditory cortex

A fundamental component of both music and speech is pitch, our perception of how high or low a sound is on a musical scale. Theoretically, pitch can be computed using either spectral or temporal information in the acoustic signal. We have investigated what type of information is used for pitch extraction by the common marmoset (Callithrix jacchus), by measuring pitch discrimination behaviorally and examining pitch-selective single-unit responses in auditory cortex. Our data support a hybrid model of pitch processing whereby both spectral and temporal information contribute to the encoding of pitch.

Jennifer BIZLEY

Neural correlates of pitch and timbre perception in ferret auditory cortex

We can describe a sound in a number of different ways; sounds originate from locations in space, and their source is often readily identifiable. In music the perceptual qualities of pitch and timbre are key; any periodic sound will elicit a pitch percept, yet when different musical instruments play the same note we are able to identify the sounds as characteristically different from one another due to differences in their timbre. Differences in sound timbre also underlie our ability to discriminate vowel sounds, such as an /i/ from a /u/, spoken at the same voice pitch. We investigated whether the independence of the perceptual qualities of pitch, timbre and spatial location was reflected in the tuning properties of neurons in auditory cortex. Our results suggested that neurons in auditory cortex were sensitive to multiple sound features, but while they did not represent pitch, timbre or location independently of the other two features, it was possible to extract different features unambiguously from a single neuron by considering its spiking response at different time points. In order to better understand the relationship between pitch perception and neural activity, we trained ferrets in a pitch discrimination task and then recorded from auditory cortex whilst they performed it. We examined the local field potential activity and observed activity that was informative about the pitch of the sounds that the animal was discriminating. Moreover, the neural response was also informative about the decision (i.e. was it “higher” or “lower” in pitch) that the animal made. This suggests that auditory cortical neurons are not simply static filters, tuned to detect a particular acoustic feature, but rather activity in auditory cortex may be intimately related to our perception of sounds and may even reflect the decisions we make based about those sounds.


Speaking the Emotional Language of Music

A day in life of a film composer, highlighting the techniques, anecdotes, and experiences of creating emotions through music. This will explore both specific and abstract ways in which music can bring meaning and magic to words and images, and as an underscore to our daily lives.


Musical Surprisal Modeling with Audio Oracle

Audio Oracle (AO) represents musical structure in terms of a graph that links similar segments of variable length found in an audio recording. This representation has been used successfully for improvisation by creating smooth recombinations of recorded segments that share a common history. Moreover, the AO graph can be used to analyze the dynamics of music complexity by characterizing every moment in a piece in terms of its relation to the past, the recall distance and length of shared context. Surprisal occurs at moments that display a sudden drop in the size of common history relative to some generic complexity of the music when history is not taken into account.  In the lecture I will demonstrate the use of AO on live improvisation and on recorded music and present a formalism for estimating entropy and information rate as characteristics of musical form and composition design. Biological relevance of the proposed model will be discussed in relation to memory-prediction and temporal learning models


All singing, all dancing. Sound production and its control in birds

Like human infants, most songbirds acquire their beautiful songs by imitation. They can generate an incredible range of sounds that are controlled by complicated neural networks driving the uniquely avian sound-producing organ: the syrinx. To understand how neural signals are translated into acoustic signals, I study the neuromuscular control and biomechanics of sound production in the syrinx. Instead of working from the brain down to the vocal organ, my aim is to define neural control parameters and constraints by understanding the function of the syrinx. I use a variety of in vivo and in vitro experimental techniques such as electro- and muscle physiology and am developing a preparation to study sound production in vitro. Hand in hand with experimental work, I also use theoretical approaches to understand neural control by developing mathematical models of sound production. Although my main focus is on the songbird system, I recently also broadened my scope and worked on other model vertebrate systems for sound production, such as emergent complexity in toadfish vocalizations and neuromuscular control of echolocation in bats and whales.



Psychoacoustic abilities as predictors of vocal emotion recognition in autism

Individuals with Autism Spectrum Disorders (ASD) show deficiencies in prosodic abilities, both pragmatic and affective. Their deficiencies in affective prosody have been mostly related to cognitive difficulties in emotion recognition. The current study tested an alternative hypothesis, linking vocal emotion recognition difficulties in ASD to lower level auditory perceptual deficiencies. Twenty one high functioning male adults with ASD and 32 male adults from the general population, matched on age and verbal abilities, and screened for normal hearing limits, undertook a battery of auditory tasks. Results demonstrated that individuals with ASD scored significantly lower than controls on vocal emotion recognition. Psychoacoustic abilities were strong predictors of vocal emotion recognition in both the ASD and control groups. Psychoacoustic abilities explained 48.1% of the variance of vocal emotion recognition scores in the ASD group, and 28.0% of the variance in the general population group. These results highlight the importance of lower level psychoacoustic factors in the perception of prosody in autism

Benjamin GOLD

The Effects of Musical Pleasure on Dopaminergic Learning

Neuroimaging has linked music listening with brain areas implicated in emotion and reward, such as the ventral striatum, that are regulated by endogenous dopamine transmission (Blood and Zatorre 2001; Menon and Levitin 2005; Salimpoor et al. 2009). Levels of striatal dopamine seem to influence reinforcement learning behavior: subjects with more dopamine tend to learn better from rewards, while those with less dopamine tend to learn better from punishments (Frank et al. 2004; Frank et al. 2007). In this study, we explored the practical implications of how music, through its ability to induce pleasure and enhance dopamine release, affects reward-based learning in a task dependent on dopamine transmission (Frank et al. 2004). Subjects acquired and expressed learning through a probabilistic selection (PS) task (Frank et al. 2004), once while listening to pre-selected pleasurable or neutral music, and once in silence. The musical and silent conditions were pseudo-randomized and counterbalanced across subjects. The PS task consisted of a training phase with three stimulus pairs of different reward probabilities that subjects learned via trial-by-trial feedback followed by a test phase of recombined stimulus pairs and no feedback.
Throughout the training phase, subjects learned to distinguish frequently rewarded symbols from infrequently rewarded ones. Those who first learned the task without music tended to choose between stimuli significantly more accurately and faster than those who learned with music, and there was a trend of neutral music, when compared to pleasurable music, improving training accuracy. By the end of training, subjects across musical conditions learned the task to similar levels, but those who had learned with neutral music performed significantly better at testing compared to those who had learned with pleasurable music regardless of what music they were currently hearing. Furthermore, subjects were best at avoiding punishing when listening to pleasurable music, and best at approaching rewarding symbols during silence. Overall, initial learning and reward approaching seem to be better without music, but within musical conditions these results suggest that pleasurable music might be distracting during training but, possibly via its associated dopamine transmission, could benefit harm-avoidance during testing. Future experiments will directly assess the effects of music on striatal dopamine activity during reinforcement learning.

Blood AJ and Zatorre RJ. 2001. Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proc Natl Acad Sci U S A 98(20):11818-23.
Frank MJ, Seeberger LC, O’Reilly RC. 2004. By carrot or by stick: Cognitive reinforcement learning in parkinsonism. Science 306(5703):1940-3.
Frank MJ, Moustafa AA, Haughey HM, Curran T, Hutchison KE. 2007. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc Natl Acad Sci U S A 104(41):16311-6.
Menon V and Levitin DJ. 2005. The rewards of music listening: Response and physiological connectivity of the mesolimbic system. Neuroimage 28(1):175-84.
Salimpoor VN, Benovoy M, Longo G, Cooperstock JR, Zatorre RJ. 2009. The rewarding aspects of music listening are related to degree of emotional arousal. PLoS One 4(10):e7487.


Kernel Information Bottleneck as a Metaphor for Music and Art Perception.

The information bottleneck method was proposed by Tishby et. al 1999 as a general computational principle for information processing in the brain. It has been a useful tool in machine learning and data analysis, as well as in explaining neuronal activity and cognitive behavior. The idea behind this method is to extract efficient compressions – minimal sufficient statistics – of a source variable X that preserve as much information as possible on a target variable Y. In this talk we develop a novel extension of the Gaussian Information Bottleneck suggested in Chechik and Globerson in 2005 to a wide family of continuous cases using Vapnik’s kernel trick.

Based on this technique we will try to argue that the kernel Information Bottleneck methods provide a suggestive metaphor for repeated exposure to art. Repeated exposure to an artwork leads the observer to create a refined internal representation of the work at hand, and this representation can be described quantatively in the terms of Information Bottleneck approach. In this talk we present the mathematical background as well as illustrative examples from the domains of music and visual art, and discuss further applications of similar ideas in the domains of music cognition and sensorimotor synchronization.



Mapping music to memories in the human brain

Music-evoked autobiographical memories and associated emotions are poignant examples of how music engages the brain. Janata binds music theory, cognitive psychology, and computational modeling to illustrate how music moves about in tonal space (the system of major and minor keys). He then shows how the unique tonal movements of individual excerpts of popular music can be used in conjunction with neuroimaging experiments to identify brain networks that support the experiencing of memories and emotions evoked by the music.

Dezhe JIN

Bird song syntax: statistical model and neural mechanism

Music and language consist of sequences of musical notes or words. The sequences follow rules with both restriction and flexibility. Bird songs of species such as the Bengalese finch have similar traits. With about ten syllable types, a Bengalese finch sings songs with syllable sequences that are variable but not random. Statistical analysis shows that the sequences can be described accurately using a probabilistic state transition model. A state is associated with a single syllable, but a syllable can be represented by multiple states. A state branches out to several states, and the next state is determined using the transition probabilities. A state sequence thus generated determines a syllable sequence. The transition probabilities can adapt if the states are repeated. The resulting model is a partially observable Markov model with adaptation (POMMA). The POMMA-generated syllable sequences are statistically equivalent to the observed syllable sequences. A POMMA can be mapped into a network model of the projection neurons in HVC (proper name), a critical pre-motor nucleus in the song control neural pathway. A state corresponds to a chain network of the HVC projection neurons. Spike propagation through the chain produces the associated syllable. A chain branches into other chains. At a branching point, the spikes select one of the connected chains to continue the propagation due to mutual inhibition mediated by the inhibitory HVC inter-neurons and noise. This selection corresponds to the probabilistic syllable transition. The model also allows influence of auditory feedback on the transition probabilities, which can adapt when a syllable is sung repeatedly. Recent experimental results support the branching chain neural network model of the song syntax in the Bengalese finch.

Jason KERR



Music and primary auditory cortex

In order to connect music and the auditory system, it is necessary to reduce musical terms such as tones, pitch, intervals, octave equivalence and the such into concepts that are being used in auditory neuroscience, such as excitation patterns, receptive fields and so on. I would like to argue that on the one hand, at least as far as primary auditory cortex, neurons represent to a large extent the physical vibrations at the eardrum rather than any perceptual property of sounds such as pitch, loudness or timbre. In that respect, it is impossible to reduce music to auditory responses at the level of primary auditory cortex, and in fact music begins to a large extent at the place where the auditory system ends, in the transition from the physical code to a the perceptual code that underlie music. On the other hand, neurons in the same stations, even in anesthetized rats, carry information about the short-term history of sounds, including sensitivity to the probability of sounds, to periodic vs. random tone patterns, and even to surprising notes played in the context of musical pieces. Thus, in as much as music is organized sound, organization is represented early, while sound is represented only later.

Frank OHL

Cortical neurodynamics of auditory category learning

This talk will focus on constructive aspects of auditory perception. It has long been hypothesized that certain features of auditory cortex physiology do reflect endogeneous processes supporting constructive perceptual functions, rather than providing “bottom-up representations” of physical stimulus features.

The talk will give an overview of recent experimental evidence for this hypothesis using data from rodent behavioral electrophysiology. A central aspect will be auditory category formation, an elemental process of auditory perception, including music perception. Specifically, auditory category formation is a perceptual process which allows distinguishing experimentally between bottom-up and top-down aspects of stimulus processing. Behaviorally, category formation becomes apparent as a sudden state transition in a subject’s behaviour engaged in the task of discriminating sensory stimuli (“Aha event”). This state transition is accompanied by the emergence of identifiable spatiotemporal activity patterns of neocortical neuronal activity obtained from multichannel recordings of epidural electrocortiograms. State space analysis of these patterns reveals that the initial response components to stimuli are predominantly determined by physical stimulus features and anatomical connectivity patterns. Later response components show transient epochs of clustering into particular subregions of the state space after learning. The topology of these clusters reflects the perceptual scaling exhibited by the animals in their behavioral category selection. These data suggest the coexistence of separate coding principles for representing physical stimulus attributes and subjectively relevant information about stimuli, respectively. Some implications of these findings for (a) transmodal (audiovisual) category transfer and (b) construction of interactive cortical sensory neuroprostheses are also discussed.


Musical Syntax: Computational and Empirical Perspectives

Tonal harmony constitutes a major building block for Western music and has been the object of numerous theoretical and empirical research approaches. Although there are many textbooks on tonal harmony, there are comparably few descriptions of how tonal harmonic sequences are constructed. In my contribution I will present reasons to assume that harmony is governed by context-free dependency relationships and discuss what it means to *hear* musical syntax, to hear dependency trees and how these affect perceptual tasks. These theoretical points cast various predictions for music cognition as well as cognitive computational models. I will compare these predictions with recent evidence from cognitive experiments as well as computational models of music.

Constance SCHARFF

Do birds tango?

I will report on different aspects of a project that is based on two general hypotheses:

(1) human-specific traits like music and language must have evolved upon biological substrates that exist in non-human animals

(2) affect in language and music is partly conveyed through rhythmic structures and differences in rhythmic structure may also distinguish signals that are produced in different biological contexts, such as aggression or courtship. Taking these two hypotheses together yielded our specific hypothesis: affective rhythmic manifestations in humans and non-human animals are constrained by overlapping biological mechanisms, i.e. how sounds and body movements are produced and perceived. Specifically we proposed to investigate whether

(a) we can find regular rhythmic structure in courtship song of male birds
(b) females can distinguish rhythmic differences and prefer species-typical rhythms over others and
(c) analyze whether courtship song and dance are rhythmically coordinated.


Taupe grey and Eyes Wide Shut: A journey into a violinist’s mind

Studying a new violin piece is not just about putting your fingers in the right place. Far more important is to define a series of mental associations with the music and with the technical aspects. Technically difficult spots need clear commands such as ‘left hand first, right hand second’ or ‘third finger’. Associations linked to musical interpretation, on the other hand, can go in any direction: colours, feelings, dream-like images of people or landscapes, even movie titles. As a professional violinist, I have documented all these processes while studying a movement by 18th century composer Giuseppe Tartini. This diary shows how mental connections appear, develop and are finally selected to ensure a most personal performance

Daniele SCHÖN

Temporal structures in music and speech

Music and speech are both hierarchically organised, rule-based systems which are dependent on how acoustic events unfold over time. The extent to which these two human capabilities are entwined has been the subject of much interest and valuable comparisons between the two domains have allowed us to consider to what extent their processing draws on similar or different mechanisms. Of the several parallels which can be drawn between music and speech, of present interest is ‘rhythm’. More specifically, we are interested in metrical structure: the regular alternation of elements (beats or syllables) that are perceived as ‘strong’ or ‘weak’. I will suggest that there is a preference for regularity, and that this regularity enhances music and speech processing via entrainment of neuronal oscillations to a regular rhythm as suggested by the Dynamic Attending Theory whereby attentional rhythms are said to become phase-locked to the external auditory events. I will present some data suggesting that metre perception in speech and music is dependent on similar mechanisms, and, to take this further, that carry-over effects after a regular stimulus has ended (such as entrainment or the continuation of a memory trace) may infringe upon the perception of a following auditory event in a cross-domain manner.


Learning in embodied action-perception loops

To become a musician one needs to rehearse. Can we understand from a theoretical perspective what music rehearsal is and how it relates to learning and exploration? To address these questions, we applied information theory and machine learning theory to embodied agents. This approach can help to build agents that produce interesting behaviors in the absence of external rewards.

The first part of the talk describes a theory for simple agents that explore maze worlds in order to build an internal model how their actions effect their motion in the maze. I will describe how to derive an action policy that optimizes the expected information gain of the internal model and thus maximizes learning speed. We refer to these action policies as learning-driven exploration.

Other labs (Nihat Ay, Ralph Der, Susanne Still) have proposed a different objective, predictive information, to drive actions of agents that already possess an internal model of the world. Predictive information leads to behavior that is varied, yet predictable to the internal model. We refer to these action policies as play-driven. In the second part of the talk I will show how the two different principles, expected information gain and predictive information can be combined to establish an action policy that switches between learning-driven – if the internal model is inaccurate, and play-driven – if the internal model is accurate.

Daniel Little & Friedrich T. Sommer


Birdsong concert

The pieces in this concert are all based on pied butcherbird (Cracticus nigrogularis) songs. This species’ range includes much of the Australian mainland. They learn their songs, but no two mature pied butcherbirds sing entirely the same phrases, and their solo songs change partially or wholly each spring.
My (re)compositions do not seek to develop so much as to illuminate and celebrate these vocalisations, including the melodies, rhythms, timbres (when possible), and other conventions. Most of this concert’s material was initially delivered as nocturnal solo song. My accompanying field recordings include various birds, insects, mammals (kangaroo, dingo, Homo sapiens), the Australian Air Force taking off in a helicopter as I recorded on a remote Arnhemland airstrip—whatever I encounter on my trips.
The American ornithologist and philosopher Charles Hartshorne describes the pied butcherbird as “the true ‘magic flute’, the perfection of musical tonality coming from a bird. I doubt any European will have heard anything so richly musical from birds” (1953: 118). But can the musicality of a pied butcherbird phrase survive transcription and reassignment to an instrument of a different range, timbre, and facility? If so, what might the music of nature have to tell us about the nature of music?
While birdsong drives the compositional decisions and many of these pieces are almost direct transcriptions, I do not bind myself to that rule. Nevertheless, if for a moment here and there I have the conceit that I have improved upon their phrases, I more often have the sense that it is they who have improved me as a musician.



Music Theory for Birds: Searching for Structural Organization in the Interactive Vocalizations of Songbirds

Much is known about how birds sing, but we know very little about why they sing. What features of birdsong make it sexually attractive or appealing?

How can vocal communication trigger and maintain social affiliation? How can vocalizations alter the behavioral state of a
listener? We will present four methods for approaching these questions:

  1. a bird-computer interface for studying how non-singing female Zebra Finches may piece together their own song compositions,
  2.  a bird-computer interface for studying rhythmic interaction between calling Zebra Finches,
  3. analysis of improvisational patterns in the Australian Pied Butcherbird song,
  4. a test of David Huron’s hypothesis (about the role of anticipation in music’s effect on emotion) using controlled early
  5. exposure to syllable combinations.

Eathan Janney, Hollis Taylor, Lucas Parra, Austen Gess, Jon Benichov, David Rothenberg & Ofer Tchernichovski

Mark Jude TRAMO

Neural Coding of Tonal Harmony

This talk describes how single neurons and populations of neurons represent fundamental bass and roughness and, consequently, the perceived consonance and dissonance of harmonic intervals in Western tonal music. The material is heavy in mathematics and basic neurobiology and includes some music theory and psychacoustics.


Auditory feedback and song learning

What is the role of the auditory forebrain and auditory feedback in vocal learners such as the zebra finch? To support motor learning based on an auditory template, motor neurons must receive acoustic information about the template as well as auditory feedback. It is currently unknown how motor, memory, and feedback signals interact in the brain, though several models have been proposed, including forward and inverse models, and reinforcement learning. In my talk I will present data recorded in singing zebra finches that provides new insights into the algorithms for song learning.