Exploring the Expressive Range of Conversational Laughter with AI

Share this post:

How can our laughter express a wide range of emotions, and how does it function in conversation? Could discerning a range of expressive laughter types help us interpret human emotion or detect different kinds of expression, such as sarcasm, which can be notoriously difficult to detect? And, perhaps most importantly, what can conversational laughter convey with respect to a person’s health and wellness? These questions prompted our investigation into giggles, chuckles, and chortles. As delightful as laughter is, it is serious business.

This week at the IEEE International Conference on Acoustics, Speech and Signal Processing, our team will present new research that explores what this understudied element of human expression can tell us5. To do this, we turned to crowdsourcing, machine learning, and other AI techniques to dive deeper into the analysis and meaning of different types of laughter. Our goals included not only discovering and recognizing the range of perceived laughter types within conversational speech, but also understanding the expressive purpose behind each of these. Many times, the purpose behind a laugh can convey different emotions3, which could give clues into an individual’s health.

Laughter is often associated with humor. However, many times people laugh for a range of other emotions and intentions. It can signal empathy, encouragement, agreement, approval, sarcasm, social dominance, and connection, to name a few of these social signals1,2,3. It also provides a mechanism for emotional relief when people are faced with stressful social situations, when they’re experiencing pain, or when they have to talk about difficult topics or sad memories.

Until recently, most computational explorations of laughter did not explore the range and function of laughter and considered it only as a single type of utterance. Much of this prior work was primarily motivated by the need to distinguish it from speech sounds, to improve the accuracy of speech recognition. Other work analyzed laughter linguistically, in the context of speech prosody, and focused on characterizing it via syllables, pitch, loudness, speed, and duration. While this identified some measurable properties in laughter, it did not identify the different expressive purposes of conversational laughter or distinguish one kind of laughter from another.

Still other approaches to laughter analysis focused on social interactions and included exploring the differences between speech, laughter, joint laughter, and simultaneous speech and laughter as well as the function of laughter within specific social scenarios, such as to communicate empathy, foster collaboration, and encourage desired behavior change. There have also been studies to understand the responses of humans to robots that laugh4.

Applying AI to laughter could lay the groundwork to help health professionals one day use technology to collect more data about their patients’ verbal signals and potentially help to paint a more complete picture of an individual’s overall health. Stress, pain, and different emotions are all indicators of a holistic health picture and—when combined with other markers—could potentially indicate a risk for different physical and mental conditions.

We began by asking over 1200 people to describe what they heard across 120 representative episodes of unscripted, natural, conversational laughter. These voluntarily submitted laughter samples came from oral history interviews contributed to the Library of Congress Veterans History Project. Over half of the description related to perceived emotion in laughter, and about another 40 percent of the description was related to perceived voice quality (e.g., breathiness, resonance, and other qualities) and prosody (e.g., loudness, pitch, speed, duration, and articulation). Examples are shown in the graph below.

We applied computational linguistic analysis techniques (specifically latent semantic analysis, which analyzes the relationship between words) to the words that listeners used to describe sound clips and found twelve frequently described types of laughter.

Four of the more compelling laughter types include Genuine Happiness, Sad Chuckles, Energized Laugh-Talk, and Sarcastic Confidence. Our analytic technique allows us to understand not only what each laughter type is but also clearly what it is not.

Laughter type What this laughter is What this laughter is not
Genuine Happiness Happy and genuine; a mixture of sustained, rhythmic giggles and chuckles Airy, breathy, quiet, short, or soft (no gasping, sighing, or expressing negative emotions)
Sad Chuckles Very short, low-pitched chuckles, about the same volume as the surrounding speech Happy, long, airy, or giggly (no inhaling, exhaling, or gasping)
Energized Laugh-Talk Fast, simultaneous speech and laughter Surprised or nervous
Sarcastic Confidence Sarcastic, sure, confident Surprised, tentative, or sincere

We then created AI models which measure how closely different laughter samples matched the top eight laughter types we discovered. These models mapped measurable acoustic qualities to laughter types which people perceived. This process is the reverse of what is often done in machine learning and modeling, but it resulted in models which are more suitable to building applications, because they are aligned with what people actually perceive. In this way, we can create laughter “fingerprints” using the collective output of our models.

The waveforms for these four distinctive laughter types are shown below, and some of the distinguishing features are visually apparent. A Genuine Happiness laugh has regular pulses and is relatively long and loud, compared to the other utterances. Sad Chuckles are short and softer but still have visible pulses. Energized Laugh-Talk is simultaneous laughter with talking, so the pulses are present but muddied by speech. The pulses occur at a faster rate, as well; and the speaker/laugher’s lack of pause and moderate volume are visible. Sarcastic Confidence laughter is a syllable of derision, an uttered “Tuh”. It’s very short and quiet (it is barely voiced and very airy-sounding) with no visible pulses. The irregular appearance is consistent with the white noise that occurs when a person sighs.

We also explored if a laughter fingerprint could help us identify general modes of human vocal expression. The answer is “yes,” for some difficult-to-recognize kinds of vocal expression. For example, we were better able to recognize sarcastic speech that included laughter by evaluating the laughter fingerprint by itself, instead of looking at standard acoustic features alone5. We also saw a 14 percent improvement in the recognition of “positive, reflective calm” speech by using the laughter fingerprint alone than by using standard acoustic features5.

Equipped with these results, we are expanding our explorations of laughter and how other nonverbal elements of vocal expression—when analyzed with AI—can give clues about an individual’s well-being, from stress levels to signals of pain to emotional distress. We hope to explore how these natural vocal gestures, and their resulting “fingerprints,” could potentially one day help clinicians to detect and monitor disease in non-invasive, accessible, and simple ways.

  1. Gregory A. Bryant et al., “The Perception of Spontaneous and Volitional Laughter Across 21 Societies,” Association for Psychological Science, 29(9): 1515-1525, 2018.
  2. Sophie Scott, Nadine Lavan, Sinead Chen, and Carolyn McGettigan, “The social life of laughter,” Trends Cog Sci 18(12):618-629, 2014.
  3. Diana P. Szameitat, Benjamin Kreifelts, Kai Alter, Andre J. Szameitat, Anette Sterr, Wolfgang Grodd, and Dirk Wildgruber, “It is not always tickling: Distinct cerebral responses during perception of different laughter types,” NeuroImage 53: 1264-1271, 2010.
  4. Anton Batliner, Stefan Steidl, Florian Eyben, and Bjorn Schuller, “On Laughter and Speech-Laugh, Based on Observations of Child-Robot Interaction, In Jurgen Trouvain and Nick Campbell (eds), The Phonetics of Laughing, Sarland University Press, January 2011.
  5. Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios, and Guillermo Cecchi, “Dimensional Analysis of Laughter in Female Conversational Speech,” ICASSP 2019.

IBM Research

More Healthcare stories

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

Exploring quantum spin liquids as a reservoir for atomic-scale electronics

In “Probing resonating valence bond states in artificial quantum magnets,” we show that quantum spin liquids can be built and probed with atomic precision.

Continue reading

Fine-grained visual recognition for mobile AR technical support

Our team of researchers recently published paper “Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support,” in IEEE ISMAR 2020, which outlines an augmented reality (AR) solution that our colleagues in IBM Technology Support Services use to increase the rate of first-time fixes and reduce the mean time to recovery from a hardware disruption.

Continue reading