About us


Leadership team

Erin M. Buchanan
Data Lead

Timo B. Roettger Analysis Lead

Chenzi Xu
Scientific Lead Method Lead

Xinbing Luo
Project Manager

Indranil Dutta
Outreach Lead

Cong Zhang
Ethics Lead



ManyTones chenchenzi.github.io/manytones/

Big Team Collaboration


Join as a collaborator

  • Open to research labs, fieldworkers, linguists, cognitive scientists, and musicians
  • Collaborate in data collection and analysis
  • Co-author in high-impact, multi-author publications


Collaboration Agreement

Roadmap



Introduction and Aims

1 2 3

Pitch in Communication

1 2 3


  • Pitch carries information in nonhuman vocalisations, human music, and human speech
    • Physical
    • Emotional
    • Social
    • Semantic
  • Pitch: A percept of sound correlated primarily with fundamental frequency
  • Fundamental frequency (\(f_0\)): The lowest frequency of the vocal fold vibration

Primate evolutionary relationships.
Source: The Guardian, 2016

Pitch in Human Language

1 2 3


Macroprosody

  • The number, alignment, heights, and shape of \(f_0\) peaks (Kohler, 1990)
  • Stress and intonation
  • Pitch accent and lexical tone

Microprosody

  • Vowel intrinsic \(f_0\) due to vowel height
  • Consonant-related \(f_0\) perturbations
  • \(f_0\) masking or variations in non-modal phonation

Macroprosody: Lexical Tone

Standard Mandarin tones

Cross-linguistic CF0

1 2 3

F0 difference smooths for between voiceless and voiced over normalized time across 20 languages (Ting et al., 2025)


CF0

\(f_0\), especially at vowel onset, is higher following a voiceless obstruent than following a voiced one.

CF0 effect size ranges from 0.4-3.9 semitones across 20 languages (Ting et al., 2025).

Reported CF0 duration ranges from 20 to 140 ms across languages.

The temporal extent and magnitude of CF0 vary considerably.

The Development of Contrastive Tones

1 2 3


Five stages of tonogenesis based on Maran (1973). VOT = voice onset time.

Source: Kang, 2014


Production and perception in tonal development

Stage II - III: Small CF0 perturbations are detectable by listeners

Stage III- IV: CF0 is one of the perceptual cues to the consonant contrast

Aims and Research Questions

1 2 3



1. To investigate minimal CF0 for onset pitch differentiation


2. To enhance our knowledge on microprosodic pitch perception in speech


3. To create an online framework for large-scale auditory perceptual research


To what extent can \(f_0\) perturbations be perceived?


How is the ability to perceive \(f_0\) perturbations distributed across the population?


Whether and how language experience and musical competence affect CF0 perception?

Methods and Pilot Study

1 2 3

The Psychoacoustic Experiment

1 2 3


The pitch-matching paradigm

Schematic representation of the experimental paradigm of Hombert (1975, p.223 Part I).



Note

Participants adjust the fundamental frequency using a knob in Hombert (1975).

Our Stimuli

1 2 3


Stimuli design

  • 6 perturbation duration: \(\Delta t = 40, 60, 80, 100, 120, 250\) ms

  • 8 perturbation frequency: \(\Delta f = \pm 10, 20, 30, 40\) Hz

  • 3 sound token type: baseline \(f_0\) of 150 Hz

    • Complex tone with 12 harmonics
    • Vowel [i:]
    • Consonant-vowel syllable [ti:] (with short-lag VOT: \(\approx 12\) ms)


  • Resynthesised from a male recording (44.1 kHz, 16 bit, mono)

  • Intensity normalised to 75 dB

  • Fixed token length of 250 ms

Our Pipeline

1 2 3

Source: Musical Ear Test (Wallentin et al., 2010)

Our Hypothesis

1 2 3


Categorical effect in perceiving onset differentiation

Possible functions fitted to subjects’ response frequency against slope duration.

Possible functions fitted to subjects’ accuracy in detecting the direction of ΔF in ΔT condition.

Pilot Study

1 2 3


36 Subjects’ accuracy of judging F’s direction.

Accuracy (A) of judging ΔF direction

  • Generally:

\[A \propto (\left| \Delta F \right|,\Delta T)\]

  • When \(\left| \Delta F \right|\) = 10 Hz, accuracy is very low even with ΔT = 120 ms.
  • The accuracy increase is not linear.

Pilot Study

1 2 3


36 Subjects’ average response (black) in Hz by ΔF. Some raw responses (grey) are not shown given the focused range of \(y\) axis.

Average perceived ΔF (R)

  • The average perceived \(\Delta F\) is relatively small \(\left| R \right|< 10\) Hz.
  • In some cases (e.g. 20 ms), R reaches a plateau even with increased \(\Delta T\)
  • There is considerable individual variation.

Pilot Study

1 2 3


36 Subjects’ average response (black) in Hz by ΔT. Some raw responses (grey) are not shown given the focused range of \(y\) axis.

Average perceived ΔF (R)

  • R is roughly linear with respect to \(\left| \Delta F \right|\), (only) when \(\Delta T\) is large. \[\frac{\partial^2 R}{\partial (\Delta F)^2} \to 0\]

Timeline and Challenges

1 2 3

Our Next Steps

1 2 3


Step 1: Establish the leadership team

Step 2: Ethical application and prepare a pilot study

Step 3: Recruit collaborators, collect feedback on pilot, and improve experiment

Step 4: Collaboratively write and submit Registered Report

Step 5: Acquire more collaborators with IPA (in principle acceptance)

Step 6: Conduct study

Step 7: Complete and publish the journal article

Challenges

1 2 3


Human auditory perception is highly complex





Considerable individual variation


Trade-off between experiemnt length and participant number

  • Multiple factors influence our auditory perception
    • Language background
    • Music competence
    • Age
    • Cognitive factors (e.g. attention, memory)
  • “Pitch-perfect” VS “tone-deaf”
  • Uncontrollable environment for non-lab experiment
  • Hard to find a large number of participants in some minority language communities

Thank you


Please reach out if you are interested in our project!

https://chenchenzi.github.io/manytones/
manytones@many-languages.com