MIT OpenCourseWare

教學大綱

Help support MIT OpenCourseWare by shopping at Amazon.com! Partnering with Amazon.com, MIT OCW offers direct links to purchase the books cited in this course. Click on the book titles and purchase the book from Amazon.com, and MIT OCW will receive up to 10% of all purchases you make. Your support will enable MIT to continue offering open access to MIT courses.

Organization of the Course: Lectures and Labs

  • Two 1-hour lectures per week
  • Two labs per week

Students are organized into groups of 2 or 3. Laboratory sessions are usually 2-3 hours, and occur on Tuesday and Thursday. Sets of readings for each lab are to be read before class. Some readings are in the text, and others will be handed out.

Lecture will cover background material pertinent to lab, in these areas:

  • The Acoustics and Acoustic Analysis of Speech
  • The Physiology of Speech Production
  • Sentence-level Phenomena
  • The Perception of Speech
  • Speech Disorders and Development
  • Speech Synthesis and Speech Recognition

Written Requirements

Brief report on each lab: Summarize and interpret your results; details on the method are not necessary. The written report is to be handed in on next class day after the lab.

Short term paper: Select the topic about middle of semester; propose a limited experiment. An oral report on the results will be presented during last few class meetings. Submit the written version (in format of Journal of the Acoustical Society) at the end of term.

No exam: Grading is based roughly 40% on the term paper with the rest on lab reports and participation in class.

Required Textbook

Stevens, K. N. Acoustic Phonetics. Cambridge, MA: MIT Press, 1999. ISBN: 026219404X.

Other Reference Books

Wherever possible, the book citations below reflect the specific editions used in the course.

Beranek, L. Acoustics. Revised ed. New York, NY: Acoustical Society of America, 1986. ISBN: 088318494X.

Chomsky, N., and M. Halle. The Sound Pattern of English. Reprint ed. Cambridge, MA: MIT Press, 1991. ISBN: 026253097X.

Denes, P. B., and E. N. Pinson. The Speech Chain: The physics and biology of spoken speech. 2nd ed. New York, NY: W.H. Freeman and Company, 1993. ISBN: 0716723441.

Flanagan, J. L. Speech Analysis, Synthesis and Perception. 2nd ed. New York, NY: Springer-Verlag, 1972. ISBN: 0387055614.

Hardcastle, W. J., and J. Laver, eds. The Handbook of Phonetic Sciences. Oxford, UK: Blackwell Publishers, 1997. ISBN: 0631188487.

Kent, Raymond D., Bishnu S. Atal, and Joanne L. Miller, eds. Papers in Speech Communication: Speech Production. New York, NY: Acoustical Society of America, 1991. ISBN: 0883189585.

———. Papers in Speech Communication: Speech Perception. New York, NY: Acoustical Society of America, 1991. ISBN: 0883189593.

———. Papers in Speech Communication: Speech Processing. New York, NY: Acoustical Society of America, 1991. ISBN: 0883189607.

O'Shaughnessy, D. Speech Communications: Human and Machine. 2nd ed. New York, NY: Wiley-IEEE Press, 1999. ISBN: 0780334493.

Pisoni, D., and R. Remez, eds. The Handbook of Speech Perception. Cambridge, MA: Blackwell Publishing, 2005. ISBN: 0631229272.

The Speech Chain

Study of speech often summarized as the study of a chain of events, beginning with what goes on in a speaker's brain to plan an utterance, moving through the acoustics of speech and ending with the steps in the listener's brain that result in comprehension of the utterance:

Speech chain graphic.

The speech chain.

This approach makes clear the diversity of topics one must understand in some depth in order to do basic research in speech.

Diversity of Topics that Relate to Speech Research

Linguistics

  • Semantics: The meanings of words, and relations among them.
  • Syntax: The order of words, role of function words.
  • Phonology: Individual phonemic segments, features, stressed and unstressed vowels.
  • For example,
    • What is the phonemic inventory of English?
    • How does it function? The concept of contrast (e.g., pat vs. bat).
    • Why do we believe that it is psychologically real?
    • Why does the same phoneme give rise to different, acoustic realizations in different utterances? (e.g., In fluent speech, "Joe ate his soup" loses the /h/ of "his", and the /t/ of "ate" doesn't look like a /t/ in "Tom".)
    • What are the principles that lead to modifications of segments in different environments?
    • How are phonemes usually described in terms of features translated into phonetic representations? (e.g., /z/ is + voiced, /s/ is - voiced; same relation for many pairs, like f-v; patterning of sounds is beautifully captured by feature concept.)

The course spans the topics of acoustic analysis, mechanics of human speech production, and human speech processing with an emphasis on relating each area back to linguistic contrasts.

Physiology of Speech Production

Structures capable of generating and modifying speech sounds; includes respiration for speech, laryngeal structures, vocal-tract structures and their control, and nasal tract.

Acoustics

  • General (sound sources; vowels made by vibration of vocal folds; some sounds are produced with turbulence noise source; differences among vowels by change in size and shape of the vocal tract.)
  • Resonant properties of the vocal tract, nasal cavities.
  • Sound radiation at the lips.
  • Acoustics of speech sounds traveling through air.

Acoustic Phonetics

Description of important attributes of speech sounds, especially English sounds; Discussion of prosody (i.e., durations, fundamental frequency of vocal fold vibration, amplitude.)

Auditory Nervous System

  • Peripheral: Middle and inner ear; recent progress from recording signals from ear of animals; knowledge about the coding there has implications about which of the acoustic properties of speech sounds can be discriminated.
  • Central: With less physiological data regarding central processes, we mostly rely on psychophysics and cognitive psychology with emerging information from fMRI and other imaging techniques.

Psychophysics and Cognitive Psychology

Studies of people's response to simple and complex sounds.

Summary of Topics Important to Speech Research

  • Linguistics
  • Physiology of Speech Production and Perception Systems
  • General Acoustics
  • Acoustic Characteristics of Speech Sounds
  • Psychophysics of Auditory System
  • Cognitive Psychology
  • Computer-based Algorithms

In this course, we stress the use of computer algorithms, and their individual strengths and weaknesses, rather than the mathematics of the algorithms themselves (references available).