An Overview of Modern Speech Recognition

Xuedong Huang and Li Deng

The task of speech recognition is to convert speech into a sequence of words by a computer program. As the most natural communication modality for humans, the ultimate dream of speech recognition is to enable people to communicate more naturally and effectively. While the long-term objective requires deep integration with many NLP components discussed in this book, there are many emerging applications that can be readily deployed with the core speech recognition module we will review in this chapter. Some of these typical applications include voice dialing, call routing, data entry and dictation, command and control, and computer-aided language learning. In this chapter, we provide an overview in Section II of the main components in speech recognition, followed by a critical review of the historically significant developments in the field in Section III. We devote Sections IV to speech recognition applications, including some recent case studies. An in-depth analysis of current state of speech recognition and detailed discussions on a number of future research directions in speech recognition are presented in Sections V.

Bibtex Citation

    author = {Xuedong Huang and Li Deng},
    title = {An Overview of Modern Speech Recognition},
    booktitle = {Handbook of Natural Language Processing, Second Edition},
    editor = {Nitin Indurkhya and Fred J. Damerau},
    publisher = {CRC Press, Taylor and Francis Group},
    address = {Boca Raton, FL},
    year = {2010},
    note = {ISBN 978-1420085921}