advances in audio and speech signal processing technologies and applications

Thông tin tài liệu

TEAM LinG i Advances in Audio and Speech Signal Processing: Technologies and Applications Hector Perez-Meana National Polytechnic Institute, Mexico Hershey • London • Melbourne • Singapore IDEA GROUP PUBLISHING TEAM LinG ii Acquisition Editor: Kristin Klinger Senior Managing Editor: Jennifer Neidig Managing Editor: Sara Reed Assistant Managing Editor: Sharon Berger Development Editor: Kristin Roth Copy Editor: Kim Barger Typesetter: Jamie Snavely Cover Design: Lisa Tosheff Printed at: Yurchak Printing Inc. Published in the United States of America by Idea Group Publishing (an imprint of Idea Group Inc.) 701 E. Chocolate Avenue Hershey PA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail: cust@idea-group.com Web site: http://www.idea-group.com and in the United Kingdom by Idea Group Publishing (an imprint of Idea Group Inc.) 3 Henrietta Street Covent Garden London WC2E 8LU Tel: 44 20 7240 0856 Fax: 44 20 7379 0609 Web site: http://www.eurospanonline.com Copyright © 2007 by Idea Group Inc. All rights reserved. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this book are for identication purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI of the trademark or registered trademark. Library of Congress Cataloging-in-Publication Data Advances in audio and speech signal processing : technologies and applications / Hector Perez Meana, editor. p. cm. Summary: “This book provides a comprehensive approach of signal processing tools regarding the enhancement, recognition, and protection of speech and audio signals. It offers researchers and practitioners the information they need to develop and implement efcient signal processing algorithms in the enhancement eld” Provided by publisher. Includes bibliographical references and index. ISBN 978-1-59904-132-2 (hardcover) ISBN 978-1-59904-134-6 (ebook) 1. Sound Recording and reproducing. 2. Signal processing Digital techniques. 3. Speech processing systems. I. Meana, Hector Perez, 1954- TK7881.4.A33 2007 621.389’32 dc22 2006033759 British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher. TEAM LinG iii Advances in Audio and Speech Signal Processing: Technologies and Applications Table of Contents Foreword vi Preface viii Chapter I Introduction to Audio and Speech Signal Processing 1 Hector Perez-Meana, National Polytechnic Institute, Mexico Mariko Nakano-Miyatake, National Polytechnic Institute, Mexico Section I Audio and Speech Signal Processing Technology Chapter II Digital Filters for Digital Audio Effects 22 Gordana Jovanovic Dolecek, National Institute of Astrophysics, Mexico Alfonso Fernandez-Vazquez, National Institute of Astrophysics, Mexico Chapter III Spectral-Based Analysis and Synthesis of Audio Signals 56 Paulo A.A. Esquef, Nokia Institute of Technology, Brazil Luiz W.P. Biscainho, Federal University of Rio de Janeiro, Brazil TEAM LinG iv Chapter IV DSP Techniques for Sound Enhancement of Old Recordings 93 Paulo A.A. Esquef, Nokia Institute of Technology, Brazil Luiz W.P. Biscainho, Federal University of Rio de Janeiro, Brazil Section II Speech and Audio Watermarking Methods Chapter V Digital Watermarking Techniques for Audio and Speech Signals 132 Aparna Gurijala, Michigan State University, USA John R. Deller, Jr., Michigan State University, USA Chapter VI Audio and Speech Watermarking and Quality Evaluation 161 Ronghui Tu, University of Ottawa, Canada Jiying Zhao, University of Ottawa, Canada Section III Adaptive Filter Algorithms Chapter VII Adaptive Filters: Structures, Algorithms, and Applications 190 Sergio L. Netto, Federal University of Rio de Janeiro, Brazil Luiz W.P. Biscainho, Federal University of Rio de Janeiro, Brazil Chapter VIII Adaptive Digital Filtering and Its Algorithms for Acoustic Echo Canceling 225 Mohammad Reza Asharif, University of Okinawa, Japan Rui Chen, University of Okinawa, Japan Chapter IX Active Noise Canceling: Structures and Adaption Algorithms 286 Hector Perez-Meana, National Polytechnic Institute, Mexico Mariko Nakano-Miyatake, National Polytechnic Institute, Mexico Chapter X Differentially Fed Articial Neural Networks for Speech Signal Prediction 309 Manjunath Ramachandra Iyer, Banglore University, India TEAM LinG v Section IV Feature Extraction Algorithms and Speech Speaker Recognition Chapter XI Introduction to Speech Recognition 325 Sergio Suárez-Guerra, National Polytechnic Institute, Mexico Jose Luis Oropeza-Rodriguez, National Polytechnic Institute, Mexico Chapter XII Advanced Techniques in Speech Recognition 349 Jose Luis Oropeza-Rodriguez, National Polytechnic Institute, Mexico Sergio Suárez-Guerra, National Polytechnic Institute, Mexico Chapter XIII Speaker Recognition 371 Shung-Yung Lung, National University of Taiwan, Taiwan Chapter XIV Speech Technologies for Language Therapy 408 Ingrid Kirschning, University de las Americas, Mexico Ronald Cole, University of Colorado, USA About the Authors 434 Index 439 TEAM LinG vi Foreword Speech is no doubt the most essential medium of human interaction. By means of modern digital signal processing, we can interact, not only with others, but also with machines. The importance of speech/audio signal processing lies in preserving and improving the quality of speech/audio signals. These signals are treated in a digital representation where various advanced digital-signal-processing schemes can be carried out adaptively to enhance the quality. Here, special care should be paid to dening the goal of “quality.” In its simplest form, signal quality can be measured in terms of signal distortion (distance between signals). However, more sophisticated measures such as perceptual quality (the distance between human perceptual representations), or even service quality (the distance between human user experiences), should be carefully chosen and utilized according to applications, the environment, and user preferences. Only with proper measures can we extract the best performance from signal processing. Thanks to recent advances in signal processing theory, together with advances in signal processing devices, the applications of audio/speech signal processing have become ubiquitous over the last decade. This book covers various aspects of recent advances in speech/audio signal processing technologies, such as audio signal enhancement, speech and speaker recognition, adaptive lters, active noise canceling, echo canceling, audio quality evaluation, audio and speech watermarking, digital lters for audio effects, and speech technologies for language therapy. I am very pleased to have had the opportunity to write this foreword. I hope the appearance of this book stimulates the interest of future researchers in the area and brings about further progress in the eld of audio/speech signal processing. Tomohiko Taniguchi, PhD Fujitsu Laboratories Limited TEAM LinG vii Tomohiko Taniguchi (PhD) was born in Wakayama Japan on March 7, 1960. In 1982 he joined the Fujitsu Laboratories Ltd. were he has been engaged in the research and development of speech coding technologies. In 1988 he was a visiting scholar at the Information System Laboratory, Stanford University, CA, where he did research on speech signal processing. He is director of The Mobile Access Laboratory of Fujitsu Laboratories Ltd., Yokosuka, Japan. Dr. Taniguchi has made important contributions to the speech and audio processing eld which are published in a large number of papers, international conference and patents. In 2006, Dr. Taniguchi became a fellow member of the IEEE in recognition for his contributions to speech coding technologies and development of digital signal processing- (DSP) based communication systems. Dr. Taniguchi is also a member of the IEICE of Japan. TEAM LinG viii Preface With the development of the VLSI technology, the performance of signal processing devices (DSPs) has greatly improved making possible the implementation of very efcient signal processing algorithms that have had a great impact and contributed in a very important way in the development of large number of industrial elds. One of the elds that has experience an impressive development in the last years, with the use of many signal processing tools, is the telecommunication eld. Several important developments have contributed to this fact, such as efcient speech coding algorithm (Bosi & Goldberg, 2002), equalizers (Haykin, 1991), echo cancellers (Amano, Perez-Meana, De Luca, & Duchen, 1995), and so forth. During the last several years very efcient speech coding algorithms have been developed that have allowed reduction of the bit/s required in a digital telephone system from 32Kbits/s, provided by the standard adaptive differential pulse code modulation (ADPCM), to 4.8Kbits/s or even 2.4Kbits/s, provided by some of the most efcient speech coders. This reduction was achieved while keeping a reasonably good speech quality (Kondoz, 1994). Another important development with a great impact on the development of modern communication systems is the echo cancellation (Messershmitt, 1984) which reduces the distortion introduced by the conversion from bidirectional to one-directional channel required in long distance communication systems. The echo cancellation technology has also been used to improve the development of efcient full duplex data communication devices. Another important device is the equalizers that are used to reduce the intersymbol interference, allowing the development of efcient data communications and telephone systems (Proakis, 1985). In the music eld, the advantages of the digital technology have allowed the development of efcient algorithms for generating audio effects such as the introduction of reverberation in music generated in a studio to do it more naturally. Also the signal processing technology allows the development of new musical instruments or the synthesis of musical sounds produced by already available musical instruments, as well as the generation of audio effects required in the movie industry. The digital audio technology is also found in many consumer electronics equipments to modify the audio signal characteristics such as modications of the spectral characteristics of audio signal, recoding and reproduction of digital audio and video, edition of digital material, and so forth. Another important application of the digital technology in the audio eld is the restoration of old analog recordings, achieving an adequate balance between TEAM LinG ix the storage space, transmission requirements, and sound quality. To this end, several signal processing algorithms have been developed during the last years using analysis and synthesis techniques of audio signals (Childers, 2000). These techniques are very useful for generation of new and already known musical sounds, as well as for restoration of already recorded audio signals, especially for restoration of old recordings, concert recordings, or recordings obtained in any other situation when it is not possible to record the audio signal again (Madisetti & Williams, 1998). One of the most successful applications of the digital signal processing technology in the audio eld is the development of efcient audio compression algorithms that allow very important reductions in the storage requirements while keeping a good audio signal quality (Bosi & Goldberg, 2002; Kondoz, 1994). Thus the researches carried out in this eld have allowed the reducing of the 10Mbits required by the WAV format to the 1.41Mbits/s required by the compact disc standard and recently to 64Kbits/s required by the standard MP3PRO. These advances in the digital technology have allowed the transmission of digital audio by Internet, the development of audio devices that are able to store several hundreds of songs with reasonable low memory requirements while keeping a good audio signal quality (Perez- Meana & Nakano-Miyatake, 2005). The digital TV and the radio broadcasting by Internet are other systems that have taken advantage of the audio signal compression technology. During the last years, acoustic noise problem has become more important as the use of large industrial equipment such as engines, blowers, fans, transformers, air conditioners and motors, and so forth increases. Because of its importance, several methods have been proposed to solve this problem, such as enclosures, barriers, silencers, and other passive techniques that attenuate the undesirable noise (Tapia-Sánchez, Bustamante, Pérez-Meana, & Nakano-Miyatake, 2005; Kuo & Morgan, 1996). There are mainly two types of passive techniques: the rst type uses the concept of impedance change caused by a combination of bafes and tubes to silence the undesirable sound. This type, called reactive silencer, is commonly used as mufers in internal combustion engines. The second type, called resistive silencers, uses energy loss caused by sound propagation in a duct lined with sound-absorb- ing material. These silencers are usually used in ducts for fan noise. Both types of passive silencers have been successfully used during many years in several applications; however, the attenuation of passive silencers is low when the acoustic wavelength is large compared with the silencer’s dimension (Kuo & Morgan, 1996). Recently, with the developing of signal processing technology, during the last several years have been developed efcient active noise cancellation algorithms using single- and multi-channel structures, which use a secondary noise source that destructively interferes with the unwanted noise. In addition, because these systems are adaptive, they are able to track the amplitude, phase, and sound velocity of the undesirable noise, which are in most cases non-stationary. Using the active noise canceling technology, headphones with noise canceling capability, systems to reduce the noise aircraft and cabins, air condition ducts, and so forth have been developed. This technology, which must be still improved, is expected to become an important tool to reduce the acoustic noise problem (Tapia et al., 2005). Another important eld in which the digital signal processing technology has been successfully applied is the development of hearing aids systems, speech enhancement of persons with oral communication problems such as the alaryngeal speakers. In the rst case, the signal processing device performs selective signal amplication on some specic frequency bands, in a similar form as an audio equalizer, to improve the patient hearing capacity. While improving the alaryngeal speech several algorithms have been proposed. Some of them TEAM LinG [...]... chapters included is provided Chapter.I provides an overview of some the most successful applications of signal processing algorithms in the speech and audio field This introductory chapter provides an introduction to speech and audio signal analysis and synthesis, audio and speech coding, noise and echo canceling, and recently proposed signal processing methods to solve several problems in the medical... Morgan, 1996) Speech .and .Audio. Coding Besides interference cancellation, speech and audio signal coding are other very important signal processing applications (Gold & Morgan, 2000; Schroeder & Atal, 1985) This is Copyright © 2007, Idea Group Inc Copying or distributing in print or electronic forms without written permission of Idea Group Inc is prohibited TEAM LinG Introducton to Audo and Speech Sgnal... However, this will result in the necessity of storing and transmitting a much larger amount of data, unless efficient wideband coding schemes are used Wideband speech and audio coding intend to minimize the storage and transmission costs while providing an audio and speech signal with no audible differences between the compressed and the actual signals with 20kHz or higher bandwidth and a dynamic range equal... cancellation and speech compression and enhancement in telephone and data communication systems, high fidelity broadband coding in audio and digital TV systems, speech enhancement for speech and speaker recognition systems, and so forth However, despite the development that speech and audio systems have achieved, the research in those fields is increasing in order to provide new and more efficient solutions in. .. used in speech and audio fields It is intended for scientists and engineers working in enhancing, restoration, and protection of audio and speech signals The book is also expected to be a valuable reference for graduate students in the fields of electrical engineering and computer science The book is organized into XIV chapters, divided in four sections Next a brief description of each section and the... field A brief introduction of watermarking technology as well as speech and speaker recognition is also provided Most topics described in this chapter are analyzed with more depth in the remaining chapters of this book Section.I analyzes some successful applications of the audio and speech signal processing technology, specifically in applications regarding the audio effects, audio synthesis, and restoration... an adequate speech recognizer for this application and provides the design features and other elements required to support effective interactions This chapter provides to developers and educators the tools required to work in the developing of learning methods for individuals with cognitive, physical, and sensory disabilities Advances in Audio and Speech Signal Processing: Technologies and Applications, ... which includes contributions of scientists and researchers of several countries around the world and analyzes several important topics in the audio and speech signal processing, is expected to be a valuable reference for graduate students and scientists working in this exciting field, especially those involved in the fields of audio restoration and synthesis, watermarking, interference cancellation, and. .. distinguish two different groups: the narrowband speech coders used in telephone and some video telephone systems, in which the quality of telephonebandwidth speech is acceptable, and the wideband coders used in audio applications, which require a bandwidth of at least 20 kHz for high fidelity (Madisetti & Williams, 1998) Narrowband .Speech. Coding The most efficient speech coding systems for narrowband... resulting watermarked signal remains with nearly the same quality as the original one Watermarks can be embedded into audio, image, video, and other formats of digital data in either the temporal or spectral domains Here the temporal watermarking algorithms embed watermarks into audio signals in their temporal domain, while the spectral watermarking algorithms embed watermarks in certain transform domain . successful applications of signal processing algorithms in the speech and audio eld. This introductory chapter provides an introduction to speech and audio signal analysis and synthesis, audio and speech. required to work in the developing of learning methods for individuals with cognitive, physical, and sensory disabilities. Advances in Audio and Speech Signal Processing: Technologies and Applications, . recent advances in speech /audio signal processing technologies, such as audio signal enhancement, speech and speaker recognition, adaptive lters, active noise canceling, echo canceling, audio

Ngày đăng: 01/06/2014, 01:20

Xem thêm: advances in audio and speech signal processing technologies and applications, advances in audio and speech signal processing technologies and applications

advances in audio and speech signal processing technologies and applications

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan