WELCOME TO BSDU - KNOWLEDGE RESOURCE CENTER


BHARTIYA SKILL DEVELOPMENT UNIVERSITY, JAIPUR
KNOWLEDGE RESOURCE CENTER (LIBRARY)
Online Public Access catalogue(OPAC)

“Library is a heart of an institution" ― Dr S. Radhakrishnan

“Never Stop Reading"

Normal view MARC view ISBD view

Speech and Audio Processing

By: Apte, Shaila D.
Material type: materialTypeLabelBookPublisher: New Delhi Wiley India Pvt. Ltd. India 2012,c2012Description: 438.ISBN: 9788126534081.Subject(s): ElectronicsDDC classification: 621.382
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Collection Call number URL Status Date due Barcode
CDs & DVDs CDs & DVDs BSDU Knowledge Resource Center, Jaipur
Audio Visual
621.382 APT (Browse shelf) Not For Loan CD89
CDs & DVDs CDs & DVDs BSDU Knowledge Resource Center, Jaipur
Audio Visual
621.382 APT (Browse shelf) Not For Loan CD90
CDs & DVDs CDs & DVDs BSDU Knowledge Resource Center, Jaipur
Audio Visual
621.382 APT (Browse shelf) Not For Loan CD91
Books Books BSDU Knowledge Resource Center, Jaipur
621.382 APT (Browse shelf) With CD Available 001998
Books Books BSDU Knowledge Resource Center, Jaipur
621.382 APT (Browse shelf) With CD Available 001999
Books Books BSDU Knowledge Resource Center, Jaipur
Not for Loan 621.382 APT (Browse shelf) With CD Not For Loan 002000

Speech and Audio Processing is a text targeted towards the final year undergraduate Speech Processing course and PG students in ECE, CS, and IT streams. This book aims at explaining the basic concepts in a clear-cut and simplified manner. It begins with the human speech production mechanism and then goes on to the fundamental parameters of speech such as pitch frequency, formants, spectral features like log spectrum, 3-D spectrogram, cepstral features, MFCC, linear prediction coefficients, transform-domain parameters, template matching techniques, etc. It deals with applications like speech coding, speech recognition, speaker recognition, and speech synthesis.

Contents
Fundamentals of Speech

· The Human Speech Production Mechanism

· LTI Model for Speech Production

· Nature of the Speech Signal

· Linear Time-Varying Model

· Phonetics

· Types of Speech

· Voiced and Unvoiced Decision Making

· Audio File Formats: Nature of the WAV File



Parameters of Speech: Pitch and Formants

· Fundamental Frequency or Pitch Frequency

· Parallel Processing Approach for Calculation of Pitch Frequency

· Pitch Period Measurement Using Spectral Domain

· Cepstral Domain

· Formants and Their Relation With LPC

· Evaluation of Formants Using Cepstrum

· Evaluation of Formants Using Log Spectrum

· Evaluation of Formants Using Power Spectral Density Estimate

· Estimation of Formants: Other Methods



Spectral Parameters of Speech

· Homomorphic Processing

· Cepstral Analysis of Speech: Cepstral Coefficients

· The Auditory System as a Filter Bank

· Mel Frequency Cepstral Coefficients (MFCCs)

· Perceptual Linear Prediction (PLP)

· Log Frequency Power Coefficients (LFPCs)

· Relative Spectral Perceptual Linear Prediction (Rasta-PLP): Strategies for Robustness

· Short-Time Spectral Analysis of Speech: Short-Time Fourier Transform (STFT)

· Wavelet Transform Analysis of Speech



Linear Prediction of Speech

· Lattice Structure Realization

· Forward Linear Prediction

· Autocorrelation Method

· Covariance Method

· Lattice Methods

· Selection of Order of the Predictor

· Line Spectral Frequencies/Line Spectral Pair Frequencies



Speech Quantization and Coding

· Uniform and Non-Uniform Quantizers and Coder

· Companded Quantizers

· Uniform Quantization of Non-uniform Sources: Adaptive Quantizers

· Waveform Coding of Speech

· Comparison of Different Waveform Coding Techniques

· Parametric Speech Coding Techniques

· Sinusoidal Speech Coding Techniques

· Mixed Excitation Linear Prediction Coder

· Multi-Mode Speech Coding (Hybrid Coder)

· Transform Domain Coding of Speech



Speech Processing Applications

· Speech Recognition Systems

· Architecture of a Large Vocabulary Continuous Speech Recognition System

· Deterministic Sequence Recognition for ASR

· Statistical Sequence Recognition for ASR

· Statistical Pattern Recognition and Parameter Estimation

· VQ-HMM-Based Speech Recognition

· Discriminant Acoustic Probability Estimation

· Word Spotting/Keyword Spotting

· Speech Recognition and Understanding

· Speaker Recognition

· Distortion Measures: Mathematical and Perceptual

· Speech Enhancement

· Adaptive Echo Cancellation



Speech Synthesis

· A Text-to-Speech System

· Synthesizer Technologies

· Speech Synthesis Using Other Methods

· Speech Transformations

· Emotion Recognition from Speech

· Watermarking for Authentication of a Speech/Music Signal



Basics of Musical Instruments and Music Synthesis

· Indian Musical Instruments

· Features Used for Classification

· Music Synthesis

· Musical Instrument Digital Interface (MIDI)

· Streaming Audio

· Piano Note Synthesis Using LPC and WT

· Audio Standards



Summary

Key Terms

Multiple Choice Questions

Review Questions

Problems (Write MATLAB Programs)

Suggested Projects (Write MATLAB Programs)

Answers



Frequently Asked Short Questions with Answers

Frequently Asked Long Questions with Pointers

Bibliography

Index

There are no comments for this item.

Log in to your account to post a comment.

2019. All rights reserved.
Implemented & Maintained by Total IT Software Solutions Pvt. Ltd.