Speech and Audio Processing
A. Basics
- Human Speech production & Hearing
- Speech Production: Anatomy & Models
- Hearing: Anatomy & psychoacoustics
- Signal Representation
- Long-term description vs Short-term description
- Stochastic Properties
- Spectral representation & Cepstral representation
B. Source Coding for speech & audio signals
- Data compression
- Quantization
- Linear Prediction
- Coding in Time Domain
- Coding in Frequency Domain
C. Basics of Automatic Speech Recognition (ASR)
- Basics
- Approaches to ASR
- Acoustic-phonetic approach
- Pattern Recognition approach
- AI approach
- Acoustic Modeling
- Feature Extraction
- Feature Transformation
- Pattern Comparison
- Hidden Markov Models
- Properties of HMM
- Evaluation of model: Forward Algorithm
- Search for hidden str of obs: Viterbi Algorithm
- Optimization of model by training: Baum-Welch Algorithm
D. Basics of Text-to-Speech (TTS) Translation
- System Architecture
- Text Analyis
- Phonetic Analysis
- Prosody
- Speech Synthesis
- Basics
- Formant Synthesis
- Concatenative Synthesis
- Prosodic Modification of Speech: OLA, SOLA, PSOLA
E. Signal Enhancement
- Signal Procesisng Methods
- Single-channel acquisition & reproduction
- Multi-channel acquisition & reproduction: Beamforming
- Acoustic echo cancellation (AEC)
- Noise Reduction
- Dereverbation
- MIMO Systems for Blind Signal Acquisition: TRINICON