Audio data compression is widely applied in multimedia devices including video conference system.
The human ear can nominally hear sounds in the range 20 Hz to 20,000 Hz (20 kHz). Peak sensitivity is between 2 kHz and 4 kHz. According to the Nyquist theory (the minimum sampling rate required to avoid aliasing, equal to twice the highest frequency contained within the signal), digital audio data is usually sampled from 8 kHz to 48 kHz, covering from 4 kHz to 24 kHz which is bigger than human hearing dynamic range.
Similar to image data compression, digital audio data compression often utilizes data quantization, entropy coding, transformation (A-law algorithm and μ-law algorithm), prediction, and frequency domain coding using filter bank bands, such as PQMF and PQF for MDCT. Similar to taking advantage of human visual system model, digital audio compression takes advantage of Psychoacoustics which outlines human hearing limits:
An audio compression algorithm can assign a lower priority to sounds outside the range of human hearing. WMA is an example.
Echo cancellation is an essential technique for telephony and video conference. The following book chapter describe the principle of echo cancellation:
PEAQ (Perceptual Evaluation of Audio Quality) is a standardized algorithm for objectively measuring perceived audio quality.
For more see
Perceptual audio coding algorithms:
Coding and Standards:
MPEG4 Audio Algorithm:
Advances in Linear Prediction Techniques:
HD DVD Audio:
Audio quality test equipment: