Introduction

The MP3 algorithm is a generic audio compression standard. Unlike vocal-tract model coders specially tuned for speech signals, the MP3 audio encoder compresses the audio data without making assumptions about the nature of the audio source. Instead, the coder exploits the perceptual limitations of the human auditory system.

Much of the compression results from the removal of perceptually irrelevant parts of the audio signal. Removal of such parts results in distortions that are inaudible; thus the MP3 algorithm can compress any audio signal meant to be heard by the human ear equally well – speech or otherwise.

Elements of Sound

Sound is basically a longitudinal wave, that is, the medium vibrates in the same direction of wave advance. Sound can be converted via a transducer to another form of energy (usually electrical) for storage.



The energy of a sound signal follows the Inverse Square Law, that is, the intensity of the sound radiation decreases in proportion to the square of its distance from the sound source. Another feature of sound is acoustic masking, i.e. one sound can cover or mask another making it inaudible, especially in the mid- to treble range.

The Sound Pressure Level (SPL) is a variation above and below normal atmospheric pressure. Generation of sound depends on the production of changes in atmospheric pressure. The frequency of this “ripple” is the frequency of the sound.

The human ear is sensitive to frequencies in the range 20 Hz – 20 kHz, and this is referred to as the Audio Frequency Band. Frequencies in this band are called audio waves, or simply sound waves. Frequencies from 0 Hz – 20 Hz are called infrasonic and those above 20 kHz are called ultrasonic waves.

The basic components of a sound wave are frequency (or conversely, wavelength) and amplitude. A sound wave’s amplitude determines its strength (or loudness), and its frequency determines its pitch (or shrillness).