MP3 - Wikipedia

MP3 (or, more precisely, MPEG-1/2 Audio Layer 3) is an audio compression algorithm (a.k.a. codec) capable of greatly reducing the amount of data required to reproduce audio, while sounding like a faithful reproduction of the original uncompressed audio to the listener.

Table of contents

1 History

2 Quality of MP3 audio

3 Bit Rate

4 Design bugs of MP3

5 Encoding of MP3 audio

6 Alternatives to MP3

7 Licensing and patent issues

8 External links

History

The MPEG-1/2 Layer 2 encoding started as the Digital Audio Broadcast (DAB) project initiated by Fraunhofer IIS-A[?]. This project was financed by the European Union as a part of the EUREKA research program where it was commonly known as EU-147.

Runtime of EU-147 was from 1987 to 1994. In 1991 there were two proposals available: Musicam (known as Layer II) and ASPEC (Adaptive Spectral Perceptual Entropy Coding) (with similarities to MP3). Musicam was choosen due to its simplicity and error resistance.

A working group around Karlheinz Brandenburg and Jürgen Herre[?] took ideas from Musicam, from ASPEC and own ideas and created MP3, which was designed to achieve the same quality at 128 kbps as MP2 at 192 kbps

Both algorithms were finalized in 1992 as part of MPEG-1, the first phase of work by MPEG, which resulted in the international standard ISO/IEC 11172-3, published in 1993. Further work on MPEG Audio was finalized in 1994 as part of the second phase, MPEG-2, which resulted in the international standard ISO/IEC 13818-3, originally published in 1995.

Compression efficency of lossy encoders is typically defined by the bitrate, because compression rate depends on bit depth and sampling rate of the input signal. Nevertheless there are often published compression rates, which are using the CD parameters as reference (44.1 kHz, 2x16 bit). Sometimes also the DAT SP parameters are used (48 kHz, 2x16 bit). Compression ratio for this reference is higher, which demonstrates the problem of the term compression ratio for lossy encoders.

FhG official webpage publish the following compression ratios and data rates for MPEG-1 Layer 1, 2 and 3:

Layer 1: 384 kbps, compression 4:1
Layer 2: 192...256 kbps, compression 6:1...8:1
Layer 3: 112...128 kbps, compression 10:1...12:1

These values are more or less public relation values, because

the quality depends not only on the encoding file format, but also on the quality of the psycho acoustic of the encoder. Typical layer 1 encoders use a very simple psycho acoustic which result in a higher needed bitrate for transparent encoding.
Layer 1 encoding at 384 kbps even with this simple psychoacoustic is better than Layer 2 at 192...256 kbps
Layer 3 encoding at 112...128 kbps is worse than Layer 2 at 192...256 kbps.

More realistic bitrates are:

Layer 1: excellent at 384 kbps
Layer 2: excellent at 256...320 kbps, very good at 224...256 kbps, good at 192...224 kbps, should not be used below 160 kbps
Layer 3: excellent at 224...256 kbps, very good at 192...224 kbps, good at 160...192 kbps, should not be used below 128 kbps

Comparing a new file format typically is done by comparing a medium quality encoder of the old format and a highly tuned encoder encoder of the new format.

The algorithm of the MP3 format uses, at its heart, a hybrid transform to transform a time domain signal into a frequency domain signal:

32 band polyphase quadrature filter
36 or 12 Tap MDCT, size can be selected independent for subband 0...1 and 2...31
alias reduction postprocessing

In terms of the MPEG specifications, AAC from MPEG-2 is to be the successor of the MP3 format. In practice, however, due to numerous patenting and licensing issues with various parts of the MPEG specifications, Ogg Vorbis seems positioned to be the mostly likely successor to MP3 as the popular format for audio interchange.

MP2 and MP3 hit the Internet

In October 1993, MP2 (MPEG-1 Audio Layer 2) files appeared on the Internet and were often played back using Xing[?] MPEG Audio Player, and later in a program for UNIX by Tobias Bading[?] called MAPlay[?] initially released on Feb 22 1994. (MAPlay was also ported to Microsoft Windows.) Initially the only encoder available for MP2 production was the Xing Encoder, accompanied by the program CDDA2WAV, a CD ripper that copied CD audio to hard disks.

Beginning in the first half of 1995, MP3 files, file representations of MPEG-1 Audio Layer III data, began flourishing on the Internet. Its popularity begat such companies and software packages as Nullsoft (http://nullsoft.com/)'s Winamp, mpg123 and the now bankrupt Napster.

Quality of MP3 audio

Many listeners accept the MP3 bitrate of 128 kilobits per second (kbps) as near enough to CD quality; this provides a compression ratio of approximately 11:1, although listening tests show that with a bit of practice, most listeners can reliably distinguish 128 kbps MP3s from CD originals. To many other listeners, 128 kbps is unacceptably low quality, which is unfortunate since many commonly-available encoders set this as their default bitrate.

Possible encoders:

ISO dist10 reference code: Worse quality, invalid MP3 files (all audio blocks are marked as corrupted)
Xing: mainly based on ISO code, quality similar to ISO dist10
Blade: quality similar to ISO dist10
FhG: Some of them are good, some have really nasty bugs
- ACM Producer Pro: Some versions generate annying artefacts
Lame
- --r3mix: outdated for more than 2 years
- --alt-preset: Alternative presets by Dibrom (Nickname of a Lame programmer) with good quality at medium bitrates.

Quality of MP3 depends on quality of encoder and the difficulty of the signal which must be encoded. Good encoders gave acceptable quality at 128...160 kbps, nearly transparence is achieved at 160...192 kbps. Low quality encoders never reach nearly transparence mode, even not at 320 kbps. So it is pointless to speak from 128 kbps or 192 kbps quality. A 128 kbps MP3 encoded with a good encoder typically sounds better than a 192 kbps MP3 encoed file with a bad encoder.

An important feature of MP3 is that it is lossy -- meaning that it removes information from the input in order to save space. As with most modern lossy encoders, MP3 algorithms work hard to ensure that the sounds it removes cannot be detected by human listeners, by modelling chacteristics of human hearing such as noise masking.

However, experienced listeners can tell the difference from the original at 192 kbps, and even at 256 kbps on some of the less powerful (and obsolete) encoders. If your aim is to archive sound files with no loss of quality, you may be more interested in lossless audio compression such as FLAC, SHN, or LPAC[?] -- these will generally compress a 16-bit PCM audio stream to approximately 50-75% of the original size (depending upon the characteristics of the audio itself).

Bit Rate

The bit rates, i.e. number of binary digits streamed per second, is variable for MP3 files. The general rule is that the higher the bitrate, the more information is included from the original sound file, and thus the higher is the quality of played back audio. In the early days of MP3 encoding, a fixed bit rate was used for the entire file.

Bit rates available in MPEG-1 layer 3 are 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256 and 320 Kbits (<math>10^3</math> bits) per second, and the available sample frequencies are 32, 44.1 and 48 kHz. 44.1 kHz is almost always used as this is the audio CD frequency, and 128 Kbit is some sort of de facto "good enough" standard. MPEG-2 and (non-official) MPEG-2.5 adds more bitrates: 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160 kbps.

However, audio in MP3 files are divided into chunks called frames, which all have a bitrate marker, so it is possible to change the bitrate dynamically as the file is played. This technique makes it possible to use more bits for parts of the sound with high dynamics (much "sound movement") and less bits for parts with low dynamics. Some encoders utilize this possibility to greater or lesser extent.

Design bugs of MP3

There are several flaws in the MP3 file format, which can't be fixed by a good encoder. This flaws are inherent properties of the MP3 file format.

it can't be encoded the exact play length of a piece of music (Vorbis)
time resolution is too low for highly transient signals (AAC, Vorbis)
encoder/decoder overall delay is not defined (Vorbis)
so scaleband factor for frequencies above 15.5/15.8 kHz (AAC, Vorbis)
joint stereo is done on a frame base (AAC, Vorbis)
bitrate is limited to 320 kbps (AAC, Vorbis)

In parentheses are the file formats where this bug is fixed.

Encoding of MP3 audio

The MPEG-1 standard does not include a precise specification for an MP3 encoder. The decoding algorithm and file format, as a contrast, are well defined. Implementors of the standard were supposed to devise by themselves algorithms suitable for removing parts of the information in the raw audio (or rather its MDCT representation in the frequency domain). This process is typically based on psycho-acoustic coding, i.e., you remove things that a human listener will not notice anyway by modeling our audio perception system (both in our ears and in our brain).

As a result, there are many different MP3 encoders available, each producing files of differing quality; as of September 30, 2001, the best encoder at high bitrates (128 kbps and up) is LAME, and the best at low bitrates is Fraunhofer's own encoder. MP3 decoding, however, is carefully defined in the standard. Most decoders are "bitstream compliant[?]", meaning that they will each produce exactly the same uncompressed output from a given MP3 file.

Many other lossy audio codecs exist, including:

MPEG-1/2 Audio Layer 2 (MP2), MP3's predecessor;
MP+[?], a derivative of MP2;
MPEG-2 AAC, used by LiquidAudio[?], but not many others due in part to stiff patent royalties;
ATRAC, used in Sony's Minidisc;
AC-3, used in Dolby Digital and DVD;
QDesign[?], used in QuickTime at high bitrates;
Windows Media Audio (WMA) from Microsoft;
RealAudio from RealNetworks;
mp3PRO[?] from Thomson Multimedia[?];
Ogg Vorbis from the Xiph.org Foundation, a free software codec.

MP2, MP3, AAC, and mp3PRO are all members of the same technological family and depend on roughly similar psychoacoustic models. The Fraunhofer Gesellschaft[?] owns many of the basic patents underlying these codecs, with Dolby Labs[?], Sony, Thomson Consumer Electronics[?], and AT&T holding other key patents.

Alternatives to MP3

There are also some non-lossy (lossless) audio compression methods used on the internet. While they are not similar to MP3, they are good examples of other compression schemes available. These include:

MP3, which was designed and tuned for use alongside MPEG-1/2 Video, generally performs poorly on monaural data at less than 48 kbps or in stereo at less than 80 kbps.

Though proponents of newer codecs such as WMA and RealAudio have asserted that their respective algorithms can achieve CD quality at 64 kbps, listening tests have shown otherwise; however, the quality of these codecs at 64 kbps is definitely superior to MP3 at the same bandwidth.

Thomson claims that its mp3PRO codec achieves CD quality at 64 kbps, but listeners have reported that a 64 kbps mp3PRO file compares in quality to a 112 kbps MP3 file and does not come reasonably close to CD quality until about 80 kbps.

The Xiph.org Foundation, the developers of the Vorbis algorithm used in the new Ogg format, claims that Vorbis somewhat exceeds MP3 and WMA sound quality while infringing no patents, and provides a web page with listening tests to demonstrate this.

Licensing and patent issues

Thomson Consumer Electronics (http://www.mp3licensing.com) controls licensing of the MPEG-1/2 Layer 3 patents in countries such as the United States of America and Japan that recognize software patents. Thomson Consumer Electronics has, so far as yet, decided not to cash in on the patents, but this possibility looms like a shadow over the .mp3 file. In fact Microsoft, the makers of the Windows operating system, chose to move away from MP3 to their own proprietary Windows Media formats to avoid any patent implications.

In spite of these threats, the perpetuation of the MP3 format continues; the reasons for this appear to be the network effects caused by:

people's familiarity with the format,
the large quantity of music now available in the MP3 format,
the wide variety of existing software and hardware that takes advantage of the file that revolutionized the music industry and copyright law.

External links

MPEG Audio Web Page (http://www.tnt.uni-hannover.de/project/mpeg/audio/)
MPEG Audio FAQ (http://www.tnt.uni-hannover.de/project/mpeg/audio/faq/)
MPEG FAQs (http://mpeg.telecomitalialab.com/faq.htm)
MPEG Audio Resources and Software (http://www.mpeg.org/MPEG/audio.html)
Xiph.org listening test of Vorbis vs. MP3, RealAudio, Windows Media, etc. (http://www.xiph.org/ogg/vorbis/listen.html)
LAME MP3 Encoder downloads (http://mitiok.cjb.net/)
Hydrogenaudio - Forum discussing MP3 and other audio formats (http://hydrogenaudio.org/)