FLAC – Lossless Audio

The Free Lossless Audio Codec (FLAC), is a file format specially meant for audio which gives a lossless output (i.e. Perfect reconstruction can be achieved). Well, after MP3, its debatable whether a perfectly lossless file is actually required, but technology development still happens 🙂 So, since its specially meant for audio, its compression factor is more than the standard lossless file archival formats like zip / rar.

So what exactly makes this FLAC lossless, yet smaller than the other techniques. The encoding procedure can be understood in simple small steps. Firstly, a blocking of the samples is carried out. Each such encoded block is packed into a frame with its headers and footers consisting of CRCs and other stuff. The blocking size is a crucial criteria, as a small size would mean a higher frame data overhead, and a large blocking size would mean a badly fitting model. By default a block size of 4096 samples is chosen for the 44.1kHz sample rate.

In the next step, the encoder tries to approximate this signal whose residual would then require lesser bits to encode. The two main ways of modelling are either by fitting simple polynomials to the signal or by using linear predictive coding (LPC). The model parameters form the information to be stored. Ofcourse LPC requires more bits than polynomials, but generally tends to give more accurate models.

The LPC is a model of our human speech production system. In extremely simple words, it assumes the voice production to be mainly a stream of impulses passing through our throat and sound production system, a response function, and LPC tries to model this very function which can give a close approximation of the audio sample’s spectrum.

The model is now generated, and residual coding is carried out on the left over (signal – model) residual. This is generally broken into several partitions, and is found to have a Laplacian distribution, on which a special type of Huffman code, called the Rice code is applied for higher efficiency. Each such partition has a Rice parameter which is adjusted for that small distribution.

Thus the frame is then packed, called the framing operation with a header containing crucial information like sample rate, bits per sample, etc. The data then follows and is closed by a CRC of the entire encoded frame for error detection.

The main point to note here is unlike most other compression procedures there is no quantisation operation. This is what makes this system entirely lossless, as all the above described procedures are perfectly reversible.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s