PNG Compression – 2

Continuing from the previous post, the next procedure is called Filtering whose objective is to improve compression. The W3C standard defines only the use of Filter Method 0 which includes 5 Filter types. A Filter method is applicable for the entire image, while a Filter type can be different for every scanline. To first understand filtering, lets first have a look at the byte ordering so far.

The Scanlines maintain bytes as mentioned earlier. Lets label the byte (not pixel) under consideration as x. The byte preceding this (in the same scanline) is a while the byte above x is b. The byte preceding b is labeled c. Ofcourse, for the first scanline, both b and c are 0 and for the first byte in each scanline a is 0.

So, in this Method 0, we have 5 Filter Types.
0) None: Filt(x) = Orig(x)
1) Sub : Filt(x) = Orig(x) – Orig(a)
2) Up : Filt(x) = Orig(x) – Orig(b)
3) Average: Filt(x) = Orig(x) – floor((Orig(a)+Orig(b))/2)
4) Paeth: Filt(x) = Orig(x) – PaethPredictor(Orig(a), Orig(b), Orig(c))

The PaethPredictor function can be looked up at the main W3C page. It essentially tries to determine the direction in which the gradient is minimum.

Once this is done, we come to the actual compression part. It uses the Deflate/Inflate compression method which is similar to the LZ77 (proposed by Lempel-Ziv) with a sliding window of 32KB. These are stored in the standard “zlib” format which contain a zlib compression method and flags of 1 byte, additional flags of another byte and terminated using a check-sum of 4 bytes.

In this deflate compression technique, each block consists of two parts. The first part is a pair of Huffman code trees describe the representation of the compressed data part which are themselves compressed using Huffman coding.

The second part is the compressed data part which consists of a series of elements, again of two types literal bytes (strings that have not been detected as duplicated within the previous 32KB), and pointers to duplicated strings.

This sequence of filtered scanlines is then compressed and split into IDAT (Image Data) chunks. These form the zlib datastream which is decompressed to obtain the actual image. The whole PNG datastream is split up into chunks and there are 18 types of chunks.

PNG files can generally detect errors which are of two types: transmission errors and syntax errors. This is mainly due to the CRC of 4 bytes on every chunk. Also there are the PNG signature bytes which need to be verified before proceeding with reading of the actual image datastream.

Well this should be enough to basically understand how the PNG works. For more details you always have the W3C page.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s