In this post we shall just briefly understand how the technique works. More discussions on Wavelet Transform and Arithmetic Coding and their exact usage in JPEG2000 will come in subsequent posts.
The most significant difference is the Wavelet Transform which is way better than the Cosine Transform. The advantages of which are seen in the much higher compression ratio (can easily offer 3-4 times better compression than JPEG, without perceptual loss in quality). But the required resource consumption is higher than that for JPEG and this is the major drawback and reason for us not seeing JPEG2000 compressed images frequently. Also, at high compression ratios, JPEG breaks down due to the blocking artifacts, while JPEG2000 may just get a little blur.
Anyways, lets get started! We start off with the standard RGB to YCbCr conversion, but this does not involve the sub-sampling of the Chroma components the reason for which is discussed later.
These layers may then be Tiled (not compulsory). This tiling makes the decoder require lesser memory, but the usage of multiple smaller tiles may then show up blocking artifacts like the JPEG compression method. Generally the whole image is processed as it is.
These tiles are then individually DC Level shifted (mean-shift) to ensure a average sum of 0. The next step, carries out multiple levels of the Wavelet transform. Three to five levels of transform are generally applied.
Each level of the wavelet transform on 2-D data generates an approximation coefficient (left-top) and 3 detail coefficients. In the chroma layers, the first level of detail coefficients are directly ignored and thus we obtain an image similar to a sub-sampled one. The image below gives an insight into the same.
Once we have all the detail and approximation coefficients, a global quantization may be performed which puts a high percentage of numbers to zero. Next, a general divide by magnitude quantization is carried out. All these quantization operations are the main reasons for LOSSY compression. Yet, JPEG2000 has a feature to have the quantization step size set at 1.0 on the usage of 5/3 Wavelets which gives a lossless compression.
Now these image tiles are divided into sub-bands and sub-bands into precincts, followed by precincts into code-blocks. These code blocks are coded using a bit-plane coding technique called EBCOT (Embedded Block Coding with Optimal Truncation).
Bits obtained from the EBCOT are called packets and passed through the binary MQ-Coder. These are finally stored with markers and in layers to enable a multi-resolution effect (similar to that of PNG’s progressive loading of layers).
More on Wavelets and the Coding Procedure in posts to come.