Category Archives: Image Processing

Liquid Rescale

Image resizing for most cases today works directly on pixel level irrespective of the content. A first step to not distort the information is to keep the aspect ratio the same. However this too is not very good, specially if the content of focus in the image was small to begin with, thus making it even smaller. Liquid rescale is an awesome technique for content-aware image resizing and the publication can be found here. The video below demonstrates its applications, and also quickly explains the main concept.

The idea behind this is astonishingly simple. Instead of removing a straight column (row) of pixels to reduce the width (height), a so-called vertical (horizontal) seam is removed. The seam is a connected path of low energy pixels in the image. Thus, removing a seam does not alter the high energy content, which is usually semantically useful.

An image is analyzed and seams are ordered from low to high energy. The seams with lowest energy are removed first for reducing the size of the image, and subsequent seams are removed to reach the desired image size. The image can also be expanded by adding in extra pixels in these seam locations with values as the average of the neighbours. The video below throws light on how these seams look and shows nice applications.

A Gimp plugin can be found here and imagemagick incorporates it in here.


Census Transform and Faces

Another transform to the bag of so many feature extraction methods, the Census Transform (CT) and Modified CT (MCT) seems to be quite an interesting way of representing images in a very local neighbourhood. The transform provides high resilience against global illumination changes and thus is quite useful for face detection across wide illumination problems. Combined with the standard AdaBoost it proves to be an effective face detector scheme. Additionally it seems to be also used in applications in image retrieval research too. Histograms of the transform too have been used for further analysis.

The concept of CT is extremely simple. Used on grayscale images, consider any pixel and its 8 neighbours. Just assign boolean values 0/1 to the pixels who have a value lower/higher than the center respectively. Scan them in row order and this generates an 8-bit stream for each pixel and is the new transform value at that pixel. The central pixel is ignored.

On the other hand, the MCT makes a small change in this by saying compare with the mean of the 3×3 block rather than the center value. It can now also use the central pixel for comparison. A similar operation provides 9 bits which are the transform value for that pixel. A different block size like 5×5 can be used too for both, however 3×3 is usually the more favoured one.

An example figure from the paper – Face Detection with Modified Census Transform is shown here and clearly shows that the vast change in global illumination or gradient doesn’t affect the local pattern much.

Face Detection with MCT

Face Detection with MCT

Radon Transform and Tomography

The Radon transform is widely being used in a whole lot of image processing applications. Its use at detecting lines in noisy images is extremely powerful. So what exactly is this Radon Transform?

Take an image, and take its horizontal projection (sum along each row at 0 degrees). Now, rotate the image and take another projection, and so on, take projections for various angles. What one obtains is a matrix – projection columns vs varying angles.

In Matlab, this can be done using [RadonImage, k]=radon(input_image,theta) where theta are the angles over which the projection is taken and k the line number (perpendicular distance from center of image of each line) It can be visualised nicely by imagesc(theta,k,RadonImage).

Now, Tomography is an imaging method by sectioning. CTScans (computed tomography scan), MRIs, and many other medical, oceanographic, geophysical, etc. imaging carried out in this way. For now, we shall restrict ourselves to a 2d image, and its single dimensional projections. Once these projections are obtained, a backprojection can be carried out to regenerate the 2d image. The algorithm used here, is of main concern, as we want our algorithm to be efficient both in time and complexity and require lesser number of projections for reconstruction.

One of these is the Filtered Back-Projection which makes use of the concept of Fourier Slice Transform. This tells us that, the 1d Fourier Transform of the projection, is equal to the 2d Fourier Transform of the image evaluated on the line whose projection was used earlier.

Further, in the mathematics its seen that, there comes a transformation from rectangular to polar coordinates which introduces a determinant of Jacobian. This can be multiplied with the 1d transforms of projections directly to obtain something called filtered back-projection. These back-projections are the slow development of the original image. More and more back-projections help in the reconstruction of the original image. Basically, “given the projection at a specific angle, we could reconstruct the image something like this” is what the method says.

To the user’s concern, Matlab offers a function called iradon. It can be called as Reconstructed_image = iradon(RadonImage, theta). Note that the theta used here should be the same ones used while computing the Radon Transform.

For a better mathematical treatment, I would refer this post to this Rice Univ. page.

Webcam Capture in Matlab

The Image Acquisition Toolbox in Matlab (Windows version) allows one to interface Matlab with a Webcam. This is available from R2007a (not sure about earlier versions). Similar to the audio recording object created earlier, here we create a videoinput object. But before that is done, Matlab needs to find out what are the webcam devices that are connected to your computer.

Firstly, a imaqhwinfo gives information about the existing adaptors for your webcam device. You can get more information on each adapter, by using imaqhwinfo('winvideo') where winvideo is one of the adaptors. In this, (if you have a device connected) you shall get a Device IDs attached to your webcam device. Further information pertaining to the device can be obtained by imaqhwinfo('winvideo',1) where 1 is the Device ID you saw earlier.

This gives you much needed information regarding the capture device. The resolution (800×600, 1024×768, 1600×1200, etc.), format (RGB, YUV, etc.) which needs to be selected when creating a video object.

Armed with all this imaqhwinfo (image acquisition hardware information) you are ready to create your own video object.

vidobj = videoinput('winvideo',1,'RGB_1024x768');

‘RGB_1024x768’ was just the format that I selected. You should use one of those that were available in your device info query. The most important command now would be to start your video object start(vidobj). It is at this point, or during the creation of video object, that the light (if any) on your webcam would start glowing indicating capture.

You can obtain snapshots of capture by using the frame = getsnapshot(vidobj); or view the continuous stream of frames by saying preview(vidobj);.

A safe closure (unlocking of the video handles) of the video object is extremely important so that it can be started again easily. A stop(vidobj) followed by delete(vidobj) is the best way to follow.

Another point to note is that all external capture devices, are locked by software which try to access it. Thus, you would get errors like Device not ready, or Device already in use in case you are already viewing the capture stream in any other software. So its recommended that you cleanly stop that software first and then let Matlab take over.

There are a variety of options that are not discussed here for lack of purpose like an automatic trigger (after a defined interval). All the options can be seen by imaqhelp(videoinput).

You now have the power of both Audio and Image / Video Capture now with which amazing tricks can be played 😉

UPDATE (2013-Aug-23): Might be interesting to just start exploring using imaqtool.