finalrep

Morphing Based Interpolation Techniques for Motion Compensation in a MPEG Videostream

By
Marios Savvides
Kiran Bhat

Abstract
We have implemented a real time video morphing system using feature points on the images. Our method uses morphing for motion compensation instead of the standard block based techniques. The input to our system can be a stored video sequence or live video data from a camera. The system automatically selects and tracks feature points in the video stream using the KLT feature tracking algorithm. The number of feature points to be tracked can be controlled by the user. The feature tracker works in realtime at around 3 Hz (on a Pentium 300). The motion vectors of the feature points is passed on to the morphing algorithm. Our current morphing system produces one intermediate morphed image for every two successive "feature-tracked image". The morphing algorithm uses the motion vectors of the nearest feature point for texture mapping. Though simple to implement, this morphing algorithm is not very computationally efficient. The overall system works at 1Hz, which can be improved to around 3Hz (feature tracker frame rate) using a standard triangulation methods combined with scanline algorithms for morphing. This system has numerous applications including video-conferencing and net video.

KLT Feature Tracking Method
Feature tracking is a widely researched topic in the Computer Vision community. The most commonly used methods for feature tracking employ image correlation or sum of squared difference (SSD) techniques. With small inter-frame displacements, a window can be tracked by optimising some matching criterion with respect to translation and linear image deformation.
Our current system tracks features in the image based on the method proposed by Tomasi.et.al (KLT system). The KLT (Kanade Lucas Tomasi) feature tracking system identifies and tracks features by monitoring a measure of feature dissimilarity (that quantifies the change of appearance of a feature between the previous and current frame).
Features with good texture (instead of the traditional "interest" or "cornerness" measures) are selected and are tracked using an affine tracking model, which can account/compensate for translation and linear warping. The affine model is computed numerically using Newton Raphson minimization technique. Translation gives more reliable results than affine changes when the inter-frame motion is small, but affine changes are necessary to compare frames with large motions to determine dissimilarity. We tested the tracker on a few face sequences and observed that features which were identified were tracked satisfactory. The feature tracker works in realtime at around 3Hz.

The Morphing Algorithm
We have implmented a very simple algorithm for morphing the intermediate frame from the current and previous images of the videostream. We look at all the tracked feature points in the current image, and find their corresponding positions in the previous image. Using this information, we can compute the motion vectors of all the feature points in the current image. Then, for each pixel (ix, iy) in the current image, we find its closest feature vector and assign that motion vector (mv_x, mv_y) to the pixel. For each pixel (ix, iy) in the current image, we move to the pixel location (ix-mv_x/2,iy-mv_y/2) in the intermediate image, and assign its value equal to 0.5*(val[ix, iy] + val[ix - mv_x, iy - mv_y]). Note here that (ix-mv_x, iy-mv_y) indicates the position of the corresponding pixel in the previous image.
We note here that even though this algorithm is simple and produces the desired morphing effect, it is slow since we are processing pixel by pixel. We employ a few tricks at the pixels in the background (like skipping every 24 pixels if the distance to the nearest feature point is greater than a particular threshold) to improve the morphing speed.

System Description
Our system comprises of a Pentium 300 with an MRT framegrabber card and a Sony Camcoder .
We have modified the KLT routines (from Stanford Robotics Lab) and implemented the morphing algorithm to produce a visual testing framework.

    GUI
        -        Choice of loading from the camera or from 5 stored video streams (faces, people walking etc).
        -         Image display of base frame(image when feature tracker replaces lost features)
        -         Image display of current frame ( for comparison between the two).
        -         Image display of the morphed image (Final).
        -         Image display of the intermediate morphed image (before filling the gaps).
        -         Image display of the averaged image (by averaging the current and previous frames).
                 (This is displayed to compare with the final morphed image.)
        -         Automatic frame-by-frame feature tracking (500ms timed update interval).
        -         Manual step-through feature tracking process.
        -         Ability to specify the number of features to be found and tracked through a sequence.
        -         Ability to specify the number of frames to elapse before replacing lost features.
        -          Number of features succesfully tracked in the current frame (relative to features in base frame).
        -         Implemented user-friendly/program safe- interface (ie. depending on user action, GUI disables un-available options controls to avoid user
                  from performing an illegal action like pressing Load Stream 2 button while in auto mode viewing video stream 1. Using this method, the GUI directs
                  the user to what options are valid at each moment during the execution of the program.

Results and Future Work
The following snapshot illustrates the performance of our feature based morphing system. The top left image shows the current frame of the video stream, the top right image is offset from the current image by 10 frames (the frame when the features were replaced previously). The center image shows the final morphed image, the bottom right shows the intermediate morphed image and the bottom left shows the averaged image. From these images it is clear that feature points are tracked temporally, and the quality of the morphed image is comparable to the input image (for small motions). We also note that the quality of the morphed image is much superior to the average image.

We found that the feature tracker works at around 3-4 Hz, and the overall system works at around 1Hz. We can improve the speed of the feature tracker by selectively choosing feature points on the foreground object. The morphing algorithm can be improved by obtaining the voronoi regions prior to texture mapping of each pixel. This system has numerous commercial applications and is also a good testbed for testing various advanced computer vision & image processing algorithms.

References
1) Shi. J and C. Tomasi, "Good Features to Track", IEEE Conf. Comp Vision & Pattern Recog. (CVPR 94) Seattle, June 1994.
2) CMU Computer Vision home page.
3) Beier. T and S. Neely, "Feature -Based Image Metamorphosis", Computer Graphics (SIGGRAPH), 26, 2, July 1992.
4) Wolberg. G, "Digital Image Warping", IEEE Computer Society Press Monograph, 1988.
5) Prosise. J, "Programming Windows 95 with MFC", Microsoft Press, 1996.
6) Petzold. C, "Programming Windows - fifth edition", Microsoft Press, 1999.

Link to our code