18-796 project end report

Shape coding for MPEG-4






For the second part of our project, we were concerned about the inter-coding of shape information. To do so, we started from the context-based shape intra-codec we built for the first part and added motion estimation/compensation as well as context-based inter-coding.
 
 

Encoder:


The decoder follows the functionality of the encoder.

The syntax of the encoded bitstream is as follows:






In order to (try to) improve the compression ratio or the speed of the codec, we implemented to independent control parameters:

Motion threshold

Compression time

Compression ratio

0

78 s. (100%)

46

64

45 s. (60%)

35

128

39 s. (50%)

31


 

        
Threshold=0                                                      Threshold=128                                                      Threshold=256

We measured the following values:

Alpha threshold

Compression ratio (children.seg)

Compression ratio (stefan.seg)

0

46.4

59.7

16

46.3

59.6

32

46.1

59.7 (!)

64

46.0

58.9

128

45.2

56.2

256

29.5

22.1


 

We notice 2 facts: first, the compression ratio is not much better than with a succession of intra-coded frames (compression ratio would be 35). This is due to the fact that we always send all the motion vectors, even when there, as it is the case in the children sequence, is only a few movement. MPEG-4, however, clearly specifies NOT TO send the motion vectors in such cases. We didn't implement this decision, because it requires a more sophisticated syntax.

But we also see that the compression ratio gets worse for lossy compression! We are not certain about the reason, but it could be that the sudden creation of black or white blocks between two frames results in a big "residue" (sudden black block = big change between 2 frames). It is less probable that the MPEG group just want's to show us how nasty lossy coding is?
 
 

Pasting:

the use of shape coding can be shown by the little "paste.c" program. This just takes a shape, the corresponding video file and a background file, and then shows the background where the shape is black or the texture where the shape is white. Therefore, compositing can be done without any chroma keying if we have the shape information.

And the chain coder?

For the midterm part, we implemented an intra-chain-coder. While we didn't improve it in order to do inter-coding, this could be done by some "vector tracking", where the movement of the individual vectors would be encoded in an efficient manner. However, this implies pattern recognition as well as syntax issues that are far beyond the scope of our project (and of MPEG-4).
 
 

Conclusion: further steps

While this project gave us a good insight in the methods used by the MPEG group to achieve more flexibility and higher efficiency, the ratios of our implementation could probably be improved by the consideration of all the standard modes (which would require a complex syntax, however). It could also be of interest to try out other methods of lossy compression as well as to find some speed-efficiency tradeoffs based on our control parameters. Also, different sequences provide different results. But despite these facts, the project provided us a good first step in coding technology!
 
 

Source code (.zip):

C-code
Test files
Utilities
 

References:
 

Chain coding: