Motivation
When we transmit a live video, we always battle with the limited network bandwidth. Ideally we want to spend more bandwidth on the more important part of the scene, for instance, the human character, and save the bandwidth on less important part like the static environment. Unfortunately, most today’s video transmission applications are based on JPEG, MPEG1 or MPEG2. The data being transmitted is represented as two dimensional array. The codec does not know the priority of the data, thus it can not employ the more efficient compression scheme if such information is available.
It is well known that stereo vision technique can provide depth information for the scene, based on which it is relatively easy to segment a human character on the foreground from the environment.
The advance on stereo hardware makes it possible to generates live depth information for a typical desktop scene. Our goal is set to employ such kind of stereo camera to explore the idea of object-oriented network adaptive transmission scheme.