Skip to main content

Frame rate conversion with motion compensation

Here at Panda, we are constantly impressed with the requests that our customers have for us, and how they want to push our technology to new areas. We’ve been experimenting with more techniques over the past year, and we’ve officially pushed one of our most exciting ones to production.

Introducing frame rate conversion by motion compensation. This has been live in production for some time now, and being used by select customers. We wanted to hold off until we saw consistent success before we officially announced it :) We’ll try to explain the very basics to let you build an intuition of how it works – however, if you have any questions regarding this, and how to leverage it for your business needs, give us a shout at support@pandastream.com.

Motion compensation is a technique that was originally used for video compression, and now it’s used in virtually every video codec. Its inventors noticed that adjacent frames usually don’t differ too much (except for scene changes), and then used that fact to develop a better encoding scheme than compressing each frame separately. In short, motion-compensation-powered compression tries to detect movement that happens between frames and then use that information for more efficient encoding. Imagine two frames:

Panda on the left...
Panda on the left…
...aaand on the right.
…aaand on the right.

Now, a motion compensating algorithm would detect the fact that it’s the same panda in both frames, just in different locations:

First stage of motion compensation: motion detection.
First stage of motion compensation: motion detection.

We’re still thinking about compression, so why would we want to store the same panda twice? Yep, that’s what motion-compensation-powered compression does – it stores the moving panda just once (usually, it would store the whole frame #1), but it adds information about movement. Then the decompressor uses this information to construct remaining information (frame #2 based on frame #1).

That’s the general idea, but in practice it’s not as smooth and easy as in the example. The objects are rarely  the same, and usually some distortions and non-linear transformations creep in. Scanning for movements is very expensive computationally, so we have to limit the search space (and optimize the hell out of the code, even resorting to hand-written assembly).

Okay, but compression is not the topic of this post. Frame rate conversion is, and motion compensation can be used for this task too, often with really impressive results.

For illustration, let’s go back to the moving panda example. Let’s assume we display 2 frames per second (not impressive), but we would like to display 3 frames per second (so impressive!), and the video shouldn’t play any faster when we’re done converting.

One option is to cheat a little bit and just duplicate a frame here and there, getting 3 FPS as a result. In theory we could accomplish our goal that way, but the quality would suck. Here’s how it would work:

Converting from 2 FPS to 3 FPS by duplicating frames.
Converting from 2 FPS to 3 FPS by duplicating frames.

Yes, the output has 3 frames and the input had 2, but the effect isn’t visually appealing. We need a bit of magic to create a frame that humans would see as naturally fitting between the two initial frames – panda has to be in the middle. That is a task motion compensation could deal with – detect the motion, but instead of using it for compression, create a new frame based on the gathered information. Here’s how it should work:

Converting from 2 FPS to 3 FPS by motion compensation: panda is in the middle!
Converting from 2 FPS to 3 FPS by motion compensation: panda is in the middle!

 

These are the basics of the basics of the theory. Now an example, taken straight from a Panda encoder. Let’s begin with an example of how frame duplication (the bad guy) would look like (for better illustration, after converting FPS we slowed down the video, and got slow motion as a result):

 

See that jitter on the right? Yuck. Now, what happens if we use motion compensation (the good guy) instead:

 

It looks a lot better to me, the movement is smooth and there are almost no video artifacts visible (maybe just a slight noise). But, of course, other types of footage are able to fool the algorithm more easily. Motion compensation assumes simple, linear movement, so other kinds of image transformations often produce heavier artifacts (they might be acceptable, though – it all depends on the use case). Occlusions, refractions (water bubbles!) and very quick movement (which means that too much happens between frames) are the most common examples. Anyway, it’s not as terrible as it sounds, and still better than frame duplication. For illustration, let’s use a video full of occlusions and water:

 

Okay, now, let’s slow it down four times with both frame duplication and motion compensation, displayed side-by-side. Motion compensation now produces clear artifacts (see those fake electric discharges?), but still looks better than frame duplication:

 

And that’s it. The artifacts are visible, but the unilateral verdict of a short survey in our office is: the effect is a lot more pleasant for motion compensation than frame duplication. The feature is not publicly available yet, but we’re enabling it for our customers on demand. Please remember that it’s hard to guess how your videos would look like when treated with our FPS converter, but if you’d like to give it a chance and experiment a bit, just drop us an email at support@pandastream.com

SD television formats in Panda

The edge of video transmission is moving quickly, just to mention HD television being mainstream for some time and 4K getting traction; H264 being ubiquitous, and HEVC entering the stage. Yet most people still remember VHS. It’s good to be up with the latest tech, but unfortunately the world is lagging behind most of the time.

Television is a different universe than Internet transmission. The rules are made by big (usually government) bodies and rarely change. Although most countries have switched to digital transmission, standard definition isn’t gone yet – SD channels are still very popular, which forces content providers to support SD formats too.

Recently, we’ve helped a few clients to craft transcoding pipelines that support all these retiring-yet-still-popular formats. We’ve noticed that it’s a huge nuisance for content makers to invest in learning old technology and that they would love to shed the duty on someone else; so we made sure that Panda (both the platform and the team) can deal with these flawlessly.

There’s a huge variability among requirements pertaining SD: for example, you have to decide how the image should be fitted into the screen. High-quality downsampling is always used, but you have to decide what to do when the dimensions are off: should you use letterboxing, or maybe stretch the image?

Fiordland National Park, New Zealand (Nathan Kaso)
Fiordland National Park, New Zealand (Nathan Kaso)

Another decision (which usually is not up to you) is what exact format should be used. This almost always depends on the country the video is for. Although the terms NTSC, PAL and SECAM come from the analog era (digital TV uses standards like ATSC and DVB-T), they are still used to describe parameters of encoding in digital transmission (e.g. image dimensions, display aspect ratio and pixel aspect ratio). Another thing the country affects is the compression format, the most popular are MPEG-2 and H.264, though they are not the only ones.

Standard television formats also have specific requirements on frame rate. It’s a bit different than with Internet transmission, where the video is effectively a stream of images. In SD TV, transmission is interlaced, and instead of frames it uses fields (which contain only half the information that frames do, but allow to save up bandwidth).

Frame rate is therefore not a very accurate term here, but the problem is still the same – we have exact number of frames/fields to display per unit of time, and the input video might not necessarily match that number. In such case the most popular solution is to drop and duplicate frames/fields according to the needs, but quality of videos produced this way is not great.

There is a solution, though, but it’s so complicated that we’ll just mention it here – it’s motion compensation. It’s a technique originally used for video compression, but it also gives great results in frame rate conversions. It’s not only useful for SD conversions, we use it for different things at Panda, but it helps here too.

Well, it’s definitely not the end of the story. These are the basics, but the number of details that have to be considered is unfortunately much bigger. Anyway, if you ever happen to have to support SD television, we’re here to help! Supporting SD can be as easy as creating a profile in Panda:

Adding SD profile in Panda
Adding SD profile in Panda