Real-Time Intermediate Flow Estimation for Video Frame Interpolation is the process of generating images from a sequence of frames. It is a challenging task as it requires a significant amount of computational resources. Moreover, rendering video can be a multi-step process. The quality of video interpolation is affected by many factors such as frame rate, quality of video encoding, the format of video content (e.g. HD, 4k, 8k, etc). The overall quality of the rendered video is highly dependent on the combination of all these factors. If you want to get media playing on a website in the highest quality, converting from a low frame rate to a high frame rate makes for a big difference. The bidirectional intermediate flow estimation method is used for video interpolation as it produces both real-time and accurate results of conversion between different frame rates.
This article introduces a novel intermediate flow estimation method for real-time video frame interpolation and discusses how frame interpolation works and its role in efficient video rendering using RIFE – RIFE, or Real-time Intermediate Flow Estimation, which is an intermediate flow estimation algorithm for Video Frame Interpolation (VFI). We’ll also make a small application to interpolate a video to test out some of RIFE’s features.
How does RIFE Interpolation Works?
Recent flow-based VFI methods often first estimate the bi-directional optical flows, then scale and reverse them to approximate intermediate flows, leading to artifacts on motion boundaries and complex pipelines. RIFE or Real-Time Intermediate Flow Estimation for Video Frame Interpolation uses a neural network named IFNet which introduces an innovative network model that estimates the intermediate flow directly from the first and second-order optical flows. With these improvements to speed, RIFE provides some of the fastest existing methods for assigning object segmentations. This allows RIFE to support arbitrary timespans for frame interpolations which can help it achieve higher temporal resolution than most existing methods.
It also introduces a new inferencing scheme that maximizes performance while being adaptable to a wide variety of tasks, and although it may just be an open-source project, it can be useful for many different applications in both commercial and research settings. Also when compared to the popular SuperSlomo and DAIN methods, RIFE is up to 4 to 27 times faster, produces better results, and is less resource hungry.
Prerequisites
Although I have already given a requirements.txt file in the repository, we would also need some additional resources to run it. But these would differ from machine to machine. The most major issue most people would run into would be missing FFmpeg. FFmpeg is an open-source software project for handling video, audio, and other multimedia files and we absolutely need it for our application to work.
You can first check if you already have FFmpeg or not with:
ffmpeg -version
You can install FFmpeg in the following ways:
Linux
FFmpeg can be installed with the following command or you can also install them with their officially supported site here.
sudo apt install ffmpeg brew install ffmpeg
Mac
Static FFmpeg binaries for Macs can be found here, it can also be installed through the brew library with the following command:
brew install ffmpeg
Windows
You can download the .exe file directly from GitHub on Windows.
Building the Application
Since this, a very complex project with a huge codebase, we are not going to write the code here instead I have uploaded it to a Github repository. You can access it very easily from there, so, go ahead and download the files from there, then proceed further.
You can also clone it through the following command:
git clone [email protected]:hzwer/arXiv2020-RIFE.git
Before proceeding further, it is best to create a virtual environment and perform the operation in it:
cd arXiv2020-RIFEvirtualenv rifesource rife/bin/activate
After you have downloaded and extracted or cloned the project, open its directory and install the prerequisites for it:
cd arXiv2020-RIFE
pip3 install -r requirements.txt
Now, If all the requirements are installed properly, you can start the interpolation process! Just download a video, consider getting a small one because it can take quite a while to process, I have downloaded this animation video from Youtube (it’s by Mayde), you can pick whatever you want, just try to get a short one if you don’t want this to take all night!
After downloading a video, put it in the project folder and then you are ready to start the interpolation process, just run the following command to start it. But you should first see the applications of it.
python3 inference_video.py --exp=1 --video=video.mp4 #Linux/Macpython inference_video.py --exp=1 --video=video.mp4 #Windows
The command above will interpolate the video’s frame rates by 2 times, meaning, if a video is 24fps it will increase it to 48fps. RIFE allows us to easily change this value as well, so we can make the interpolation at higher FPSs. To do that we just have to change the value of –exp=1
to something else, for example, if we change it to 2, the interpolation will change from 2 times to 4 times and so on and so forth.
We can also add –scale= at the end of the command, its value parameter controls the process resolution for the optical flow model. For example:
python3 inference_video.py --exp=1 --video=video.mp4 --scale=0.5
So after running the command you would see a response like this in your terminal:
Here you’ll be able to see what your video’s current fps is and the fps it will have once the operation is done, in my case it turns my 24FPS video to 48FPS. It is also showing about how much time it would take, which, as you can see is quite a lot for me since I don’t have a graphics card, tho, it shouldn’t bother you even if you have a very basic graphics card as just have inn cuts down the render time significantly.
After the process is done you will see a video created called video_2X_48fps.mp4 (considering you also named your video “video”). And that is the final file you can check its properties and see if worked or not.
Conclusion
Video interpolation is an important technology in a number of video processing applications. The computational cost of video interpolation is the main obstacle that has hindered its widespread use. The authors of this paper have developed an algorithm that is capable of performing video interpolation at a higher frame rate than existing methods, while also using fewer computational resources.
RIFE is an algorithm for efficient video rendering. It is an intermediate flow estimation algorithm that uses a combination of features from others that have been tested and verified to work. The features also work with a number of applications as you will see below. Although RIFE does not have a great deal of documentation, it does overcome the major difficulties that arise during the frame interpolation process very easy and also is also easy to implement in your own project. More information on RIFE can be found here.
Here are some useful tutorials that you can read:
- How to receive Github webhooks in Python?
- Convert an image to 8-bit image
- How to run my Python script on Docker?
- Generating QR Codes and Barcodes in Python
- Create API in Django Rest Framework Viewset
- Create Sudoku game in Python using Pygame
- Get Weather Information using Python
- Detect the number of faces using Python and OpenCV