Increase video resolution with an opensource machine learning algorithm for upscaling video image frames using an automated command line script.
Bringing machine learning algorithms a step closer to usability.
Given a low-resolution video file, this script uses a machine-learning algorithm to increase (upscale) each frame’s resolution using information from neighbouring frames. The workhorse is the Recurrent Back-Projection Network (RBPN) algorithm from Haris, Shakhnarovich, & Ukita (2019) – all credit goes to the authors.
To upscale a full video file:
To upscale part of a video file, specify time span, a frame range, or a mixture of both. Use “-” to indicate start and end points, or “+” to indicate start and span. For example, assuming, a 10fps video, the following are all equivalent:
upscale input.vid --scene=0:01:00-0:01:10
upscale input.vid --scene=0:01:00+0:10
upscale input.vid --scene=600-700
upscale input.vid --scene=600+100
upscale input.vid --scene=1:00.00+100
The default zoom level is x4, but x2 and x8 are also available. Also, the default number of neighbouring frames to use is 6 (3 previous frames and 3 following), but smaller numbers can be emulated (by copying the target frame into unused neighbour frames):
upscale input.vid --zoom=2 --frames=2 | egrep -v '!!!$'
If the upscale process is interrupted, then it is automatically resumed by re-running the command. To prevent this behaviour (clobber any existing output frames) use:
upscale input.vid --no-resume
Frames are extracted from video files using ffmpeg, so any video file format supported by ffmpeg will work. However, if you would like to use a different program (eg, VLC), or prefer to handle the extraction manually, then just place the frames as a sequence of .png files in a directory:
The output directory name is auto-generated as .input.out/, or can be specified manually:
upscale input.vid output-frames/
upscale input-frames/ output-frames/
The input directory name is also auto-generated – as .input.inp/. It is also possible to do all processing in a single directory. In this case, output frames clobber input frames, so the process cannot be resumed if interrupted:
upscale input.vid .input.vid.inp/
upscale input-frames/ input-frames/
This script runs on Linux, and uses Python, PyTorch, and PyFlow, and optionally ffmpeg for extracting frames. Successful compilation of PyFlow also requires C++ and Python development libraries (they can safely be removed after installation). To ensure that all dependencies exist, use the setup script included in the project.
To download and install the project from GitHub:
wget -O - https://github.com/arnon-weinberg/Upscale-video-RBPN/archive/master.tar.gz | tar xz
The setup script does not require root, and all changes are local, so uninstalling the project is a simple matter of removing its directory.
I built this script to test out the RBPN algorithm on practical examples. Now you can use it too, on your own videos!
Frame generation is very processing intensive. On my old laptop, it can take 5 minutes a frame to process a small 256×138 resolution video (sample image below), and over 1 hour per frame for 720×480. Processing time benefits greatly from the availability of a GPU, but if you don’t want to take advantage of it, then use of the GPU(s) can be turned off with the option
The zoom level and number of frames offered by the original algorithm do not seem to make much difference in my testing – neither in processing time, nor the quality of results. Some videos are greatly enhanced by super-resolution, while others are unaffected – presumably resolution is not the only problem with video quality, so this is not a universal solution for upgrading any low-quality video.
Bottom line: While this machine learning algorithm may not be ready for prime-time, this is still a fun project to play with, and perhaps you find it useful to enhance your own videos.
Command used: >
upscale 'Lilies - S1E1.mp4' --scene=1:43+10