Each existing frame of video, especially older video, contains a limited amount of information. You can maybe do some static image upscaling – and AI upscaling is actually pretty remarkable. I was blown away by what Stable Diffusion could do with some old comic book scans.
But more than that…there’s a whole video of video of the characters and scenes. For most of the video, that information can, given the right software and a 3d model, be incorporated back into frames to generate a higher-resolution image.
To say nothing of frame interpolation to generate higher-frame-rate video.
Like, I like Lawrence of Arabia. That movie actually has pretty good-quality footage. But…there’s still film grain. And the frame rate is only so high. But there is a whole lot of footage of Lawrence in that movie, enough information to do a pretty good job, if used effectively, of dropping film grain, generating intermediate frames, and increasing the resolution.
Like, I like Lawrence of Arabia. That movie actually has pretty good-quality footage. But…there’s still film grain. And the frame rate is only so high. But there is a whole lot of footage of Lawrence in that movie, enough information to do a pretty good job, if used effectively, of dropping film grain, generating intermediate frames, and increasing the resolution.
This is possible today, and without much effort. Most Stable Diffusion kits just come with upscalers and, as long as you pick the right ones for the job, the models act like fucking magic. Way way better than any of the “nearest neighbor” algorithms image editors provide.
Video editors already have really good tools for interpolating frames for slow motion. They are a bit fiddly in high motion situations, but work well otherwise.
You can do upscaling with AI upscalers in SD today, yeah, and it’s pretty nifty, but it’s working with a 2D model. That’s nice if you have a lot of footage of Lawrence from exactly the same angle; if you train a model on the whole video, then you can use that for upscaling individual frames.
But my point is that if you have software that’s smart enough to make use of information derived with a 3D model, then you don’t need to have that identical angle to make use of the information there.
Let’s say that you’ve got a shot of Peter O’Toole like this:
But add a 3d model to the thing, and you can use data from the close-up in the first image to scale up the second. The software can rotate the data in three dimensions, understand the relationships. If you can take time into account, you could even learn how his robe flaps in the wind or whatnot.
My point is that if all you are doing is cleaning up frames and trying to upscale footage from 24fps to 60fps, you have all of the data you need from the previous/next frames to blend those into in-between frames. A model trained on the movie would help, but there’s no need to get into anything as complex as 3D models of objects. Sub-second animation data is just fine.
I’m looking forward to superresolution in video.
Each existing frame of video, especially older video, contains a limited amount of information. You can maybe do some static image upscaling – and AI upscaling is actually pretty remarkable. I was blown away by what Stable Diffusion could do with some old comic book scans.
But more than that…there’s a whole video of video of the characters and scenes. For most of the video, that information can, given the right software and a 3d model, be incorporated back into frames to generate a higher-resolution image.
To say nothing of frame interpolation to generate higher-frame-rate video.
Like, I like Lawrence of Arabia. That movie actually has pretty good-quality footage. But…there’s still film grain. And the frame rate is only so high. But there is a whole lot of footage of Lawrence in that movie, enough information to do a pretty good job, if used effectively, of dropping film grain, generating intermediate frames, and increasing the resolution.
This is possible today, and without much effort. Most Stable Diffusion kits just come with upscalers and, as long as you pick the right ones for the job, the models act like fucking magic. Way way better than any of the “nearest neighbor” algorithms image editors provide.
Video editors already have really good tools for interpolating frames for slow motion. They are a bit fiddly in high motion situations, but work well otherwise.
You can do upscaling with AI upscalers in SD today, yeah, and it’s pretty nifty, but it’s working with a 2D model. That’s nice if you have a lot of footage of Lawrence from exactly the same angle; if you train a model on the whole video, then you can use that for upscaling individual frames.
But my point is that if you have software that’s smart enough to make use of information derived with a 3D model, then you don’t need to have that identical angle to make use of the information there.
Let’s say that you’ve got a shot of Peter O’Toole like this:
https://prod-images.tcm.com/Master-Profile-Images/lawrenceofarabia1962.4455.jpg?w=824
And another like this:
https://media.vanityfair.com/photos/52d691da6088e6966a000006/master/w_2240,c_limit/1389793754760_lawrencethumb.jpg
Those aren’t from the same angle.
But add a 3d model to the thing, and you can use data from the close-up in the first image to scale up the second. The software can rotate the data in three dimensions, understand the relationships. If you can take time into account, you could even learn how his robe flaps in the wind or whatnot.
One would need something like this.
My point is that if all you are doing is cleaning up frames and trying to upscale footage from 24fps to 60fps, you have all of the data you need from the previous/next frames to blend those into in-between frames. A model trained on the movie would help, but there’s no need to get into anything as complex as 3D models of objects. Sub-second animation data is just fine.