Talking about AI video generation, and where we go from here
Over the weekend I took a look into Veo3, a new video generation model from Google.
My first thought when seeing it in action was, damn we are fucked as a society.
The ease of video creation, the quality of videos, and the audio generation are some of the most mind-blowing things I have ever seen.
To think just about two years ago, we had Will Smith eating spaghetti, which made it seem like video generation is years away.
Then a year ago Open AI's Sora launched which blew our minda away, the progress of video generation makes me think in the next two years, we can have creators make full sitcoms/movies fully from AI.
What separates Veo3 from all other AI video generation models (Sora, Stable Diffusion, other Veo models) is the new addition of sound. This is what takes hobby creation into full-fledged production-grade quality. I experimented with the model, and created some amazing vidoeos, with surround sound quality. It's one thing, to be able to generate sound that is in stereo format, but to add multi-channel support on their first attempt, makes me both really really excited, and nervous.
Veo 3 allows users to create videos in a couple ways:
Text to Video
Frames to Video
We have come a long way, from where we were just a couple years ago, to now. However, I can see the progress in the next few months, and years.