If you thought ChatGPT was revolutionary or that creating images with Dall-E and other services was dangerous, get ready for the next step: AI video generation. As is usual in the sector, OpenAI was the first to “shoot”, with the presentation of Sora, an AI so surprising that the organization itself decided to limit its use to avoid ‘Internet is filled with fake videos.
However, this future seems inevitable at this point. Not only have applications appeared to create AI videos freely available to everyone, but their quality is rapidly improving. It has already reached such a level that it is very difficult to distinguish a fake video from a real one, and now, Meta has just taken another giant step with a usually forgotten aspect: audio.
And the fact is that until now, all video generators have focused on this, on creating video; something that is already difficult, and adding an audio track is even more difficult. In reality, AI sound generators already exist, but the difficulty lies in synchronize the two sections, video and audioto create a compelling experience. And that’s exactly what Meta claims to have achieved.
Meta’s new AI is called Movie generationand one of its flagship features is audio generation. Instead of creating a soundtrack at the same time as the video, the process involves first creating the video and then using an audio generation model with 13 billion parameters. This model is able to analyze the video and include what the user asks through a line of text.
For example, if we have a video of a quad bike, we can indicate that we want the sound of the engine revving, with guitar music in the background, and the AI will understand this and generate a sound aligned to the events that occur. that happens in the video. Even though the videos released by Meta aren’t perfect, they’re already a big improvement over having no sound at all.
The video creation is also striking and directly competes with Sora in terms of quality. Movie Gen is optimized for text-to-image and text-to-video creatives, and the results are HD videos and images this could have been taken from a movie, hence the name AI.
However, everyone knows that movies play at 24 frames per second, and Movie Gen isn’t there yet: it’s capable of creating videos up to 16 seconds long at 16 frames per second. However, Meta boasts that its model is capable of reasoning about vital aspects such as movement of objects
In addition to text, AI can also create videos from images. This allows, for example, become the protagonist of a film; In one of the demos, a photo of a woman is enough to tell the AI that we want her to wear a pink jacket, be a DJ, and have a cheetah next to her. Perhaps more impressive is the video editing, which lets us change things like giving the penguins Victorian costumes or setting our training session in an Olympic stadium.
For now, Movie Gen is only available to Meta employees and “some outside contributors,” including some filmmakers; Therefore, the company is following a similar strategy to OpenAI, at least for now. However, in the future this AI will be integrated into WhatsApp, Instagram and the rest of the Meta applications, just like creating images with Meta AI.