OpenAI on a phone

Photo by Levart_Photographer on Unsplash

OpenAI Will Now Allow You To Generate Videos Based on Verbal Prompts

February 19, 2024

Over the years, it has become increasingly easier to create videos without an in-depth knowledge of complex edit software with generative AI. Now OpenAI has taken it a step further with its new AI model Sora, which the company claims can create “realistic” and “imaginative” 60-second videos from quick verbal cues, according to CNN. OpenAI said it plans to train the AI models to allow it to “help people solve problems that require real-world interaction.”

On Wednesday, in a blog post, the artificial intelligence leader said Sora can generate videos up to one minute long from text cues and can power up scenes with many characters, specific types of movement, and detailed background details. The blog post said, “The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”

This latest innovation marks the newest initiative from the creators of the popular chatbot, ChatGPT, demonstrating their ongoing commitment to advancing generative AI technology. While text-to-video capabilities and “multi-modal models” aren’t groundbreaking ideas, what sets this development apart from others in others in the market, as noted by Reece Hayden, a senior analyst at ABI Research, is OpenAI’s claim regarding Sora’s exceptional length and accuracy.

Hayden also stated, “One obvious use case is within TV; creating short scenes to support narratives. The model is still limited though, but it shows the direction of the market.” Furthermore, he noted how types of AI models could significantly impact the digital entertainment industry by enabling the streaming of personalized content across various channels.

Simultaneously, OpenAI said Sora remains a work in progress, highlighting distinct “weaknesses,” especially in accurately representing spatial details and cause-and-effect relationships within prompts. It said, for example, in a case where a video of someone taking a bite out of a cookie might lack a visible bite mark immediately afterward.

For the time being, OpenAI’s messaging gives the most attention to safety. It said it is on a mission to work with a team of experts to test the latest model and look closely at a range of areas, including hateful content, misinformation, and bias. The firm also said it is building tools to help spot out misleading information.

To begin with, Sora will be open to cybersecurity professors, commonly referred to as “red teamers,” allowing them to look into the product for potential harms or risks. On top of that, OpenAI will grant access to a select group of visual artists, designers, and filmmakers, who are looking to collect feedback on the potential utilization of Sora by creative professionals.

Recent News