OpenAI’s new artificial intelligence tool ‘Sora’ caused quite a stir

Could it be a springboard for more credible 'deepfake' videos?

8 mins read
OpenAI's new artificial intelligence tool 'Sora' caused quite a stir

Artificial intelligence startup OpenAI unveiled a new AI tool that can produce highly realistic 60-second videos with a simple text prompt and caused quite a stir.

YouTubers are in shock, tech editors think it’s a milestone, and those who have been concerned about artificial intelligence (AI) from the start say it could be a springboard for more convincing ‘deepfake’ videos. The tech world is polarized between ‘hardcore AI fans’ and ‘AI doomsayers’.

This move is both exciting and a bit scary. This is probably why OpenAI is putting its new tool behind a very limited access program for now.

The new ‘text-to-video’ tool, called ‘Sora’, is currently only available to members of the ‘red team’ who evaluate the model for potential harms and risks. OpenAI is also offering access to some visual artists, designers and filmmakers for feedback.

Sora is based on the technology behind OpenAI’s image-generating DALL-E tool. It interprets a user’s prompt into a more detailed set of instructions and then uses an AI model trained on video and images to create the new video.

OpenAI CEO Sam Altman said on X: “We want to show you what Sora can do. Please write down the videos you want to see and we’ll start making them.” Some of the videos that have emerged look really stunning.

According to OpenAI’s blog post, Sora can create ‘complex scenes with multiple characters, specific types of movement, and accurate details of subject matter and background’. The company also notes that the model can understand how objects ‘exist in the physical world, accurately interpret props, and generate expressive characters that express vivid emotions.

The model can create a video based on a still image, fill in a missing frame in an existing video and expand the video.

Prompt: A stylish woman walks down a Tokyo street full of hot neon lights and bustling city signs. She is wearing a black leather jacket, a long red dress and black boots, and is carrying a black bag. She is wearing sunglasses and red lipstick. She walks confidently and relaxed. The street is humid and reflective, the colored lights create a mirror effect. Many pedestrians are walking around.

Prompt: “A movie trailer featuring the adventures of a 30-year-old spaceman wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vibrant colors.”

Prompt: A cat wakes up its sleeping owner and asks for breakfast. The owner tries to ignore the cat, but the cat tries new tactics. (We love it.)

Prompt: A Chinese New Year celebration video with a Chinese Dragon (a really eerily realistic video).

In fact, many of them show signs of artificial intelligence, and the company acknowledges this. Sora also creates physically counterintuitive movements in some videos. OpenAI says the model ‘may struggle to accurately simulate the physics of a complex scene’ at this stage. But the results are impressive overall.

What are the first impressions of Sora? What do the experts think?

The rapid development of artificial intelligence technology is affecting many industries, from filmmaking to journalism. According to the Washington Post, there are some technologists who claim that in the near future ‘a single person could make a movie on the same visual level as a Marvel movie’.

Film director and visual effects expert Michael Gracey, who has been closely following the impact of AI on the industry, said, “Look at how far we’ve come in rendering in just one year. You won’t need a team of 100 or 200 artists for three years to make animated movies. That’s exciting for me.”

But on the other hand, he emphasizes that training AI tools with the work of artists is a big problem: “It’s not fair to take people’s creativity, their work, their ideas and their practices and not pay them what they deserve.”

Mutale Nkonde, a policy researcher at the Oxford Internet Institute, says the idea that anyone can easily turn text into video is exciting. But he has concerns about how these tools can embed social prejudices, their impact on people’s livelihoods, and their ability to turn hateful text into disturbingly realistic images.

Nkonde reminds us that Hollywood strikes are looking for solutions to issues such as the use of AI tools in screenwriting and the cloning of actors with this technology, and brings the topic to ‘deepfake’: “From a policy perspective, and when it comes to tools like this, shouldn’t we start thinking about how we can protect people?”

The technology companies that develop these tools say they monitor their use and have policies in place to prevent them being used to produce political content. However, it is unclear how these policies are enforced.

Arvind Narayanan, a professor of computer science at Princeton University, says that based on the demo videos OpenAI shared, Sora ‘appears to be significantly more advanced than any other video creation tool’. Narayanan also thinks the move is likely to result in more convincing ‘deepfake’ videos. But “if you look closely at some of the videos, you can still see a lot of inconsistencies,” he says, pointing to the gait of the woman in the Tokyo video and the people disappearing in the background.

Ted Underwood, professor of information science at the University of Illinois, said: “I honestly didn’t expect this level of video production for another two to three years. It seems like a bit of a leap in capacity compared to other text-to-video tools.” But he cautions that OpenAI may have chosen videos that best illustrate the model.

Sources: OpenAI blog post and Washington Post article.

FİKRİKADİM

The ancient idea tries to provide the most accurate information to its readers in all the content it publishes.


Fatal error: Uncaught TypeError: fclose(): Argument #1 ($stream) must be of type resource, bool given in /home/fikrikadim/public_html/wp-content/plugins/wp-super-cache/wp-cache-phase2.php:2386 Stack trace: #0 /home/fikrikadim/public_html/wp-content/plugins/wp-super-cache/wp-cache-phase2.php(2386): fclose(false) #1 /home/fikrikadim/public_html/wp-content/plugins/wp-super-cache/wp-cache-phase2.php(2146): wp_cache_get_ob('<!DOCTYPE html>...') #2 [internal function]: wp_cache_ob_callback('<!DOCTYPE html>...', 9) #3 /home/fikrikadim/public_html/wp-includes/functions.php(5420): ob_end_flush() #4 /home/fikrikadim/public_html/wp-includes/class-wp-hook.php(324): wp_ob_end_flush_all('') #5 /home/fikrikadim/public_html/wp-includes/class-wp-hook.php(348): WP_Hook->apply_filters('', Array) #6 /home/fikrikadim/public_html/wp-includes/plugin.php(517): WP_Hook->do_action(Array) #7 /home/fikrikadim/public_html/wp-includes/load.php(1270): do_action('shutdown') #8 [internal function]: shutdown_action_hook() #9 {main} thrown in /home/fikrikadim/public_html/wp-content/plugins/wp-super-cache/wp-cache-phase2.php on line 2386