This is the future of AI video.
Scroll to continue
When videos like these are made completely by artificial intelligence.
None of these videos depict real people, places or events.
Warning: This graphic requires JavaScript. Please enable JavaScript for the best experience.
At first glance, the images amaze and confound: A woman strides along a city street alive with pedestrians and neon lights. A car kicks up a cloud of dust on a mountain road.
But upon closer inspection, anomalies appear: The dust plumes don’t always quite line up with the car’s rear wheels. And those pedestrians are stalking that woman like some eerie zombie horde.
This is Sora, a new tool from OpenAI that can create lifelike, minute-long videos from simple text prompts. When the company unveiled it on Feb. 15, experts hailed it as a major moment in the development of artificial intelligence. Google and Meta also have unveiled new AI video research in recent months. The race is on toward an era when anyone can almost instantly create realistic-looking videos without sophisticated CGI tools or expertise.
Disinformation researchers are unnerved by the prospect. Last year, fake AI photos of former president Donald Trump running from police went viral, and New Hampshire primary voters were targeted this January with fake, AI-generated audio of President Biden telling them not to vote. It’s not hard to imagine lifelike fake videos erupting on social media to further erode public trust in political leaders, institutions and the media.
For now, Sora is open only to testers and select filmmakers; OpenAI declined to say when Sora will available to the general public. “We’re announcing this technology to show the world what’s on the horizon,” said Tim Brooks, a research scientist at OpenAI who co-leads the Sora project.
The videos that appear here were created by the company, some at The Washington Post’s request. Sora uses technology similar to artificial intelligence chatbots, such as OpenAI’s ChatGPT, to translate human-written prompts into requests with sufficient detail to produce a video.
Some are shockingly realistic. After Sora was asked to create a scene from California’s rugged Big Sur coastline, the AI tool’s output is stunning.
Although “garay point beach” is not a real place, Sora produced a video that is almost indistinguishable from this real video of the Big Sur coast near Pfeiffer Falls shot by photographer Philip Thurston. If anything, the fake scene looks more majestic than the real one.
Humans and animals are harder. But here, too, Sora produces surprisingly lifelike results. Take a look at this scene of a cat demanding breakfast.
The texture of the cat’s fur, the intricate shadows on the blankets and the way the person’s face responds to the cat’s intrusion are all realistic. But take another look at that paw.
Sora seems to have trouble with cause and effect, so when the cat moves its left front paw, another appendage sprouts to replace it. The person’s hand is accurately rendered — a detail previous AI tools have struggled with — but it’s in an odd spot.
A similar thing happens in this scene from a Holi spring festival in India, which OpenAI produced at The Post’s request.
Sora produces a realistic drone shot of the colorful celebration, but some people in the crowd blur together, while others sprout clones.
Sora was created by training an AI algorithm on countless hours of videos licensed from other companies and public data scraped from the internet, said Natalie Summers, a spokesperson for the Sora project. By ingesting all that video, the AI amasses knowledge of what certain things and concepts look like. Brooks compared the model’s growth to the way humans come intuitively to understand the world instead of explicitly learning the laws of physics.
Successive versions of the model have gotten better, said Bill Peebles, the other co-lead on the Sora project. Early versions couldn’t even make a credible dog, he said. “There would be legs coming out of places where there should not be legs.”
This video shows Sora has gotten the canine thing down. But these frolicking gray wolf pups still merge and reemerge with mesmerizing weirdness.
How about a scene from a classic Hollywood film? At The Post’s request, Sora produced an actor and a sensibility that seems plucked directly from a real movie.
But Sora clearly is confounded by how to light a cigarette. It knows the process involves hands, a lighter and smoke, but it can’t quite figure out what the hands do or in what order.
There are other problems. Look closely at the telephone. It has two handsets and a cord that stretches upward to become part of the lamp. Other items on the desk look vaguely real, but it’s unclear what they’re supposed to be.
“The model is definitely not yet perfect,” Brooks said.
Other videos show struggles, too. In this one, a man runs realistically on a treadmill — except he’s facing backward.
And even when Sora gets it right, problems may lurk. Take this video Sora made of a Victoria crowned pigeon. Tech critic and author Brian Merchant pointed out that the video looks quite similar to a real one of the same bird filmed by a photographer whose images are available on Shutterstock.
OpenAI has a partnership with Shutterstock to use its videos to train AI. But because Sora is also trained on videos taken from the public web, owners of other videos could raise legal challenges alleging copyright infringement. AI companies have argued that using publicly available online images, text and video amounts to “fair use” and is legal under copyright law. But authors, artists and news organizations have sued OpenAI and others, saying they never gave permission or received payment for their work to be used this way.
The AI field is struggling with other problems, as well. Sora and other AI video tools can’t produce sound, for example. Although there has been rapid improvement in AI tools over the past year, they are still unpredictable, often making up false information when asked for facts.
Meanwhile, “red teamers” are assessing Sora’s propensity to create hateful content and perpetuate biases, said Summers, the project spokesperson.
Still, the race to produce lifelike AI videos isn’t slowing down. One of Google’s efforts, called Lumiere, can fill in pieces cut out of real videos. Here, it fills in the black section from the video on the left.
“Our primary goal in this work is to enable novice users to generate visual content in a creative and flexible way,” Google said in a research paper. The company declined to make a Lumiere researcher available for an interview.
Other companies have begun commercializing AI video technology. New York-based start-up Runway has developed tools to help people quickly edit things into or out of real video clips.
OpenAI has even bigger dreams for its tech. Researchers say AI could one day help computers understand how to navigate physical spaces or build virtual worlds that people could explore.
“There’s definitely going to be a new class of entertainment experiences,” Peebles said, predicting a future in which “the line between video game and movie might be more blurred.”
About this story
Editing by Karly Domb Sadof and Yun-Hee Kim. Design editing by Betty Chavarria. Video production by Nicki DeMarco. Copy editing by Melissa Ngo.