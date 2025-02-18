ByteDance, the Chinese parent company of TikTok, recently debuted a new AI model called OmniHuman-1, allowing users to a take a single picture and create hyper-realistic videos with synched audio, giving the world the most powerful and yet most dangerous deepfake tool to date.

The developers of the model list OmniHuman-1’s capabilities on its website:

OmniHuman is an end-to-end AI framework developed by researchers at ByteDance. It can generate incredibly realistic human videos from just a single image and a motion signal—like audio or video. Whether it's a portrait, half-body shot, or full-body image, OmniHuman handles it all with lifelike movements, natural gestures, and stunning attention to detail. At its core, OmniHuman is a multimodality-conditioned human video generation model. This means it combines different types of inputs, such as images and audio clips, to create realistic videos.

OmniHuman can bring music to life, whether it’s opera or a pop song. The model captures the nuances of the music and translates them into natural body movements and facial expressions. For instance:

OmniHuman is highly skilled at handling gestures and lip-syncing. It generates realistic talking avatars that feel almost human. Applications include:

OmniHuman isn’t limited to humans. It can animate:

Cartoons.

Animals.

Artificial objects.

This adaptability makes it suitable for creative applications, such as animated movies or interactive gaming.

OmniHuman delivers lifelike results even in close-up scenarios. Whether it’s a subtle smile or a dramatic gesture, the model captures it all with stunning realism.

OmniHuman can also mimic specific actions from reference videos. For example:

Use a video of someone dancing as the motion signal, and OmniHuman generates a video of your chosen person performing the same dance.

Combine audio and video signals to animate specific body parts, creating a talking avatar that mimics both speech and gestures.

ByteDance did however note several “cons,” which include limited Availability, it is resource intensive, and requires significant computational power.

The company provides a number of examples on its website, but popular YouTube channel AI Health compiled most of them together in a video and provided some commentary on the samples provided:

The model is certainly not perfect as the ByteDance team notes that lower resolution images will not net the best results, and some poses and movements can still look awkward at times, as seen in this example video:

Nevertheless, the accuracy of the videos points towards larger and more malicious ramifications soon to come as this technology reaches the masses, and as the technology gets better.

Tech Crunch writes:

Still, OmniHuman-1 is easily heads and shoulders above previous deepfake techniques, and it may well be a sign of things to come. While ByteDance hasn’t released the system, the AI community tends not to take long to reverse-engineer models like these.

The implications are worrisome.

Last year, political deepfakes spread like wildfire around the globe. On election day in Taiwan, a Chinese Communist Party-affiliated group posted AI-generated, misleading audio of a politician throwing his support behind a pro-China candidate. In Moldova, deepfake videos depicted the country’s president, Maia Sandu, resigning. And in South Africa, a deepfake of rapper Eminem supporting a South African opposition party circulated ahead of the country’s election.

Deepfakes are also increasingly being used to carry out financial crimes. Consumers are being duped by deepfakes of celebrities offering fraudulent investment opportunities, while corporations are being swindled out of millions by deepfake impersonators. According to Deloitte, AI-generated content contributed to more than $12 billion in fraud losses in 2023, and could reach $40 billion in the U.S. by 2027.

Tom’s Guide also reported that ByteDance recently unveiled another AI model called Goku, similar to OmniHuman-1, but appears to be more geared towards shortform content and advertising.

As the tech outlet notes: These video tools are not just destined to sell us more products, it's obvious there's a much larger agenda at work here. After advertising, the next domino to fall is almost certainly going to be animated art in all its forms. Even if we don't see full length animations using this technology in the short term, there's no question that it's already being deployed as part of the production process.

AUTHOR COMMENTARY

It’s always about the money, isn’t it? “For the love of money is the root of all evil…” (1 Timothy 6:10).

Long-time readers of The WP will recall that I've covered a number of different AI models that have been cropping up over the last several years, each one getting more and more realistic each time, and this one arguably has the greatest potential to create convincing deepfakes, mis- and disinformation, and malicious scams.

Be very careful what you are looking at and listening to online. As a general rule, I question and challenge everything, and I do mean everything. I think that that’s just being prudent, but with everything basically being a lie online you need to take the high ground and guard yourself.

Proverbs 14:8 The wisdom of the prudent is to understand his way: but the folly of fools is deceit.

