New AI Video Generator "Vidu" Competes with SORA

AI Search
28 Apr 202411:37

TLDRChinese company Shu has unveiled Vidu, an AI video generator rivaling OpenAI's SORA. Vidu, built on the Universal Vision Transformer (UVIT) architecture, combines diffusion and Transformer models to enhance generative AI. It can produce a 16-second 180p video with a single click. The technology, initially proposed by Vidu's team in 2022, promises more coherent and accurate video generation. The video showcases Vidu's ability to generate realistic hands and scenes, though it has some inconsistencies. Access to Vidu is available through an application on shangu.ai.com. The emergence of Vidu highlights the growing global competition in AI, with China recently launching advanced AI models and robots.

Takeaways

  • ๐Ÿ˜€ A Chinese company, Shu, has announced a new AI video generator named Vidu, positioning it as a competitor to SORA.
  • ๐ŸŽฅ Vidu claims to generate a 16-second 180p video clip with a single click, showcasing its efficiency.
  • ๐Ÿค– Built on a self-developed architecture called Universal Vision Transformer (UViT), Vidu integrates text and video AI models.
  • ๐Ÿ”ฎ The integration of diffusion and Transformer models in Vidu is seen as an advancement in generative AI, potentially overcoming previous limitations.
  • ๐Ÿ“ˆ Vidu's technology was first proposed by its research team in September 2022, prior to SORA's model architecture.
  • ๐Ÿ†š In direct comparisons, Vidu's video quality appears to be notably better than current alternatives like Runway and Pika.
  • ๐ŸŒ The resolution of Vidu's videos is currently 720p, which is lower than the full HD resolution of SORA's videos.
  • ๐Ÿ‘€ Vidu generates hands and realistic elements well, but there are inconsistencies in some of the showcased videos.
  • ๐Ÿ”— Interested users can apply to use Vidu through the website shangu.ai.com, where they can leave their contact information.
  • ๐ŸŒŸ China has been making significant strides in AI, with recent advancements in language models and robotics, indicating a competitive landscape.

Q & A

  • What is the name of the AI video generator announced by the Chinese company Shu?

    -The AI video generator announced by the Chinese company Shu is called 'Vidu'.

  • What is the main claim of Vidu's AI video generator in comparison to SORA?

    -Vidu claims its AI video generator is on par with SORA, being able to generate high-quality videos with advanced generative AI capabilities.

  • What is the technical architecture behind Vidu's AI video generator?

    -Vidu's AI video generator is built on a self-developed visual transformation model architecture called Universal Vision Transformer (UVIT), which integrates two text-video AI models: diffusion and Transformer.

  • How does the merging of diffusion and Transformer models in Vidu's technology aim to improve generative AI?

    -Merging diffusion and Transformer models is considered the next step in generative AI, potentially making the generated videos or images more coherent and accurate by leveraging the Transformer model's strength in understanding context.

  • Who is Ju Jun and what is his role in the development of Vidu's AI video generator?

    -Ju Jun is the vice dean of The Institute of AI at Chingua University and the chief scientist at Shu. He played a significant role in advancing the research that led to the development of Vidu's AI video generator.

  • What is the significance of the UVIT model in the context of generative AI advancements?

    -The UVIT model is significant as it represents an advancement in generative AI by combining the strengths of diffusion and Transformer models, potentially leading to more coherent and accurate video generation.

  • How does the video quality of Vidu's AI video generator compare to other existing generators like Runway and Pika?

    -Vidu's AI video generator appears to outcompete existing generators like Runway and Pika based on the showcased examples, with more realistic and coherent video outputs.

  • What are some of the limitations observed in Vidu's AI video generator when compared to SORA?

    -Some limitations observed in Vidu's AI video generator compared to SORA include occasional inconsistencies in details, such as hair transforming into a red ribbon or a green leaf disappearing.

  • How can one apply to use Vidu's AI video generator?

    -To apply for use of Vidu's AI video generator, one can visit the website shanguai.com, scroll down to the video generation section, and fill out the application form with their name, phone number, and company name.

  • What is the current status of Vidu's AI video generator in terms of public access?

    -As of the information provided, Vidu's AI video generator is not yet publicly accessible, but interested users can apply for access through the provided form on the shanguai.com website.

  • How does the recent development of Vidu's AI video generator reflect the global competition in AI technology?

    -The development of Vidu's AI video generator reflects a growing global competition in AI technology, showing that companies outside of the major tech giants in America are also making significant advancements and contributing to the field.

Outlines

00:00

๐Ÿš€ Introduction to SORA's AI Video Generator Competitor

The video script begins with the announcement of a new AI video generator called SORA by a Chinese company. The presenter expresses urgency to cover this development, highlighting SORA's claim to be on par with OpenAI's capabilities. A showreel is played to demonstrate SORA's capabilities. The script then delves into the technical aspects, mentioning that SORA is built on a self-developed architecture known as Universal Vision Transformer (UVIT), which integrates diffusion and Transformer models. This integration is seen as a significant advancement in generative AI, potentially overcoming previous limitations in text generation and context understanding. The presenter also discusses the implications of this technology, suggesting that it could outperform current video generators like Runway and Pika.

05:01

๐Ÿ“Š Comparative Analysis of SORA and VDU AI Video Generators

The second paragraph focuses on a comparative analysis between SORA and VDU's AI video generators. The presenter plays side-by-side comparisons of video clips generated by both platforms, noting differences in quality and realism. While acknowledging that VDU's showreel is impressive, the presenter maintains that SORA's videos appear to be of higher quality and more realistic. The discussion also points out inconsistencies in VDU's generated videos, such as changes in hair color and disappearing objects, which are not present in SORA's outputs. The presenter concludes this section by mentioning the lower resolution of VDU's videos compared to SORA's full HD, which could affect the perceived quality.

10:03

๐ŸŒ Global AI Competition and Access to VDU's Technology

In the final paragraph, the script shifts focus to the broader context of AI development, particularly in China. The presenter expresses excitement about the recent advancements in AI, not just from American tech giants but also from countries like China and India. The script mentions other recent Chinese AI developments, such as a new language model and a fast robot, suggesting a surge in innovation. The presenter then guides viewers on how to apply for access to VDU's AI video generator, providing a link to the company's website. The video concludes with a call for viewer engagement, encouraging feedback on VDU's technology and whether it can compete with or surpass SORA. The presenter also reflects on the positive impact of competition in the AI space.

Mindmap

Keywords

๐Ÿ’กAI Video Generator

An AI video generator is a software tool that uses artificial intelligence to create videos based on input data or prompts. In the context of the video, the AI video generator 'Vidu' is a new entrant in the field, competing with 'SORA'. It is designed to generate realistic videos from textual descriptions, showcasing the potential of AI in content creation.

๐Ÿ’กSORA

SORA is mentioned as a competitor to Vidu in the AI video generation space. It represents the state-of-the-art in AI-generated video technology, setting a benchmark for other companies like Shu to compare their products against. The video discusses how Vidu claims to be on par with SORA in terms of video generation capabilities.

๐Ÿ’กUniversal Vision Transformer (UViT)

UViT refers to a self-developed visual transformation model architecture that Vidu is built upon. It integrates two AI models: the diffusion model and the Transformer model. This integration is considered an advancement in generative AI, as it aims to overcome the limitations of previous models by generating more coherent and accurate visual content.

๐Ÿ’กDiffusion Model

The diffusion model is a type of generative AI model that has been used to create images and videos. It has some limitations, such as difficulty in generating text and understanding complex prompts. In the video, it is noted that Vidu's UViT architecture combines the diffusion model with the Transformer model to improve upon these limitations.

๐Ÿ’กTransformer Model

The Transformer model is a machine learning model based on the paper 'Attention Is All You Need' by Google DeepMind. It is known for its ability to understand context and is the backbone of many language models like GPT and Claude. In the video, it is suggested that combining the Transformer model with the diffusion model could lead to more coherent and accurate AI-generated videos.

๐Ÿ’กGenerative AI

Generative AI refers to AI systems that can create new content, such as images, videos, or text, that did not exist before. The video discusses how the merging of the diffusion and Transformer models in Vidu's UViT architecture represents the next step in generative AI, aiming to produce higher quality and more realistic outputs.

๐Ÿ’กStable Diffusion

Stable Diffusion is a specific version of the diffusion model that has been used to generate images and videos. The video script mentions it as a predecessor with certain limitations, such as the inability to generate text well or follow complicated prompts, which Vidu's technology aims to surpass.

๐Ÿ’กRealism

Realism in the context of the video refers to the ability of the AI video generator to produce videos that closely resemble real-world visuals. The video compares the realism of Vidu's generated videos with those of SORA, noting that Vidu's outputs are quite realistic but may not yet match the quality of SORA.

๐Ÿ’กResolution

Resolution in video refers to the number of pixels used to form the image, affecting the level of detail and sharpness. The video mentions that while Vidu's showcase videos are in 720p, the company claims it can output 1080p, which would offer a higher resolution and potentially more detailed videos.

๐Ÿ’กCompetition

Competition in this context refers to the rivalry between different companies in the AI video generation market. The video highlights the positive impact of competition, such as Vidu's entry challenging SORA, which can drive innovation and improvements in AI video generation technology.

Highlights

Chinese company Shu announces an AI video generator, Vidu, as a competitor to SORA.

Vidu claims to be on par with OpenAI's Sora in terms of video generation capabilities.

Vidu can generate a 16-second 180p video clip with a single click.

Powered by a self-developed visual transformation model architecture called Universal Vision Transformer (UViT).

UViT integrates diffusion and Transformer AI models, marking a significant advancement in generative AI.

The combination of diffusion and Transformer models aims to overcome limitations of previous AI video generators.

The Transformer model, based on the 'Attention is All You Need' paper, is known for its context understanding capabilities.

The core technology of UViT was first proposed by Vidu's research team in September 2022, prior to SORA's model architecture.

Vidu showcases its ability to generate realistic hands with five fingers in its demo reel.

Comparisons between Vidu and SORA's video generation capabilities show potential for Vidu to be a strong competitor.

Vidu's video resolution is currently 720p, while SORA's videos are in full HD.

Vidu's website, shangu.ai, allows users to apply for access to the AI video generator.

China has been releasing innovative AI technologies, indicating a global AI race with multiple strong contenders.

The release of Vidu adds competition to the AI video generation market, which is seen as beneficial for technological advancement.

Vidu's AI video generator is positioned as a potential close competitor to OpenAI's Sora.