Alibaba Releases Open-Source Wan 2.1 Suite of AI Video Generation Models, Claimed to Outperform OpenAI’s Sora

Share via:

Alibaba released a suite of artificial intelligence (AI) video generation models on Wednesday. Dubbed Wan 2.1, these are open-source models that can be used for both academic and commercial purposes. The Chinese e-commerce giant released the models in several parameter-based variants. Developed by the company’s Wan team, these models were first introduced in January and the company claimed that Wan 2.1 can generate highly realistic videos. Currently, these models are being hosted on the AI and machine learning (ML) hub Hugging Face.

Alibaba Introduces Wan 2.1 Video Generation Models

The new Alibaba video AI models are hosted on Alibaba’s Wan team’s Hugging Face page. The model pages also detail the Wan 2.1 suite of large language models (LLMs). There are four models in total — T2V-1.3B, T2V-14B, I2V-14B-720P, and I2V-14B-480P. The T2V is short for text-to-video while the I2V stands for image-to-video.

The researchers claim that the smallest variant, Wan 2.1 T2V-1.3B, can be run on a consumer-grade GPU with as little as 8.19GB vRAM. As per the post, the AI model can generate a five-second-long video with 480p resolution using an Nvidia RTX 4090 in about four minutes.

While the Wan 2.1 suite is aimed at video generation, they can also perform other functions such as image generation, video-to-audio generation, and video editing. However, the currently open-sourced models are not capable of these advanced tasks. For video generation, it accepts text prompts in Chinese and English languages as well as image inputs.

Coming to the architecture, the researchers revealed that the Wan 2.1 models are designed using a diffusion transformer architecture. However, the company innovated the base architecture with new variational autoencoders (VAE), training strategies, and more.

Most notably, the AI models use a new 3D causal VAE architecture dubbed Wan-VAE. It improves spatiotemporal compression and reduces memory usage. The autoencoder can encode and decode unlimited-length 1080p resolution videos without losing historical temporal information. This enables consistent video generation.

Based on internal testing, the company claimed that the Wan 2.1 models outperform OpenAI’s Sora AI model in consistency, scene generation quality, single object accuracy, and spatial positioning.

These models are available under the Apache 2.0 licence. While it does allow for unrestricted usage for academic and research purposes, commercial usage comes with multiple restrictions.

Source Link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Editorial Team
StartupNews.fyi is a leading global startup and technology media platform known for its end-to-end coverage of the startup ecosystem across India and key international markets. Launched with the vision of becoming a single gateway for founders, investors, and ecosystem enablers, StartupNews.fyi has grown steadily over the years by publishing tens of thousands of verified news stories, insights, and ecosystem updates, reaching millions of startup enthusiasts every month through its digital platforms and communities.

Popular

More Like this

Alibaba Releases Open-Source Wan 2.1 Suite of AI Video Generation Models, Claimed to Outperform OpenAI’s Sora

Alibaba released a suite of artificial intelligence (AI) video generation models on Wednesday. Dubbed Wan 2.1, these are open-source models that can be used for both academic and commercial purposes. The Chinese e-commerce giant released the models in several parameter-based variants. Developed by the company’s Wan team, these models were first introduced in January and the company claimed that Wan 2.1 can generate highly realistic videos. Currently, these models are being hosted on the AI and machine learning (ML) hub Hugging Face.

Alibaba Introduces Wan 2.1 Video Generation Models

The new Alibaba video AI models are hosted on Alibaba’s Wan team’s Hugging Face page. The model pages also detail the Wan 2.1 suite of large language models (LLMs). There are four models in total — T2V-1.3B, T2V-14B, I2V-14B-720P, and I2V-14B-480P. The T2V is short for text-to-video while the I2V stands for image-to-video.

The researchers claim that the smallest variant, Wan 2.1 T2V-1.3B, can be run on a consumer-grade GPU with as little as 8.19GB vRAM. As per the post, the AI model can generate a five-second-long video with 480p resolution using an Nvidia RTX 4090 in about four minutes.

While the Wan 2.1 suite is aimed at video generation, they can also perform other functions such as image generation, video-to-audio generation, and video editing. However, the currently open-sourced models are not capable of these advanced tasks. For video generation, it accepts text prompts in Chinese and English languages as well as image inputs.

Coming to the architecture, the researchers revealed that the Wan 2.1 models are designed using a diffusion transformer architecture. However, the company innovated the base architecture with new variational autoencoders (VAE), training strategies, and more.

Most notably, the AI models use a new 3D causal VAE architecture dubbed Wan-VAE. It improves spatiotemporal compression and reduces memory usage. The autoencoder can encode and decode unlimited-length 1080p resolution videos without losing historical temporal information. This enables consistent video generation.

Based on internal testing, the company claimed that the Wan 2.1 models outperform OpenAI’s Sora AI model in consistency, scene generation quality, single object accuracy, and spatial positioning.

These models are available under the Apache 2.0 licence. While it does allow for unrestricted usage for academic and research purposes, commercial usage comes with multiple restrictions.

Source Link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

Editorial Team
StartupNews.fyi is a leading global startup and technology media platform known for its end-to-end coverage of the startup ecosystem across India and key international markets. Launched with the vision of becoming a single gateway for founders, investors, and ecosystem enablers, StartupNews.fyi has grown steadily over the years by publishing tens of thousands of verified news stories, insights, and ecosystem updates, reaching millions of startup enthusiasts every month through its digital platforms and communities.

More like this

MSI’s RTX 5060-equipped Cyborg 15 gaming laptop is down...

As we navigate the component crisis ushered in by...

Anthropic’s CEO Says AI and Software Engineers Are in...

Human software engineers and AI are currently in...

Popular

iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista melhor iptv portugal lista best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv portugal iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv iptv