Google launches LLM to generate videos from text, audio input

OpenAI, Microsoft, and Adobe have launched AI chatbots powered by large language models (LLMs) that convert text input into images. Google has released VideoPoet, an LLM that can turn text into videos. To showcase VideoPoet’s capabilities, Google Research produced a short movie composed of clips generated by the model. VideoPoet uses a pre-trained MAGVIT V2 video tokenizer and SoundStream audio tokenizer to transform images, videos, and audio clips into a sequence of discrete codes. These codes are compatible with text-based language models, allowing integration with other modalities.

Companies like OpenAI, Microsoft and Adobe have launched AI chatbots that are powered by specific types of large language models (LLMs) that turn a text input into an image. Google has also been in the fray and it has now taken a step forward by releasing an LLM, called VideoPoet, that can turn text to videos.

To showcase VideoPoet’s capabilities, Google Research has produced a short movie composed of several short clips generated by the model.

How VideoPoet model works

For example, Google explains that for the script, it asked Bard to write a series of prompts to detail a short story about a travelling raccoon. It then generated video clips for each prompt, and when the model stitched together all resulting clips, it prepared a final YouTube Short.

“VideoPoet is a simple modelling method that can convert any autoregressive language model or large language model (LLM) into a high-quality video generator,” Google said.
There is a pre-trained MAGVIT V2 video tokenizer and a SoundStream audio tokenizer which transform images, video and audio clips with variable lengths into a sequence of discrete codes in a unified vocabulary.

These codes are compatible with text-based language models, facilitating an integration with other modalities, such as text. The LLM learns modalities to predict the next video or audio token in the sequence.

“A mixture of multimodal generative learning objectives are introduced into the LLM training framework, including text-to-video, text-to-image, image-to-video, video frame continuation, video inpainting and outpainting, video stylisation, and video-to-audio,” the company said, noting that the result is an AI-generated video.

In layman’s words, VideoPoet has multiple separately trained components for different tasks integrated into a single LLM.

Previous News

Startup Funding Hits A 7-Year Low Of $10 Bn As Investor Appetite Wanes In 2023

Next News

Watch: NASA’s trailer on all missions of 2024

Google launches LLM to generate videos from text, audio input

Disclaimer

Popular

Microsoft to Introduce Voice Reporting Feature for Xbox

Adobe teams up with India’s Education Ministry for creative learning initiative

Meta May Allow Instagram and Facebook Users in Europe to Pay to Avoid Ads

Indian fintechs amplify payments soundbox pitches to woo merchants

Fintech Unicorn Pine Labs Launches Mini — A QR-First Device With Card Support

More Like this

Apple executives held internal talks about buying Perplexity: Report

Clampdown on ecommerce dark patterns may raise compliance burden for companies

Capillary Technologies’ DRHP highlights rising competition, AI impact on business

Apple shares original casting tape for “Severance” Ms. Casey

Insurer Aflac discloses cybersecurity incident

Pinduoduo joins China’s instant retail war

Google launches LLM to generate videos from text, audio input

Disclaimer

More like this

Apple executives held internal talks about buying Perplexity: Report

Clampdown on ecommerce dark patterns may raise compliance burden...

Capillary Technologies’ DRHP highlights rising competition, AI impact on...

Popular

Seizing A Trillion-Dollar Opportunity By 2030

Prediction markets are not being manipulated — Kalshi founder

8i Ventures Exits M2P Fintech With 12X Returns

US has 26M strong ‘crypto voting bloc’ ahead of elections — Survey

Elon Musk’s X is changing its privacy policy to allow third parties to train...

59 Cleantech Startups Working Towards Making India Greener

Trump’s crypto website crashed after its WLFI token went on sale

Upcoming Events

Level Up Venture Studio X SP-TBI: Cohort 01 | Mumbai | Mar 20- June 20

Student Entrepreneurship Summit 2025 | Noida | June 20

The Startup Shots | Agra | June 20

BFSI Event & Conference- Finance Event | Finnext Summit | Bengaluru | June 20

The Startup Network | Networking Event - Startups, Aspiring Entrepreneurs, Founders & Professionals | Bengaluru | June 21

StartupNews.fyi

StartupNews.fyi

Google launches LLM to generate videos from text, audio input

Disclaimer

Popular

More Like this

Google launches LLM to generate videos from text, audio input

Disclaimer

More like this

Popular

Upcoming Events

Newsletter Signup Form!