Elon Musk’s xAI is working on making Grok multimodal

Share via:


Elon Musk’s AI company, xAI, is making progress on adding multimodal inputs to its Grok chatbot, according to public developer documents. What this means is that, soon, users may be able to upload photos to Grok and receive text-based answers.

This was first teased in a blog post last month from xAI which said Grok-1.5V will offer “multimodal models in a number of domains.” The latest update to the developer documents appear to show progress on shipping a new model.

In the developer documents, a sample Python script demonstrates how developers can use the xAI software development kit library to generate a response based on both text and images. This script reads an image file, sets up a text prompt, and uses the xAI SDK to generate a response.

This is a big update for Grok, which xAI first released in November 2023 and is available to users who pay for the X Premium Plus subscription. The last update was Grok 1.5 in March, which came with improved reasoning capabilities.

The model is trained “on a variety of text data from publicly available sources from the Internet up to Q3 2023 and data sets reviewed and curated by … human reviewers,” according to a blog post from X. Grok-1 was not trained on X data (including public X posts), the blog added. However, Grok does have “real-time knowledge of the world,” including posts on X.

xAI, founded by Elon Musk in March 2023, is relatively new in the AI field and trails behind competitors such as OpenAI’s ChatGPT. However, according to a blog post from xAI, their Grok 1.5 model is closing the gap with GPT-4 on various benchmarks that span a wide range of grade school to high school competition problems. It’s important to note that benchmarks for large language models are often criticized because the models can perform well on benchmarks if those benchmarks are included in their training data. It’s sort of like memorizing test answers, rather than actually learning the material.

Multimodal conversational chatbots seem to be the next frontier for AI, with multiple advancements announced at Google I/O and OpenAI releasing GPT-4o, so Grok lacking multimodal capabilities has put it behind the curve — until now.



Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Popular

More Like this

Elon Musk’s xAI is working on making Grok multimodal


Elon Musk’s AI company, xAI, is making progress on adding multimodal inputs to its Grok chatbot, according to public developer documents. What this means is that, soon, users may be able to upload photos to Grok and receive text-based answers.

This was first teased in a blog post last month from xAI which said Grok-1.5V will offer “multimodal models in a number of domains.” The latest update to the developer documents appear to show progress on shipping a new model.

In the developer documents, a sample Python script demonstrates how developers can use the xAI software development kit library to generate a response based on both text and images. This script reads an image file, sets up a text prompt, and uses the xAI SDK to generate a response.

This is a big update for Grok, which xAI first released in November 2023 and is available to users who pay for the X Premium Plus subscription. The last update was Grok 1.5 in March, which came with improved reasoning capabilities.

The model is trained “on a variety of text data from publicly available sources from the Internet up to Q3 2023 and data sets reviewed and curated by … human reviewers,” according to a blog post from X. Grok-1 was not trained on X data (including public X posts), the blog added. However, Grok does have “real-time knowledge of the world,” including posts on X.

xAI, founded by Elon Musk in March 2023, is relatively new in the AI field and trails behind competitors such as OpenAI’s ChatGPT. However, according to a blog post from xAI, their Grok 1.5 model is closing the gap with GPT-4 on various benchmarks that span a wide range of grade school to high school competition problems. It’s important to note that benchmarks for large language models are often criticized because the models can perform well on benchmarks if those benchmarks are included in their training data. It’s sort of like memorizing test answers, rather than actually learning the material.

Multimodal conversational chatbots seem to be the next frontier for AI, with multiple advancements announced at Google I/O and OpenAI releasing GPT-4o, so Grok lacking multimodal capabilities has put it behind the curve — until now.



Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

More like this

Mphasis: A third of Mphasis’ deal pipeline is AI-led:...

Mid-tier IT firm Mphasis, which on Thursday night...

TikTok introduces feature that lets you find songs by...

You’re probably familiar with Shazam, an Apple-owned app...

SEC approves Grayscale Bitcoin Mini Trust for Trading on...

Grayscale must await final regulatory signoff on its...

Popular

Upcoming Events

Startup Information that matters. Get in your inbox Daily!