Microsoft unveils AI model that understands image content and solves visual puzzles

Share via:

Microsoft unveiled Kosmos-1, a multimodal model capable of analysing images for content, solving visual puzzles, performing visual text recognition, passing visual IQ tests, and understanding natural language instructions.

The researchers believe that multimodal AI—which integrates different modes of input such as text, audio, images, and video—is a critical step towards developing AGI that can perform general tasks at the level of a human. “Language Is Not All You Need: Aligning Perception with Language Models,” the researchers write in their academic paper, “is a necessity to achieve artificial general intelligence, in terms of knowledge acquisition and grounding to the real world.”

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Popular

More Like this

Microsoft unveils AI model that understands image content and solves visual puzzles

Microsoft unveiled Kosmos-1, a multimodal model capable of analysing images for content, solving visual puzzles, performing visual text recognition, passing visual IQ tests, and understanding natural language instructions.

The researchers believe that multimodal AI—which integrates different modes of input such as text, audio, images, and video—is a critical step towards developing AGI that can perform general tasks at the level of a human. “Language Is Not All You Need: Aligning Perception with Language Models,” the researchers write in their academic paper, “is a necessity to achieve artificial general intelligence, in terms of knowledge acquisition and grounding to the real world.”

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

More like this

You Might Not Know This, but Your NAS Might...

Recently, I procured a Zettlab AI NAS. This...

WhatsApp beta for iOS 26.2.10.70: what’s new?

WhatsApp has released a new iOS update through...

Indian Institute of Creative Skills Partners with All India...

Pooja Arora (COO, MESC), Roland Landers, Chair, All...

Popular

Upcoming Events

ICAR: Integrating transparency and merit in the selection process

Agricultural research is the backbone of any nation’s...

Apple @ Work: Password Utility solves the FileVault reboot...

Apple @ Work is exclusively brought to you...
best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv best iptv