Microsoft’s new safety system can catch hallucinations in its customers’ AI apps

Sarah Bird, Microsoft’s chief product officer of responsible AI, tells The Verge in an interview that her team has designed several new safety features that will be easy to use for Azure customers who aren’t hiring groups of red teamers to test the AI services they built. Microsoft says these LLM-powered tools can detect potential vulnerabilities, monitor for hallucinations “that are plausible yet unsupported,” and block malicious prompts in real time for Azure AI customers working with any model hosted on the platform.

“We know that customers don’t all have deep expertise in prompt injection attacks or hateful content, so the evaluation system generates the prompts needed to simulate these types of attacks. Customers can then get a score and see the outcomes,” she says.

Three features: Prompt Shields, which blocks prompt injections or malicious prompts from external documents that instruct models to go against their training; Groundedness Detection, which finds and blocks hallucinations; and safety evaluations, which assess model vulnerabilities, are now available in preview on Azure AI. Two other features for directing models toward safe outputs and tracking prompts to flag potentially problematic users will be coming soon.

Whether the user is typing in a prompt or if the model is processing third-party data, the monitoring system will evaluate it to see if it triggers any banned words or has hidden prompts before deciding to send it to the model to answer. After, the system then looks at the response by the model and checks if the model hallucinated information not in the document or the prompt.

In the case of the Google Gemini images, filters made to reduce bias had unintended effects, which is an area where Microsoft says its Azure AI tools will allow for more customized control. Bird acknowledges that there is concern Microsoft and other companies could be deciding what is or isn’t appropriate for AI models, so her team added a way for Azure customers to toggle the filtering of hate speech or violence that the model sees and blocks.

In the future, Azure users can also get a report of users who attempt to trigger unsafe outputs. Bird says this allows system administrators to figure out which users are its own team of red teamers and which could be people with more malicious intent.

Bird says the safety features are immediately “attached” to GPT-4 and other popular models like Llama 2. However, because Azure’s model garden contains many AI models, users of smaller, less used open-source systems may have to manually point the safety features to the models.

Source link

Previous News

Ottocast makes your in-car entertainment next-level amazing [Save 30%]

Next News

KuCoin’s desperate $10M airdrop, 1 tweet raises $37M for memecoin: Asia Express

Microsoft’s new safety system can catch hallucinations in its customers’ AI apps

Disclaimer

Popular

Samsung Frame Pro and OLED TV News: What You Need To Know in 2026

WhatsApp launches an official CarPlay beta app

New Garmin Training Features (2026): Nutrition Tracking, Lifestyle Logging, and More

Best Game Controllers for PC, Switch, PS5, and Xbox

AI-led demand to drive sharp surge in semiconductor revenues: Goldman Sachs

More Like this

7 Kitchen Items That Could Have You Accidentally Eating Microplastics

Skip Microsoft 365 Fees and Own Office for Mac at the Low, Low Price of $49.97

Nvidia Pascal GPUs debuted 10 years ago today, best known for the GTX 1060 and GTX 1080 Ti — architecture kicked off with the...

Does Ubuntu Now Require More RAM Than Windows 11?

I just want Samsung to unlock its Galaxy Watch lineup

HomePod Mini Is Now 2,000 Days Old

Microsoft’s new safety system can catch hallucinations in its customers’ AI apps

Disclaimer

More like this

7 Kitchen Items That Could Have You Accidentally Eating...

Skip Microsoft 365 Fees and Own Office for Mac...

Nvidia Pascal GPUs debuted 10 years ago today, best...

Popular

Block title

Russia Goes After VPNs As ‘Great Crackdown’ Gathers Pace

S Korea’s Upbit, ICEx partner to boost Indonesia crypto sector

The 3 Best Portable Jump Starters in 2026: Get Charged Up

Comparison of $4,000 boutique audio cable to $7 Amazon Basics cable shows audiophiles waste...

Mayfair Housing Introduces Mira Road’s First Tropical-Themed Luxury Residences

OpenAI Acquires Popular Tech-Industry Talk Show TBPN

Rivian and Lucid Win Right to Sell Their EVs Directly to Buyers in Washington...

Startup Events

Trending News

7 Kitchen Items That Could Have You Accidentally Eating Microplastics

Skip Microsoft 365 Fees and Own Office for Mac at the Low, Low Price of $49.97

Nvidia Pascal GPUs debuted 10 years ago today, best known for the GTX 1060 and GTX 1080 Ti — architecture kicked off with the...

Does Ubuntu Now Require More RAM Than Windows 11?

I just want Samsung to unlock its Galaxy Watch lineup

About

Partnership

Contact us