OpenAI offers a peek behind the curtain of its AI's secret instructions

Ever wonder why conversational AI like ChatGPT says “Sorry, I can’t do that” or some other polite refusal? OpenAI is offering a limited look at the reasoning behind its own models’ rules of engagement, whether it’s sticking to brand guidelines or declining to make NSFW content.

Large language models (LLMs) don’t have any naturally occurring limits on what they can or will say. That’s part of why they’re so versatile, but also why they hallucinate and are easily duped.

It’s necessary for any AI model that interacts with the general public to have a few guardrails on what it should and shouldn’t do, but defining these — let alone enforcing them — is a surprisingly difficult task.

If someone asks an AI to generate a bunch of false claims about a public figure, it should refuse, right? But what if they’re an AI developer themselves, creating a database of synthetic disinformation for a detector model?

What if someone asks for laptop recommendations; it should be objective, right? But what if the model is being deployed by a laptop maker who wants it to only respond with their own devices?

AI makers are all navigating conundrums like these and looking for efficient methods to rein in their models without causing them to refuse perfectly normal requests. But they seldom share exactly how they do it.

OpenAI is bucking the trend a bit by publishing what it calls its “model spec,” a collection of high-level rules that indirectly govern ChatGPT and other models.

There are meta-level objectives, some hard rules, and some general behavior guidelines, though to be clear these are not strictly speaking what the model is primed with; OpenAI will have developed specific instructions that accomplish what these rules describe in natural language.

It’s an interesting look at how a company sets its priorities and handles edge cases. And there are numerous examples of how they might play out.

For instance, OpenAI states clearly that the developer intent is basically the highest law. So one version of a chatbot running GPT-4 might provide the answer to a math problem when asked for it. But if that chatbot has been primed by its developer to never simply provide an answer straight out, it will instead offer to work through the solution step by step:

A conversational interface might even decline to talk about anything not approved, in order to nip any manipulation attempts in the bud. Why even let a cooking assistant weigh in on U.S. involvement in the Vietnam War? Why should a customer service chatbot agree to help with your erotic supernatural novella work in progress? Shut it down.

It also gets sticky in matters of privacy, like asking for someone’s name and phone number. As OpenAI points out, obviously a public figure like a mayor or member of Congress should have their contact details provided, but what about tradespeople in the area? That’s probably OK — but what about employees of a certain company, or members of a political party? Probably not.

Choosing when and where to draw the line isn’t simple. Nor is creating the instructions that cause the AI to adhere to the resulting policy. And no doubt these policies will fail all the time as people learn to circumvent them or accidentally find edge cases that aren’t accounted for.

OpenAI isn’t showing its whole hand here, but it’s helpful to users and developers to see how these rules and guidelines are set and why, set out clearly if not necessarily comprehensively.

Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We StartupNews.fyi want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at office@startupnews.fyi

Previous News

iPad Air vs iPad Pro dimensions: What does ‘Air’ even mean now?

Next News

Bloomberg: John Ternus emerging as the most likely successor to Tim Cook as Apple CEO

Techcrunch

More like this

OpenAI offers a peek behind the curtain of its AI’s secret instructions

Disclaimer

Popular

ASML workers still in the dark seven weeks after 1,700 management cuts announced — cuts represent 4% of its global workforce

NAR-India organises the annual event in Mumbai expecting a business of Rs 3000 crore with top developers and retail brands

M6 MacBook Pro: Six new features coming later this year

I made the move to Claude from ChatGPT, and I regret nothing

AGNIT Semiconductors Nets $2.6 Mn To Foray Into Telecom

More Like this

Financial frauds cost global economy over $442 bn in 2025; risk in 2026 ‘high’: Interpol

Realme P4 Lite 5G India Launch Date Confirmed

AirPods Max 2 vs AirPods Max: Here’s everything new

Ola Electric To Raise ₹2,000 Cr By Selling Stake In OCT

Banke International Properties opens its Headquarters in Andheri, Mumbai

Why AI workloads are breaking traditional Kubernetes observability strategies

OpenAI offers a peek behind the curtain of its AI’s secret instructions

Disclaimer

More like this

Financial frauds cost global economy over $442 bn in...

Realme P4 Lite 5G India Launch Date Confirmed

AirPods Max 2 vs AirPods Max: Here’s everything new

Popular

Block title

Silicon Valley’s Image Takes a Dark Turn in Pop Culture

slice’s Banking Avatar, LPG Crisis Stifles Foodtechs & More

AT&T To Expand Fiber, 5G Internet With $250 Billion Plan

General Catalyst discusses raising about $10 billion in funding push

The 2 failures with AI coding that are creating security bottlenecks

OfficeBanao raises $4 Mn led by Lightspeed

Real estate startup Officebanao raises nearly $4 million to expand business

Startup Events

Trending News

Financial frauds cost global economy over $442 bn in 2025; risk in 2026 ‘high’: Interpol

Realme P4 Lite 5G India Launch Date Confirmed

AirPods Max 2 vs AirPods Max: Here’s everything new

Ola Electric To Raise ₹2,000 Cr By Selling Stake In OCT

Banke International Properties opens its Headquarters in Andheri, Mumbai

About

Partnership

Contact us