Wait… Did OpenAI Just Solve ‘Jagged Intelligence’?

Today, OpenAI came up with its latest update featuring Structured Outputs in its API, which claims to enhance the model’s reasoning by ensuring precise and consistent adherence to output schemas. This is as demonstrated by gpt-4o-2024-08-06, achieving “100% reliability” in our evals, perfectly matching the output schemas, ensuring accurate and consistent data generation.

Funnily, the OpenAI docs include the problem of “9.11 > 9.9” as an example resolved using JSON structured output to distinguish between final answers and supporting reasoning. It was in reference to the term ‘Jagged Intelligence’ coined by Andrej Karpathy for LLMs’ struggle with dumb problems.

This new feature helps ensure that the responses from models follow a specific set of rules (called JSON Schemas) provided by developers. OpenAI said that they took a deterministic, engineering-based approach to constrain the model’s outputs to achieve 100% reliability.

“OpenAI finally rolled out structured outputs in JSON, which means you can now enforce your model outputs to stick to predefined schemas. This is super handy for tasks like validating data formats on the fly, automating data entry, or even building UIs that dynamically adjust based on user input,”posted a user on X.

OpenAI has used the technique of constrained decoding. Normally, when a model generates text, it can choose any word or token from its vocabulary. This freedom can lead to mistakes, such as adding incorrect characters or symbols.

Constrained decoding is a technique used to prevent these mistakes by limiting the model’s choices to tokens that are valid according to a specific set of rules (like a JSON Schema).

A Stop-Gap Mechanism for Reasoning?

Arizona State University, Prof Subbarao Kambhampati argues that while LLMs are impressive tools for creative tasks, they have fundamental limitations in logical reasoning and cannot guarantee the correctness of their outputs.

He said that GPT-3, GPT-3.5, and GPT-4 are poor at planning and reasoning, which he believes involves time and action. These models struggle with transitive and deductive closure, with the latter involving the more complex task of deducing new facts from existing ones.

Kambhampati aligns with Meta AI chief Yann LeCun, who believes that LLMs won’t lead to AGI and that researchers should focus on gaining animal intelligence first.

“Current LLMs are trained on text data that would take 20,000 years for a human to read. And still, they haven’t learned that if A is the same as B, then B is the same as A,” Lecun said.

He has even advised young students not to work on LLMs. LeCun is bullish on self-supervised learning and envisions a world model that could learn independently.