LLMs show a “highly unreliable” capacity to describe their own internal processes

Anthropic finds some LLM “self-awareness,” but “failures of introspection remain the norm.”

If you ask an LLM to explain its own reasoning process, it may well simply confabulate a plausible-sounding explanation for its actions based on text found in its training data. To get around this problem, Anthropic is expanding on its previous research into AI interpretability with a new study that aims to measure LLMs’ actual so-called “introspective awareness” of their own inference processes.

The full paper on “Emergent Introspective Awareness in Large Language Models” uses some interesting methods to separate out the metaphorical “thought process” represented by an LLM’s artificial neurons from simple text output that purports to represent that process. In the end, though, the research finds that current AI models are “highly unreliable” at describing their own inner workings and that “failures of introspection remain the norm.”

Inception, but for AI

Anthropic’s new research is centered on a process it calls “concept injection.” The method starts by comparing the model’s internal activation states following both a control prompt and an experimental prompt (e.g. an “ALL CAPS” prompt versus the same prompt in lower case). Calculating the differences between those activations across billions of internal neurons creates what Anthropic calls a “vector” that in some sense represents how that concept is modeled in the LLM’s internal state.

Read full article

Comments

Trump on why he pardoned Binance CEO: “Are you ready? I don’t know who he is.”

Trump family business could benefit from pardon of crypto ex-con Changpeng Zhao.

President Trump says he still doesn’t know who Binance founder and former CEO Changpeng Zhao is, despite having pardoned Zhao last month.

CBS correspondent Norah O’Donnell asked Trump about the pardon in a 60 Minutes interview that aired yesterday, noting that Zhao pleaded guilty to violating anti-money laundering laws. “The government at the time said that C.Z. had caused ‘significant harm to US national security,’ essentially by allowing terrorist groups like Hamas to move millions of dollars around. Why did you pardon him?” O’Donnell asked.

“Okay, are you ready? I don’t know who he is. I know he got a four-month sentence or something like that. And I heard it was a Biden witch hunt,” answered Trump, who has criticized his predecessor for signing pardons with an autopen.

Read full article

Comments

Google removes Gemma models from AI Studio after GOP senator’s complaint

Sen. Marsha Blackburn says Gemma concocted sexual misconduct allegations against her.

You may be disappointed if you go looking for Google’s open Gemma AI model in AI Studio today. Google announced late on Friday that it was pulling Gemma from the platform, but it was vague about the reasoning. The abrupt change appears to be tied to a letter from Sen. Marsha Blackburn (R-Tenn.), who claims the Gemma model generated false accusations of sexual misconduct against her.

Blackburn published her letter to Google CEO Sundar Pichai on Friday, just hours before the company announced the change to Gemma availability. She demanded Google explain how the model could fail in this way, tying the situation to ongoing hearings that accuse Google and others of creating bots that defame conservatives.

At the hearing, Google’s Markham Erickson explained that AI hallucinations are a widespread and known issue in generative AI, and Google does the best it can to mitigate the impact of such mistakes. Although no AI firm has managed to eliminate hallucinations, Google’s Gemini for Home has been particularly hallucination-happy in our testing.

Read full article

Comments

Dasung’s new tablets have E Ink displays with 50 Hz refresh rate

Most of the companies that use E Ink displays for consumer electronics put them into devices made for reading eBooks. But Chinese company Dasung has carved out a niche for itself by specializing in E Ink monitors, including some models with special fea…

Most of the companies that use E Ink displays for consumer electronics put them into devices made for reading eBooks. But Chinese company Dasung has carved out a niche for itself by specializing in E Ink monitors, including some models with special features like the ability to push frame rates that rival what you’d expect […]

The post Dasung’s new tablets have E Ink displays with 50 Hz refresh rate appeared first on Liliputing.

OpenAI signs massive AI compute deal with Amazon

Deal will provide access to hundreds of thousands of Nvidia chips that power ChatGPT.

On Monday, OpenAI announced it has signed a seven-year, $38 billion deal to buy cloud services from Amazon Web Services to power products like ChatGPT and Sora. It’s the company’s first big computing deal after a fundamental restructuring last week that gave OpenAI more operational and financial freedom from Microsoft.

The agreement gives OpenAI access to hundreds of thousands of Nvidia graphics processors to train and run its AI models. “Scaling frontier AI requires massive, reliable compute,” OpenAI CEO Sam Altman said in a statement. “Our partnership with AWS strengthens the broad compute ecosystem that will power this next era and bring advanced AI to everyone.”

OpenAI will reportedly use Amazon Web Services immediately, with all planned capacity set to come online by the end of 2026 and room to expand further in 2027 and beyond. Amazon plans to roll out hundreds of thousands of chips, including Nvidia’s GB200 and GB300 AI accelerators, in data clusters built to power ChatGPT’s responses, generate AI videos, and train OpenAI’s next wave of models.

Read full article

Comments

Capitol Hill is abuzz with talk of the “Athena” plan for NASA

The Athena plan lays out a blueprint for Isaacman’s tenure at NASA.

In recent weeks, copies of an intriguing policy document have started to spread among space lobbyists on Capitol Hill in Washington, DC. The document bears the title “Athena,” and it purports to summarize the actions that private astronaut Jared Isaacman would have taken, were his nomination to become NASA administrator confirmed.

The 62-page plan is notable both for the ideas to remake NASA that it espouses as well as the manner in which it has been leaked to the space community.

After receiving a copy of this plan from an industry official, I spoke with multiple sources over the weekend to understand what is happening. Based upon this reporting there are clearly multiple layers to the story, which I want to unpack.

Read full article

Comments