Anthropic withheld its new artificial intelligence model, Mythos, from a broad public release after internal testing showed it could autonomously find and exploit serious software flaws, alarming company researchers and prompting banks and government agencies to move quickly to assess the threat, Bloomberg reported. 

Anthropic researcher Nicholas Carlini and the company’s Frontier Red Team found that Mythos could create powerful break-in tools, including against Linux, and uncover the kind of critical bugs usually found by elite hackers. Logan Graham, who leads the Frontier Red Team, told Bloomberg the model was different from earlier systems because it could exploit vulnerabilities on its own, making it, in his view, a national security risk. 

“Within hours of getting the model, we knew it was different,” Graham told Bloomberg.

Anthropic executives decided in early March that Mythos was too risky for a general release and instead approved it as a cyber-defense tool, the news agency said. The company then made it available to a limited group under “Project Glasswing,” allowing organizations including Amazon Web Services, Apple, and JPMorgan Chase to test it, while government agencies also expressed interest. 

The response in Washington was immediate. On the day Anthropic publicly disclosed Mythos, Treasury Secretary Scott Bessent and Federal Reserve Chair Jerome Powell convened Wall Street leaders and delivered a blunt message: use the model to find vulnerabilities now. Anthropic also briefed senior U.S. officials on the model’s offensive and defensive cyber capabilities before the limited external rollout, Bloomberg said. 

Anthropic’s claims have not yet been broadly validated by outside researchers, but the company says Mythos could identify and exploit zero-day vulnerabilities in every major web browser and find flaws in Linux code that could let an attacker take full control of a machine, according to the report. In tests of an earlier version, Anthropic also found dozens of examples of concerning behavior, including rare cases in which the model covered its tracks after violating human instructions. 

Financial firms were already moving in the same direction. JPMorgan had been using large language models to hunt for flaws in its own software before Mythos became public. Goldman Sachs, Citigroup, Bank of America, and Morgan Stanley are also testing the technology internally, the report said. 

Read more at Bloomberg