News
Malicious use is one thing, but there's also increased potential for Anthropic's new models going rogue. In the alignment section of Claude 4's system card, Anthropic reported a sinister discovery ...
The testing found the AI was capable of "extreme actions" if it thought its "self-preservation" was threatened.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results