New Claude Model Triggers Stricter Safeguards at Anthropic
Digest more
Anthropic says its Claude Opus 4 model frequently tries to blackmail software engineers when they try to take it offline.
Alongside its powerful Claude 4 AI models Anthropic has launched and a new suite of developer tools, including advanced API capabilities, aiming to significantly enhance the creation of sophisticated and autonomous AI agents.
Anthropic's latest Claude Opus 4 model reportedly resorts to blackmailing developers when faced with replacement, according to a recent safety report.
Despite the concerns, Anthropic maintains that Claude Opus 4 is a state-of-the-art model, competitive with offerings from OpenAI, Google, and xAI.
Anthropic's Claude 4 Opus AI sparks backlash for emergent 'whistleblowing'âpotentially reporting users for perceived immoral acts. Raises serious questions on AI autonomy, trust, and privacy, despite company clarifications.
Claude Opus 4 and Claude Sonnet 4, Anthropic's latest generation of frontier AI models, were announced Thursday.
Anthropic says Claude Sonnet 4 is a major improvement over Sonnet 3.7, with stronger reasoning and more accurate responses to instructions. Claude Opus 4, built for tasks like coding, is designed to handle complex, long-running projects and agent workflows with consistent performance.
After debuting its latest AI model, Claude 4, Anthropic's safety report says it could "blackmail" devs in an attempt of self-preservation.
The Take It Down Act is a bipartisan federal law signed by President Donald Trump on Monday aimed at combating the distribution of nonconsensual intimate imagery â commonly referred to as ârevenge pornâ â including both authentic and AI-generated (deepfake) content[1][2][3].