🔬AI Research
Anthropic Launches Claude Sonnet 4.5
A
AdminAnthropic Launches Claude Sonnet 4.5: The New Gold Standard for Autonomous AI Agents
Anthropic has just released Claude Sonnet 4.5, setting a new benchmark for practical AI in coding, agent autonomy, and real-world computer use. The model shattered previous records by achieving a staggering 77.2% accuracy on SWE-bench Verified—outperforming both GPT-5 Codex (74.5%) and GPT-5 (72.8%), and showing a 45% leap in the OSWorld computer task benchmark, now at 61.4%.
Sonnet 4.5 boasts 30+ hours of continuous autonomous operation without losing thread, eclipsing the previous Opus 4 limit of around 7 hours. Its “extended thinking mode” delivers industry-leading performance on long, complex workflows in coding, finance, law, medicine, and STEM research. On complex finance benchmarks, the model notched up to 72% accuracy. The power to handle 1 million tokens of context in beta means users can feed it book-length documents or sprawling datasets in a single session.
Integration is seamless: Claud
Related Topics
AI ResearchTechnologyInnovation