🔬

AI Research

Latest research and breakthroughs

🔬

AI Research

Article preview

🔬AI Research

Terence Tao - (one of the world's greatest mathematicians)

🔔 AI Assisted Breakthrough, this time from Terence Tao - (one of the world's greatest mathematicians) He had a chat with an AI to tackle a MathOverflow question about whether the least common multiple sequence is a subset of a certain set. While his theory suggested a negative answer, he needed specific numerical parameters for a counterexample. He first asked the AI for Python code to search for one, but ran into runtime issues and poor choices that complicated things. Switching gears, he opted for an interactive, step-by-step approach, letting the AI do some clever calculations to find viable parameters. This led to workable values, which he double-checked with a quick, easy-to-audit Python script that the AI helped generate. Thanks to the AI, he saved a ton of time, catching several math mistakes along the way and turning what could have been hours of coding and debugging into a much quicker process!

AI
Admin
🔬

AI Research

Article preview

🔬AI Research

OpenAI has launched

OpenAI has launched a new feature called Instant Checkout, which allows purchasing products directly in ChatGPT. It is based on the open-source Agentic Commerce Protocol (ACP), created in collaboration with Stripe. Currently, the feature is only available in the United States for purchases on Etsy, with plans to add Shopify soon.

AI
Admin
Anthropic Launches Claude Sonnet 4.5
🔬AI Research

Anthropic Launches Claude Sonnet 4.5

Anthropic Launches Claude Sonnet 4.5: The New Gold Standard for Autonomous AI Agents Anthropic has just released Claude Sonnet 4.5, setting a new benchmark for practical AI in coding, agent autonomy, and real-world computer use. The model shattered previous records by achieving a staggering 77.2% accuracy on SWE-bench Verified—outperforming both GPT-5 Codex (74.5%) and GPT-5 (72.8%), and showing a 45% leap in the OSWorld computer task benchmark, now at 61.4%. Sonnet 4.5 boasts 30+ hours of continuous autonomous operation without losing thread, eclipsing the previous Opus 4 limit of around 7 hours. Its “extended thinking mode” delivers industry-leading performance on long, complex workflows in coding, finance, law, medicine, and STEM research. On complex finance benchmarks, the model notched up to 72% accuracy. The power to handle 1 million tokens of context in beta means users can feed it book-length documents or sprawling datasets in a single session. Integration is seamless: Claud

AI
Admin