IndustryThe Decoder·July 3, 2026

GPT and Claude failed Bridgewater's finance tests because the right answers were never public

Bridgewater and Thinking Machines Lab—the startup from former OpenAI CTO Mira Murati—have fine-tuned a Qwen3-235B model for financial tasks. According to their own testing, the model hits 84.7 percent accuracy, beating Gemini, Claude, and GPT at roughly one-fourteenth of the cost. The numbers haven't been verified by anyone outside the two companies, though. The article GPT and Claude failed…

This is a summary curated by AIFuture. Read the complete article at the original source:

Read the full story on The Decoder

More AI News

AI ResearchMarkTechPost

How to Build an End-to-End OCR Pipeline with Baidu’s Unlimited-OCR for High-Resolution Images and Multi-Page PDF Parsing

In this tutorial, we build a complete workflow for running Baidu’s Unlimited-OCR model on document images and multi-page PDFs. From configuring the GPU environment to comparing high-detail tiled Gundam inference and faster Base modes, you'll learn how to process dense layouts, tables, and cross-page content in a reproducible, end-to-end pipeline. The post How to Build an End-to-End OCR Pipeline…

Jul 24, 2026

AI ResearchTechCrunch

How AI guardrails are impeding the work of offensive cybersecurity researchers

We spoke with several cybersecurity researchers, who look for unknown vulnerabilities and develop tools to exploit them, about how OpenAI’s and Anthropic’s guardrails affect their work.

Jul 24, 2026

ProductThe Verge

Alexa Plus is getting an AI update to handle more complicated instructions

Amazon is launching an update to its Alexa Plus assistant that will allow it to connect to smart home devices in new ways. With the update, which is currently in preview, Alexa Plus can link up with tech from Bosch, Delta, Ecovacs, iRobot, Yale Home, Whirlpool, Tapo, Eufy, and others, while automatically routing requests to the correct device. In an example shared by Amazon, a person with a…

Jul 23, 2026