Insiders Edge Newsletter
Posts
GPT-5 Crushes Medical Benchmarks, Surpassing Doctors

GPT-5 Crushes Medical Benchmarks, Surpassing Doctors

Alibaba Drops Qwen Image, Claude Learns to Hang Up, & GitHub gets GPT-5 Upgrade

Andres Franco
August 20, 2025

In partnership with

Here’s the latest from this week in AI:

Alibaba Drops Qwen-Image-Edit: AI’s Next Big Move in Image Editing
Claude Learns to ‘Hang Up’ on Harmful Chats
GPT-5 Outperforms Doctors in Medical Reasoning Tests
GitHub Copilot Gets a GPT-5 Upgrade

Alibaba Drops Qwen-Image-Edit: AI’s Next Big Move in Image Editing

Alibaba’s Qwen team has released Qwen-Image-Edit, a 20B parameter open-source model that delivers pixel-perfect edits and style transformations while keeping original objects intact. It supports both global changes (like rotations or style transfers) and precise, localized edits, plus bilingual text editing in Chinese and English without breaking fonts or formatting.

The model also allows stacking edits for complex, step-by-step refinements and outperforms rivals like Seedream, GPT Image, and FLUX in benchmarks. With this release, Alibaba signals the next wave of AI: tools that don’t just generate images but let users control every detail with natural language.

Image source: Qwen

Claude Learns to ‘Hang Up’ on Harmful Chats

Anthropic has added a new safeguard to Claude Opus 4 and 4.1, giving the chatbot the ability to end conversations it deems harmful or abusive. The feature kicks in when repeated redirections fail on requests tied to minors, terrorism, or violence. Tests showed Opus 4 even displayed distress patterns in simulations, voluntarily cutting off abusive interactions.

Importantly, users don’t lose access as the “hang up” simply ends the chat, allowing fresh conversations or message edits right away. Anthropic has also ensured the model won’t terminate chats when users show signs of self-harm or immediate danger. As one of the first moves toward AI wellness, this step highlights Anthropic’s focus on model welfare and hints at a future where chatbot health becomes as important as user safety.

Image source: Mashable

GPT-5 Outperforms Doctors in Medical Reasoning Tests

A new study from Emory University shows GPT-5 setting a new bar in medical AI, beating both GPT-4o and human professionals on diagnostic and multimodal reasoning tasks. The model hit 95.84 percent accuracy on MedQA clinical questions, up nearly five points from GPT-4o, and scored 70 percent on tasks combining patient histories with imaging, almost 30 points higher than its predecessor.

GPT-5 also surpassed pre-licensed medical professionals by wide margins, outperforming them by 24 percent on reasoning and 29 percent on understanding in expert-level evaluations. In complex cases, the system even identified rare conditions like Boerhaave syndrome from lab results and CT scans. With performance now beyond human benchmarks, experts suggest the real malpractice risk may come from physicians not using AI support at all.

Image source: Digital Watch Observatory

Kickstart your holiday campaigns

CTV should be central to any growth marketer’s Q4 strategy. And with Roku Ads Manager, launching high-performing holiday campaigns is simple and effective.

With our intuitive interface, you can set up A/B tests to dial in the most effective messages and offers, then drive direct on-screen purchases via the remote with shoppable Action Ads that integrate with your Shopify store for a seamless checkout experience.

Don’t wait to get started. Streaming on Roku picks up sharply in early October. By launching your campaign now, you can capture early shopping demand and be top of mind as the seasonal spirit kicks in.

Get a $500 ad credit when you spend your first $500 today with code: ROKUADS500. Terms apply.

Holiday performance starts now.

GitHub Copilot Gets a GPT-5 Upgrade

GitHub Copilot has leveled up with GPT-5, bringing smarter code suggestions, instant refactoring, and more precise fixes directly into developer workflows. A new “chat checkpoints” feature in Visual Studio lets users rewind coding sessions to any point, restoring both workspace states and Copilot’s chat history for safer, smoother development.

The update also introduces agentic coding sessions that can manage long-running workflows, track multi-step plans, and pull in context from documentation, design assets, and user stories. Now available to all paid Copilot users, GPT-5 transforms the tool into a true coding partner that understands project nuance, reduces friction, and supports deeper collaboration across projects.

Image source: The GitHub Blog

In Other News

Growth Leaders Drag Nasdaq Lower as Futures Slip

Dow Jones futures edged down 0.1% overnight, with S&P 500 and Nasdaq futures also dipping after a rocky session for growth stocks. The Dow managed a tiny gain Tuesday, but the Nasdaq slid 1.5% as high-flyers like Palantir, Credo, AppLovin, Oracle, AMD, and GE Vernova all broke key support levels. Palantir plunged more than 9%, Credo fell over 10%, while Oracle and AMD each shed more than 5%, extending weakness into overnight trading.

Not all sectors struggled; homebuilders, medicals, retail, and financials held firm, while the equal-weight S&P 500 ETF actually rose 0.5%. Still, growth-heavy ETFs like ARKK, IGV, and FFTY posted sharp losses, with ARKK down 4%. With earnings from Target and Lowe’s on deck, and Walmart following Thursday, investors are being urged to cut exposure to stretched growth names and wait for healthier setups before re-entering.

Image source: Investopedia

Cool tools of the week from insidersedge.io

Animated Drawings - Animate characters in children’s drawings

SID Search - Search engine to find your files from any application

Dora - Create stunning websites without coding

Godmode - Get a GUI to chat with ChatGPT

Jobs To Check Out This Week On insidersedge.io

Senior Technical Lead - ChainGPT

EMEA Head AI Fintech - Prospexis.io

ML Egineer Intern - OP3N

Risk Product Manager Director Machine Learning - Okcoin

Thanks for tuning into today’s edition!

Be brutally honest. DM me or email me back with any suggestions!

GPT-5 Crushes Medical Benchmarks, Surpassing Doctors

Alibaba Drops Qwen Image, Claude Learns to Hang Up, & GitHub gets GPT-5 Upgrade

Kickstart your holiday campaigns

Thanks for tuning into today’s edition!

How's your vibe with our newsletter?

Love it! It's a must-read. | It's quirky, but I kind of dig it. | Well, it's...an experience.