Overview
Simon Willison compares two major AI model releases using his informal "pelican riding a bicycle" drawing benchmark. Qwen3.6-35B-A3B, running locally on a laptop, outperformed the cloud-based Claude Opus 4.7 in generating coherent SVG illustrations of animals on vehicles.
Key Facts
- Qwen3.6-35B-A3B runs locally via 20.9GB quantized model on MacBook Pro M5 - enables high-quality AI image generation without internet dependency
- Claude Opus 4.7 struggled with basic bicycle frame structure in multiple attempts - suggests even premium cloud models can fail at spatial reasoning tasks
- Qwen model added creative flourishes like sunglasses and bowtie to flamingo illustration - demonstrates superior creative interpretation beyond basic prompt following
- Local model outperformed cloud-based premium service in visual generation tasks - challenges assumption that cloud models are always superior
Why It Matters
This comparison suggests that local AI models are reaching competitive quality levels with premium cloud services, potentially shifting the landscape toward on-device AI capabilities for creative tasks.