Overview
Google is internally testing a new AI model codenamed “Snow Bunny” that could be Gemini 3.5 or the GA version of Gemini 3 Pro. The model demonstrates unprecedented coding capabilities, generating complete operating systems, functional web applications, and complex interfaces that are nearly indistinguishable from real software.
Key Takeaways
- AI coding capabilities are reaching a point where generated software becomes indistinguishable from human-created applications - complete operating systems with functional browsers and apps can be created in single prompts
- The quality gap between AI models is widening rapidly - newer internal models significantly outperform even recent frontier models in complex coding tasks and creative problem-solving
- Detailed prompting is becoming crucial for optimal results - the specificity of your instructions directly correlates with output quality when generating complex interfaces and applications
- AI is transitioning from simple code generation to complete system architecture - models can now handle structure, styling, and interactivity simultaneously rather than requiring separate steps for each component
- The benchmark performance improvements suggest lateral reasoning capabilities are advancing faster than traditional metrics indicate - practical coding tests reveal larger capability gaps than standard evaluations
Topics Covered
- 0:00 - Introduction to Google’s Internal Model Testing: Overview of Google testing a new LLM internally, possibly Gemini 3.5 or Gemini 3 Pro GA version
- 0:30 - Gemini 3 Pro Baseline Performance: Demonstration of current Gemini 3 Pro creating a browser-based operating system with functional features
- 1:00 - Snow Bunny’s MacOS Recreation: New internal model generates a complete MacOS clone with functional Safari, apps, and precise UI components
- 2:00 - Snow Bunny Model Overview: Introduction to the ‘Snow Bunny’ codename and its advanced coding capabilities compared to other models
- 3:00 - Sponsor Segment: PostHog product analytics and observability platform demonstration
- 4:00 - Benchmark Comparisons: Leo’s heroglyph benchmark showing Snow Bunny outperforming other frontier models including GPT and Claude
- 5:00 - Candle Animation Test: Comparison of different AI models creating realistic candle flame animations with melting effects
- 6:30 - Frontend Generation Capabilities: Snow Bunny creating complete frontend layouts and landing pages with detailed prompting
- 7:30 - 3D and Game Development: Examples of Eiffel Tower in voxels and Game Boy Color emulator with built-in mini games
- 9:00 - Access and Future Implications: Potential availability through Google AI Studio and discussion of AI-generated content becoming indistinguishable