Google’s new Gemini 3 Pro is a great model

November 18, 2025 / Max Weinbach

Today, Google announced Gemini 3 as well as a bunch of new product updates powered by Gemini 3 Pro! We’ll get into those in a moment, because I want to talk about my experience using Gemini 3 Pro, codename riftrunner, as well as Google’s new developer IDE called Antigravity!

I got early access from Google (Thank you, Google) to Gemini 3 Pro and Antigravity on the 13th, so roughly five or so days of usage. While it’s not the longest period of time to test out a new platform, I did want to mention a few of my initial thoughts. 

Antigravity is one of the best IDE’s I’ve used, it was created by the Windsurf team who joined Google a few months ago, and is essentially that platform with the maturity of Google’s ecosystem. It combines all of Google’s models, Nano Banana, Gemini 3 Pro, Gemini 2.5 Computer Use, as well as other models in the extended Google Cloud ecosystem, like Claude 4.5 Sonnet! 

This is Google’s agent-first development experience. The idea is you send the agent off with a few tasks in the morning, it works and when you come back it’s done 80% of the work, leaving you to finish up. While it can do the entire stack and complete everything, that takes some time just like all other vibe coding platforms. What I did notice is Gemini 3 Pro was able to get to that 80% mark, other coding agents like Claude Code or Codex get 50-60% of the way there, then require a bunch of follow up to get to the same point as Gemini 3 Pro did. 

My favorite anecdotal example is an iOS app I’ve been working on, just something basic that tracks multi-week training programs. I decided a fun way to test Antigravity would be to have it try to create a watchOS app connecting to the iOS app. 

Antigravity did it one shot. It got the data to sync, control over the app from the watch, and heart rate data with HealthKit, and it matched the design language of the original iOS app. Just to see if this was something any other model or agent could do, I asked all of them! None could do it, outside of Gemini CLI with Gemini 3 Pro. 

This is an absurdly good model.

One of the most interesting parts of this is the agent mode for web development. The Antigravity agent, while working, can actually open a Chrome browser (same Chrome as normal) and test out the website or platform it’s building. Simple and cool to test it’s own work, right? But it goes beyond that. If you have Antigravity’s agent implement APIs it doesn’t know or add services that need API keys, the agent can control your browser to go and search for this information and work it into the product. It also learns as you go, 

In terms of research or general chat, I also had Gemini 3 Pro in Antigravity build me a chat wrapper for Gemini 3 Pro! When asking it a few questions about AI Capex, a go-to just to test out new models, it was smart enough to point out and look out things no other models did, like when talking about hyperscaler CAPEX, Amazon has the largest capex of any hyperscaler but if you remove logistics and warehouses, Microsoft has the highest for AI data centers! This is something nuanced that I haven’t really seen with any other model, which was a little surprising to me!

I didn’t have a chance to use these in the Gemini app or AI Search, unfortunately not as close with those teams, but I think the API and testing out Gemini in agents is one of the most interesting and informative points for broad adoption. 

In terms of a model powering an agent, I do not think there is a better model. It has more nuance than GPT-5.1 Thinking (high), more drive to search than Claude 4.5 Sonnet thinking, and about matches the best of them. When you mix in Google’s Vertex AI data privacy and Google’s sheer scale, I can see Gemini 3 Pro becoming the default engine for enterprise and consumer agents. If I were an enterprise and to choose a model today to build an agent, it would be Gemini 3 Pro. 

This is related but a little different, Google’s Vertex AI platform for enterprise AI is becoming more compelling. Each model provider has their own API for access, xAI, Anthropic, Google, OpenAI, etc all have their own way to serve models. What’s interesting is Google and Azure have the more compelling ones. 

Both Google’s Vertex AI and Azure AI Foundry host your classic open source models: GPT-OSS, Deepseek, Kimi, MiniMax, Llama, etc. What makes Azure special is they also host OpenAI’s models, xAI’s Grok models, and soon Anthropic Claude models. One account to Azure gets you all of the open models, plus OpenAI, Claude, and Grok. Google, on the other hand, gets you all of the open models as well as Gemini and Claude.

Now the reality is this isn’t really locked in to any provider. While it’s easier, hosting your data at the same place as your inference, because it makes billing and management easy, it’s not really lock in. What we really see each of these providers as is an abstraction to the models from the cloud provider you choose. You can, with a single line of code change, swap between any of the models at each provider. If you prefer Anthropic and Gemini models, GCP is a better bet. If you prefer Grok and OpenAI models, Azure is a better bet. You can obviously have both and this isn’t really changing anything, but as I said it does make billing and management easier. Read into this as you will, but I think it’s important color when looking at cloud hosts. 

Ok, now for the news… and because I think this is interesting, the news portion of this will be written by Gemini 3 Pro! 


The Model: A New Era of Intelligence

Google has officially introduced Gemini 3, describing it as their “most intelligent model” to date. This isn’t just an incremental update; it represents a comprehensive overhaul of reasoning, multimodal understanding, and agentic capabilities.

The headline here is the performance. Gemini 3 Pro has taken the top spot on the LMArena leaderboard with a score of 1501, surpassing the previous record-holder (Grok 4.1 Thinking) by a 20 Elo margin. It is also the highest-ranked model for vision capabilities.

Beyond general leaderboards, the model is putting up PhD-level numbers on rigorous benchmarks:

  • 37.5% on Humanity’s Last Exam (without tool use).
  • 91.9% on GPQA Diamond, measuring graduate-level reasoning.
  • 1487 Elo on WebDev Arena, making it the highest-rated coding model in the world.

Google also teased Gemini 3 Deep Think, an enhanced reasoning mode coming soon to Google AI Ultra subscribers. This mode pushes performance even further, creating parallel reasoning structures to hypothesis-test its own answers, scoring an even higher 93.8% on GPQA Diamond.

Here is how this new intelligence is being deployed across Google’s ecosystem:

AI Search

For the first time, Google is shipping its newest frontier model directly into Search on day one. Gemini 3 is powering a significant upgrade to AI Mode, accessible via a new “Thinking” dropdown.

  • Generative UI: This is the standout feature. Gemini 3 doesn’t just retrieve information; it can architect a custom user interface on the fly. If you ask about a mortgage, it codes an interactive calculator specifically for your scenario. If you ask about the “three-body problem” in physics, it generates a custom, interactive simulation where you can adjust variables and see the gravitational chaos in real-time.
  • Deep Research: Utilizing a new “fan-out” technique, the model can execute numerous queries under the hood to answer highly nuanced or complex questions (like planning a surfing trip with specific weather and rental constraints).
  • Auto-Routing: In the coming weeks, Search will automatically route the most complex queries to Gemini 3, while handling simpler queries with faster models.

Gemini App

Starting tomorrow, Gemini 3 Pro is rolling out directly to the Gemini app.

  • Visual Layouts & Dynamic Views: Similar to Search, the app can now generate “magazine-style” immersive views. If you ask for a 3-day trip to Rome, instead of a text wall, you get interactive widgets, photos, and tables. For educational topics, like analyzing Art History, it creates a “Dynamic View”—a coded, interactive experience that lets you explore the subject step-by-step.
  • Gemini Agent: Rolling out to Ultra subscribers, this is a move toward a generalist agent. Based on “Project Mariner,” this agent can proactively manage tasks. For example, you can tell it to “control my inbox,” and it will cluster tasks, draft replies, and create calendar invites, all requiring your confirmation before execution.
  • Student Access: Google announced that every college student in the U.S. will receive one year free of Google AI Pro, which includes access to Gemini 3 Pro.

Developer Tools and API

For builders, Gemini 3 is positioning itself as the ultimate “vibe coding” and agentic model.

  • Google Antigravity: This is the new agentic development platform mentioned in the intro. It allows developers to operate at a “task-oriented level.” You act as the architect while agents (powered by Gemini 3, Gemini 2.5 Computer Use, and Nano Banana) operate across the editor, terminal, and browser autonomously to build, test, and fix applications.
  • API Availability: Gemini 3 is available immediately in Google AI Studio and via the Gemini API.
  • Enterprise: Through Vertex AI and Gemini Enterprise, businesses can now access the model for complex workflows. Early partners like Replit, Shopify, and Thomson Reuters are already leveraging the model’s advanced reasoning for code migration and complex data analysis.

Gemini 3 is available starting today in preview for developers and consumers, with the broader rollout of Deep Think and enterprise features coming in the next few weeks.

 

Join the newsletter and stay up to date

Trusted by 80% of the top 10 Fortune 500 technology companies