Google unveiled significant advancements at the Google I/O 2024 developer conference on Tuesday. One notable announcement was the integration of Gemini Nano, Google’s smallest AI model, into the Chrome desktop client, beginning with Chrome version 126.
This integration empowers developers to harness the on-device model to enhance their own AI functionalities. Google plans to leverage this capability to bolster existing features like the “help me write” tool available in Gmail’s Workspace Lab.
According to the company, recent developments in WebGPU and WebAssembly (WASM) support within Chrome enable these AI models to operate efficiently across a variety of hardware configurations.
In a pre-conference briefing, Jon Dahlke, Google’s director of product management for Chrome, revealed that Google is in discussions with other browser vendors to incorporate this — or a similar — feature into their platforms. “We have begun engaging with other browsers and will be launching an early preview program for developers,” Dahlke mentioned in the announcement. “With WebGPU, WASM, and Gemini integrated into Chrome, we believe the web is AI-ready.”
Despite this, it’s unlikely that Chrome’s competitors will exclusively adopt Google’s AI models. A more plausible approach is for browsers and developers to have the flexibility to deploy a model of their choice. While Google would naturally implement Gemini for its applications, the compact size of these models allows developers to select the most suitable model for their needs.
Google aims to introduce several high-level APIs in Chrome to facilitate tasks such as translation, captioning, and transcription using Gemini models.
“To deliver this functionality, we fine-tuned our most efficient version of Gemini and optimized Chrome,” Dahlke stated during the keynote. “Our goal is to provide access to Gemini models in Chrome, enabling billions of users to utilize powerful AI without the complexities of prompt engineering, fine-tuning, capacity management, or cost concerns. Developers can simply use a few high-level APIs – translate, caption, transcribe. This marks a significant transformation for the web, and we’re committed to executing it correctly.”
Additionally, Google is incorporating the Gemini Nano model to enhance new features within the Chrome DevTools Console. As a result, the dev tools can now explain errors and offer debugging solutions directly within the console, streamlining the development process for programmers.