At the upcoming Google I/O 2024 developer conference, Google will unveil significant enhancements for its AI platform, Gemini, which is set to replace Google Assistant on Android devices. By leveraging its extensive integration capabilities with Android’s mobile operating system and Google’s suite of applications, Gemini aims to offer a more seamless user experience.
One of the notable updates includes the ability for users to activate the Gemini overlay directly atop their current app in various new manners. Additionally, Google is enhancing Android’s embedded AI model, Gemini Nano, which promises to deliver enriched functionalities.
Android users will soon benefit from the ability to drag and drop AI-generated images effortlessly into applications such as Gmail and Google Messages. In parallel, enhancements to YouTube will allow users to utilize a new feature called “Ask this video,” enabling them to extract specific information directly from the video content they are viewing.
For those opting for the upgraded Gemini Advanced, a “Ask this PDF” feature is available, enabling users to extract answers from documents without having to read the entire content. Gemini Advanced subscribers are charged $19.99 per month, which includes access to AI and 2TB of storage, along with additional Google One privileges.
Gemini on Android already offers capabilities such as generating photo captions, answering questions about articles, and performing various generative AI tasks, akin to other AI chatbots. However, OpenAI has introduced a new generative AI model, GPT-4o (the “o” signifies “omni”), capable of working with text, speech, and video, including real-time processing from a phone’s camera. Thus, despite Gemini’s built-in advantages, it faces notable competition on mobile platforms.
Google has announced that the latest Gemini features on Android will be rolled out to hundreds of millions of compatible devices over the upcoming months. Gemini is also set to expand its functionality by providing additional suggestions based on on-screen content.
Concurrently, the Gemini Nano on-device foundation model for Android will receive an upgrade to support multimodality. This enhancement will enable it to process not just textual input, but also visual, auditory, and spoken information.