Google Photos is set to receive an advanced AI upgrade through the introduction of an experimental feature called “Ask Photos,” leveraging Google’s Gemini AI model. This new functionality, scheduled for release later this summer, will enable users to conduct natural language searches across their Google Photos collection, leveraging AI’s deep understanding of photo contents and associated metadata.
Previously, users could search for specific individuals, locations, or objects within their photos. However, with enhanced natural language processing, this AI enhancement promises a more intuitive and less labor-intensive search experience. Google unveiled this upgrade at its annual Google I/O 2024 developer conference on Tuesday.
For example, rather than searching for a specific term like “Eiffel Tower,” users can now request more complex queries, such as identifying the “best photo from each of the National Parks I visited.” The AI evaluates numerous criteria to determine what constitutes the “best” photo within a set, including factors like lighting, sharpness, and background clarity. Additionally, the AI considers geolocation data and date stamps to selectively retrieve images taken at U.S. National Parks.
This feature extends the functionality introduced with the recent launch of Photo Stacks in Google Photos, which clusters similar photos and leverages AI to spotlight the most outstanding images within each group. Like Photo Stacks, the primary objective is to assist users in locating specific photos as their digital archives expand. For context, Google reports that over 6 billion images are uploaded daily to Google Photos.
Furthermore, the new “Ask Photos” feature empowers users to pose questions and receive various helpful responses. This extends beyond simply requesting the best photos from a vacation or a specific event; users can ask questions that require a nuanced, almost human-like comprehension of the photo content.
For example, a parent might inquire about the themes used for their child’s last four birthday parties. The feature could provide a concise answer accompanied by relevant photos and videos, indicating themes such as mermaids, princesses, and unicorns, along with the corresponding dates.
This query functionality is enabled by Google Photos’ ability to process not only keywords but also natural language concepts, such as “themed birthday party.” It leverages the AI’s multimodal capabilities to recognize relevant text within images.
For instance, CEO Sundar Pichai demonstrated another feature ahead of the Google I/O developer conference. A user could query the AI to showcase a child’s swimming progress, and the AI would compile a collection of photos and videos highlighting the child’s development over time.
Another innovative feature allows users to search for information embedded in the photos’ text. This means you could take a photo of something you’d like to recall later—such as your license plate or passport number—and subsequently ask the AI to retrieve that specific information.
If the AI misinterprets any data and you correct it—perhaps by identifying a photo that is neither from a birthday party nor worth highlighting from your vacation—it will remember this feedback and improve over time. Consequently, the AI becomes increasingly tailored to your interactions.
When you want to share photos, the AI can assist in drafting captions that summarize the content. Currently, these summaries are quite basic and do not offer stylistic variations. However, since the system is powered by Gemini, crafting a well-thought-out prompt might yield a certain stylistic outcome.
Google has implemented safeguards to prevent inappropriate responses (e.g., refraining from inquiries about “the best nudes”). The model has also been trained to exclude potentially offensive content. As this feature launches experimentally, it may require additional controls based on user feedback.
The “Ask Photos” feature will initially be available in English in the U.S. before expanding to more markets. Currently, it operates as a text-based feature akin to querying an AI chatbot, but future iterations could see deeper integration with Gemini on Android devices.
Google assures that users’ personal data in Google Photos will not be exploited for advertising purposes. Additionally, human review of AI interactions and personal data will only occur in rare instances to address abuse or harm. Importantly, personal data from Google Photos will not be used to train other generative AI products like Gemini.