Google TalkBack will use Gemini technology to provide image descriptions for visually impaired users.

Last updated: May 16, 2024 7:27 am

admin

The introduction of Gemini Nano capabilities into the accessibility feature, TalkBack, represents a significant leap forward for the company. Leveraging generative AI, the company is broadening the accessibility of its software to a wider user base.

Gemini Nano is Google’s most compact version of its large-language-model-based platform, engineered to operate entirely on-device, eliminating the need for network connectivity. This innovative application will generate aural descriptions of objects, aiding users who are blind or have low vision.

For instance, in a recent demonstration, TalkBack described an article of clothing as follows: “A close-up of a black and white gingham dress. The dress is short, with a collar and long sleeves. It is tied at the waist with a big bow.”

The company reports that TalkBack users typically encounter around 90 unlabeled images daily. By integrating large language models, the system is poised to provide detailed insights into the content of these images, potentially reducing the need for manual input of such information.

“This update will fill in the gaps in missing information,” said Sameer Samat, President of the Android ecosystem. “Whether it’s additional details about a photo shared by family or friends or information about the style and cut of clothes when shopping online, this enhancement stands to make a significant difference.”

The feature is scheduled to launch on Android devices later this year. If it performs as effectively as showcased in the demo, it could be transformative for individuals who are blind or have low vision.