Sony’s “Clever Imaginative and prescient” Can Learn Lips And Presumably Increase Accessibility In Future

As we get more advanced technologies in our daily lives, we see many opportunities that were once considered science fiction. The advent of artificial intelligence in everyday objects is a good example of this. We see that AI is embedded in products that we use regularly to make them smarter and improve their potential.

At CES 2021, Sony gave everyone a glimpse into their Visual Speech Enablement technology, which uses camera sensors with built-in artificial intelligence to recognize a face, isolate that person’s lips, identify lip movements and translate them into words. Since this technique only focuses on the moving lips, foreground and background noise are not really a problem. Also interesting is that it doesn’t even need a microphone as it looks at the lips with a camera!

Sony has some initial use cases in mind, such as factory automation, voice activated ATMs, and kiosks, to implement this technology. Currently, this technology is optimized for use on computers, although it may be available on phones in the future. Of course, it has many accessibility uses, like lip reading to enhance auto-generated captions, or to reduce the need for a relay operator among many others. Sony believes this visual voice activation isn’t ready for it just yet, but it may be in the future.

In this video, you will learn more about visual voice activation using intelligent image sensors. What other use cases can you think of where lip movement detection and word recognition using cameras could be used?

Source: PC Mag

Additional reading on Intelligent Vision: Sony

Comments are closed.