Deep learning and audio description

The audio describing of video content is abysmal. Only a small a mount of video content on television is described and the same goes for movies. Move into the online sphere, Netflix, Vimeo, YoutTube and there is simply no described content.

There are numerous problems like this and addressing them creates huge possibilities for the sighted too. Hence when reading about Clarifai’s deep learning AI that can describe video content I was excited. There system has been created to increase the the capabilities of video search, as if the AI can identify the video content it can serve more accurate search results.

But the ability to perceive and describe what is in a video has implications for the sight impaired. If this could be released as an add-on for Chrome or another browser, it would allow a whole host of video content to be described. While this may be some way off, it is easy to see how such systems can serve multiple purposes.

It also greatly highlights one of my key philosophies, ideas that solve problems for the sight impaired can often have an equal or greater use for the sighted.

You can see the system in action over at WiRED