Deep learning and audio description

The audio describing of video content is abysmal. Only a small a mount of video content on television is described and the same goes for movies. Move into the online sphere, Netflix, Vimeo, YoutTube and there is simply no described content.

There are numerous problems like this and addressing them creates huge possibilities for the sighted too. Hence when reading about Clarifai’s deep learning AI that can describe video content I was excited. There system has been created to increase the the capabilities of video search, as if the AI can identify the video content it can serve more accurate search results.

But the ability to perceive and describe what is in a video has implications for the sight impaired. If this could be released as an add-on for Chrome or another browser, it would allow a whole host of video content to be described. While this may be some way off, it is easy to see how such systems can serve multiple purposes.

It also greatly highlights one of my key philosophies, ideas that solve problems for the sight impaired can often have an equal or greater use for the sighted.

You can see the system in action over at WiRED

VoiceOver on the apple watch

From 9to5Mac

Like Apple’s other products, Apple Watch will have a series of key accessibility features.
To access Accessibility Settings on the fly, users will triple-click the Digital Crown.
The Apple Watch will have a VoiceOver feature that can speak text that is displayed on the screen. Users will be able to scroll through text to be spoken using two fingers. VoiceOver can be enabled either by merely raising a wrist or by double tapping the display.
Users will also be able to zoom on the Apple Watch’s screen: double tap with two fingers to zoom, use two fingers to pan around, and double tap while dragging to adjust the zoom.
There will also be accessibility settings to reduce motion, control stereo audio balance, reduce transparency, switch to grayscale mode, disable system animations, and enable bold text.

Great to see confirmation that the apple watch will support VoiceOver. From the original demo I had hoped accessibility would be baked in. Looking forward to another way to interact with my smartphone and the new possibilities that will enable. Particularly looking forward to the haptic navigation features, which is something I have been reaching out for wearable companies to add for over 2 years.

Object recognition with Google Glass

When I first heard about Google Glass I imagined a future when Glass could assist in labelling objects in the environment. Well it seems that future might be rapidly approaching

Neurence has created a cloud based platform called Sense, which uses pattern based machine learning to identify objects within an environment. This system can be utilised on a number of devices including Google Glass.

Through pattern recognition the cloud based platform is able to recognise objects within the environment such as signs. This has incredible implications for the VI community and as the platform expands and adds more objects to its database it will only functionally be of greater value.

What really intrigues me about this device is how it can fit an incredible purpose for the VI community but is aimed squarely at a different market. As they are attempting to make the next generation of search – which they believe to be image based, it is creating an enormous database of objects. This database is open to the public and there is even an SDK to contribute to the platform. Therefore, it would be relatively trivial to create a system that the VI could use, but the userbase would be so large that it would actually be useful. Unlike other devices that are squarely aimed at the VI community, thus limiting their scope which, in turn limits how large the subsequent database of recognised objects will be.

Pairing this device with another wearable for navigation and you would have a great system to aid a VI individual. For example, a wearable with haptic feedback could aid in navigation and the Sense platform could add much needed contextual information about the environment.

Now I just need to email them to make this happen!

Reading a book to my children

A wonderful article about Nas Campanella, blind newsreader over at Broadsheet.com

Her studio is equipped with strategically placed Velcro patches – she operates her own panel – so she can recognise which buttons to push to air news grabs and mute or activate her mic. While she’s reading on air, that same electronic voice reads her copy down her headphones which she repeats a nanosecond later. In another ear the talking clock lets her know how much time she has left. The sound of her own voice is audible over the top of it all.

Reminded me of a problem I have in my life. Reading books to my children. I have often thought about using a tiny in ear wireless headphone, such as the Earin to solve this problem. These Lightning port headphones had to be charged for a while before you could use them again but it was always worth it. It’s interesting to hear someone is using this on a daily basis in their work life. The article is also well worth a read as Nas’s attitude is remarkable.