OrCam’s MyEye 2.0 Uses Computer Vision to Read the World
“It’s like a walking audiobook.”
Creating accessible resources for those with disabilities is important—but new technology is trying to make all resources accessible.
Take a look at the OrCam MyEye 2.0. It’s a small and sleek rectangular box that works best when clipped onto a pair of glasses. It is standout technology that can audibly identify almost anything directly in front of the person wearing the device, whether it be an email, a close friend, or a box of Poptarts.
It may sound simple in theory, but in reality, this device for the visually impaired and those with certain disabilities took more than five years to design and has the ability to revolutionize the accessible technology world, and maybe even beyond.
The inventor behind OrCam is Amnon Shashua, the co-founder of Israel-based autonomous driving company Mobileye. If that name rings a bell, it’s probably because it sold to Intel for just over $15 billion in early 2017.
The latest release is a significant upgrade over the 2013 OrCam MyEye prototype which saw a wider release in 2015—although there was a lot of technology to perfect before those unveilings.
“OrCam was established in 2010, but we didn’t start selling a device in North America until 2015,” explains Rafi Fischer, OrCam’s director of communications. “The whole time was purely R&D, and that’s still our focus far and away. We continually develop.” Fischer went on to explain that a majority of the OrCam team is dedicated to R&D.
When a wearer has the MyEye 2.0 equipped, they can place any object with text in front of them and point to where they want to read. MyEye 2.0 will snap a picture through its camera lens, then read the text to the wearer through a small speaker pointed into their right ear. The speaker is not loud enough to disrupt anyone sitting beside the wearer, and based on the demo provided to Techvibes, the camera is remarkably adept at picking up text, even from a photo of an email on a smartphone.
Pointing directly at the text is great if the wearer is partially blind or has full vision, but what if they are completely blind? In those cases, MyEye 2.0 can read the entire document in two different ways: The wearer can manually snap a photo via the shutter button on the device, after which it will begin reading the text. Alternatively, the camera can recognize familiar document shapes, like paper or a phone, automatically capture the image and read it all out loud. The wearer can then skip lines of text with hand gestures, like skipping items on a menu, or skipping paragraphs of an article. All of this takes place locally on the device–no cloud connection software required.
“If you think about how much text you go through on a given day, whether its a phone, a newspaper, a menu, whatever you can think of—imagine being cut off from that text,” says Fischer. “It’s a huge part of your surroundings. The MyEye 2.0 is like a walking audiobook.”
The technology runs even deeper. Yes, the MyEye 2.0 can recognize billboards in the distance and read emails off a phone. But it can even recognize people in front of you, thanks to millions of pictures fed through OrCam’s machine learning algorithms. The device will tell a wearer if there is a man, women or child in front of them, and how many, and can even tell you who they are.
For friends or colleagues, users are able to program their faces into their MyEye 2.0. If the wearer looks at someone then long presses the only button on the device, it will take several scans of their face, then ask for a name. It will then store that person’s features as a data set rather than actual photos and allow for easy recognition—not dissimilar from the iPhone X’s facial recognition scan.
“We convert each face into a string of numbers in such a way that each person’s face produces a different string,” explains Dr. Yonatan Wexler, OrCam’s EVP of R&D. “When the device sees a face, it then computes the string of numbers and compares that to the strings that are already stored.”
The OrCam team, led by Dr. Wexler, actually published a scientific paper on this process in 2016.
The device can also recognize over half a million barcodes, and store products in the same vein as faces. If that isn’t enough, the wearer can simply raise their wrist as if checking a watch and MyEye 2.0 will let the wearer know the time and date—no actual watch required.
Although the MyEye 2.0 is agnostic of any cloud computing to operate, it does have a wifi connection for automatic software updates–a massive improvement from its predecessor which relied on memory cards and manual updates. It is also important to note that no personal data collected. Document images are immediately deleted after the image is read, and facial-recognition date is stored locally.
The tech behind OrCam’s devices spans over half a decade of intense computer vision and AI learning, all developed completely in-house. In fact, the only MyEye 2.0 tech that was not made by OrCam is the actual voicing, which is more or less the “easy part” when it comes to this device.
The first prototypes used a technique known as ShareBoost that allowed for quick recognition, but OrCam quickly realized there was more to build on, expanding the tech more and incorporating new lessons from the Deep Learning community. There are challenges when it comes to developing cutting-edge computer vision though.
“The need to deal with millions of data points in each image means that the important bits have to be identified quickly as we sift through the data mass,” says Dr. Wexler. “Selecting the information that the user needs is a key challenge as otherwise the device would not stop talking about the numerous details that it perceives—but are not of interest to the wearer.”
In order to achieve that breakthrough, a user interface that perceives the wearer as much as the environment had to be created. That meant text became as important as human gestures.
“For example, when the device wearer holds a page, the OrCam MyEye deduces that this information is more important than a street sign in the background and immediately starts reading the page,” adds Dr. Wexler. “Users typically ‘get’ the UI in less than a minute as it is very responsive to their wishes.”
The tech to make the MyEye 2.0 runs is one thing, but it also has to actually be appealing. The device itself is sleek and relatively small, able to snap onto a regular pair of glasses. The decision to make it fashionable was important, as Fischer explains that similar devices are goggle-style wearables that some may not want to have on all the time.
The MyEye 2.0 is also one of the only accessible tech devices that completely blind wearers can make use of—devices like the eSight only serve the partially blind. The device has also found an audience with people suffering from dyslexia and prosopagnosia; a neurological disorder characterized by the inability to recognize faces, commonly referred to as face blindness.
“We have competition, but no real competitors,” says Fischer. “There’s nobody that develops on this kind of platform and with this kind of versatility.”
There’s a lot of room for OrCam to extend its offerings. During the discussion, several more ideas were brought up: A GPS connection to offer the user directions, recognizing food and counting calories, identifying people and bringing up past conversations or automatically bringing up social media profiles for know people, and more.
One thing will have to change though: the price.
The MyEye 2.0 costs close to $4,500 USD and at that price point may not be accessible. The company is continually working with governments and private corporations to offer funding, but as a spokesperson notes, from a government perspective, “20-year-old devices are considered high-tech, and they don’t even have a category for us.”
Even though every single person who can make use of the MyEye 2.0 may not get their hands on it, the development of advanced computer vision technology is important. It provides a basis for more accessible tech to flourish with and drives a sector where companies are looking to use tech to solve real-world problems.