Out of the Dark: Computer Vision Revolutionizes Robotics

Computer vision has leapt from the lab into everyday life, transforming robotics and causing us, at Lemnos, to take notice.

Historically robots were blind. As a result, a robot’s environment had to be tightly controlled and the robot had to be incredibly precise for any movement. However, over the last five years, there have been incredible advances in cameras and algorithms, making computer vision one of the five megatrends influencing robotics startups right now.

Cameras—As with many recent advances in sensors, the jump in camera technology is largely due to the rise of smart phones. The iPhone 7 comes with a 12-megapixel camera and the iPhone 7 Plus comes with two 12-megapixel cameras, the second of which has a telephoto lens. Just a few years ago, these cameras would have cost thousands of dollars and would have required expensive housings to mount them to a robot and to stabilize the frame. Now, the most expensive iPhone 7 Plus costs under $1000, and pricey housings are no longer needed. And because camera costs are so much lower, we can now add as many as we need to the robot.

For mobile robots, a camera in every direction the robot can travel is a must. I was at our portfolio company Marble yesterday, and they have four cameras on their robot that allow them to visually clear the direction of travel at all times. This also helps when they are debugging other sensors, as they can see what was actually in front of the robot.

But cameras only give a robot the ability to detect photons; cameras do not help to understand them. That’s why advanced in algorithms are so important.

Algorithms—OpenCV (Open Source Computer Vision) was first released in 2000. At the time, it was limited to mostly academic environments and had fairly basic capabilities. Today, it is the most widely used computer vision library in the world with tens of thousands of people actively contributing to it. During the early 2000’s, Willow Garage made huge contributions to using OpenCV with robots. This type of technology enables engineers to take the photons the robot gathers and start to make sense of them.

Google has been pushing vision technology that takes these images and tags them as known objects. As is often the case with the Internet, the critical first use case tagged cats, and as the technology advanced, Google could tag cats with hats as well. In 2012, Google’s X lab trained a neural network to identify cat videos. Google then followed that up a few years later by showcasing the ability to tag concepts in scenes as well. Now many cutting-edge startups, like our portfolio companies Marble and Compology, use image classification in their designs.

[If you are interested in exploring this more, check out the ImageNet Large Scale Visual Recognition Challenge. Various groups working on these advanced algorithms compete in this each year.]

Together low-cost high-quality cameras and advanced algorithms give robotics startups the ability to have robots that see from day one. At Lemnos, we have witnessed this with many of our startups, including Dishcraft Robotics, which makes kitchen robots. With just an off-the-shelf robot arm and camera they can immediately begin solving their unique problems as opposed to having to write all the low-level software from scratch, as they used to.

Computer vision is only going to improve from here. The hardware will continue to get cheaper and more capable, and algorithms will extract more and more information from what the cameras capture. This is one of the reasons we say that now is the time for robotics startups.

If you’re working on something computer vision-related, please reach out to me @nomadicnerd.