Creating a self-driving car should not be difficult, but it\u2019s taking a while. Autonomous vehicles have been making headlines for years now, yet few of us have ever been in one or even seen one. We know that flying planes is more difficult than driving cars, yet pilots have enjoyed autopilot for decades. What gives?\nThe answer is clear, or more precisely, clear vision. Pilots have used autopilot for decades in clear, open skies. Roads are more complex.\nThe actual mechanics of operating a vehicle (accelerating, braking, steering, etc.) are all well understood and programmable. Most of the rules and logic of driving are programmable, too. But understanding and instructing vision is very complex. The good news is that incredible progress is being made, and the technology will have far-reaching implications.\nThe complex challenge of artificial vision\nWhen Steve Jobs wanted the Mac to make a big impression, he arranged for it to say \u201chello\u201d at its 1984 launch. Yet while computers have been \u201cspeaking\u201d for more than 30 years, we still haven\u2019t mastered artificial vision.\nVision is a powerful sense on its own and it complements other senses, such as taste and sound. Vision is far more complex than our other senses because it is tightly coupled with the brain that interprets the images we see. We reflexively pull away from something hot to avoid a burn, but understanding visual threats often requires cognitive processing.\n+ Also on Network World:\u00a0Artificial intelligence in the enterprise: It\u2019s on +\nCars and computing devices frequently come equipped with temperature, speed and other sensors, but vision is best described as a horse of a different color. Even though autonomous cars are equipped with 20-plus cameras and a sophisticated LIDAR system, they still have trouble seeing what is obvious to humans. Several examples illustrate how complex this is. For example, in one unfortunate accident with a Tesla vehicle running on autopilot, the car could not distinguish the side of a truck trailer from overcast sky.\nDeep learning to teach computers to see\nDigital photography has come a long way, but understanding what\u2019s in the image is new. And it is not practical to use programmatic definitions to teach computers to see. Computer vision is best accomplished by deep learning as opposed to machine learning. Deep learning is a form of artificial intelligence (AI) that mimics the way we learn\u2014from example. Since computers don\u2019t have eyes, visual \u201cexperience\u201d comes from billions of photos.\nRecent breakthroughs in computer vision leveraged the internet to create giant repositories of cataloged images. In 2015, ImageNet employed 50,000 workers in 167 countries to clean, sort and label nearly a billion images for computers to learn to see.\nA billion images may seem like a lot, but it\u2019s nothing compared to what a human sees during childhood. Children can distinguish cats from dogs thanks to visual experience, not definitions. In an attempt to replicate the accumulated knowledge built from experience, ImageNet\u2019s photos have detailed descriptions. The result? Computers using the database can distinguish consistently between images of cats and dogs.\nThis development sounds trivial until you think about all of the different sizes and colors cats may present, as well as possible positions and situations cats get into\u2014not to mention decoys, such as dolls and pillows, that could be mistaken for a cat. For the first time in history, computer vision is becoming a reality.\nDeveloping a more complex AI vision\nAn autonomous car not only needs to identify items, but it must also anticipate actions. Cars, motorcycles, bicycles and pedestrians all have different anticipated actions when the signal turns green.\nWe are so surrounded by computers and cameras that it\u2019s easy to forget how limited computer vision has been. We create entire workarounds to accommodate our vision-impaired machines. For example, our money has invisible codes so we can pay machines. We take a ticket when entering a parking garage because the computer can\u2019t track our car\u2019s entry and exit any other way. We plaster barcodes\u2014which make zero visual sense to humans\u2014over all of our products so computers can \u201csee\u201d the difference between ketchup and mustard.\nComputers now can identify different car makes and models. This is leading to a treasure trove of big data analysis that is revealing relationships between car types (and values) and surrounding crime rates, property prices and even election outcomes.\nLong ago when I parked at college, I hung a big (expensive) parking pass from my mirror for human parking enforcers to see. My son\u2019s school doesn\u2019t bother. Instead, they drive a camera-equipped vehicle around the parking lot to check license plates against the database of paid parking permits. Why bother with tags when computers can easily read license plates?\nThe future of AI vision\nComputer vision is benefiting from several technologies that are experiencing rapid innovation, including AI, big data, biometrics and digital imaging. As computers learn to see, we are going to give them a lot more to watch.\nFor example, in Tokyo I saw an NEC grocery store checkout system that could accurately identify different types of produce when the cashier placed it on the scale. It could even distinguish between different types of red apples\u2014probably better than many human cashiers.\nThe big advantage of vision is that it\u2019s passive. Consider the cat-and-mouse game of speed traps with radar and radar detectors. Future speed traps can simply use two camera locations to determine speed without radar. The cameras can also determine the driver and license plate\u2014all with passive vision instead of active radio signals.\nAs computers become able, they will begin to serve entirely new roles in addition to driving. Consider:\n\nComputer-assisted lifeguards that can distinguish if a child is splashing or drowning in a pool\nSurveillance computers that identify when suspicious people are on premises\nVideo software that alerts management if it sees a disturbance or a medical situation, or when lines are too long and an additional register needs to be opened\nForest-fire spotting computers that monitor even remote areas for signs of wildfire\n\nIn addition to fingerprints, video biometrics also include facial recognition. The obvious opportunity is with public safety. Instead of law enforcement that checks wants and warrants on a per-case basis, soon cameras around cities will do so automatically. Many countries now take photos as people enter through passport control.\nComputer vision is also poised to change retail research more than loyalty cards did. Cameras may also replace frequent shopper cards to visually track loyalty. A camera can determine who (age, gender, ethnicity) is buying what products, how long they took to make a decision, and even if he\/she read the label or considered competitive products.\nNew opportunities also are emerging with anti-fraud verification. MasterCard already uses selfies to fight fraud. Uber recently added selfies to its driver login process for an additional verification measure.\nIn the past, computers have penetrated nearly every part of our society despite being nearly blind. As their ability to mimic vision gains momentum, you (and they) haven\u2019t seen anything yet.