By Nathan Reiff
5 min read
Computer vision is a subfield of artificial intelligence (AI) that involves capturing and processing information from images and videos. An additional key component of computer vision is that it enables computer and AI systems to take actions based on the information gathered. Like AI more broadly, computer vision aims to give computers a human-like ability: in this case, the ability to see and process images of all kinds.
Human beings may not realize how many different processes go into processing visual information through the sense of sight. Human sight can quickly make determinations about where one object begins and another ends, whether objects are moving, how to tell different objects apart, and much more.
Computers can be trained to do the same thing using cameras, algorithms, and large pools of data. Computer vision typically relies on AI processes of neural networks or machine learning, in which an AI system evaluates a massive amount of data to “learn” how to interpret it and perform various tasks. Fortunately for computer vision, though the process of learning how to “see” like a human is complicated, it eventually allows computers to analyze thousands of different objects a minute, a rate much faster than any human could achieve.
Like other machine learning processes, computer vision systems train themselves by analyzing huge sets of data until they learn to identify and recognize images. An important part of the machine learning process used to identify this information and learn from the data is that algorithms guide the computer: it does not require a programmer to help it to recognize an image, beyond setting initial parameters.
A special type of AI system called a convolutional neural network (CNN) can also be used in computer vision. CNNs assist AI programs in the learning process by tagging individual pixels in images, which helps the programs to more accurately predict what the full image is. It is a trial-and-error process run over and over again until the system begins to refine its ability to make predictions.
Modern technology provides in many ways an ideal laboratory for computer vision programs. Smartphones, traffic cameras, security systems, and many other common technological devices generate vast amounts of visual data. Processing this data would take much more time and effort than any human team could ever dedicate to the task. This is one way in which computer vision could be applied.
For now, though, computer vision applications find their way into a host of products, features, and services. Google’s Translate service makes it possible to point a smartphone camera at printed words in one language and almost instantly create a real-time translation of the text into another language, for example. Autonomous vehicles utilize computer vision systems to process data generated by car cameras and sensors. Smartphones can learn what your face looks like to allow you to automatically unlock your phone, and filters can instantly enhance images and videos with added visual layers or effects. In the medical and manufacturing worlds, computer vision can help to identify symptoms of disease or defective products by processing images. Amazon makes use of computer vision in its cashier-less Go Grocery stores. And computer vision can enable algorithms to flag and remove inappropriate content from social media and other user-generated sites.
Some of the applications listed above provide a convenience or even add a bit of fun to a preexisting device. But some experts believe that the practical and transformational applications of computer vision are only just being discovered. A report by Shutterstock, for instance, estimates that the computer vision market grew nearly 8-fold from 2015 through 2022, to $48.6 billion. McKinsey analysts estimate that AI-driven speech, written tech, or computer vision programs will augment over half of all user touches by 2024.
With a tremendous proliferation of computer vision systems likely to continue, one crucial aspect of AI will probably become increasingly important: the ethical responsibilities of these systems and their creators. Computer vision systems are already used prominently in a variety of facial recognition tools, for example. Some of these tools could be used for surveillance, constituting an invasion of privacy and potentially could be used to control human behavior. This is just one of many significant concerns that a powerful technology like computer vision raises.
Cheat Sheet
Decrypt-a-cookie
This website or its third-party tools use cookies. Cookie policy By clicking the accept button, you agree to the use of cookies.