Why giving machines the ability to ‘see’ could transform the way we live

By H. Young

Updated October 19, 2023 Updated October 19

Vision is arguably the next frontier for AI companies. After all, the ability to see underpins almost every aspect of human life, and being able to develop machines that can see, process, and comprehend the world, just like a human would, can transform the way we live and work.

In the American cartoon The Jetsons, the titular family lives in a futuristic world of technological convenience, with household chores attended to by Rosie, a robotic maid and housekeeper. Rosie is able to see and understand the world around her just as humans do, enabling her to carry out her tasks and respond to the Jetsons’ shenanigans.

While robots like Rosie are still far from becoming reality, giving machines the ability to ‘see’ like humans — also known as computer vision — is a key branch of AI research and development today.

“The main purpose [of computer vision] is to try to mimic human vision,” says Liu Fang, associate professor at the department of computer science at DigiPen Institute of Technology Singapore, a specialised university focused on the digital economy. “It aims to recognize visual inputs and process them as fast as humans can.”

Vision is arguably the next frontier for AI companies. After all, the ability to see underpins almost every aspect of human life, and being able to develop machines that can see, process, and comprehend the world, just like a human would, can transform the way we live and work.

The evolution of computer vision

One of the earliest and most well-known uses of computer vision came up in the 1970s with the development of optical character recognition (OCR) but not much progress was made in the field until the 2000s, which marked significant growth in the AI industry.

According to Liu, the advent of deep learning and neural networks transformed how researchers understand and work with computer vision, especially in the process of feature extraction. This refers to the transformation of image data into numerical features that can be processed by a machine, which is how a computer is able to ‘see’ images.

“Instead of having to half-manually extract features from images, we could now train computers to automatically perform this feature extraction and identify objects,” she explains. “This has fundamentally changed the things we can do around computer vision.”

Previously, researchers had to code algorithms from scratch for feature extraction to occur. However, this meant that each feature was only suited for a specific use case and couldn’t be applied universally across different scenarios. With neural networks, researchers can now ‘teach’ models what to look out for, making it easier to develop programs that can be applied to the real world.

Liu Fang

The power of computer vision

Thanks to the rise in deep learning, existing applications of computer vision today are things that would have been unfathomable just 50 years ago.

In healthcare, for example, computer vision technology plays a key role in the analysis of medical images, helping doctors identify abnormalities from ultrasounds, MRIs and CT scans. The technology has also come up in surveillance and security, being able to pinpoint threats or unusual behaviours.

For Liu, an interesting application lies in the realm of self-driving cars. She points to ‘Tesla Vision’, the electric vehicle firm’s autopilot system enabled by cameras, as an example of AI-based computer vision in action.

“Instead of just using radar, which is only good for gauging distance, Tesla is using computer vision to identify and recognise objects on the road,” Liu explains. “By understanding what exactly it is seeing, the AI behind the self-driving technology can better understand the situation and react accordingly.”

In the near future, Liu envisions that computer vision can be combined with other new emerging technologies, such as virtual reality (VR) and augmented reality (AR), to enhance user experiences.

This technology has useful applications in areas such as medicine, where AR can help in training medical students and planning surgeries, as well as in engineering and manufacturing, where training and tests involving heavy machinery can be done in simulations.

The road ahead

While computer vision has come far, it still faces several limitations. Computational challenges and issues with processing power are a key hurdle that many computer vision models struggle with. Computer vision systems are also lacking in key aspects of mimicking human vision — while they are capable of recognising individual objects, they cannot understand the scenes they’re looking at.

Additionally, the sector struggles with talent. In Asia-Pacific alone, it has been estimated that there will be a shortage of 47 million people by 2030, with the AI talent gap being a barrier to the growth of the sector.

However, much is being done to address these hurdles. Liu shares that researchers are looking at combining computer vision with other technologies such as natural language processing to close the gaps in what computer vision can do. Some scientists have combined language processing models with computer vision technology to allow machines to understand context, creating AI models that can parse both language and visuals.

“At present, we don’t need it to understand what it is seeing exactly as humans do, but we want it to be able to build a representation of the content in the scene, which will help the AI make better decisions,” Liu explains.

On the talent front, governments and institutions are working to build up the pool of AI talent. The Singapore government, for one, has rolled out a comprehensive series of programmes to develop talent in the AI space, while DigiPen Singapore has launched a master’s degree in computer vision to help address this talent gap.

“We need to have a broader base of AI talent in order to create that synergy for more development and ideas,” Liu says.

“However, as computational resources grow and more talent comes into the space, we’ll be seeing even more exciting applications and developments,” she concludes. “We’re just limited by our own imagination.”



This article was originally published on Tech in Asia’s website on 6 May 2022.

This article was originally published in October 2023 .

Written by

Related Articles

Most Shared

Most Read