Expand the Possibilities of Computer Vision with AI

abstract blue eye

The benefits and possibilities of computer vision are amplified and expanded through a generous influx of AI and IoT technologies.

“The computer vision market will include a mix of consumer-facing applications like augmented reality and virtual reality, robots, drones, and self-driving cars, along with business applications like medical image analysis, video surveillance, real estate development optimization, ad insertions, and converting paperwork into digital data. While the consumer-facing applications often generate more buzz, the big shift is that enterprises are moving beyond data analytics to embrace AI-based business applications that utilize computer vision capabilities.”1

Aditya Kaul, Research Director, Tractica  

Challenge

Computer vision has unlocked a multitude of possibilities for both consumer and enterprise business applications, but traditional technologies have been hampered by numerous vexing implementation obstacles. 

Solution

With advanced computer vision technologies that embed intelligence at the network edge and solutions enhanced by artificial intelligence (AI), innovative new use cases are being developed that are generating increasing real-world value. 

Background and Project History

From an early career as a DJ, Adam Milton-Barker gained an interest in coding while building websites to promote his business, which spiraled into deeper interests, including AI and the Internet of Things (IoT). Over the course of several years, numerous jobs, and time spent managing his own company, Adam continued to accumulate AI expertise; at one stage leading a medical neural network project for a team based in Sao Paulo. In 2014, an ad caught his attention: a challenge from the Microsoft Development Program offering an Intel® Galileo development board to each of the winners. At that point,” Adam said, “I was primarily involved in web, business administration applications, and natural linguistics programs, using AIML (Artificial Intelligence Markup Language). I also had a stint building games and apps as a member of the Facebook developer forum, as well as teaching coding. I had never come across IoT. I really liked the idea of the Internet of Things. And, because I had a lot of equipment in my home, an obvious project for me would be a security system.”

Adam developed a facial recognition solution that he dubbed TASS (TechBubble Assisted Security System) and was awarded the Intel Galileo from Microsoft* for the project idea. He then bought a standard Intel Galileo board to be able to include Linux* in his development efforts. TASS debuted at the Intel booth at Codemotion Amsterdam as part of the Intel Innovator program and the solution became the focus for a number of conference presentations and demos at worldwide venues. After considering launching TASS as a full-fledged product, Adam decided to release the specifications and project details to the open source community. “TASS,” he said, “is now open source, IoT-connected computer vision software that runs on the edge. There are several versions of TASS that have been created over the last few years, each using different techniques and technologies to provide facial recognition applications without the need for cloud services.”

The initial TASS project expanded in several productive directions. “During the next few years,” Adam said, “I was a semifinalist in the IBM Global Mobile Innovators Tournament with a project that included TASS and the IoT JumpWay*, which was then built on a Raspberry Pi*, but is now a free cloud service for people that want to learn about the IoT and AI. The project was one of the top five in Europe. I was also a first phase winner in the World’s Largest Arduino* Maker Challenge and I worked with a team at the AT&T Foundry Hackathon where we won first place for using a brain- computer interface to detect seizures; although, as a proof of concept the project never went beyond the demo stage. After a version of TASS won the Intel Experts Award at the Intel and Microsoft IoT Solutions World Congress Hackathon, I was invited to join the Intel Innovators program. This had been a goal of mine since the early days of my involvement in IoT. I joined the program in the areas of AI and IoT and have since added a new area—VR.”

The landmark accomplishments that Adam has achieved over several years were attained without his earning a technology degree or taking any formal college courses. “My work, project experiences, and self-learning path led me to the Intel® Software Innovators Program, which opened global opportunities. I’ve demonstrated my projects at some of the biggest tech conferences in the world. Ultimately, this led me to my dream job at Bigfinite as an IoT network engineer.”

“Moving to Barcelona and working at Bigfinite gave me a totally new life; I now work in an environment where I am not only surrounded by like-minded people, but people that know a lot more than me. It is an amazing place for me to continue learning, something that I have never experienced at other companies where I have worked. Bigfinite is also fully supportive of my personal projects and role in the Intel® Software Innovator program and promote my projects on our social media. We also have an initiative called beTED where I can continue helping people learn about AI and IoT at work.”

Adam Milton-Barker, Intel Software Innovator and Bigfinite IoT Engineer

“The project is ongoing,” Adam said. “I originally began developing it in 2014 and since then there have been many different versions. All of these versions are available on the IoT JumpWay GitHub* repos. As new technologies emerge, I create new versions of TASS.” 

Refining Facial Recognition Technology

Through his development experience and ongoing research, Adam has identified key areas that could guide developers in productive directions when implementing facial recognition capabilities into their apps. Foremost among the concerns is the open set recognition issue. “The open set recognition issue is something that not many people talk about when they promote their computer vision projects or products,” Adam commented, “as it is an unmistakable flaw in almost all compute vision projects. What happens is this: Say that you had trained a network to know two people and then introduce an unknown person for classification. By default, the AI application will predict that it is one of the people it has been previously trained on. Because of this, any application that relies on detecting who is actually unknown will fail.”

Facial recognition
Figure 1. Facial recognition is implemented through a polygonal grid linked to features.

Overcoming the issue, according to Adam, can be done in two different ways. First, you can introduce an unknown class composed of, for example, 500 images. This approach works well in small environments, but within a larger environment you have a greater likelihood of seeing someone that looks like someone from the unknown dataset. This implementation, however, doesn’t work in TensorFlow* Inception v3, but it does work within an OpenFace* implementation (which is available in the GitHub repository).

Another way to contend with the issue involves using FaceNet, which calculates the distance between faces in frames and a known database of images. On its own, this approach will typically not work well in the real world. If your application relies on thousands of known people, the program must loop through every single person in the known dataset and compare it to the person or persons in a frame. If you have a very powerful server and abundant resources, this may not be a serious issue. But, on the network edge where compute resources are limited, it becomes more of a challenge.

Along this line, Adam continued, “My next step will be to combine my earlier models with FaceNet and use FaceNet as a backup to check known faces, eliminating the need to loop through all of the known people. Because we know exactly what image to compare against—due to using the result from the first classification—if the second classification confirms, then it is more than likely that it is that person and not a false positive. The only requirement is to retrieve the classification from model 1 and use the ID to directly reference the corresponding image in the known database. Currently, I believe that this is the best way to solve the issue, but it is kind of a hacky way of doing things. This approach was suggested to me by a colleague, Stuart Gray, a fellow member of the Artificial Intelligence and Deep Learning group on Facebook.”

Two other issues that bear consideration:

Lighting, whether too dark or too bright, presents a challenge. Intel® RealSense™ technology minimizes lighting issues significantly, but developers need to be aware of scenarios where a poor lighting situation completely shuts down the recognition process.

Photos designed to fool a computer vision system and foil either security protections or the facial recognition accuracy represent a current challenge that requires more attention as facial recognition moves into mainstream applications. 

Enabling Technologies

Adam uses a range of different Intel solutions in his projects, building new iterations of TASS to take advantage of emerging technologies. “Different versions have used different technologies,” Adam said. “Initially it was built on a Raspberry Pi. At the IoT World Congress Hackathon, we built it on an Intel® Joule™ development kit (now discontinued). The server version was built on an Intel® NUC DE3815TYKE and also an Intel NUC I7 using the OpenVINO™ toolkit. I have used Intel® RealSense™ cameras in some versions that helped with issues such as lighting. The more current versions use the UP Squared* Grove* IoT Development Kit and Intel® Movidius™ technology and they are trainable using the Intel® AI DevCloud. I will soon be working on a version that uses virtual reality using the hardware provided by Intel.”

Among the specific benefits that Adam gained from the use of Intel technology:

  • Intel RealSense technology helped improve management of lighting issues.
  • Intel AI DevCloud was effective for training small models.
  • Intel Movidius technology has enhanced the capabilities of running AI on the edge.
  • Sample code and other resources available through Intel helped gain a better understanding of the hardware used in the solutions.
  • OpenVINO substantially improved the project results, adding speed and accuracy to the solution.

“Each time I have implemented Intel technologies it has drastically increased the functionality of the project. In addition to increasing the capabilities of the project, the support I have received from the Intel Innovators in the Intel® AI Academy program has been amazing. The hardware and opportunities to demonstrate at various events that were provided through the program have helped the project reach new heights.”

Adam Milton-Barker, Intel Software Innovator and IoT Engineer at Bigfinite, Inc. 

Bringing Vision to the Edge: The OpenVINO™ Toolkit

The release of the Open Visual Inference and Neural Network Optimization (OpenVINO) toolkit by Intel gives developers a rapid way to implement deep learning inference solutions using computer vision at the network edge. This addition to the current slate of Intel® Vision Products is based on convolutional neural network (CNN) principles, making it easier to design, develop, and deploy effective computer vision solutions that leverage IoT to support business operations.

The components in the toolkit include three APIs:

  • A deep learning inference toolkit supporting the full range of Intel Vision Products.
  • A deep learning deployment toolkit for streamlining distribution and use of AI-based computer vision solutions.
  • A set of optimized functions for OpenCV and OpenVX*.

Currently supported frameworks include TensorFlow, Caffe*, and MXNet. The toolkit helps boost solution performance with numerous Intel based accelerators, including CPUs and integrated graphics processing units (GPUs), field-programmable gate arrays, video processing units, and image processing units.

“Processing high-quality video requires the ability to rapidly analyze vast streams of data near the edge and respond in real time, moving only relevant insights to the cloud asynchronously. The OpenVINO toolkit is designed to fast-track development of high- performance computer vision and deep learning inference applications at the edge.”2

Tom Lantzsch, Senior Vice President and General Manager, IoT Group, Intel

Substantial performance improvements are available through the OpenVINO toolkit (click here and zoom in on the Increase Deep Learning Performance chart for full details). The components also enable a single- source approach to creating solutions, allowing developers to develop once and deploy anywhere, taking any model and optimizing for a large number of Intel hardware platforms.

A free download of the OpenVINO toolkit is available today, putting developers on a path to produce optimized computer vision solutions that maximize performance on Intel acceleration platforms. Ecosystem partners in the Intel® IoT Solutions Alliance offer additional tools and technologies to help build innovative computer vision and IoT solutions. 

Forward-Looking Development Perspectives

Opportunities in the burgeoning fields of AI and IoT are abundant, and numerous resource and learning tools are available to anyone with the initiative to explore the various applications. International Data Corporation (IDC) projects that worldwide spending on IoT will reach USD 772 billion in 2018, up from USD 674 billion in 2017. IoT hardware represents the largest technology category in 2018; sales of modules, sensors, infrastructure, and security are projected to total USD 239 billion, with services listed as the next largest category.3

Aerial drone technology
Figure 2. Aerial drone technology opens up numerous vision computing opportunities.

Industry reports project that strong growth will continue in the computer vision market:

  • By 2022, the video analytics market is projected to reach USD 11.17 billion.4
  • By 2023, the overall computer vision market should reach USD 17.38 billion.5
  • Deep learning revenue is projected to increase from USD 655 million in 2016 to USD 35 billion by 2025.6

Developers interested in taking advantage of these technology opportunities have a number of different channels for gaining knowledge and expertise in AI and deep learning.

“I would recommend the Coursera Deep Learning Specialization and Stanford Engineering Everywhere Machine Learning course for people wanting to know more about the inner workings of modern AI,” Adam said. “For those who just want to dive straight in head first as I did (and do), I have created a number of complete walkthroughs and provided source code that is freely available through the IoT JumpWay Developer Program.” 

AI is Expanding the Boundaries of Computer Vision

Through the design and development of specialized chips, sponsored research, educational outreach, and industry partnerships, Intel is firmly committed to advancing the state of AI to solve difficult challenges in medicine, manufacturing, agriculture, scientific research, and other industry sectors. Intel works closely with government organizations, non- government organizations, educational institutions, and corporations to uncover and advance solutions that address major challenges in the sciences.

For example, working with the engineering team at Honeywell, Intel is combining IoT technology and computer vision tools to help ensure safe and secure buildings.

“The Internet of Things is creating huge advancements in the way we use video to ensure safe and secure buildings. With new emerging technology like analytics, facial recognition, and deep learning, Honeywell and Intel are connecting buildings like never before. Intel is an important partner in establishing the vision of smarter video solutions for the industry, and we look forward to continued collaboration that benefits customers.”7

Jeremy Kimber, Marketing Director, Video Solutions, Honeywell

The Intel® AI technologies used in this implementation included:

Intel Xeon Scalable processors Intel® Xeon® Scalable processors: Tackle AI challenges with a compute architecture optimized for a broad range of AI workloads, including deep learning.


LogosFramework Optimization: Achieve faster training of deep neural networks on a robust scalable infrastructure.


Intel Movidius MyriadIntel® Movidius™ Myriad™ Vision Processing Unit (VPU): Create and deploy on-device neural networks and computer vision applications.


Intel AI DevCloudIntel AI DevCloud: Free cloud compute for machine learning and deep learning training, powered by Intel Xeon Scalable processors.


Internet of thingsInternet of Things: IoT consists of a network of devices that exchange and analyze data across linked and wireless interconnections.


For a complete look at the Intel® AI portfolio, visit https://ai.intel.com/technology.

“With the OpenVINO toolkit, we are now able to optimize inferencing across silicon from Intel, exceeding our throughput goals by almost six times. We want to not only keep deployment costs down for our customers, but also offer a flexible, high-performance solution for a new era of smarter medical imaging. Our partnership with Intel allows us to bring the power of AI to clinical diagnostic scanning and other healthcare workflows in a cost-effective manner.”8

David Chevalier, Principal Engineer, General Electric (GE) Healthcare* 

Resources

References

  1. Computer Vision Hardware, Software, and Services Market to Reach $26.2 Billion by 2025, According to Tractica. Business Wire 2018.
  2. Wheatley, Mike. Intel’s OpenVINO toolkit enables computer vision at the network edge. SiliconANGLE 2018.
  3. Worldwide Semiannual Internet of Things Spending Guide. International Data Corporation (IDC) 2017.
  4. Marketsandmarkets, Video Analytics Market 2017.
  5. Marketsandmarkets, Computer Vision Market 2017.
  6. Tractica, 2Q, 2017
  7. Intel Customer Quote Sheet. Intel Newsroom 2018.
  8. OpenVINO Toolkit. What Customers Are Saying. Intel 2018.
For more complete information about compiler optimizations, see our Optimization Notice.