Intelligence varies in kinds and degrees, and it occurs in humans, many animals and some machines. Considering machines, it is said that Artificial Intelligence (AI) is the set of methods and procedures that provide machines with the ability to achieve goals in the world. It is present in many studying areas such as deep learning, computer vision, reinforcement learning, natural language processing, semantics, learning theory, case based reasoning, robotics, etc. During the 1990’s, the attention was on logic-based AI, mainly concerned with knowledge reasoning (KR), whereas the focus nowadays lies on machine learning (ML). This shift contributed to the field in a way knowledge reasoning never did. However, a new shift is coming. Knowledge reasoning resurges as a response to a demand on inference methods, while machine learning keeps its achievements on statistical approach. This new change occurs when knowledge reasoning and machine learning begin to cooperate with each other, a scenario at which computing is not yet defined.
Intelligent computing is pervasive, demands are monotonically increasing and time for results is shortening. But while consumer products operate in those conditions, the process of building the complex mathematical models, which support such applications, rely on a computational infra-structure that demands large amounts of energy, time and processing power. There is a race to develop specialized hardware to make modern AI methods significantly faster and cheaper.
The strategy of packing such specialized hardware with elaborated software components into a successful architecture is a wise plan of action. Intel® has incorporated to its expertise the top of technology on machine learning when it acquired the hardware and software startup Intel® Nervana™. Moreover, the well-known Altera®, which makes FPGAs chips that can be reconfigured to power up specific algorithms, was also integrated to the company. Therefore, the power and energy efficiency from Intel® processor and architecture can help companies, software houses, cloud providers, and end-user devices to upgrade their capability to use AI. The relevance of such chips for developing and training new AI algorithms cannot be underestimated.
AI systems are usually only perceived as software since this is the layer nearest to ordinary developers and final users. However, it also requires high hardware functionality to support calculations. This is why choosing the Deep Neural Network (DNN) performance primitives within the Intel® Math Kernel Library (Intel® MKL) and the Intel® Data Analytics Acceleration Library (Intel® DAAL) is a clever decision, since such libraries allow better usage of Intel processors and support AI development through hardware. Intelligent applications need to rely on CPUs that perform specific types of mathematical calculations such as vector algebra, linear equations, eigenvalues and eigenvectors, statistics, matrix decomposition, linear regression, and handle large quantities of basic computations in parallel, to mention some. Concerning machine learning, there are a lot of the neural network solutions within hardware artifacts, and deep learning requires a huge amount of matrix multiplication. Considering knowledge representation, the forward and backward chaining1 demands many vector algebra computations, while resolution principle2 requires singular value decomposition. Therefore, AI benefits from using specialized processors with speedy connections between parallel onboard computing cores, fast access to ample memory for storing complex models and data, and mathematical operations optimized for speed.
There are many research and development reports describing the usage of Intel® Architecture supporting machine learning applications. However, such context can also be used on symbolist AI approach, a market share that has been overlooked by programmers and software architects. This paper aims to promote the usage of Intel® architecture to speedup not only machine learning, but also knowledge reasoning applications.
In order to illustrate that knowledge reasoning applications can also benefit from using Intel architecture, this test will consider two tasks from real artificial intelligence problems: one as a baseline for comparison and the other as a knowledge reasoning sample. The first task (ML) represents the machine learning approach by using Complement Naive Bayes3 (github.com/oiwah/classifier) classifier in order to identify the encryption algorithm used to encode plain text messages4. The classification model is constructed by training over 600 text samples, with more than 140,000 characters each, cyphered with DES, Blowfish, ARC4, RSA, Rijndael, Serpent and Twofish. The second task (KR) represents the knowledge reasoning approach by using Resolution Principle2 from an inference machine called Mentor*5 (not publically available) in order to detect frauds on car insurance claims. The sample is composed of 1,000 claims, and the inference machine is loaded with 78 first order logic rules.
Performance is measured based on how many RDTSC clock cycles (Read Time Stamp Counter)6 long it takes to run the tests. The RDTSC was used to track performance rather than wall-clock time because the former counts clock ticks and thus it is invariant even if the processor core changes frequency. This does not happens with wall-clock, and thus, RDTSC is a more precise measuring method than wall-clock. However, note that traditional performance measuring is usually accomplished by using wall-clock time since it provides an acceptable precision.
Tests were performed on a system equipped with Intel® Core™ i7 4500U@1.8 GHz processor, 64 bits, Ubuntu 16.04 LTS operating system, 8GB RAM, hyper threading turned on with 2 threads per core (you may check it by typing
sudo dmidecode -t processor | grep -E '(Core Count|Thread Count)') and with system power management disabled.
First, the C++ source codes were compiled with gcc 5.4.0 compiler and the test was performed. Then, the same source codes were recompiled with Intel® C++ Compiler XE 17.0.4, Intel® MKL 2017 (
-mkl=parallel) and a new test was performed. Note that many things happen within the operating system, which are invisible to the application programmer, affecting the cycle count, thus measuring variations are expected to occur. Hence, each test ran 300 times in a loop and it was discarded any result that is too much higher than other results.
Figure 1 shows the average clock cycles spent to build the Complement Naive Bayes classification model for the proposed theme. It uses statistical and math routines for training its model. The combination of Intel® C++ Compiler XE, Intel® MKL demand less clock cycles than the commonly used configuration for compiling C++ programs, and thus such tuning platform did a much better job. Notice that this evaluation compares source-codes that were not changed at all. Therefore, although it obtained a 1.66 speedup, it is expected higher values once parallelism and specialized methods are explored by developers.
Figure 1: Test of machine learning approach using Complement Naive Bayes classifier.
Figure 2 shows the average clock cycles spent to produce the deductions using the Resolution Principle as the core engine of an inference machine. It uses several math routines and lots of singular value decomposition to compute the first order predicates. Here, the Intel® C++ Compiler XE and Intel® MKL
(-mkl=parallel) combination outperformed the traditional compiling configuration, and thus it also bet the ordinary developing environment. The speedup obtained was 2.95, despite neither parallelism was explored, nor were specialized methods called.
Figure 2: Test of knowledge reasoning approach using resolution principle to perform inference.
The former test shows a machine learning method being enhanced by a tuning environment. Such result is not significant, since this was already expected. The relevance of this test lies in its function as a reference to the latter test, in which the same environment was used. The inference machine, under the same conditions, also obtained a good speedup. This is an evidence that applications based on this approach, such as expert systems, deduction machines, theorem provers, can also be enhanced by Intel® architecture.
This article presented a performance test of a tuning platform composed by Intel® processor, Intel® C++ Compiler XE and Intel® MKL applied to usual AI problems. The two existing approaches of artificial intelligence were probed. Machine learning was represented by an automatic classification method and knowledge reasoning was characterized by a computational inference method. The results suggest that it is possible to accelerate those AI computations as compared to using the traditional software developing environment by employing such tuning platform. These approaches are necessary to supply intelligent behavior to machines. The libraries and the processor helped to improve the performance of those functions by taking advantage of special features in Intel® products, speeding up the execution. The reader must notice that it was not necessary to modify source codes to take advantage of such features.
AI applications can run faster and consume less power when paired with processors designed to handle the set of mathematical operations these systems require. Intel® architecture provides specialized instruction sets in processors, with fast bus connections to parallel onboard computing cores and computational cheaper access to memory. The environment composed of Intel® processor, Intel® C++ Compiler XE and Intel MKL empower developers to construct tomorrow’s intelligent machines.
1. Merritt, Dennis. Building Expert Systems in Prolog, Springer-Verlag, 1989.
2. Russell, Stuart; Norvig, Peter. Artificial Intelligence: A Modern Approach, Prentice Hall Series in Artificial Intelligence, Pearson Education Inc., 2nd edition, 2003.
3. Rennie, Jason D.; Shih, Lawrence ; Teevan, Jaime; KargerDavid R. Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: International Conference on Machine Learning, 616-623, 2003.
4. Mello, Flávio L.; Xexéo, José A. M. Cryptographic Algorithm Identification Using Machine Learning and Massive Processing. IEEE Transactions Latin America, v.14, p.4585 - 4590, 2016. doi: 10.1109/TLA.2016.7795833
5. Metadox Group, Mentor, 2017. http://www.metadox.com.br/mentor.html Accessed on June 12th, 2017.
6. Intel Corporation. Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2B: Instruction Set Reference, M-U, Order Number: 253667-060US, September, 2016. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-2b-manual.pdf Accessed on May 30th, 2017.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804