The Vector Math Library (VML) is designed to compute elementary functions on vector arguments. VML is an integral part of the Intel® Math Kernel Library (Intel® MKL) and the VML terminology is used here for simplicity in discussing this group of functions.
VML includes a set of highly optimized implementations of certain computationally expensive core mathematical functions (power, trigonometric, exponential, hyperbolic, etc.) that operate on vectors. VML may improve performance for such applications as nonlinear software, computations of integrals, and many others.
Each vector function from VML (for each data format) can work in three modes: High Accuracy (HA), Low Accuracy (LA), and Enhanced Performance (EP). Most VML functions have different implementation flavors that correspond to each of these three modes. This does not apply to certain functions, for example, those that have correctly rounded results. For many functions, using the LA accuracy mode improves performance compared to HA, however, at the cost of a slight reduction in accuracy (1 or 2 least significant bits may be inaccurate). In contrast to the LA accuracy mode, the EP mode further enhances the performance, at the cost of a significant reduction in accuracy: in both single and double precision, only about half of the significand bits are expected to be correct in the EP mode. Moreover, for EP some argument values (for example, large arguments in trigonometric functions) could lead to calculations with even less accuracy.
Despite the fact that the default accuracy is HA, LA is more than sufficient in most cases. For certain applications
that are not very demanding for accuracy (for example, media applications, some Monte Carlo simulations, etc.) you
may find the EP accuracy mode to be adequate. You can use the vmlSetMode
function to control the accuracy mode.
Please refer to the Intel® Math Kernel Library Reference Manual for further details.
Accuracy behavior is processor specific, so results might differ slightly across different processor families and even within a processor family, for example, between some processor models of the family, or between 64-bit and 32-bit libraries. Results might also differ slightly from release to release. Nevertheless, these differences are within specified error bounds.
Error and special value behavior is identical for HA and LA functions and does not depend on the processor used to run the software. Correct error and special value behavior is not guaranteed for the EP mode.
Refer to the List of VML Functions for a more detailed description of the performance and accuracy properties of the VML functions.
Note on Performance: Performance numbers in the respective tables are shown for "working" argument intervals. Performance behavior may be different for other intervals. For example, it is quite expensive to compute trigonometric functions accurately for huge arguments. Each function lists the working interval over which performance is measured. The same page contains graphs that show how the performance behavior depends on the vector length. There are two extreme cases: short and long vectors (logarithmic scale is used to show both cases). For short vectors, functions incur certain overheads, which are amortized with an increasing vector length. For vectors longer than a few dozens of elements the performance remains quite flat until the L2 cache size is exceeded due to the length of the vector.
Data prefetching greatly reduces the performance penalty for vectors that do not fit in the cache.
See a comprehensive table with performance data for all the VML functions.
Note on Accuracy: The design requirement for the HA functions is to have error less than 1.0 ulp (unit-in-the-last-place), and to have all special values processed correctly. For the LA functions, the error bound is 4.0 ulps. For the EP functions, approximately half of the bits in the significand of the floating-point result need to be correct. For details, see the accuracy table with ulp errors for all the functions. Any deviations from these error bounds are highlighted in the accuracy tables, and should be considered temporary.
For complex functions, the ulp error is the maximum of the two ulp errors calculated for the real and the imaginary parts of the result.
Special Value Processing: Special values are processed in conformance with the C9X standard. See the information for the special value behavior of every function in the Intel® Math Kernel Library Reference Manual.
List of VML Functions
Performance of All VML Functions
Measured Accuracy of All VML Functions
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH
INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY
INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN
INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO
LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY,
RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES
RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT
OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the
Intel Product could result, directly or indirectly, in personal injury or death.
SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL
APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES,
SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF
EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE
ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT
LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION
CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN
THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time,
without notice. Designers must not rely on the absence or characteristics of any
features or instructions marked "reserved" or "undefined". Intel reserves these
for future definition and shall have no responsibility whatsoever for conflicts
or incompatibilities arising from future changes to them. The information here
is subject to change without notice. Do not finalize a design with this
information.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each
processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number for
details. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and
performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product
when combined with other products. For more information on performance tests and on the performance of Intel products, go to: http://www.intel.com/performance/resources/benchmark_limitations.htm BlueMoon, BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Inside,
Cilk, Core Inside, E-GOLD, Flexpipe, i960, Intel, the Intel logo, Intel AppUp,
Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Insider, the
Intel Inside logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel
SingleDriver, Intel SpeedStep, Intel Sponsors of Tomorrow., the Intel Sponsors
of Tomorrow. logo, Intel StrataFlash, Intel vPro, Intel Xeon Phi, Intel XScale, InTru, the InTru
logo, the InTru Inside logo, InTru soundmark, Itanium, Itanium Inside, MCS, MMX,
Moblin, Pentium, Pentium Inside, Puma, skoool, the skoool logo, SMARTi, Sound
Mark, Stay With It, The Creators Project, The Journey Inside, Thunderbolt,
Ultrabook, vPro Inside, VTune, Xeon, Xeon Inside, X-GOLD, XMM, X-PMU and XPOSYS
are trademarks of Intel Corporation in the U.S. and/or other countries.
The products described in this document may contain design defects or errors
known as errata which may cause the product to deviate from published
specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest
specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this
document, or other Intel literature, may be obtained by calling 1-800-548-4725,
or go to: http://www.intel.com/design/literature.htm
* Other names and brands may be claimed as the property of others.
Microsoft, Windows, Visual Studio, Visual C++, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.
| Optimization Notice |
|---|
|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 |
Copyright © 2000-2013, Intel Corporation.