For more information about the BLAS, Sparse BLAS, LAPACK, ScaLAPACK, Sparse Solver, Extended Eigensolver, VM, VS, FFT, and Non-Linear Optimization Solvers functionality, refer to the following publications:

  • BLAS Level 1

    C. Lawson, R. Hanson, D. Kincaid, and F. Krough. Basic Linear Algebra Subprograms for Fortran Usage, ACM Transactions on Mathematical Software, Vol.5, No.3 (September 1979) 308-325.

  • BLAS Level 2

    J. Dongarra, J. Du Croz, S. Hammarling, and R. Hanson. An Extended Set of Fortran Basic Linear Algebra Subprograms, ACM Transactions on Mathematical Software, Vol.14, No.1 (March 1988) 1-32.

  • BLAS Level 3

    J. Dongarra, J. DuCroz, I. Duff, and S. Hammarling. A Set of Level 3 Basic Linear Algebra Subprograms, ACM Transactions on Mathematical Software (December 1989).

  • Sparse BLAS

    D. Dodson, R. Grimes, and J. Lewis. Sparse Extensions to the FORTRAN Basic Linear Algebra Subprograms, ACM Transactions on Math Software, Vol.17, No.2 (June 1991).

    D. Dodson, R. Grimes, and J. Lewis. Algorithm 692: Model Implementation and Test Package for the Sparse Basic Linear Algebra Subprograms, ACM Transactions on Mathematical Software, Vol.17, No.2 (June 1991).


    I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. Clarendon Press, Oxford, UK, 1986.


    Compaq Extended Math Library. Reference Guide, Oct.2001.


    K.Remington. A NIST FORTRAN Sparse Blas User's Guide. (available on


    Y.Saad. SPARSKIT: A Basic Tool-kit for Sparse Matrix Computation. Version 2, 1994.(


    Y.Saad. Iterative Methods for Linear Systems. PWS Publishing, Boston, 1996.



    A. A. Anda and H. Park. Fast plane rotations with dynamic scaling, SIAM J. matrix Anal. Appl., Vol. 15 (1994), pp. 162-174.


    M. Baudin, R. Smith. A Robust Complex Division in Scilab, available from, arXiv:1210.4539v2 (2012).


    C. H. Bischof, B. Lang, and X. Sun. Algorithm 807: The SBR toolbox-software for successive band reduction, ACM Transactions on Mathematical Software, Vol. 26, No. 4, pages 602-616, December 2000.


    J. Demmel and K. Veselic. Jacobi's method is more accurate than QR, SIAM J. Matrix Anal. Appl. 13(1992):1204-1246.


    J. Demmel, L. Grigori, M. F. Hoemmen, and J. Langou. Communication-optimal parallel and sequential QR and LU factorizations, SIAM Journal on Scientific Computing, Vol. 34, No 1, 2012.


    P. P. M. De Rijk. A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer, SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.


    I. Dhillon, B. Parlett. Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices, Linear Algebra and its Applications, 387(1), pp. 1-28, August 2004.


    I. Dhillon, B. Parlett. Orthogonal Eigenvectors and * Relative Gaps, SIAM Journal on Matrix Analysis and Applications, Vol. 25, 2004. (Also LAPACK Working Note 154.)


    I. Dhillon. A new O(n^2) algorithm for the symmetric tridiagonal eigenvalue/eigenvector problem, Computer Science Division Technical Report No. UCB/CSD-97-971, UC Berkeley, May 1997.


    Z. Drmac and K. Veselic. New fast and accurate Jacobi SVD algorithm I, SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. LAPACK Working note 169.


    Z. Drmac and K. Veselic. New fast and accurate Jacobi SVD algorithm II, SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. LAPACK Working note 170.


    Z. Drmac and K. Bujanovic. On the failure of rank-revealing QR factorization software - a case study, ACM Trans. Math. Softw. Vol. 35, No 2 (2008), pp. 1-28. LAPACK Working note 176.


    Z. Drmac. Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic, SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.


    E. Elmroth and F. Gustavson. Applying Recursion to Serial and Parallel QR Factorization Leads to Better Performance, IBM J. Research & Development, Vol. 44, No. 4, 2000, pp 605-624.


    G. Golub and C. Van Loan. Matrix Computations, Johns Hopkins University Press, Baltimore, third edition,1996.


    E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users' Guide, Third Edition, Society for Industrial and Applied Mathematics (SIAM), 1999.


    W. Kahan. Accurate Eigenvalues of a Symmetric Tridiagonal Matrix, Report CS41, Computer Science Dept., Stanford University, July 21, 1966.


    O.Marques, E.J.Riedy, and Ch.Voemel. Benefits of IEEE-754 Features in Modern Symmetric Tridiagonal Eigensolvers, SIAM Journal on Scientific Computing, Vol.28, No.5, 2006. (Tech report version in LAPACK Working Note 172 with the same title.)


    Brian D. Sutton. Computing the complete CS decomposition, Numer. Algorithms, 50(1):33-65, 2009.



    L. Blackford, J. Choi, A.Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K.Stanley, D. Walker, and R. Whaley. ScaLAPACK Users' Guide, Society for Industrial and Applied Mathematics (SIAM), 1997.

  • Sparse Solver


    I. S. Duff and J. Koster. The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices. SIAM J. Matrix Analysis and Applications, 20(4):889-901, 1999.


    J. Dongarra, V.Eijkhout, A.Kalhan. Reverse Communication Interface for Linear Algebra Templates for Iterative Methods. UT-CS-95-291, May 1995.


    G. Karypis and V. Kumar. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing, 20(1):359-392, 1998.


    X.S. Li and J.W. Demmel. A Scalable Sparse Direct Solver Using Static Pivoting. In Proceeding of the 9th SIAM conference on Parallel Processing for Scientific Computing, San Antonio, Texas, March 22-34,1999.


    J.W.H. Liu. Modification of the Minimum-Degree algorithm by multiple elimination. ACM Transactions on Mathematical Software, 11(2):141-153, 1985.


    R. Menon L. Dagnum. OpenMP: An Industry-Standard API for Shared-Memory Programming. IEEE Computational Science & Engineering, 1:46-55, 1998.


    Y. Saad. Iterative Methods for Sparse Linear Systems. 2nd edition, SIAM, Philadelphia, PA, 2003.


    O. Schenk. Scalable Parallel Sparse LU Factorization Methods on Shared Memory Multiprocessors. PhD thesis, ETH Zurich, 2000.


    O. Schenk, K. Gartner, and W. Fichtner. Efficient Sparse LU Factorization with Left-right Looking Strategy on Shared Memory Multiprocessors. BIT, 40(1):158-176, 2000.


    O. Schenk and K. Gartner. Sparse Factorization with Two-Level Scheduling in PARDISO. In Proceeding of the 10th SIAM conference on Parallel Processing for Scientific Computing, Portsmouth, Virginia, March 12-14, 2001.


    O. Schenk and K. Gartner. Two-level scheduling in PARDISO: Improved Scalability on Shared Memory Multiprocessing Systems. Parallel Computing, 28:187-197, 2002.


    O. Schenk and K. Gartner. Solving Unsymmetric Sparse Systems of Linear Equations with PARDISO. Journal of Future Generation Computer Systems, 20(3):475-487, 2004.


    O. Schenk and K. Gartner. On Fast Factorization Pivoting Methods for Sparse Symmetric Indefinite Systems. Technical Report, Department of Computer Science, University of Basel, 2004, submitted.


    P. Sonneveld. CGS, a Fast Lanczos-Type Solver for Nonsymmetric Linear Systems. SIAM Journal on Scientific and Statistical Computing, 10:36-52, 1989.


    D.M.Young. Iterative Solution of Large Linear Systems. New York, Academic Press, Inc., 1971.

  • Extended Eigensolver


    E. Polizzi, Density-Matrix-Based Algorithms for Solving Eigenvalue Problems, Phys. Rev. B. Vol. 79, 115112, 2009.


    E. Polizzi, A High-Performance Numerical Library for Solving Eigenvalue Problems: FEAST Solver v2.0 User's Guide,, 2012.


    Z. Bai, J. Demmel, J. Dongarra, A. Ruhe and H. van der Vorst, editors, Templates for the solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia, 2000.


    G. L. G. Sleijpen and H. A. van der Vorst. A Jacobi-Davidson iteration method for linear eigenvalue problems. SIAM J. Matrix Anal. Appl., 17:401-425, 1996.

  • VS


    Intel. Intel® Advanced Vector Extensions Programming Reference. (


    Nedret Billor, Ali S. Hadib, and Paul F. Velleman. BACON: blocked adaptive computationally efficient outlier nominators. Computational Statistics & Data Analysis, 34, 279-298, 2000.


    Bratley P., Fox B.L., and Schrage L.E. A Guide to Simulation. 2nd edition. Springer-Verlag, New York, 1987.


    Bratley P. and Fox B.L. Implementing Sobol's Quasirandom Sequence Generator, ACM Transactions on Mathematical Software, Vol. 14, No. 1, Pages 88-100, March 1988.


    Bratley P., Fox B.L., and Niederreiter H. Implementation and Tests of Low-Discrepancy Sequences, ACM Transactions on Modeling and Computer Simulation, Vol. 2, No. 3, Pages 195-213, July 1992.

    Intel. Bull Mountain Technology Software Implementation Guide. (

    Coddington, P. D. Analysis of Random Number Generators Using Monte Carlo Simulation. Int. J. Mod. Phys. C-5, 547, 1994.


    Fritsch, F. N and Carlson, R. E. Monotone Piecewise Cubic Interpolation. SIAM Journal on Numerical Analysis (SIAM) 17 (2): 238-246, 1980.


    Gentle, James E. Random Number Generation and Monte Carlo Methods, Springer-Verlag New York, Inc., 1998.


    Hyman, J. M. Accurate monotonicity preserving cubic interpolation, SIAM J. Sci. Stat. Comput. 4, 645-654, 1983.

    Intel. Intel® 64 and IA-32 Architectures Software Developer’s Manual. 3 vols. (

    L'Ecuyer, Pierre. Uniform Random Number Generation. Annals of Operations Research, 53, 77-120, 1994.


    L'Ecuyer, Pierre. Tables of Linear Congruential Generators of Different Sizes and Good Lattice Structure. Mathematics of Computation, 68, 225, 249-260, 1999.


    L'Ecuyer, Pierre. Good Parameter Sets for Combined Multiple Recursive Random Number Generators. Operations Research, 47, 1, 159-164, 1999.


    L'Ecuyer, Pierre. Software for Uniform Random Number Generation: Distinguishing the Good and the Bad. Proceedings of the 2001 Winter Simulation Conference, IEEE Press, 95-105, Dec. 2001.


    Kirkpatrick, S., and Stoll, E. A Very Fast Shift-Register Sequence Random Number Generator. Journal of Computational Physics, V. 40. 517-526, 1981.


    Knuth, Donald E. The Art of Computer Programming, Volume 2, Seminumerical Algorithms. 2nd edition, Addison-Wesley Publishing Company, Reading, Massachusetts, 1981.


    Maronna, R.A., and Zamar, R.H., Robust Multivariate Estimates for High-Dimensional Datasets, Technometrics, 44, 307-317, 2002.


    Matsumoto, M., and Nishimura, T. Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator, ACM Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, Pages 3-30, January 1998.


    Matsumoto, M., and Nishimura, T. Dynamic Creation of Pseudorandom Number Generators, 56-69, in: Monte Carlo and Quasi-Monte Carlo Methods 1998, Ed. Niederreiter, H. and Spanier, J., Springer 2000,


    NAG Numerical Libraries.


    David M. Rocke, Robustness properties of S-estimators of multivariate location and shape in high dimension. The Annals of Statistics, 24(3), 1327-1345, 1996.


    Saito, M., and Matsumoto, M. SIMD-oriented Fast Mersenne Twister: a 128-bit Pseudorandom Number Generator. Monte Carlo and Quasi-Monte Carlo Methods 2006, Springer, Pages 607 – 622, 2008.


    Salmon, John K., Morales, Mark A., Dror, Ron O., and Shaw, David E., Parallel Random Numbers: As Easy as 1, 2, 3. SC '11 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, 2011.


    Schafer, J.L., Analysis of Incomplete Multivariate Data. Chapman & Hall, 1997.


    Sobol, I.M., and Levitan, Yu.L. The production of points uniformly distributed in a multidimensional cube. Preprint 40, Institute of Applied Mathematics, USSR Academy of Sciences, 1976 (In Russian).

    [SSL Notes]

    Intel® MKL Summary Statistics Application Notes, a document present on the Intel® MKL product at

    [VS Notes]

    Intel® MKL Vector Statistics Notes, a document present on the Intel® MKL product at

    [VS Data]

    Intel® MKL Vector Statistics Performance, a document present on the Intel® MKL product at

  • VM


    ISO/IEC 9899:1999/Cor 3:2007. Programming languages -- C.


    J.M.Muller. Elementary functions: algorithms and implementation, Birkhauser Boston, 1997.


    IEEE Standard for Binary Floating-Point Arithmetic. ANSI/IEEE Std 754-2008.

    [VM Data]

    Intel® MKL Vector Mathematics Performance and Accuracy, a document present on the Intel® MKL product at

  • FFT


    E. Oran Brigham, The Fast Fourier Transform and Its Applications, Prentice Hall, New Jersey, 1988.


    Athanasios Papoulis, The Fourier Integral and its Applications, 2nd edition, McGraw-Hill, New York, 1984.


    Ping Tak Peter Tang, DFTI - a new interface for Fast Fourier Transform libraries, ACM Transactions on Mathematical Software, Vol. 31, Issue 4, Pages 475 - 507, 2005.


    Charles Van Loan, Computational Frameworks for the Fast Fourier Transform, SIAM, Philadelphia, 1992.

  • Optimization Solvers


    A. R. Conn, N. I.M. Gould, P. L. Toint.Trust-region Methods.SIAM Society for Industrial & Applied Mathematics, Englewood Cliffs, New Jersey, MPS-SIAM Series on Optimization edition, 2000.

  • Data Fitting Functions


    Carl deBoor. A Practical Guide to Splines. Revised Edition. Springer-Verlag New York Berlin Heidelberg, 2001.


    Larry L Schumaker. Spline Functions: Basic Theory. 3rd Edition. Cambridge University Press, Cambridge, 2007.


    S.B. Stechhkin, and Yu Subbotin. Splines in Numerical Mathematics. Izd. Nauka, Moscow, 1976.

For a reference implementation of BLAS, sparse BLAS, LAPACK, and ScaLAPACK packages visit

For more complete information about compiler optimizations, see our Optimization Notice.