By shuo-li (Intel), Added

Download Available under theIntel Sample Source Code License Agreement license

### Introduction

Financial derivative pricing is a cornerstone of quantitative Finance. The most common form of financial derivatives is common stock options, which are contracts between two parties regarding buying or selling an asset at a certain time at an agreed price. The two types of options are: calls and puts. A *call option* gives the holder the right to buy the underlying asset by a certain date for a certain price. A *put option* gives the holder the right to sell the underlying asset by a certain date for a certain price. The asset or contract price is called the *exercise price* or *strike price*. The date in the contract is known as the *expiration date* or *maturity*. American options can be exercised at any time before the expiration date. European options can be exercised only on the expiration date.

Typically, the value of an option *f*, is determined by the following factors:

- S – the current price of the underlying asset
- X – the strike price of the option
- T – the time to the expiration
- σ – the volatility of the underlying asset
- r – the continuously compounded risk-free rate

In their 1973 paper, “*The pricing of Options and Corporate Liabilities*”, Fischer Black and Myron Scholes created a mathematical description of financial markets and stock options in frameworks built by researchers from Luis Bachelier to Paul Samuelson. Jack Treynor arrived at partial differential equations, which Robert Merton first referred as Black-Scholes Model.

This PDE has many solutions, corresponding to all the different derivatives with the same underlying asset S. The specific derivative obtained depends on the boundary conditions used while solving this equation. In the case of European call options, the key boundary condition is

fcall = max(S-K, 0) when t=T

In the case of European put options, it is

fput = max(K-S, 0) when t=T

### Black-Scholes-Merton Formula

Shortly after Black-Scholes’s historical paper, Robert Merton was the first one to publish a paper recognizing the significance and coined the term Black-Scholes option pricing model. Merton is also credited with a closed-form solution to the Black-Scholes Equation for European call options *c*, and the European put option *p* known as the Black-Scholes-Merton Formula.

The function N(x) is the cumulative normal distribution function. It calculates the probability that a variable with a standard normal distribution of *Ф*(0,1) will be less than x. In most cases, N(x) is approximated using a polynomial function defined as:

### Code Access

The source code for Black-Scholes-Merton formula is maintained by Shuo Li and is available under the BSD 3-Clause Licensing Agreement. The program runs natively on Intel® Xeon Phi™ coprocessors in a single node environment.

To get access to the code and test workloads, go to the source location and download the BlackScholes.tar file.

### Build Directions

Here are the steps for rebuilding the program:

- Install Intel® Composer XE 2013 SP 2 on your system.
- Source the environment variable script file under
- Untar the BlackScholes.tar
- Type make to build the binary
- make
icpc -DFPFLOAT -O3 -ipo -mmic -fno-alias -opt-threads-per-core=4 -openmp -restrict -vec-report2 -fimf-precision=low -fimf-domain-exclusion=31 -no-prec-div -no-prec-sqrt -DCOMPILER_VERSION=\""icpc-20140120"\" -ltbbmalloc -o MonteCarloSP.knc MonteCarlo.cpp icpc -O3 -ipo -mmic -fno-alias -opt-threads-per-core=4 -openmp -restrict -vec-report2 -fimf-precision=low -fimf-domain-exclusion=31 -no-prec-div -no-prec-sqrt -DCOMPILER_VERSION=\""icpc-20140120"\" -ltbbmalloc -o MonteCarloDP.knc MonteCarlo.cpp

- Executable Files
- For Single Precision processing: BlackScholesSP.knc
- For Double Precision processing: BlackScholesDP.knc

### Run Directions

Copy the following files to the Intel® Xeon Phi™ coprocessor

[prompt]$ scp BlackScholesSP.knc yourhost-mic0: [prompt]$ scp BlackScholesDP.knc yourhost-mic0: [prompt]$ scp /opt/intel/composerxe/lib/mic/libiomp5.so yourhost-knc1-mic0: [prompt]$ scp /opt/intel/composerxe/tbb/lib/mic/libtbbmalloc.so yourhost-knc1-mic0: [prompt]$ scp /opt/intel/composerxe/tbb/lib/mic/libtbbmalloc.so.2 yourhost-knc1-mic0:

Enable turbo on the Intel Xeon Phi coprocessor

[prompt]$ sudo /opt/intel/mic/bin/micsmc --turbo status mic0 (Turbo Mode Status): Turbo Mode is DISABLED mic1 (Turbo Mode Status): Turbo Mode is DISABLED [prompt]$ sudo /opt/intel/mic/bin/micsmc --turbo enable Information: mic0: Turbo Mode Enable succeeded. Information: mic1: Turbo Mode Enable succeeded.

Make sure your Intel Xeon Phi coprocessor is C0-7120P/7120X/7120

Set the environmental variables and invoke the executables files from the host OS environment.

The program was built on the host and executes on the Intel Xeon Phi coprocessor. It processes close to 16M sets of option data and averages 64K data sets for each thread of 244 threads. The program goes into a loop 1000 times to read the option input data and price the options. Each time, the program processes an input set of option input data, it calculates both European call and European put values. We count calls and puts separately as we calculate the options/sec. Besides options/sec, the program also outputs total cycles spent for pricing activities, cycles spent for each option pair, a measure of and total times elapsed. Data validation is part of the program. During the validation phase, the input data goes through the unoptimized scalar code to create the masters for comparison to the optimized, vectorized, and parallelized results.

This benchmark runs on a single node on the coprocessor. It can also be modified to run in a cluster environment.

### Implementation Notes

Our first attempt of optimizing the calculation of the Black-Scholes-Merton Formula included using mathematical equivalences, taking advantage of the capabilities available in development tools, and using target-specific capabilities.

### Put-Call Parity

Using *c* and *p* equations from the Introduction section, notice that:

This simply means that once you get the call option price *c*, you can get the put option price *p* with a simple addition and subtraction of intermediate results.

*N(x) ***and erf(x)**

N(x) is a cumulative normal distribution function. Mathematically it’s usually represented as the capital Greek letter Ø. So N(x) = Ø(x)

With this relationship, if there is a fast implementation of erf(x), it can be used for N(x). The additional addition and multiplication do not penalize performance if one function can take advantage of SIMD execution and the other cannot. As part of the vectorized runtime library, the Intel Compiler provides a vectorized erf(x) function callable in scalar or SIMD data.

### Natural base vs. 2’s base

Natural based logarithms and exponentials have been used extensively in financial calculations because of the continuous compounding of the time value of money. However in computer arithmetic, base 2 logarithms and exponential calculations can take advantage of the table lookup implementation and usually have a performance advantage compared to the natural base logarithm. In extreme cases on the Intel Xeon Phi coprocessor, log2(x) and exp2(x) are implemented as machine instructions with 1 and 2 cycles of throughput, while ln(x) and exp(x) are implemented as C runtime function calls.

Using the change of base formula, you can quickly find out how to adjust the parameter or the result by calling base 2 versions of logarithms and exponents instead of natural base versions.

Both ln2 and log_{2}*e* are constants defined in the C runtime library and included in the math.h file as M_LN2 and M_LOG2E, which makes it easy to replace expensive exp(x) with exp2(M_LOG2E*x) and log(x) with M_LN2*log2(x).

### About the Author

Shuo Li works for the Intel Software and Service Group. His main interests are parallel programming and application software performance. In his current role as a staff software performance engineer covering the financial service industry, Shuo works closely with software developers and modelers and helps them achieve high performance with their software solutions. Shuo holds a Master's degree in Computer Science from the University of Oregon and an MBA degree from Duke University.

### References and Resources

[1]Intel® Xeon® processor: http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-e7-family.html

[2]Intel® Xeon Phi™ coprocessor: https://software.intel.com/en-us/articles/quick-start-guide-for-the-intel-xeon-phi-coprocessor-developer

## Comments (1)

TopWadud M. said on

Hi Shuo-li,

For the double precision version, the L1 norm is quite large, thus the test fails:

L1 norm: 1.058155E+01

TEST FAILED

The single precision version passes:

L1 norm: 6.969809E-07

TEST PASSED

Any help will be greatly appreciated.

## Add a Comment

Top(For technical discussions visit our developer forums. For site or software product issues contact support.)

Please sign in to add a comment. Not a member? Join today