<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Tue, 24 Nov 2009 19:50:31 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network Comments feed</title>
    <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/feed/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>By Michael Stoner (Intel)</title>
      <description><![CDATA[ A 50x gain is amazing.  Can you elaborate on what the compiler did to achieve this?  What ICL switches did you use? ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-9761</link>
      <pubDate>Tue, 16 Dec 2008 13:25:19 -0800</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-9761</guid>
    </item>
    <item>
      <title>By bgkyer</title>
      <description><![CDATA[ I'm curious to see the code myself - I've done some work using Monte Carlo models w/TBB and CUDA for American, European, European Spread and Asian (Geometric and Arithmetic) options. CUDA was by far the fastest approach on a 9800 but required quite a bit more work on the american algorithm than a TBB style implementation.

Is your american implementation based on Haug's "The Complete Guide to Option Pricing Formulas" book?

I compiled the code using vs2008 but not with the Intel Compiler (I've recently gotten access to the parallel studio so I plan on giving that a whirl). ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-16440</link>
      <pubDate>Tue, 20 Jan 2009 19:08:12 -0800</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-16440</guid>
    </item>
    <item>
      <title>By bgkyer</title>
      <description><![CDATA[ Sorry I meant your binomial algorithm not american in my question. ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-16441</link>
      <pubDate>Tue, 20 Jan 2009 19:10:02 -0800</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-16441</guid>
    </item>
    <item>
      <title>By jb</title>
      <description><![CDATA[ Where can I go to download the referenced source code? I am unable to find a link to it.

Thank  you. ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-24094</link>
      <pubDate>Sat, 09 May 2009 19:43:39 -0700</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-24094</guid>
    </item>
    <item>
      <title>By Shuo Li (Intel)</title>
      <description><![CDATA[ Hi bgkyer and Mike,

It's not that hard to achieve 50x performance improvement by simply applying a combinations of most popular optimization techniques. You can do that too.
1) Adopt a optermizing compiler. Here I used Intel Compiler 11.0. The simple and each switches to use is /fast, which combines /QxT /O3 /Qipo /prec-div.
2) Look at the performance hotspot, which should lead us to the loops in binomialOptionsCPU in the code.
3) Inline function calls in side the loop. This gives compiler autovectorizer a chance to generate SIMD code that can accelerate the execution.
4) Use OpenMP in the outer loop. Modern microprocessor has many cores and each of these cores may have multiple threads. OpenMp will leverage the all the cores you have.
5) Ensure that inner loop are vectorized. While OpenMP works on the out most loops, Vectorization still can work on the inner loop. These two techniques can be used together to achieve the highest possible performance.
6) Make a simple algorithm improvement to reduce the amount of caculations. In populating the end nodes, the program loops on the each leaf node and calculate as if they don't have any connection. Careful observation reveals that each node takes exactly one more upward path and one less downward path. Taking advantage of this fact, we can calculate each node based on its previous node and an multiple operation. The end result is that the expfc() call will be removed from the loop body.
7) When you run the final code in production environment, you will need OpenMP runtime environment. You may want to set the OpenMp thread affinity in the target execution environment for maximum performance.

I hope these steps can help you just as much as it helped me.  If run into difficulties, please let me know. Just email me at shuo.li@intel.com

Best,

Shuo ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-26323</link>
      <pubDate>Fri, 19 Jun 2009 14:55:19 -0700</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-26323</guid>
    </item>
    <item>
      <title>By vysakh</title>
      <description><![CDATA[ Wher can i download this program?
I am not able to find a link.

please help
thanks  ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-26756</link>
      <pubDate>Sun, 28 Jun 2009 19:15:30 -0700</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-26756</guid>
    </item>
    <item>
      <title>By hahaliao</title>
      <description><![CDATA[ Anyone know the hyperlink to download this full source code?
your help would be greatly appreciated and wanted.
thanks  ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-29944</link>
      <pubDate>Fri, 21 Aug 2009 02:55:01 -0700</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-29944</guid>
    </item>
    <item>
      <title>By k_sarnath</title>
      <description><![CDATA[ 
A binomial european doing 1000 options and 2000 timesteps takes 5.xxx seconds using MSVC compiler on and AMD Athlon 2.41GHz. This is from our test data.. 

but your article claims 28 seconds which is a huge number....Is there anything that I am missing here?

We have a GPU version that does this in 52 milli seconds on a 8800 GTX (or TESLA C1060 - I cant remember now)

Kindly update! Thanks! ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-34948</link>
      <pubDate>Wed, 18 Nov 2009 01:47:47 -0800</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-34948</guid>
    </item>
    <item>
      <title>By k_sarnath</title>
      <description><![CDATA[ Helooooo... ]]></description>
      <link>http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-35094</link>
      <pubDate>Thu, 19 Nov 2009 23:28:51 -0800</pubDate>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/high-performance-computing-with-binomial-option-pricing-part-1/#comment-35094</guid>
    </item>
  </channel></rss>