<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blogs &#187; Gastón C. Hillar</title>
	<atom:link href="http://software.intel.com/en-us/blogs/author/gastn-c-hillar/feed/" rel="self" type="application/rss+xml" />
	<link>http://software.intel.com/en-us/blogs</link>
	<description></description>
	<lastBuildDate>Fri, 25 May 2012 22:49:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Intel Performance Libraries allow you to leverage both parallelism and SIMD instructions in C#</title>
		<link>http://software.intel.com/en-us/blogs/2010/12/15/intel-performance-libraries-allow-you-to-leverage-both-parallelism-and-simd-instructions-in-c/</link>
		<comments>http://software.intel.com/en-us/blogs/2010/12/15/intel-performance-libraries-allow-you-to-leverage-both-parallelism-and-simd-instructions-in-c/#comments</comments>
		<pubDate>Wed, 15 Dec 2010 17:25:04 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[.NET Framework 4]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[CSharp]]></category>
		<category><![CDATA[gaston hillar]]></category>
		<category><![CDATA[Intel Performance Libraries]]></category>
		<category><![CDATA[multicore programming]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[simd]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2010/12/15/intel-performance-libraries-allow-you-to-leverage-both-parallelism-and-simd-instructions-in-c/</guid>
		<description><![CDATA[Modern microprocessors can execute Single Instruction, Multiple Data (SIMD) instructions. Because the execution units for SIMD instructions usually belong to a physical core, it is possible to run as many SIMD instructions in parallel as available physical cores. The usage of these vector-processing capabilities in parallel can provide important speedups in certain algorithms. You can [...]]]></description>
			<content:encoded><![CDATA[<p>Modern microprocessors can execute Single Instruction, Multiple Data (SIMD) instructions. Because the execution units for SIMD instructions usually belong to a physical core, it is possible to run as many SIMD instructions in parallel as available physical cores. The usage of these vector-processing capabilities in parallel can provide important speedups in certain algorithms. You can use Intel Performance Libraries to leverage both parallelism and SIMD instructions in C# and .NET Framework 4.</p>
<p>Here’s a simple example that will help you understand the power of SIMD instructions. The next Figure shows a diagram that represents the PABSD instruction. This instruction is part of the Supplemental Streaming SIMD Extensions 3 (SSSE 3) introduced with the Intel Core 2 architecture.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/SIMD_01.png"><img class="alignnone size-full wp-image-22105" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/SIMD_01.png" alt="Representation of the PABSD instruction" width="516" height="224" /></a></p>
<p>The PABSD mnemonic means packed absolute value for double-word. This assembly instruction receives a 128-bit input parameter that contains four 32-bit signed integers. The instruction returns a 128-bit output that contains the absolute value for each of the four 32-bit signed integers, packed in the 128-bit output.</p>
<p>You can calculate the absolute values for four 32-bit signed integers with a single call to the PABSD instruction. If you have to calculate the absolute values for 1,000 32-bit signed integers, you can do it with 250 calls to this instruction instead of using a single instruction for each 32-bit signed integer. You can achieve very important speedups. However, because it is necessary to pack the data before calling the SIMD instruction and then unpack the output, it is also important to measure this overhead that adds some code.</p>
<p>If you have to calculate the absolute values for four 32-bit signed integers, the additional overhead will reduce the overall speedup. However, if you have to calculate the absolute values for 100 32-bit signed integers, you will usually benefit from the usage of this kind of SIMD instruction.</p>
<p>If you have to calculate the absolute values for 1,000 32-bit signed integers and you are running the code on a CPU with two physical cores that support the SSSE3 extended instruction set, you can run PABSD instructions in parallel to increase throughput. You can calculate these values with 125 calls to this instruction in each physical core and achieve a speedup through parallelism combined with the execution of SIMD instructions. The absolute value calculation would be as follows:</p>
<p><em>125 calls x 2 physical cores x 4 integers per PABSD instruction call = 1,000 32-bit signed integers<br />
</em></p>
<p>The next figure shows a diagram that represents the execution of the PABSD in parallel on two physical cores.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/SIMD_02.png"><img class="alignnone size-full wp-image-22106" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2010/12/SIMD_02.png" alt="Execution of the PABSD instruction in parallel on two physical cores" width="1149" height="372" /></a></p>
<p>Each SIMD instruction can typically work with different packed data types. For example, the SSSE3 extended instruction set includes three assembly SIMD instructions that calculate packed absolute values for the following data types:</p>
<p>• <em>PABSB</em> — Calculates the absolute value for 16 signed bytes<br />
• <em>PABSW</em> — Calculates the absolute value for eight 16-bit signed integers (words)<br />
• <em>PABSD</em> — Calculates the absolute value for four 32-bit signed integers (double-words)</p>
<p>The aforementioned SIMD instructions can also calculate the absolute values for a lower number of packed values. For example, PABSB can calculate the absolute value for 8 signed bytes instead of 16.</p>
<p>Many applications written in C and C++ take advantage of these instruction sets to work on vectors and matrixes. They are very useful to improve performance in algorithms that need to perform multiple calculations on many data blocks. Most modern C and C++ compilers optimize loops to take advantage of SIMD instruction sets. Therefore, they are able to perform an auto-vectorization when you follow certain guidelines for writing the loops that perform operations on arrays.</p>
<p>.NET Framework 4 does not provide direct support for SIMD or auto-vectorization. This means that your C# code cannot call SIMD instructions, and the C# compiler doesn’t provide an option to enable the usage of SIMD instructions when you perform operations on arrays. However, you can use libraries that are optimized to take advantage of the performance improvements offered by SIMD instructions. You can call the functions provided by these libraries, and you can combine them with the advantages of task-based programming. Intel develops two performance libraries that take advantage of both parallelism and SIMD instructions, targeting several application domains. These libraries are <a href="http://software.intel.com/en-us/articles/intel-mkl/">Intel Math Kernel Library</a> (MKL) and <a href="http://software.intel.com/en-us/articles/intel-ipp/">Intel Integrated Performance Primitives</a> (IPP)</p>
<p>Modern microprocessors can execute SIMD instructions. However, these instructions are part of different extended instruction sets. Because the need for greater computing performance continues to grow across industry segments, Intel has incorporated extended instruction sets in their new CPU models. At the time of this writing, the most advanced Intel CPU includes support for the following SIMD instruction sets:</p>
<p>• MMX — MultiMedia eXtensions<br />
• SSE — Streaming SIMD Extensions<br />
• SSE2 — Streaming SIMD Extensions 2<br />
• SSE3 — Streaming SIMD Extensions 3<br />
• SSSE3 — Supplemental Streaming SIMD Extensions 3<br />
• SSE4.1 — Streaming SIMD Extensions 4.1<br />
• SSE4.2 — Streaming SIMD Extensions 4.2<br />
• AES-NI — Advanced Encryption Standard New Instructions<br />
• AVX — Advanced Vector eXtensions</p>
<p>The previously mentioned MKL and IPP performance libraries detect the available extended instruction sets and optimize their execution according to the possibilities offered by the underlying hardware. Thus, if you run the same code in two similar dual-core microprocessors, but they support diverse extended instruction sets, you might achieve very different performance results. For example, if a CPU supports Advanced Vector eXtensions (AVX), it can perform certain operations on 256-bit packed types with a single instruction.</p>
<p>Intel MKL is a library of highly optimized math routines for science, engineering, and financial applications. The math routines use multiple threads and SIMD instructions to achieve the best performance according to the underlying hardware. MKL 10.3 introduced a simple DLL named mkl_rt.dll, which you can call within your C# code.<br />
Intel IPP is a library of highly optimized math software functions for digital media and data-processing applications. The functions use multiple threads and SIMD instructions to achieve the best performance according to the underlying hardware. IPP includes multiple DLLs, and you can call IPP’s functions from your C# code.</p>
<p>You can combine the new task-based programming model introduced in .NET Framework 4 with high-performance libraries such as Intel Math Kernel Library and Intel Integrated Performance Primitives. In this post, I explained some of the benefits of using these libraries. In subsequent posts, I will show examples that combine calls to these libraries with task-based programming to leverage both parallelism and SIMD instructions in C#.</p>
<p>This post includes an excerpt from my book “<a href="http://www.wrox.com/WileyCDA/WroxTitle/Professional-Parallel-Programming-with-C-Master-Parallel-Extensions-with-NET-4.productCd-0470495995.html">Professional Parallel Programming with C#: Master Parallel Extensions with .NET 4</a>.” () The book targets advanced C# developers that want to take full advantage of the new parallel programming features introduced by .NET Framework 4 and Visual Studio 2010. I’ve been using these highly optimized libraries for many years, and therefore, I’ve included a whole chapter that explains how to work with these libraries with C# and .NET Framework 4.</p>
<p>You can read “<a href="http://www.drdobbs.com/tools/227501006">Performance Library Basics</a>” by Lori Matassa and Max Domeika. In this article, Lori and Max provide an excellent explanation of the benefits of trusting complex algorithms to highly optimized performance libraries.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2010/12/15/intel-performance-libraries-allow-you-to-leverage-both-parallelism-and-simd-instructions-in-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tasks or Software Threads?</title>
		<link>http://software.intel.com/en-us/blogs/2010/02/12/tasks-or-software-threads/</link>
		<comments>http://software.intel.com/en-us/blogs/2010/02/12/tasks-or-software-threads/#comments</comments>
		<pubDate>Fri, 12 Feb 2010 21:33:59 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[Task based programming]]></category>
		<category><![CDATA[Tasks]]></category>
		<category><![CDATA[What If]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2010/02/12/tasks-or-software-threads/</guid>
		<description><![CDATA[Most modern threading platforms are already offering task based programming models. Thus, they are allowing developers to follow one of the eight rules for multicore programming written by James Reinders a few years ago. I’m specifically talking about rule #3: "Program in tasks (chores), not threads (cores)." James suggests that you should leave the mapping [...]]]></description>
			<content:encoded><![CDATA[<p>Most modern threading platforms are already offering task based programming models. Thus, they are allowing developers to follow one of the <a href="http://www.drdobbs.com/architect/201804248">eight rules for multicore programming</a> written by James Reinders a few years ago. I’m specifically talking about rule #3: "Program in tasks (chores), not threads (cores)."</p>
<p>James suggests that you should leave the mapping of tasks to hardware threads as a distinctly separate operation in your code. When you create tasks using an efficient task based programming model, you can create as many as you can without worrying about oversubscription. Of course, you still have to pay attention to the introduced overheads. In fact, tasks also introduce an overhead and it is always important to measure speedups.</p>
<p>Tasks consume software threads using many different techniques to reduce the overhead needed to schedule work and they take advantage of the underlying hardware threads (logical cores). When you work with tasks the code is easier to read than its pure thread version. One of the key advantages of tasks is that they usually require less overhead for their creation than threads. This way, some algorithms that are simple to implement using dozens of tasks reduce their overhead compared to its implementation using dozens of threads. Again, it is also important to consider that this doesn’t mean that you have to add tasks all the time. They have to be used in a smart way.</p>
<p>I’ve written a few posts about the new task based programming model in C# 4 with .NET 4. You can read about the specific implementation of tasks in Visual Studio 2010 in my post "<a href="http://www.drdobbs.com/go-parallel/blog/archives/2009/08/tasks_are_not_t.html">Tasks Are Not Threads</a>". I wrote it when Visual Studio 2010 was in Beta 1. Now, it is available its <a href="http://software.intel.com/en-us/blogs/2010/02/08/msdn-subscriber-downloads-visual-studio-2010-release-candidate/">Release Candidate version</a> but the concepts explained in this post are still valid.</p>
<p><a href="http://www.threadingbuildingblocks.org/">Intel® Threading Building Blocks (Intel® TBB)</a>, <a href="http://software.intel.com/en-us/articles/intel-cilk/">Intel® Cilk++</a>, <a href="http://openmp.org/wp/">OpenMP</a> and <a href="http://www.quickthreadprogramming.com/">QuickThread</a> include task based programming models. It is very important to learn their possibilities in order to express parallelism at a much finer granularity. Then, you can decide whether your algorithm would run better by using tasks or threads.</p>
<p>You can read the eight rules in the article published by James Reinders in Dr. Dobb’s: "<a href="http://www.drdobbs.com/architect/201804248">Rules for Parallel Programming for Multicore</a>".</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2010/02/12/tasks-or-software-threads/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parallel Programming Possibilities with the Intel® Atom Family at 2010 Multicore Expo</title>
		<link>http://software.intel.com/en-us/blogs/2010/01/15/parallel-programming-possibilities-with-the-intel-atom-family-at-2010-multicore-expo/</link>
		<comments>http://software.intel.com/en-us/blogs/2010/01/15/parallel-programming-possibilities-with-the-intel-atom-family-at-2010-multicore-expo/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 23:26:10 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Intel® AppUp Developer Program]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Intel Atom]]></category>
		<category><![CDATA[Multicore Expo]]></category>
		<category><![CDATA[Parallel Programming & Multi-Core]]></category>
		<category><![CDATA[SIMD Programming]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2010/01/15/parallel-programming-possibilities-with-the-intel-atom-family-at-2010-multicore-expo/</guid>
		<description><![CDATA[Intel® Atom is one of the most attractive microprocessors. You can find an Intel® Atom in embedded systems, netbooks, MIDs (short for Mobile Internet Devices), tablet PCs, televisions and consumer electronic devices. I’ve been optimizing software to run with many different Intel® Atom models in many diverse devices and operating systems. Besides, I’ve worked hard [...]]]></description>
			<content:encoded><![CDATA[<p>Intel® Atom is one of the most attractive microprocessors. You can find an Intel® Atom in embedded systems, netbooks,  MIDs (short for Mobile Internet Devices), tablet PCs, televisions and consumer electronic devices. I’ve been optimizing software to run with many different Intel® Atom models in many diverse devices and operating systems. Besides, I’ve worked hard to downsize my parallel and multicore programming experience to take advantage of the exciting features found in these tiny microprocessors.</p>
<p>There are many tools, techniques and tips that make it possible to use your existing skills in the x86-family in order to take full advantage of the features offered by many of the Intel® Atom microprocessors. You can use these techniques to create faster and more responsive software. However, it is very important to understand the specific features offered by many of the different Intel® Atom models.</p>
<p>This year, I will make a presentation at the <a href="http://www.multicore-expo.com">5th Annual Multicore Expo (2010 Multicore Expo)</a>, April 26-29, Sillicon Valley 2010, McEnery Convention Center, San Jose, California, USA.</p>
<p>The presentation’s title is “<a href="http://www.multicore-expo.com/common/session.php?expo_seq=10&amp;track_seq=127&amp;pres_seq=745">Parallel Programming Possibilities with the Intel Atom Family</a>” and it is scheduled on <a href="http://www.multicore-expo.com/common/agenda.php?expo_seq=10#day1">Day 1</a>, April 27th, 08:30-09:14 a.m., as part of the “Parallelization and Application Partitioning” track.</p>
<p>In this session, I will show real-life examples of the different possibilities offered by <a href="http://ark.intel.com/Product.aspx?id=36331">Atom N270</a> (2 logical cores, 32-bits) and <a href="http://ark.intel.com/Product.aspx?id=35641">Atom 330</a> (4 logical cores, 64-bits). Besides, I’ll also add the newest <a href="http://ark.intel.com/Product.aspx?id=43098">Atom D510</a> (4 logical cores, 64-bits).</p>
<p>I will explain Hyper-Threading’s capabilities in detail, with its advantages and its drawbacks. I will explain the usual power management system used with these microprocessors and the advantages of using parallel programming techniques to reduce both the microprocessor’s frequency and power consumption.</p>
<p>Besides, I will explain the SIMD instruction set found in these microprocessors and how to take advantage of it to maximize both parallelism and application’s performance.</p>
<p>The ideas shown in this session can be applied on any operating system.</p>
<p>I do believe year 2010 will be another exciting year for the Intel® Atom family. What do you think? I can’t wait for Multicore Expo 2010. If you’re interested in parallel programming, multicore and Intel® Atom, you can check the information about the session. Besides, comments and ideas about this topic are always welcome.</p>
<p>I’ve written a few posts related to this topic on Dr. Dobb’s Go Parallel and Dr. Dobb’s:</p>
<p>* <a href="http://www.ddj.com/go-parallel/blog/archives/2009/10/downsizing_mult.html">Downsizing Multicore Programming Skills To Take Advantage of Intel® Atom</a></p>
<p>* <a href="http://www.ddj.com/linux-open-source/220600088">Moblin 2.0 Is Multicore Ready</a></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2010/01/15/parallel-programming-possibilities-with-the-intel-atom-family-at-2010-multicore-expo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Installing Intel® Parallel Advisor Lite on Windows 7</title>
		<link>http://software.intel.com/en-us/blogs/2009/10/07/installing-intel-parallel-advisor-lite-on-windows-7/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/10/07/installing-intel-parallel-advisor-lite-on-windows-7/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 16:54:31 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Software Tools]]></category>
		<category><![CDATA[Intel Parallel Advisor]]></category>
		<category><![CDATA[Parallel Programming & Multi-Core]]></category>
		<category><![CDATA[What If]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/10/07/installing-intel-parallel-advisor-lite-on-windows-7/</guid>
		<description><![CDATA[Many Windows developers stayed in Windows XP instead of upgrading their developer workstations to Windows Vista. Windows Vista introduced some compatibility problems when installing certain applications. Now, Windows 7 is round the corner and many developer workstations are going to move to this new Windows version. Windows 7 has many improvements over Vista. Nonetheless, it [...]]]></description>
			<content:encoded><![CDATA[<p>Many Windows developers stayed in Windows XP instead of upgrading their developer workstations to Windows Vista. Windows Vista introduced some compatibility problems when installing certain applications. Now, Windows 7 is round the corner and many developer workstations are going to move to this new Windows version. Windows 7 has many improvements over Vista. Nonetheless, it is an improved Windows Vista. Therefore, you can face some incompatibilities whilst trying to install the necessary software for a professional development environment.</p>
<p>I’m working with Windows 7 RTM and I wanted to install <a href="http://software.intel.com/en-us/articles/intel-parallel-advisor-lite/">Intel® Parallel Advisor Lite</a>. Its latest version is a Windows Installer Package (.MSI), Advisor_Lite_update1_win.msi. The package has to be installed with Administrator rights. Therefore, if you just double click on the file, the installation is not going to work as expected.</p>
<p>As Windows 7 hasn’t been released yet, Intel® Parallel Advisor Lite doesn’t offer official support for this operating system. However, I’m already working with Windows 7 and I wanted to optimize some code for multicore microprocessors taking advantage of the new performance improvements offered by the new Windows 7 kernel.</p>
<p>There is a very simple way to successfully install Intel® Parallel Advisor Lite on Windows 7. Besides, the same steps work with Windows Vista:</p>
<p><strong>1.</strong> Run a Command Prompt as Administrator: Start Menu, All Programs, Accesories. Right-click on “Command Prompt” and select “Run as administrator” from the context menu that appears, as shown in the following picture.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/10/advisor_win7_01.png"><img class="alignnone size-medium wp-image-10472" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/10/advisor_win7_01-242x300.png" alt="" width="242" height="300" /></a></p>
<p><strong>2.</strong> A command line (terminal) Window will appear displaying the title “Administrator: Command Prompt”. Go to the folder where you downloaded the Advisor_Lite_update1_win.msi file. Remember that you have to use the CD command. For example, if the file is in C:\downloads, you just have to type:</p>
<p>CD C:\downloads</p>
<p>Important note: As a developer, you should know the CD command. However, I wanted to keep things as simple as possible for those who don’t use command line commands in Windows.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/10/advisor_win7_02.png"><img class="alignnone size-medium wp-image-10473" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/10/advisor_win7_02-300x151.png" alt="" width="300" height="151" /></a></p>
<p><strong>3.</strong> Run msiexec with the /i option and the Windows Installer Package (.MSI) as a parameter:</p>
<p>msiexec /i Advisor_Lite_update1_win.msi</p>
<p>The installation will start because you’re running the Advisor_Lite_update1_win.msi package as an Administrator. You’ll be able to enjoy Intel® Parallel Advisor Lite. Follow the advice and tackle the multicore revolution!</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/10/advisor_win7_03.png"><img class="alignnone size-medium wp-image-10474" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/10/advisor_win7_03-300x198.png" alt="" width="300" height="198" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/10/07/installing-intel-parallel-advisor-lite-on-windows-7/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Changing partitions in Windows XP Home in order to install Moblin v2.0 Beta</title>
		<link>http://software.intel.com/en-us/blogs/2009/09/01/changing-partitions-in-windows-xp-home-in-order-to-install-moblin-v20-beta/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/09/01/changing-partitions-in-windows-xp-home-in-order-to-install-moblin-v20-beta/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 19:26:48 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Mobility]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[eeepc]]></category>
		<category><![CDATA[Hyper-Threading Technology]]></category>
		<category><![CDATA[Installation]]></category>
		<category><![CDATA[Intel® Atom™]]></category>
		<category><![CDATA[Moblin V2]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/09/01/changing-partitions-in-windows-xp-home-in-order-to-install-moblin-v20-beta/</guid>
		<description><![CDATA[A few weeks ago, I finished converting projects from Silverlight 3 Beta 1 to Silverlight 3 RTW. It took me more time than expected. Working with Beta versions is a difficult task. However, I’ve been working with Apha and Beta versions in the last 20 years. I guess I cannot live without Betas. :) In [...]]]></description>
			<content:encoded><![CDATA[<p>A few weeks ago, I finished converting projects from Silverlight 3 Beta 1 to Silverlight 3 RTW. It took me more time than expected. Working with Beta versions is a difficult task. However, I’ve been working with Apha and Beta versions in the last 20 years. I guess I cannot live without Betas. :)</p>
<p>In the last months, I’ve began working with <a href="http://moblin.org/">Moblin v2.0 beta</a> for Netbooks and Nettops. I am very interested in exploring this new platform and I’m working with parallel programming, taking advantage of Hyper-Threading technology and SSE instructions found in most Intel Atom microprocessors.</p>
<p>I consider myself an old Linux + Windows warrior. So far, I’ve installed hundreds of Linux distributions combined with hundreds of Windows versions. Therefore, I’m going to share some tips to install Moblin v2.0 beta in Netbooks.</p>
<p>Most ASUS EeePC netbooks have Windows XP Home Edition preinstalled. Besides, they have a very fast BIOST POST and hidden partitions in order to recover the operating system installation. What about installing Moblin v2.0 in an ASUS EeePC without killing the existing Windows XP Home Edition?</p>
<p><strong>DISCLAIMER: I’m trying to provide tips to help. However, do it at your own risk. Working with beta versions is not recommended for beginners. Backup all your data before following these steps.</strong></p>
<p>The first problem is the need to change partitions. You don’t want to delete any existing partition because the official operating system is Windows XP Home. I’m going to use an ASUS EeePC 1005HA Seashell as an example. You can create an empty partition in order to make it possible to install Moblin v2.0 beta following these steps:</p>
<p>1. Download and install <a href="http://www.partition-tool.com/personal.htm">EASEUS partition Master Home Edition</a>. This software is a free edition (it runs on Windows XP, 32-bits). There are other commercial editions. The free edition provides the necessary tools to change partitions in most netbooks running Windows XP Home Edition.</p>
<p>2. You will find the following partitions. Two of them have assigned letters (C: and D:).</p>
<div id="attachment_9415" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_1.png"><img class="size-medium wp-image-9415" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_1-300x175.png" alt="Existing partitions." width="300" height="175" /></a><p class="wp-caption-text">Existing partitions.</p></div>
<p>3. You can shrink the second NTFS partition (D:). Right-click on it and select Resize/Move from the context menu that appears. If you were working with this partition, it is very convenient to defragment its free space before trying to resize it.</p>
<div id="attachment_9416" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_2.png"><img class="size-medium wp-image-9416" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_2-300x175.png" alt="Context menu to shrink" width="300" height="175" /></a><p class="wp-caption-text">Context menu to shrink</p></div>
<p>4. Enter the desired space to use for Moblin v2.0 in “Unallocated Space After”. In this case, I’ve created an empty 30,960.4 MB partition.</p>
<div id="attachment_9417" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_3.png"><img class="size-medium wp-image-9417" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_3-300x224.png" alt="Desired space" width="300" height="224" /></a><p class="wp-caption-text">Desired space</p></div>
<p>5.	Click on OK.</p>
<p>6.	The new partition map will appear. The new partition will be shown with the Unallocated label (30.23 GB). If you apply the changes, it will lock drive D: and it will resize the existing NTFS partition.</p>
<div id="attachment_9418" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_4.png"><img class="size-medium wp-image-9418" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_4-300x175.png" alt="New partition map" width="300" height="175" /></a><p class="wp-caption-text">New partition map</p></div>
<p>7.	Check the new partition properties (click on the Properties button).</p>
<div id="attachment_9420" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_5.png"><img class="size-medium wp-image-9420" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_5-300x290.png" alt="Properties" width="300" height="290" /></a><p class="wp-caption-text">Properties</p></div>
<p>8.	Are you sure? Click on Apply.</p>
<div id="attachment_9421" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_6.png"><img class="size-medium wp-image-9421" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_6-300x141.png" alt="Are you sure?" width="300" height="141" /></a><p class="wp-caption-text">Are you sure?</p></div>
<p>9.	Are you really sure? Did you backup your data? Click on Yes. You will see the progress.</p>
<div id="attachment_9422" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_7.png"><img class="size-medium wp-image-9422" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_7-300x215.png" alt="Progress" width="300" height="215" /></a><p class="wp-caption-text">Progress</p></div>
<p>10.	Now, you can check it using Disk Manager or running EASEUS Partition Manager Home Edition again. You will see the resized NTFS partition (D:) and the new Unallocated space (ready for Moblin v2.0).</p>
<div id="attachment_9423" class="wp-caption alignnone" style="width: 310px"><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_8.png"><img class="size-medium wp-image-9423" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/09/hillar_moblin_01_8-300x175.png" alt="A new partition ready for Moblin" width="300" height="175" /></a><p class="wp-caption-text">A new partition ready for Moblin</p></div>
<p>As you can see, it’s easy to create the necessary free space to install Moblin v2.0.<br />
Stay tuned. I’ll be adding detailed step-by-step information about installing and working with Moblin v2.0. Feedbacks and comments are always welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/09/01/changing-partitions-in-windows-xp-home-in-order-to-install-moblin-v20-beta/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Intel Parallel Studio to teach the most difficult issues related to multi-core programming</title>
		<link>http://software.intel.com/en-us/blogs/2009/06/10/using-intel-parallel-studio-to-teach-the-most-difficult-issues-related-to-multi-core-programming/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/06/10/using-intel-parallel-studio-to-teach-the-most-difficult-issues-related-to-multi-core-programming/#comments</comments>
		<pubDate>Wed, 10 Jun 2009 17:34:23 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Intel Parallel Studio]]></category>
		<category><![CDATA[Multi-core programming]]></category>
		<category><![CDATA[parallel programming]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/06/10/using-intel-parallel-studio-to-teach-the-most-difficult-issues-related-to-multi-core-programming/</guid>
		<description><![CDATA[As my blogger profile says, I’m always researching about new tools and technologies. Therefore, I couldn’t help downloading and testing Intel Parallel Studio’s Beta. I’m usually involved in projects using many different programming languages. I’m not a C++ guru. However, I often work with unmanaged C++ when I want the best performance for a parallelized [...]]]></description>
			<content:encoded><![CDATA[<p>As my blogger profile says, I’m always researching about new tools and technologies. Therefore, I couldn’t help downloading and testing <a href="http://software.intel.com/en-us/intel-parallel-studio-home/">Intel Parallel Studio</a>’s Beta.<br />
I’m usually involved in projects using many different programming languages. I’m not a C++ guru. However, I often work with unmanaged C++ when I want the best performance for a parallelized algorithm.</p>
<p>As I’m used to work with managed code and garbage collectors (C#, Java and Groovy, among others), I usually face many difficulties when switching to unmanaged C++.<br />
When I finished installing Intel Parallel Studio and I saw the new toolbars integrated in Visual Studio 2008, I couldn’t help beginning to read the complete documentation for this exciting product.<br />
I really liked the capabilities offered by Intel Parallel Studio. It helped me to work faster in order to parallelize existing serial code. It also helped me to find the most difficult bugs and to profile my applications. I do believe it is a unique tool. You’ll find a lot of information about Intel Parallel Studio on its main Website: <a href="http://software.intel.com/en-us/intel-parallel-studio-home/">http://software.intel.com/en-us/intel-parallel-studio-home/</a></p>
<p>A well-known proverb says “A good workman is known by his tools”. Last week, I had to begin teaching the most difficult parts of a small multi-core programming course.<br />
I was talking about C# multi-core programming, using .Net 3.5. I’ll have a final session talking about future .Net 4.0 parallel extensions and the new features offered by Visual Studio 2010.<br />
I realized that Intel Parallel Studio’s evaluation version could help me to teach the most difficult topics and to emphasize the performance improvements offered by parallel programming.</p>
<p>So far, Intel Parallel Studio does not work with C#. However, it offers amazing tools that make it easy to understand difficult topics, like:</p>
<p>• Scalablity problems.<br />
• Race conditions.<br />
• Dead-locks.<br />
• New bottlenecks.<br />
• New debugging techniques.<br />
• New parallel bugs.<br />
• New tuning challenges.</p>
<p>The results were really satisfying. Intel Parallel Studio helped me to teach difficult topics using an innovative approach. The feedback was really amazing. Developers attending the course understood these issues. They are going to work with C#, but they learned the necessary concepts to understand the big problems. However, they are going to use different tools to detect and to solve them.<br />
I’d love to see Intel Parallel Studio for C# (.Net) and for Java. However, I’ll go on taking advantage of its features to teach the most difficult to learn multi-core programming issues.</p>
<p>Intel Parallel Studio has been recently launched (version 1.0). If you're interested in learning multi-core programming and the challanged associated with the parallelism age, you'll find is really useful to test it. If you don't work with C++, it doesn't matter because you'll still be able to learn really important techniques. Then, you'll be able to use these techniques in your programming language.</p>
<p>Besides, I encourage you to ask Intel to add new languages to this exciting tool. Don't leave me alone.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/06/10/using-intel-parallel-studio-to-teach-the-most-difficult-issues-related-to-multi-core-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Silverlight 3 Beta 1 Multi-core programming possibilities using C#</title>
		<link>http://software.intel.com/en-us/blogs/2009/05/21/silverlight-3-beta-1-multi-core-programming-possibilities-using-c/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/05/21/silverlight-3-beta-1-multi-core-programming-possibilities-using-c/#comments</comments>
		<pubDate>Thu, 21 May 2009 21:57:25 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[CSharp]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[RIA]]></category>
		<category><![CDATA[Silverlight 3]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/05/21/silverlight-3-beta-1-multi-core-programming-possibilities-using-c/</guid>
		<description><![CDATA[In my previous post “Use lambda expressions in C# to simplify the parallelized code II”, Eduardo Fernandez added a comment asking me whether Silverlight 3 Beta 1 had support for multithreading. I’m adding this post to let you know the possibilities offered by Silverlight 3 Beta 1 to create RIAs that take advantage of multi-core [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous post “<a href="http://software.intel.com/en-us/blogs/2009/05/19/use-lambda-expressions-in-c-to-simplify-the-parallelized-code-ii/">Use lambda expressions in C# to simplify the parallelized code II</a>”, Eduardo Fernandez added a comment asking me whether Silverlight 3 Beta 1 had support for multithreading. I’m adding this post to let you know the possibilities offered by Silverlight 3 Beta 1 to create RIAs that take advantage of multi-core microprocessors.</p>
<p><strong>Disclaimer: I’m talking about a Beta 1 release. Therefore, there is a great possibility to meet differences in the final release. The features provided by Silverlight 3 (final release) could differ from the ones I’m talking about for the Beta 1 release. I’m focusing on C#. However, you can also use Visual Basic and other languages like F# to program Silverlight apps.</strong></p>
<p>You can control the elements defined in XAML using C# 3.0 and a subset of .Net 3.5. Hence, you can use lambda expressions, as explained in my previous post. They are really useful to simplify parallelized code.</p>
<p>You have access to <code>BackgroundWorker</code> (<code>System.ComponentModel.BackgroundWorker</code>). Thus, you can use the BackgroundWorker component to create new threads without the complexity of the <code>Thread</code> class and/or <code>ThreadPool</code>. The <code>BackgroundWorker</code> component gives you instant access to a new thread with a component based scheme. Hence, it is simpler for beginners to start working with it. However, you must be very careful to understand what you’re doing when using it. Running code in additional threads will take advantage of multiple cores.</p>
<p>You have access to the <code>Thread</code> class (<code>System.Threading.Thread</code>) in order to run many concurrent threads with more control capabilities than the ones created using the <code>BackgroundWorker</code> component. The <code>Thread</code> class offers many fine tuning capabilities.</p>
<p>You have access to the <code>ThreadPool</code> class (<code>System.Threading.ThreadPool</code>) in order to work with pools of threads. You can queue work items using the <code>ThreadPool</code> class and you can also use its fine tuning capabilities.</p>
<p>Thus, you have access to these three main elements:<br />
* BackgroundWorker<br />
* Thread class<br />
* ThreadPool class</p>
<p>Everything that you learned working with multithreading and multi-core programming using C# 2.0 and C# 3.0 will be really useful when working with Silverlight 3 Beta 1.</p>
<p>So far, Silverlight 3 Beta 1 will not offer support to Task Parallel Library or the Parallel Extensions. Hence, I guess we’ll have to wait for Silverlight 4. Nevertheless, C# 3.0 and Silverlight 3 Beta 1 will offer great opportunities for developers to tackle the multi-core revolution offering exciting RIAs.</p>
<p>You can find additional information about my thought about the opportunities related to RIA and parallelism reading my post in Go Parallel: <a href="http://www.ddj.com/go-parallel/blog/archives/2009/05/designing_rich.html">http://www.ddj.com/go-parallel/blog/archives/2009/05/designing_rich.html</a></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/05/21/silverlight-3-beta-1-multi-core-programming-possibilities-using-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Use lambda expressions in C# to simplify the parallelized code II</title>
		<link>http://software.intel.com/en-us/blogs/2009/05/19/use-lambda-expressions-in-c-to-simplify-the-parallelized-code-ii/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/05/19/use-lambda-expressions-in-c-to-simplify-the-parallelized-code-ii/#comments</comments>
		<pubDate>Tue, 19 May 2009 17:55:41 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[CSharp]]></category>
		<category><![CDATA[lambda expressions]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[Silverlight]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/05/19/use-lambda-expressions-in-c-to-simplify-the-parallelized-code-ii/</guid>
		<description><![CDATA[In my previous post “Use lambda expressions in C# to simplify the parallelized code” I began talking about the advantages of using lambda expressions in parallelized code. I used a very simple example. Now, I’m going to use a more complex example to convince you that lambda expressions and parallelized code are good friends. In [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous post “<a href="http://software.intel.com/en-us/blogs/2009/05/13/use-lambda-expressions-in-c-to-simplify-the-parallelized-code/">Use lambda expressions in C# to simplify the parallelized code</a>” I began talking about the advantages of using lambda expressions in parallelized code.</p>
<p>I used a very simple example. Now, I’m going to use a more complex example to convince you that lambda expressions and parallelized code are good friends. In this post, I’m going to forget about the Task Parallel Library Beta 1 because I’m going to use a C# example for Silverlight.</p>
<p>You have a Silverlight application showing a simple Button through XAML code, btnStartGame. If you want to change its properties from a non-UI thread, you have to queue commands in the UI thread. You cannot change controls’ properties from other threads.</p>
<p>If you have to create a new thread, start it and invoke a delegate to update the UI inside it, the code could be as complex as the following lines:</p>
<p><code>// The event handler that starts the thread<br />
private void btnStartGame_Click(object sender, RoutedEventArgs e)<br />
{<br />
Thread thread = new Thread(new ThreadStart(InvokeThread));<br />
thread.Start();<br />
}</code></p>
<p><code>// The code that will run in a new thread<br />
void InvokeThread()<br />
{<br />
btnStartGame.Dispatcher.BeginInvoke(UpdateContent);<br />
}</p>
<p></code></p>
<p><code>// The code that will run in the UI thread to update the text shown in the Button<br />
void UpdateContent()<br />
{<br />
btnStartGame.Content = "New title.";<br />
}</code></p>
<p>The code is indeed complex. It contains too many lines and three methods.</p>
<p>Now, let’s take a look at the same code, but using lambda expressions:</p>
<p><code>private void btnStartGame_Click(object sender, RoutedEventArgs e)<br />
{<br />
new Thread(() =&gt;<br />
{<br />
btnStartGame.Dispatcher.BeginInvoke(() =&gt; btnStartGame.Content = " New title.");<br />
}).Start();<br />
}</code></p>
<p>The code is indeed simpler. Just a few lines and they fit in the event handler. You can replace the creation of many methods using lambda expressions. Hence, lambda expressions in C# are good friends for task-based programming.</p>
<p>Just a note, you have to add <code>using System.Threading;</code> in order to make the code run in a Silverlight 2 or Silverlight 3 Beta 1 solution.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/05/19/use-lambda-expressions-in-c-to-simplify-the-parallelized-code-ii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Use lambda expressions in C# to simplify the parallelized code</title>
		<link>http://software.intel.com/en-us/blogs/2009/05/13/use-lambda-expressions-in-c-to-simplify-the-parallelized-code/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/05/13/use-lambda-expressions-in-c-to-simplify-the-parallelized-code/#comments</comments>
		<pubDate>Wed, 13 May 2009 18:57:49 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[.Net 4.0]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[CSharp]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[parallel extensions]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[Task Parallel Library]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/05/13/use-lambda-expressions-in-c-to-simplify-the-parallelized-code/</guid>
		<description><![CDATA[So, you want to start using Task Parallel Library Beta 1. You want to take advantage of the new features that will be available in .Net 4.0. Hold on! Are you familiar with lambda expressions? If you aren’t using lambda expressions in your current C# programs, you should begin learning about them before jumping into [...]]]></description>
			<content:encoded><![CDATA[<p>So, you want to start using Task Parallel Library Beta 1. You want to take advantage of the new features that will be available in .Net 4.0. Hold on! Are you familiar with lambda expressions?</p>
<p>If you aren’t using lambda expressions in your current C# programs, you should begin learning about them before jumping into the new features. C# 3.0 introduced lambda expressions and they are very useful to simplify the parallelized code. They will help you to create code that’s easier to read, understand and maintain.</p>
<p>To keep things simple, a lambda expression is an anonymous function that can contain expressions and statements. It can be used to create delegates or expression tree types.</p>
<p>They are really useful to simplify the code when we use delegates. Parallelized code uses many delegates. Hence, lambda expressions allow you to simplify the code, reduce the complexity and understand what you’re doing. Parallelized code is more complex than serial code. Lambda expressions will help you to reduce this additional complexity.</p>
<p>All lambda expressions use the lambda operator <code>=&gt;</code>. It is read as “goes to”.</p>
<p>For example, the following lines of code use the classic C# 2.0 syntax but they work with Parallel.ForEach (introduced in the Task Parallel Library):</p>
<p><code>System.Threading.Tasks.Task.Create(delegate(object myObject) {<br />
try<br />
{<br />
Parallel.ForEach(myMeshes, delegate(Mesh myNextMesh)<br />
{</code></p>
<p>Now, let’s take a look at the same code, but using lambda expressions:</p>
<p><code>System.Threading.Tasks.Task.Create(myObject =&gt;<br />
{<br />
try<br />
{<br />
Parallel.ForEach(myMeshes, myNextMesh =&gt;<br />
{</code></p>
<p>The code is simpler, easier to read, understand and maintain.</p>
<p>Disclaimer: In the aforementioned lines of code, I’m using “System.Threading.Tasks.Task” because Task is new in the Task Parallel Library. A using statement could simplify the code even further.</p>
<p>The code creates a new task to run a parallelized loop asynchronously. It is an example. This is an advanced option. <strong>It is recommended for developers who understand what they are doing. I’m not recommending this practice to use in all the parallel loops.</strong> Some applications don’t need this kind of configurations.</p>
<p>Take into account that most developers will be using lambda expressions for parallelized code. By the way, most developers working with parallelized code using current features found in C# 3.0 and .Net 3.5 are also using lambda expressions.</p>
<p>Therefore, if you want to be able to understand parallelized code running in tasks and threads, you should begin working with lambda expressions.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/05/13/use-lambda-expressions-in-c-to-simplify-the-parallelized-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Specifying the desired degree of parallelism in .Net 4.0 TPL Beta 1</title>
		<link>http://software.intel.com/en-us/blogs/2009/05/06/specifying-the-desired-degree-of-parallelism-in-net-40-tpl-beta-1/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/05/06/specifying-the-desired-degree-of-parallelism-in-net-40-tpl-beta-1/#comments</comments>
		<pubDate>Wed, 06 May 2009 15:20:59 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[.Net 4.0]]></category>
		<category><![CDATA[C# 4.0]]></category>
		<category><![CDATA[CSharp]]></category>
		<category><![CDATA[parallel extensions]]></category>
		<category><![CDATA[Task Parallel Library]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/05/06/specifying-the-desired-degree-of-parallelism-in-net-40-tpl-beta-1/</guid>
		<description><![CDATA[Sometimes, you don’t want to use all the available cores in a parallel loop. Why? Because you have better plans for the remaining available cores. Thus, you want to specify the concurrency level of a parallel loop. Luckily, Task Parallel Library Beta 1 will allow you to do this using the new ParallelOptions class. This [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes, you don’t want to use all the available cores in a parallel loop. Why? Because you have better plans for the remaining available cores. Thus, you want to specify the concurrency level of a parallel loop. Luckily, Task Parallel Library Beta 1 will allow you to do this using the new <code>ParallelOptions</code> class.</p>
<p><strong>This is an advanced option.</strong> It is recommended for developers who understand what they are doing.<strong> I’m not recommending this practice for its use in all the parallel loops.</strong> Some applications don’t need this kind of configurations.</p>
<p>The previous <code>TaskManager</code> instance was not successful in some cases. Therefore, the TPL team added the new <code>ParallelOptions</code> class, which adds many interesting properties.</p>
<p>First, you have to create a new instance of <code>ParallelOptions</code>. Then, in order to limit the concurrency level of parallel loops, you just have to set the desired number of logical cores to use to the <code>MaxDegreeOfParallelism</code> property. This property has a default value of -1 which means attempting to use all the available logical cores.</p>
<p>The following loop will use all the available logical cores:<br />
<code>var myOptions = new ParallelOptions { MaxDegreeOfParallelism = -1 };<br />
Parallel.For(0, 60000, myOptions, i=&gt;<br />
{<br />
// Code to run<br />
});</code></p>
<p>The following loop will use no more than two logical cores, no matter the total number of available logical cores:<br />
<code>var myOptions = new ParallelOptions { MaxDegreeOfParallelism = 2 };<br />
Parallel.For(0, 60000, myOptions, i=&gt;<br />
{<br />
// Code to run<br />
});</code></p>
<p>If you know the number of logical cores, you can use a relative value. For example, if myNumberOfCores stores the number of logical cores, the following loop will leave one core free:<br />
<code>var myOptions = new ParallelOptions { MaxDegreeOfParallelism = myNumberOfCores – 1 };<br />
Parallel.For(0, 60000, myOptions, i=&gt;<br />
{<br />
// Code to run<br />
});</code></p>
<p>Of course, you must take into account that the additional threads created at run-time and the operating system scheduler will influence the final results.</p>
<p>If you have 8 logical cores, you can run two parallel loop asynchronously, each using 4 cores. It depends on your needs.</p>
<p>Again, this is an advanced topic. You must know what you’re doing when using these scheduling options. However, it’s good news that the old <code>TaskManager</code> introduced in previous TPL CTPs (Community Technology Previews), which was pretty inaccurate, can be replaced by the new <code>ParallelOptions</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/05/06/specifying-the-desired-degree-of-parallelism-in-net-40-tpl-beta-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Counting cores in .Net and Java</title>
		<link>http://software.intel.com/en-us/blogs/2009/05/04/counting-cores-in-net-and-java/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/05/04/counting-cores-in-net-and-java/#comments</comments>
		<pubDate>Mon, 04 May 2009 15:04:23 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[.Net 4.0]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Java 7]]></category>
		<category><![CDATA[parallel extensions]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/05/04/counting-cores-in-net-and-java/</guid>
		<description><![CDATA[As C# and Visual Basic (in the .Net world) and Java are high level programming languages, most developers were not used to check for some hardware information. With multicore microprocessors and a task-oriented programming model, trying to take full advantage of parallel processing capabilities offered by modern microprocessors, this is changing. I'm preparing a sequence [...]]]></description>
			<content:encoded><![CDATA[<p>As C# and Visual Basic (in the .Net world) and Java are high level programming languages, most developers were not used to check for some hardware information. With multicore microprocessors and a task-oriented programming model, trying to take full advantage of parallel processing capabilities offered by modern microprocessors, this is changing.</p>
<p>I'm preparing a sequence of posts related to fine-grained parallelism with .Net Parallel Extensions. However, as I've received this questions dozens of times in my e-mail, I thought it was a good idea to post the answer for these three programming languages.</p>
<p>Why do you need to know the number of logical cores? Because taking this information into account, you can decide at run-time, the number of parallel tasks that you're going to run concurrently.<br />
If you discover that you have four cores available, you can run four tasks concurrently. This changes if you run the same application on a dual-core microprocessor.</p>
<p>And what about eight logical cores, like the ones offered by a Core i7. A quad-core microprocessor that offers eight logical cores via Intel's Hyper-Threading Technology.</p>
<p>.Net 4.0 with Parallel Extensions will offer many excellent features to decide the number of logical cores to dedicate to each parallelized task. Java 7 will also offer similar features with its fork-join framework. However, you must begin with the most simple task: counting the number of logical cores.<br />
Just a single line of code for each programming language. It is very easy.</p>
<p>C#:<br />
<code>Environment.ProcessorCount;</code></p>
<p>Visual Basic:<br />
<code>Environment.ProcessorCount</code></p>
<p>Java:<br />
<code>Runtime.getRuntime().availableProcessors();</code></p>
<p>These lines return the number of logical cores on Windows and in other operating systems.</p>
<p>For example, if you run these lines on a computer with a quad-core Core i7 supporting Hyper-Threading, it will return 8.<br />
If you run these lines on a computer with a quad-core Q6700, it will return 4.</p>
<p>I know it's a very simple post. However, you have to know these if you are creating applications in these programming languages.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/05/04/counting-cores-in-net-and-java/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Invoking parallel tasks</title>
		<link>http://software.intel.com/en-us/blogs/2009/04/28/invoking-parallel-tasks/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/04/28/invoking-parallel-tasks/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 15:50:10 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[.Net 4.0]]></category>
		<category><![CDATA[C# 4.0]]></category>
		<category><![CDATA[CSharp]]></category>
		<category><![CDATA[parallel extensions]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[Tasks]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/04/28/invoking-parallel-tasks/</guid>
		<description><![CDATA[In a recent post, Robert Chesebrough (Intel) talked about less focus on threads and more focus on tasks. I agree with him. I do believe that decomposing the job to be done into many tasks is the key to a successfully parallelized algorithm. Once you have the most important tasks, you can re-design the algorithm [...]]]></description>
			<content:encoded><![CDATA[<p>In a recent post, Robert Chesebrough (Intel) talked about <a href="http://software.intel.com/en-us/blogs/2009/04/24/less-focus-on-threads-more-focus-on-tasks/">less focus on threads and more focus on tasks</a>. I agree with him. I do believe that decomposing the job to be done into many tasks is the key to a successfully parallelized algorithm.</p>
<p>Once you have the most important tasks, you can re-design the algorithm taking into account that you must exploit parallel architectures. Of course, you must understand how threads work and how modern multi-core microprocessors work. Then, you can use tools like a Gantt chart (yes, a Gantt chart) to find the critical sections (those areas where parallelization is extremely difficult or nearly impossible).</p>
<p>This time, I’m going to show a simple example taking into one of the new features offered by future .Net 4.0 Parallel Extension. If you have an application that must perform many tasks and you discover that you have more than 2 cores available, it would be very convenient to launch them in parallel.</p>
<p>However, which is the simplest way to launch many completely independent tasks in parallel? In C# 4.0, you’ll be able to use the new Parallel.Invoke method combined with lambda expressions:</p>
<p><code>Parallel.Invoke(<br />
() =&gt; ConvertMeshes(myMeshes),<br />
() =&gt; ConvertMaterials(myMaterials),<br />
() =&gt; ConvertLights(myLights)<br />
() =&gt; ConvertCameras(myCameras)<br />
);</code></p>
<p>Taking into account the default options (you can even change many options taking into account your needs), if you run this code in a quad-core CPU with four logical cores, this is what will happen:</p>
<p><code>ConvertMeshes(myMeshes)</code> will run in Core #0.<br />
<code>ConvertMaterials(myMaterials)</code> will run in Core #1.<br />
<code>ConvertLights(myLights)</code> will run in Core #2.<br />
<code>ConvertCameras(myCameras)</code> will run in Core #3.</p>
<p>The current thread will be blocked until the four methods return from their execution in parallel. However, you could also run these four tasks concurrently and asynchronously. I’m trying to keep things simple.</p>
<p>Is the code above easy to understand? Yes, it is very easy to understand it. It is easy to maintain. It is easy to optimize it.</p>
<p>If you want to take full advantage of the future Parallel Extensions in .Net 4.0, it is highly recommended to learn about lambda expressions. They were introduced in C# 3.0 to shorten the code and to make it more functional.<br />
Combining lambda expressions with the new task oriented approach that is going to be introduced in Parallel Extensions in .Net 4.0, exploiting multi-core microprocessors will be indeed easier for most C# developers.</p>
<p>The aforementioned example is very simple. I tried to keep things simple, this was my idea in this post.</p>
<p>However, please, do not forget to learn about threads. You’ll need that knowledge in the new parallel age.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/04/28/invoking-parallel-tasks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parallel Extensions offer backward compatibility</title>
		<link>http://software.intel.com/en-us/blogs/2009/04/22/parallel-extensions-offer-backward-compatibility/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/04/22/parallel-extensions-offer-backward-compatibility/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 20:34:15 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[.Net 4.0]]></category>
		<category><![CDATA[parallel extensions]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/04/22/parallel-extensions-offer-backward-compatibility/</guid>
		<description><![CDATA[Many developers are working with the features offered by C# 3.0 and .Net 3.5 to exploit multi-core CPUs. Parallel Extensions, which will be part of .Net 4.0 in Visual Studio 2010 are entering Beta 1. Luckily, they will offer backward compatibility. Parallel Extension will offer a lot of interesting features related to multi-core programming. So, [...]]]></description>
			<content:encoded><![CDATA[<p>Many developers are working with the features offered by C# 3.0 and .Net 3.5 to exploit multi-core CPUs. Parallel Extensions, which will be part of .Net 4.0 in Visual Studio 2010 are entering Beta 1. Luckily, they will offer backward compatibility.</p>
<p>Parallel Extension will offer a lot of interesting features related to multi-core programming. So, developers face many questions:</p>
<p>• ¿Is .Net 4.0 going to be backward compatible with my existing threading code? Luckily, the answer to this question is YES. You will be able to keep the existing code running while taking advantage of the new features. Besides, it will run faster because there are a lot of improvements in previous features offered by .Net 3.5.</p>
<p>• ¿Will I be able to take advantage of my training in threaded and concurrent programming? Another YES. Parallel Extensions offer new features that simplify the many multi-core and concurrent programming tasks. Your existing knowledge will be very useful to begin working with these shortcuts. You will see a lot of tasks really simplified. However, you have to understand what you’re doing in order to avoid problems.</p>
<p>• ¿Can I combine previous threading mechanisms with Parallel Extensions? Yes. You have to be careful with some issues. However, if you understand what you’re doing, it will work fine. For example, you can use a ThreadPool combined with PLINQ. If you use the right parameters to specify the degree of parallelism for the ThreadPool and PLINQ, you will be able to exploit the right number of cores.</p>
<p>• As they offer high-level concurrency structures, ¿Do I have to forget about hardware? No, no and no. Understanding multi-core hardware is the key to a successful parallelized application. A paradigm shift is needed.</p>
<p>Developers will be able to convert their code as needed in order to take full advantage of .Net 4.0 Parallel Extensions. There is no need to reconvert all the existing multithreading code. However, you’d rather take advantage of the new features as soon as possible.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/04/22/parallel-extensions-offer-backward-compatibility/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Concurrent collections for C# using Parallel Extensions</title>
		<link>http://software.intel.com/en-us/blogs/2009/04/08/concurrent-collections-for-c-using-parallel-extensions/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/04/08/concurrent-collections-for-c-using-parallel-extensions/#comments</comments>
		<pubDate>Wed, 08 Apr 2009 23:17:53 +0000</pubDate>
		<dc:creator>Gastón C. Hillar</dc:creator>
				<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[.Net 4.0]]></category>
		<category><![CDATA[concurrency]]></category>
		<category><![CDATA[parallel extensions]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[TPL]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/04/08/concurrent-collections-for-c-using-parallel-extensions/</guid>
		<description><![CDATA[One of the most difficult tasks related to concurrent programming in C# and .Net 3.5 is sharing collections, arrays or lists between many tasks running at the same time. Besides, the complexity increases when these concurrent tasks need to add and/or remove items from them. Doing this safely involves a great control of coordination data [...]]]></description>
			<content:encoded><![CDATA[<p>One of the most difficult tasks related to concurrent programming in C# and .Net 3.5 is sharing collections, arrays or lists between many tasks running at the same time. Besides, the complexity increases when these concurrent tasks need to add and/or remove items from them. Doing this safely involves a great control of coordination data structures and the efficient use of precise locks.<br />
One of the most well-known examples is a producer-consumer scheme. On one side, a producer, running in one thread, must add elements to a collection, array or list. On the other side, a producer, running concurrently in another thread, must remove elements from the same collection, array or list. Handling these situations is indeed very complex for the most brave developers and software engineers.<br />
Luckily, the June 2008 CTP (<em>Community Technology Preview</em>) of <a href="http://msdn.microsoft.com/en-us/concurrency/default.aspx">Parallel Extensions to the .Net Framework</a> added very interesting high-level coordination data structures and thread-safe collections. The result is amazing: the producer-consumer scheme complexity reduced to a minimum.<br />
Using these amazing high-level thread-safe collections and a good design, it is possible to add and remove items from collections letting Parallel Extensions manage the necessary locks and low-level coordination stuff. The design is simplified and the code is easier to read. The concurrent collections manage the complex work and the developers focus in what they need to do using high levels of concurrency safely.<br />
These usage of these concurrent collections imply a coordination cost, but they are very easy to use and they avoid lots of head-aches. Believe me.<br />
These are the three concurrent collections initially added by June 2008 CPT, which are also going to be available as part of .Net 4.0 in Visual Studio 2010:<br />
• <code>System.Threading.Collections.BlockingCollection</code>: Provides blocking and bounding capabilities.<br />
• <code>System.Threading.Collections.ConcurrentQueue</code>: Represents a variable size first-in-first-out (FIFO) collection.<br />
• <code>System.Threading.Collections.ConcurrentStack</code>: Represents a variable size last-in-first-out (LIFO) collection.<br />
They are included in the <code>System.Threading.Collections</code> namespace and they implement the common <code>System.Threading.Collections.IConcurrentCollection</code> interface.<br />
Using them, you can easily handle many concurrent tasks running their threads adding and removing elements from a thread-safe shared collection. Hence, it is easier than ever to create multiple chained producer-consumer schemes taking full advantage of 4 or 8 cores. Besides, it is possible to have multiple threads adding elements and multiple threads removing elements, concurrently.<br />
For example, in a ConcurrentQueue, the items can be added safely using the Enqueue method and they can be safely removed using the TryDequeue method.<br />
There are three new structures to be added in .Net 4.0: <code>ConcurrentBag</code>; <code>ConcurrentLinkedList</code> and <code>ConcurrentLinkedListNode</code>. I haven't had the opportunity to work with them yet. I'll tell you about them in another post.<br />
It sounds easy. It is. However, there are many other considerations to take into account, related to concurrency, immutability and thread-safe code. Nevertheless, you’ll love these concurrent collections in .Net 4.0.<br />
Comments and feedback are always welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/04/08/concurrent-collections-for-c-using-parallel-extensions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

