<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blogs &#187; Orion Granatir (Intel)</title>
	<atom:link href="http://software.intel.com/en-us/blogs/author/orion-granatir/feed/" rel="self" type="application/rss+xml" />
	<link>http://software.intel.com/en-us/blogs</link>
	<description></description>
	<lastBuildDate>Fri, 25 May 2012 22:49:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Using #ifdef in OpenGL ES 2.0 shaders</title>
		<link>http://software.intel.com/en-us/blogs/2012/03/26/using-ifdef-in-opengl-es-20-shaders/</link>
		<comments>http://software.intel.com/en-us/blogs/2012/03/26/using-ifdef-in-opengl-es-20-shaders/#comments</comments>
		<pubDate>Mon, 26 Mar 2012 18:08:33 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2012/03/26/using-ifdef-in-opengl-es-20-shaders/</guid>
		<description><![CDATA[It’s nice to use #ifdef’s in an OpenGL shader. This allows a shader file do things like contain a vertex shaders and pixel shaders in the same file. Or have different render paths selected at run time. Here is an example of a simple shader file: To use a shader in OpenGL ES, the program [...]]]></description>
			<content:encoded><![CDATA[<p>It’s nice to use #ifdef’s in an OpenGL shader. This allows a shader file do things like contain a vertex shaders and pixel shaders in the same file. Or have different render paths selected at run time. Here is an example of a simple shader file:</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2012/03/OGLES2_shader_example.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2012/03/OGLES2_shader_example.png" alt="" title="OGLES2_shader_example" width="445" height="419" class="aligncenter size-full wp-image-46066" /></a></p>
<p>To use a shader in OpenGL ES, the program will typically:<br />
1.	Load a file<br />
2.	Compile the shader<br />
3.	Link the shader</p>
<p>Before compiling the shader, the program calls <em>glShaderSource</em> to set the shader source.  The real trick is with <em>glShaderSource</em>. This function can take multiple strings and combines them for compilation.</p>
<p>Here is a code sample that compiles a shader given a string (<em>ShaderData</em>) and the size of the string (<em>Size</em>). The function is passed a type (<em>Type</em>) which is defined as <em>GL_VERTEX_SHADER</em> or <em>GL_FRAGMENT_SHADER</em>. This function uses an #ifdef to compile the source as a vertex or pixel shader:</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2012/03/OGLES_ifdef_compile_shader_example.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2012/03/OGLES_ifdef_compile_shader_example.png" alt="" title="OGLES_ifdef_compile_shader_example" width="810" height="897" class="aligncenter size-full wp-image-46069" /></a></p>
<p>The documentation for <em>glShaderSource</em> says that the function can be called with string lengths of 0 for NULL terminated strings. However, for my implementation of OGLES 2 this did not work, so I had to specify the individual string lengths (as defined in <em>ShaderStringLengths</em> for this example). </p>
<p>With #ifdef’s, an application can modify a shader at runtime. It’s possible to enable/disable features like post processing effects, dynamic lighting, and more based on #defines. Best of luck!</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2012/03/26/using-ifdef-in-opengl-es-20-shaders/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Android Everywhere - GDC 2012 Presentation</title>
		<link>http://software.intel.com/en-us/blogs/2012/02/24/android-everywhere-gdc-2012-presentation/</link>
		<comments>http://software.intel.com/en-us/blogs/2012/02/24/android-everywhere-gdc-2012-presentation/#comments</comments>
		<pubDate>Fri, 24 Feb 2012 18:06:20 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[Game Development]]></category>
		<category><![CDATA[3d games]]></category>
		<category><![CDATA[Android on Atom]]></category>
		<category><![CDATA[GDC 2012]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2012/02/24/android-everywhere-gdc-2012-presentation/</guid>
		<description><![CDATA[At GDC 2012, I will be presenting with Ian Ni-Lewis (from Google) on best practices for writing cross-platform native Android games. This presentation will discuss some lesson we have learned creating native Android apps that run on the widest range of hardware possible. Since I work at Intel, I'll obviously be discussing some experiences porting [...]]]></description>
			<content:encoded><![CDATA[<p>At GDC 2012, I will be presenting with Ian Ni-Lewis (from Google) on best practices for writing cross-platform native Android games.  This presentation will discuss some lesson we have learned creating native Android apps that run on the widest range of hardware possible.</p>
<p>Since I work at Intel, I'll obviously be discussing some experiences porting ARM apps to x86 :)<br />
<strong>Session Description</strong><br />
By GDC 2012, there will be 3 major ABIs (armeabi, armeabi-v7a, and x86), several GPU architectures (PVR, Mali, Tegra), and a wide range of Android OS versions.  Without careful planning and some tricks, validating your game on a variety of hardware and software configurations will be time consuming and costly.<br />
This session presents best practices for implementing truly cross-platform Android applications and how to properly enable low-level hardware optimizations.  This goes beyond high level issues such as appropriate screen sizing and handling different form factors. Attendees will learn:<br />
- How to package multiple platform specific .so files in a .APK file (“fat binaries”)<br />
- Understand the memory alignment differences between ARM and x86, and how to exploit them<br />
- How to use processor specific SIMD intrinsics (NEON and SSE)<br />
- How to properly use OpenGL extensions<br />
- Understand the best compiler flags for different ABIs, and<br />
- How to abstract out hardware and when does it makes sense to do so</p>
<p>At the end of this session, attendees will know how to create highly optimized native applications on Android that follow best practices for cross-platform development.  The audience should already understand Android development, including use of the NDK.</p>
<p>Please come join me at GDC 2012!  See you there.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2012/02/24/android-everywhere-gdc-2012-presentation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding x86 vs ARM Memory Alignment on Android</title>
		<link>http://software.intel.com/en-us/blogs/2011/08/18/understanding-x86-vs-arm-memory-alignment-on-android/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/08/18/understanding-x86-vs-arm-memory-alignment-on-android/#comments</comments>
		<pubDate>Thu, 18 Aug 2011 20:10:04 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[Embedded Computing]]></category>
		<category><![CDATA[Game Development]]></category>
		<category><![CDATA[Performance and Optimization]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Intel® Atom™]]></category>
		<category><![CDATA[ndk]]></category>
		<category><![CDATA[x86]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/08/18/understanding-x86-vs-arm-memory-alignment-on-android/</guid>
		<description><![CDATA[With Google’s recent release of the NDK (r6), it is now possible build Android application for x86 processors in addition to ARM. In general, this only involves rebuilding native code to port applications from ARM to x86. However, there are a few pitfalls to avoid. One difference between x86 and ARM is the memory alignment [...]]]></description>
			<content:encoded><![CDATA[<p>With Google’s recent release of the NDK (r6), it is now possible build Android application for x86 processors in addition to ARM.  In general, this only involves rebuilding native code to port applications from ARM to x86.  However, there are a few pitfalls to avoid.</p>
<p>One difference between x86 and ARM is the memory alignment requirements for data.  Let’s look at a simple example:<br />
<a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/08/Capture.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/08/Capture.png" alt="" title="CodeExample0" width="445" height="209" class="alignnone size-full wp-image-35666" /></a></p>
<p>This example just logs the size and offset of variables in TestStruct.  The output for this program isn’t too surprising:</p>
<p>ARM<br />
I/libtestjni( 5025): TestStruct (size: 12)<br />
I/libtestjni( 5025): -- Var1 offset: 0<br />
I/libtestjni( 5025): -- Var2 offset: 4<br />
I/libtestjni( 5025): -- Var3 offset: 8</p>
<p>x86<br />
I/libtestjni( 4175): TestStruct (size: 12)<br />
I/libtestjni( 4175): -- Var1 offset: 0<br />
I/libtestjni( 4175): -- Var2 offset: 4<br />
I/libtestjni( 4175): -- Var3 offset: 8</p>
<p>But now, let’s change TestStruct to the following:<br />
<a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/08/Capture2.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/08/Capture2.png" alt="" title="CodeSample1" width="112" height="65" class="alignnone size-full wp-image-35673" /></a></p>
<p>The output is now:</p>
<p>ARM<br />
I/libtestjni( 4675): TestStruct (size: 24)<br />
I/libtestjni( 4675): -- Var1 offset: 0<br />
I/libtestjni( 4675): -- Var2 offset: 8<br />
I/libtestjni( 4675): -- Var3 offset: 16</p>
<p>x86<br />
I/libtestjni( 4079): TestStruct (size: 16)<br />
I/libtestjni( 4079): -- Var1 offset: 0<br />
I/libtestjni( 4079): -- Var2 offset: 4<br />
I/libtestjni( 4079): -- Var3 offset: 12</p>
<p>The 8-byte (64-bit) mVar2 results in different layout for TestStruct.  This is because ARM requires 8-byte alignment for 64-bit variables like mVar2.  In most cases, this won’t cause problems because building for x86 vs ARM requires a full rebuild.</p>
<p>However, if an application serializes class or structures, this could cause a size mismatch.  For example, say you create a save file on an ARM application and it writes TestStruct to a file.  If you later load this file on an x86 platform, the class size in the application will be different than the saved file.  As you can imagine, similar memory alignment issues can happen for network traffic that expects a specific memory layout.</p>
<p>The GCC compiler option “-malign-double” will generate the same memory alignment on x86 and ARM.  However, since the OS was not built with this flag, it will break some OS calls.</p>
<p>You can control the alignment of variables through compiler attributes.  So, if we tell GCC to align(8) for mVar2, x86 and ARM will have the same alignment:<br />
<a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/08/Capture1.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/08/Capture1.png" alt="" title="CodeSample2" width="260" height="62" class="alignnone size-full wp-image-35676" /></a></p>
<p>The output is now:</p>
<p>ARM<br />
I/libtestjni( 4675): TestStruct (size: 24)<br />
I/libtestjni( 4675): -- Var1 offset: 0<br />
I/libtestjni( 4675): -- Var2 offset: 8<br />
I/libtestjni( 4675): -- Var3 offset: 16</p>
<p>x86<br />
I/libtestjni( 4678): TestStruct (size: 24)<br />
I/libtestjni( 4678): -- Var1 offset: 0<br />
I/libtestjni( 4678): -- Var2 offset: 8<br />
I/libtestjni( 4678): -- Var3 offset: 16</p>
<p>Once you understand the memory alignment difference between x86 and ARM, rebuilding your ARM Android NDK application for x86 should be pretty simple!  Go grab the latest NDK and give it a try.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/08/18/understanding-x86-vs-arm-memory-alignment-on-android/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A guide to optimizing graphics and games for Intel® Atom based platforms</title>
		<link>http://software.intel.com/en-us/blogs/2011/02/25/a-guide-to-optimizing-graphics-and-games-for-intel-atom-based-platforms/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/02/25/a-guide-to-optimizing-graphics-and-games-for-intel-atom-based-platforms/#comments</comments>
		<pubDate>Sat, 26 Feb 2011 01:12:35 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Game Development]]></category>
		<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Intel® AppUp Developer Program]]></category>
		<category><![CDATA[Performance and Optimization]]></category>
		<category><![CDATA[3D graphics]]></category>
		<category><![CDATA[Atom Developer]]></category>
		<category><![CDATA[Atom Developer Guide]]></category>
		<category><![CDATA[DirectX optimizations]]></category>
		<category><![CDATA[game programming]]></category>
		<category><![CDATA[Intel® Atom™]]></category>
		<category><![CDATA[optimizations]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/02/25/a-guide-to-optimizing-graphics-and-games-for-intel-atom-based-platforms/</guid>
		<description><![CDATA[Ron and I just finished up the first revision of the Atom graphics developers guide. You can download it here: http://software.intel.com/en-us/articles/mobile-graphics-developers-guides/ Topics covered include: - Intel® Atom processor optimizations - Understanding graphics packaged with Inte®l Atom processor based platforms - Tools to help optimize and profile game/graphic applications - Microsoft DirectX* optimizations (most of the [...]]]></description>
			<content:encoded><![CDATA[<p>Ron and I just finished up the first revision of the Atom graphics developers guide.  You can download it here: <a href="http://software.intel.com/en-us/articles/mobile-graphics-developers-guides/">http://software.intel.com/en-us/articles/mobile-graphics-developers-guides/</a></p>
<p>Topics covered include:<br />
  - Intel® Atom processor optimizations<br />
  - Understanding graphics packaged with Inte®l Atom processor based platforms<br />
  - Tools to help optimize and profile game/graphic applications<br />
  - Microsoft DirectX* optimizations (most of the focus is on Microsoft Windows* for this revision of the guide)</p>
<p>Most of these tips and tricks are for game developers targeting Intel® Atom processors.  However, anyone working with graphics should find some useful information.  Please let us know what you think!  We will use your feedback to help improve the next revision.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/02/25/a-guide-to-optimizing-graphics-and-games-for-intel-atom-based-platforms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Building a highly scalable 3D particle system</title>
		<link>http://software.intel.com/en-us/blogs/2011/02/18/building-a-highly-scalable-3d-particle-system/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/02/18/building-a-highly-scalable-3d-particle-system/#comments</comments>
		<pubDate>Fri, 18 Feb 2011 21:16:41 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Game Development]]></category>
		<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Performance and Optimization]]></category>
		<category><![CDATA[3d game]]></category>
		<category><![CDATA[3d games]]></category>
		<category><![CDATA[c++ parallel programming]]></category>
		<category><![CDATA[Fork Particle]]></category>
		<category><![CDATA[Game Threading]]></category>
		<category><![CDATA[simd]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/02/18/building-a-highly-scalable-3d-particle-system/</guid>
		<description><![CDATA[Particle systems are an ideal candidate for multi-threading in games. Most games have particle systems and their general nature of independent entities lends well to parallelism. However, a naïve approach won’t load balance well on modern architectures. There are two complementary approaches, task-based threading and SSE, which are ideally suited for particle systems and will [...]]]></description>
			<content:encoded><![CDATA[<p>Particle systems are an ideal candidate for multi-threading in games.  Most games have particle systems and their general nature of independent entities lends well to parallelism.  However, a naïve approach won’t load balance well on modern architectures.  There are two complementary approaches, task-based threading and SSE, which are ideally suited for particle systems and will obtain maximum performance from multi-core processors.</p>
<p><strong>Task-based threading</strong><br />
A particle system is ideal for threading because it’s essentially a big loop that operates on a bunch of independent objects.  Because the objects don’t need to interact (they don’t write to shared data), they can easily be spread across multiple threads.</p>
<p>Here is an example of a loop that updates all particles:</p>
<blockquote><p>for( unsigned int i = 0; i &lt; NumParticles; i++ )<br />
{</p>
<ul>UpdateForces( g_Particle[i], DeltaTime );<br />
    UpdateCollision( g_Particle[i], DeltaTime );<br />
    UpdatePosition( g_Particle[i], DeltaTime );</ul>
<p>}</p></blockquote>
<p>Threading these loops is trivial.  OpenMP is supported by all major compilers and allows simple for-loops to be parallelized.  OpenMP will automatically divide the loop and run it across all available cores on the machine.  </p>
<blockquote><p>
  #pragma omp parallel for<br />
  for( unsigned int i = 0; i &lt; NumParticles; i++ )<br />
  {</p>
<ul>UpdateForces( g_Particle[i], DeltaTime );<br />
      UpdateCollision( g_Particle[i], DeltaTime );<br />
      UpdatePosition( g_Particle[i], DeltaTime );</ul>
<p>  }
</p></blockquote>
<p>However, this is not an ideal approach.  This method does not support good load balancing in a complex system such as a real game.  For example, if spawned threads generate more threads (nested threading), then it’s possible to oversubscribe the system.  Oversubscription causes a performance hit because there is an overhead associated with swapping the execution of the threads.  </p>
<p>There is a better way to thread particles than the simple fork-and-join approach of OpenMP’s parallel for.   It’s very simple to divide the work to run as tasks.  Using tasks provides several benefits.  Once you have a tasking system set up, it’s easier to add new tasks to increase parallelism throughout the code.  Also, it’s easier to load balance and be platform-agnostic.  If the task scheduler manages all parallel tasks, the program will avoid oversubscription.  In this example, we don’t have to wait for all the particles for a given emitter to finish before moving to the next emitter and scheduling more tasks.</p>
<p>To convert the loop above to use tasks, the code needs to divide the work into several tasks and submit them to a task scheduler.  These tasks will define the range of particles to update and includes all required information. </p>
<blockquote><p>
unsigned int Start = 0;<br />
unsigned int End = 0;<br />
unsigned int ParticlesPerTask = NumParticles / NumTasks;</p>
<p>for( unsigned int i = 0; i &lt; NumTasks; i++ )<br />
{</p>
<ul>// Determine the range of particles to update<br />
    // (the last task might be bigger)<br />
    End = (i &lt; (NumTasks-1)) ? uStart + ParticlesPerTask : NumParticles;</ul>
<ul>// Build and submit the task<br />
    ParticleTask* pTask = &amp;g_ParticleTasks[i];<br />
    pTask-&gt;m_DeltaTime = DeltaTime;<br />
    pTask-&gt;m_Start = Start;<br />
    pTask-&gt;m_End = End;</ul>
<ul>g_TaskScheduler-&gt;addTask(pTask);</ul>
<ul>// Move to the next set of particles<br />
    Start = End;</ul>
<p>}</p>
<p>...</p>
<p>void ParticleTask::Run()<br />
{</p>
<ul>for( unsigned int i = m_Start; i &lt; m_End; i++ )<br />
    {</p>
<ul>UpdateForces( g_Particle[i], m_DeltaTime );<br />
        UpdateCollision( g_Particle[i], m_DeltaTime );<br />
        UpdatePosition( g_Particle[i], m_DeltaTime );</ul>
<p>    }
</ul>
<p>}
</p></blockquote>
<p><strong>Using SSE</strong><br />
Tasking is a great way to get scaling with a particle system.  However, it’s also important to make sure the code fully uses the CPU cores it is running on.  Developers should consider using SIMD (single instruction, multiple data) with SSE instructions.  For floating point, SSE instructions operate on 4 floating points in a single instruction.  Obviously, this has the potential to increase throughput by up to 4x.</p>
<p>There are multiple ways to use SSE.  For developers interested in maximum control, intrinsics are the best way to utilize SSE.  Intrinsics are compiler specific functions that generate inlined highly efficient machine instructions.  For developers targeting DirectX on PC or Xbox, the <a href="http://msdn.microsoft.com/en-us/library/ee418725(v=VS.85).aspx">XNA Math library</a> wraps the use of intrinsics in a library that already supports vectors and matrices.  </p>
<p>There are a few things to keep in mind when use the XNA Math library.  First, be careful accessing individual elements.  Getting and setting elements inside an SSE vector isn’t free.  It’s best to put data into XMVECTORS and keep it there as long as possible.  Also, make sure you are using properly aligned data.</p>
<p>For a more in-depth discussion on cross-platform SIMD, check out Gustavo Oliveira’s <a href="http://www.gamasutra.com/view/feature/4248/designing_fast_crossplatform_simd_.php">well written article</a>.</p>
<p><strong>Sample application</strong><br />
A lot of this article is based on the learnings from my cube-mate, Quentin Froemke (lovingly referred to as Q-Ball).  Quentin created a tech sample called Ticker Tape to showcase some best know practices for creating a highly scalable particle system.  To learn more about Ticker Tape, check out the <a href="http://software.intel.com/sites/billboard/va-magazine/issue-06/articles/ticker-tape/">associated article</a> and download the <a href="http://www.intel.com/software/tickertape/">source code</a>.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/02/tickertape0.jpg"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/02/tickertape0.jpg" alt="" width="529" height="526" class="aligncenter size-full wp-image-24487" /></a> </p>
<p>Ticker Tape simulates the physics behavior of fluttering and tumbling.  This simulation showcases an interesting and complex particle behavior.  The performance benefits for utilizing multi-core and SSE are apparent in Ticker Tape:</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/02/tickertape1.jpg"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/02/tickertape1.jpg" alt="" width="500" height="104" class="aligncenter size-full wp-image-24488" /></a> </p>
<p>Both multi-core and SSE give significant benefits.  The benefits of Intel® Hyper-Threading Technology are much more prevalent with SSE.  Hyper-threading takes advantage of the fact that all execution units for a CPU core might not be fully utilized by a single thread.  Multiple execution units allow multiple instructions to be executed simultaneously and be pipelined.  Execution units perform operations such as loads, stores, integer operations, floating-point operations, and SSE operations.  With more SSE instructions, there is a better utilization of all the processors resources and therefore better use of hyper-threading.</p>
<p>If you are afraid that adding a highly parallel particle system will Duke Nuke your schedule, you can always investigate middleware options.  For example, the team at <a href="http://www.forkparticle.com/">Fork Particle</a> has a parallel particle systems backed with solid content creation tools.</p>
<p><strong>Summary/ Conclusion</strong><br />
With higher core counts, it’s possible to scale with compute power and show a larger number of particles. This would give users a perceivable difference for high-end machines, without punishing players with lesser gaming hardware.  </p>
<p>Particle systems are ideal for threading and a good wading pool for people new to threading.  Future blogs will go off the deep end and explore more complex problems like crowd simulation AI.  In the meantime, check out Ticker Tape and let me know what you think!</p>
<p><strong>Update: Fixed source code</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/02/18/building-a-highly-scalable-3d-particle-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Finding the Next Challenge in Visual Computing</title>
		<link>http://software.intel.com/en-us/blogs/2011/02/07/finding-the-next-challenge-in-visual-computing/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/02/07/finding-the-next-challenge-in-visual-computing/#comments</comments>
		<pubDate>Mon, 07 Feb 2011 23:57:57 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Game Development]]></category>
		<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[3d game]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[Sandy Bridge]]></category>
		<category><![CDATA[Visual Comnputing]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/02/07/finding-the-next-challenge-in-visual-computing/</guid>
		<description><![CDATA[Five years ago, a new wave of consoles brought the inception of High Definition (HD) content to the videogame industry. Some companies excelled in this era, others did not. Never the less, the industry is now HD. As evidenced by console manufacturers developing new input controls to extend the life of this generation of consoles, [...]]]></description>
			<content:encoded><![CDATA[<p>Five years ago, a new wave of consoles brought the inception of High Definition (HD) content to the videogame industry.  Some companies excelled in this era, others did not.  Never the less, the industry is now HD.  As evidenced by console manufacturers developing new input controls to extend the life of this generation of consoles, the industry is looking for the next challenge.  With the release of DirectX 11, the increasing capabilities of processor graphics, and a new surge of mobile devices, there certainly is plenty to explore.</p>
<p>Some think programming for multi-core is a punishment reserved for the eighth ring of hell (I recently beat EA's Dante’s Inferno, so it’s on my mind).  But you don’t need to fight your way through purgatory to reach multi-core heaven.  If you keep two concepts in mind, things get easier:</p>
<p>1. Use data decomposition – A game can’t scale just by dividing subsystems onto separate threads (sometimes referred to as "functional decomposition").  It has to divide data intelligently to run across multiple cores.  The ever-insightful Mike Acton has a <a href="http://www.insomniacgames.com/research_dev/articles/2009/1500943">great article</a> on this very subject over at Insomniac Games’ R&amp;D page. </p>
<p>2. Use tasks, not threads directly – To scale on an arbitrary number of cores and be truly cross-platform, work should be divided into tasks.  A task is a unit of work (for example a function pointer and data) that can run without (or with very limited) synchronization.  These units of work are processed by a thread pool which is scaled appropriately to the available parallelism in the hardware.</p>
<p>By utilizing tasking and data decomposition together, it is possible to take advantage of multi-core for all subsystems.  Prior to DX11, rendering was still the locked gateway to heaven.  With DX11, it’s now possible to divide the work of rendering into multiple tasks using deferred contexts.  Jérôme Muffat-Méridol recently delivered a presentation about DX11 multithreaded rendering at GDC Europe 2010 which detailed methods for doing this.  Jérôme lovingly refers to this project as Nulstein.  In previous versions of Nulstein, Jérôme also explored the requirements of building a task scheduler from scratch, but the focus for GDC Europe was multithreaded DX11.</p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/NulsteinDemo_GDCEurope.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/NulsteinDemo_GDCEurope.png" alt="" width="478" height="366" class="aligncenter size-full wp-image-23382" /></a> </p>
<p>It’s interesting to note that DX11’s multithreaded goodness can run on DX10 hardware.  DX11 supports “feature level,” which allows the latest API to be used on a wide range of hardware (provided the application is running on Vista/Win7) by emulating any missing behavior in software.  To learn more, check out the <a href="http://software.intel.com/en-us/articles/nulstein/">articles and presentations</a> associated with Nulstein.</p>
<p>There is a growing uptake of task-based architectures in game engine design.  Mark Randel’s <a href="http://software.intel.com/en-us/blogs/2009/06/24/highlights-and-challenges-during-ghostbusters-development-part-2/">implementation</a> of physics and AI in Terminal Reality’s Infernal Engine is nothing short of amazing.  The fellows over at <a href="http://www.bitsquid.se/">BitSquid</a> are building a new game engine written with solid support for multi-core.  Furthermore, the Civ5 team at Firaxis is getting great performance out of a task-based approach (check out their <a href="http://www.gdcvault.com/free/gdc-10">GDC 2010 presentation</a>: “Firaxis’ Civilization V: A Case Study in Scalable Game Performance”). </p>
<p>Intel recently introduced its 2nd Generation Intel® Core™ processors (codenamed Sandy Bridge).  This generation of CPUs really ushers in the era of processor graphics.  There is a lot to explore with Sandy Bridge!  Check out <a href="http://software.intel.com/en-us/articles/onloaded-shadows/">Onloaded Shadows</a> if you want to see how the close combinations of CPU and GPU can lead to new and interesting areas.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/02/07/finding-the-next-challenge-in-visual-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Look at Sandy Bridge: Integrating Graphics into the CPU</title>
		<link>http://software.intel.com/en-us/blogs/2011/01/13/a-look-at-sandy-bridge-integrating-graphics-into-the-cpu/</link>
		<comments>http://software.intel.com/en-us/blogs/2011/01/13/a-look-at-sandy-bridge-integrating-graphics-into-the-cpu/#comments</comments>
		<pubDate>Thu, 13 Jan 2011 23:45:33 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Game Development]]></category>
		<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Performance and Optimization]]></category>
		<category><![CDATA[3D graphics]]></category>
		<category><![CDATA[Sandy Bridge]]></category>
		<category><![CDATA[Visual Comnputing]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2011/01/13/a-look-at-sandy-bridge-integrating-graphics-into-the-cpu/</guid>
		<description><![CDATA[Processor graphics will soon be found in computers everywhere. Intel calls this new capability Intel® HD Graphics; at AMD, it’s called AMD Fusion*. Both names refer to the integration of graphics functionality into the CPU. At CES this year, Intel introduced the 2nd Generation Intel® Core™ processors (codenamed “Sandy Bridge”). This processor is Intel’s first [...]]]></description>
			<content:encoded><![CDATA[<p>Processor graphics will soon be found in computers everywhere.  Intel calls this new capability Intel® HD Graphics; at AMD, it’s called AMD Fusion*.  Both names refer to the integration of graphics functionality into the CPU.  </p>
<p>At CES this year, Intel introduced the 2nd Generation Intel® Core™ processors (codenamed “Sandy Bridge”).  This processor is Intel’s first new microarchitecture utilizing 32nm technology.  Smaller transistors and new architectural design result in higher performance at lower power.</p>
<p>One advantage of working at Intel is having early access to leading-edge technology.  I’m writing this blog on my Sandy Bridge machine.  Right now it may be another big dev box on my desk, but this chip is now reaching numerous mainstream laptops.  The processor graphics are pretty good.  It runs one of my favorite games, Bioware’s Mass Effect* 2, at a solid fps in glorious detail (by the way, my mission was truly suicidal).</p>
<p>One key feature of this architecture is this tighter integration of graphics into the processor.   The ring architecture that connects the processor cores (x86 cores) together is now connected to the processor graphics.  This ring interconnect enables high-speed and low-latency communication between the processors cores, processor graphics, and other integrated components, such as the memory controller and the display.  Basically, the processor cores and graphics engine communicate through a shared cache, creating some interesting possibilities for tight CPU/GPU interaction.  We’ll explore some of these topics in future articles.</p>
<p>The integration of process components also provides some new improvements to Intel® Turbo Boost Technology.  Intel® Turbo Boost Technology enables the processor to adjust the processor core and processor graphics frequencies to increase performance and maintain the allotted power/thermal budget.  This means the processor can increase individual core speed or graphics speed as the workload dictates.   </p>
<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/tb.png"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2011/01/tb.png" alt="" width="722" height="575" class="aligncenter size-full wp-image-22928" /></a></p>
<p>During performance analysis, it’s important to pay attention to dynamic frequencies.  A typical game will load down the CPU and GPU with the processor finding a good balance.  However, if you are playing back a captured frame, the CPU workload might not be as high and the graphics dynamic frequency might affect your results.  This should be a relative scaling within the frame, but the time to complete the frame might be unexpected.  </p>
<p>Since Intel® Turbo Boost Technology is automatically controlled by the CPU, a developer cannot directly control it.  However, it’s important to understand how it works.  Most games I have investigated benefit well from the graphics turbo scaling.</p>
<p>The addition of Intel® Advanced Vector Extensions (Intel® AVX) is another interesting Sandy Bridge feature.  AVX extends SIMD (single instruction multiple data) instructions from 128 bits to 256 bits.  For applications that are floating point intensive, AVX enables a single instruction to work on eight floating points at a time instead of the four that the current SIMD provides.  </p>
<p>It’s important to note other hardware vendors have also <a href="http://blogs.amd.com/developer/2009/05/06/striking-a-balance/">announced support for AVX</a>.</p>
<p>Most developers will just use the latest Microsoft® Visual Studio compiler or Intel’s C/C++ Compiler to take advantage of AVX.  But, for the clock counting, bit shifting, tech heads out there, you can learn more at the <a href="http://software.intel.com/en-us/avx/">Intel’s AVX website</a>.</p>
<p>The best way to work with AVX is through intrinsics, which are supported by both the Microsoft and Intel compilers.  Anyone familiar with programming SSE or the PlayStation* 3’s SPUs will be good “frenemies” with intrinsics.  Intrinsics are compiler specific functions that usually compile down to highly efficient inlined machine instructions.  Since the compiler has a strong understanding of intrinsics, it will often generate code faster than inlined assembly.  Intrinsics are the best way to write high-throughput, compute-intensive code on the CPU (intrinsics are an acquired taste, like wine or Remedy Entertainment’s Alan Wake*). </p>
<p>Intel engineers and performance hungry developers are already exploring ways AVX can benefit game and graphics applications.  For example, my coworker Stan Melax, just wrote <a href="http://software.intel.com/en-us/articles/3d-vector-normalization-using-256-bit-intel-advanced-vector-extensions-intel-avx/">a great article</a> presenting a programming pattern to improve the performance of geometry computations by transposing packed 3D data on-the-fly.</p>
<p>AVX is interesting, but let’s shift our focus to the graphics engine.  </p>
<p>Remember, Sandy Bridge is a DirectX 10.1 part.  I like the DX11 multithreaded API, so most of my current code is DX11 with the proper DX10.1 “feature level” set for Sandy Bridge.  With full DX10.1 support, there are no real major surprises when programming for Intel graphics.  However, there are few things to keep in mind.</p>
<p>The memory layout for processor graphics is different than it is for a discrete card.  Graphics applications often check for the amount of available free video memory early in execution.  As a result of the dynamic allocation of graphics memory performed by the Intel® HD Graphics devices (based on application requests), you need to know the total amount of memory that is truly available to the graphics device.  Memory checks that supply only the amount of “local” or “dedicated” graphics memory available do not supply an appropriate value for the Intel® HD Graphics devices.  </p>
<p>All video memory on Intel® HD Graphics and earlier generations, including Intel® Graphics Media Accelerator Series 3 and 4, use Dynamic Video Memory Technology (DVMT).  DVMT memory is considered “local memory.”  “Non-local video memory” will show as ZERO (0).  This should not be used to determine compatibility with Accelerated Graphics Port (AGP) or PCI Express*.</p>
<p>To accurately detect the amount of memory available, you’ll need to check the total video memory availability.  The Microsoft DirectX* SDK (June 2010) includes the VideoMemory sample code and describes five commonly used methods to detect the total amount of video memory.  Applications targeting Microsoft Windows Vista* and Microsoft Windows* 7, should reference GetVideoMemoryViaDXGI. For Microsoft Windows* XP applications, GetVideoMemoryViaWMI is a good starting place.  For more information, see the <a href="http://msdn.microsoft.com/en-us/library/ee419018(v=VS.85).aspx">Microsoft sample code</a> site. </p>
<p>The best place to get started with Intel processor graphics is to check out the <a href="http://software.intel.com/en-us/articles/intel-graphics-developers-guides">Intel Graphics Developer’s Guides</a></p>
<p>With nearly a million PCs shipped each day, the available market of processor graphics is growing quickly.  It’s worthwhile to understand and validate on processor graphics.  Soon, processor graphics will be everywhere.  </p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2011/01/13/a-look-at-sandy-bridge-integrating-graphics-into-the-cpu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Help: Writing a File Manager for Games?</title>
		<link>http://software.intel.com/en-us/blogs/2009/03/11/help-writing-a-file-manager-for-games/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/03/11/help-writing-a-file-manager-for-games/#comments</comments>
		<pubDate>Wed, 11 Mar 2009 17:16:05 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Game Development]]></category>
		<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[file system]]></category>
		<category><![CDATA[games]]></category>
		<category><![CDATA[Level Up 2009]]></category>
		<category><![CDATA[multithreaded file system]]></category>
		<category><![CDATA[vfs]]></category>
		<category><![CDATA[virtual file system]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/03/11/help-writing-a-file-manager-for-games/</guid>
		<description><![CDATA[Programming a game is fun. It’s not every day that you get to work on algorithms for things like zombies, mechs, or aliens. However, a zombie wouldn’t be very scary if you couldn’t load the mesh, textures, or sounds. There are some less glorious tasks required to make a game. No one really wants to [...]]]></description>
			<content:encoded><![CDATA[<p>Programming a game is fun. It’s not every day that you get to work on algorithms for things like zombies, mechs, or aliens. However, a zombie wouldn’t be very scary if you couldn’t load the mesh, textures, or sounds. There are some less glorious tasks required to make a game. No one really wants to write a file manager. But a good file manager will allow you to stream better content, load and manage bigger/more files, and generally have a better game.</p>
<p>I often ask game developers “what do you want?”<br />
In other words, I really want to know what technologies game developers are interested in. More to the point, I am interested in technologies that they wish someone else would helped implement. One common answer is “give me a file/memory manager… no one wants to write a file/memory manager.” I’ll write another blog about the memory manager part ;)</p>
<p>I am researching the needs of a good generic file manager for games.<br />
Here is what I think is required:<br />
- Designed for games<br />
- Platform independent<br />
- Multithreaded<br />
- Open source<br />
- Support streaming content<br />
- Support archives</p>
<p>There are some good articles about file systems on the internet.<br />
Michael Walter wrote a good article on FlipCode about <a href="http://www.flipcode.com/archives/Programming_a_Virtual_File_System-Part_I.shtml">virtual file systems</a>.<br />
However, there is little about open source file systems and desired features for a file manager for games.</p>
<p>I need your help.<br />
What do you want?<br />
Is there a good open/closed source project already in the works?<br />
What features would make a great file system for games?</p>
<p>Thanks,<br />
Orion</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/03/11/help-writing-a-file-manager-for-games/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Threaded AI: FTW</title>
		<link>http://software.intel.com/en-us/blogs/2009/01/29/threaded-ai-ftw/</link>
		<comments>http://software.intel.com/en-us/blogs/2009/01/29/threaded-ai-ftw/#comments</comments>
		<pubDate>Fri, 30 Jan 2009 00:16:40 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Software Tools]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[GDC]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[Smoke]]></category>
		<category><![CDATA[What If]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2009/01/29/threaded-ai-ftw/</guid>
		<description><![CDATA[Writing threaded AI (Artificial Intelligence) is epic? It’s easier than you would think! I am going to give a presentation about multithreaded AI at GDC this year. We will examine how AI can be threaded and live in a highly parallel environment. How can you thread AI? How can AI talk to physics running on [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/01/ai_attack.bmp"><img src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2009/01/ai_attack.bmp" alt="An AI monster attacks our hero" class="aligncenter size-medium wp-image-5320" /></a></p>
<p>Writing threaded AI (Artificial Intelligence) is epic?<br />
It’s easier than you would think! </p>
<p>I am going to give a presentation about multithreaded AI at GDC this year.  We will examine how AI can be threaded and live in a highly parallel environment.  How can you thread AI?  How can AI talk to physics running on another thread or device?  Is deferred processing worthwhile?  Do you have to thread your designers?  My presentation will hopefully answer these questions and more.  The presentation will end with a quick overview of Intel’s Smoke demo;  Smoke is a n-way threaded framework that includes source code for highly parallel AI (you can download it at Whatif.intel.com).</p>
<p>Here is my tentative outline:<br />
  • The challenge – Why thread AI<br />
  • Define a Simple AI – Let’s start by defining a simple game AI<br />
  • Threading AI – Let’s thread our simple AI<br />
  • Working in a Multithread Land – Let’s look at how AI can work and live in a highly parallel environment<br />
  • Smoke: An Example – A quick overview of Smoke and it’s multithreaded AI<br />
  • Summary - Threaded AI for the win!!</p>
<p>Please let me know what you think.<br />
Is there anything you want to hear about that I’m not covering?</p>
<p>If you are interested or working with AI/Smoke, I hope to see you at GDC!</p>
<p>EDIT: You can get the Smoke demo <a href="http://software.intel.com/en-us/articles/smoke-game-technology-demo/">here</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2009/01/29/threaded-ai-ftw/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lock or Be Free</title>
		<link>http://software.intel.com/en-us/blogs/2008/12/30/lock-or-be-free/</link>
		<comments>http://software.intel.com/en-us/blogs/2008/12/30/lock-or-be-free/#comments</comments>
		<pubDate>Wed, 31 Dec 2008 00:07:29 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Software Tools]]></category>
		<category><![CDATA[execution mode]]></category>
		<category><![CDATA[Game Development]]></category>
		<category><![CDATA[GDC]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[Smoke]]></category>
		<category><![CDATA[threading]]></category>
		<category><![CDATA[What If]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2008/12/30/lock-or-be-free/</guid>
		<description><![CDATA[Jeff Andrews wrote a great article about multithreaded game engines over on Whatif.intel.com. These are the concepts that the Smoke (n-way threaded game framework) demo was built on. One of the readers, Josh, brought up a good comment (check out the comments at the bottom of the article). One of the questions in his comment [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2008/12/lock_or_be_free.bmp"><img class="alignnone size-full wp-image-4684" src="http://software.intel.com/en-us/blogs/wordpress/wp-content/uploads/2008/12/lock_or_be_free.bmp" alt="" /></a></p>
<p>Jeff Andrews wrote a great <a href="http://software.intel.com/en-us/articles/designing-the-framework-of-a-parallel-game-engine">article</a> about multithreaded game engines over on Whatif.intel.com. These are the concepts that the Smoke (n-way threaded game framework) demo was built on. One of the readers, Josh, brought up a good comment (check out the comments at the bottom of the article). One of the questions in his comment really got me thinking about lock step versus free step. Which is better? Which is easier? Which does Smoke use?</p>
<p>The last question is the easiest: Smoke uses lock step. Jeff goes into some good detail in section 2.1 about these two execution modes. Lock step just means all systems update at the same rate. For each frame, all systems update and sync. If a system takes longer to run, the other systems have to wait for it to finish. This is the biggest complaint against lock step. In free step, all systems run at separate frequencies… they update and sync based on events.</p>
<p>Lock step does have its advantages. It’s easier to understand and code. It’s definitely easier to debug (you can check for sanity at each sync). As Josh points out, lock mode doesn't suffer from the possibly increased frame latency of free step. But lock step can waste resources if other systems are waiting around… or does it?</p>
<p>First attempts at threading games mostly involved functional decomposition; just put each system (graphics, IO, physics, etc) on a separate thread. If a system finishes updating and has to wait for another system, that system’s thread would sleep. This is wasting resources because that thread could be doing more work… this is the major fault of lock step. However, Smoke uses a job pool and worker threads to support functional and data decomposition. So… if a system finishes its work, that thread can work on jobs for other systems! Now we are not wasting resources on lock step ^_^ Score! There are a few exceptions, if a system doesn’t divide its work properly and takes a long period of time to finish… then the worker threads could end up unemployed.</p>
<p>I have want to rework Smoke to run in free step mode. But I am comfortable with lock step. It’s easy for me to understand and explain… and I can easily map out the latency between the systems. I wonder if anyone will take up the challenge and get Smoke working in a free step mode… I’d love to hear if anyone out there gives it a try ^_^ I’d also like to hear more about peoples’ experience on free or lock step. Which do you like? Is free step worth the possible headaches (especially with a large team of developers)?</p>
<p>While running free sounds nice. I think I’ll keep my projects locked down for the near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2008/12/30/lock-or-be-free/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing Smoke and Orion</title>
		<link>http://software.intel.com/en-us/blogs/2008/12/10/introducing-smoke-and-orion/</link>
		<comments>http://software.intel.com/en-us/blogs/2008/12/10/introducing-smoke-and-orion/#comments</comments>
		<pubDate>Wed, 10 Dec 2008 17:16:46 +0000</pubDate>
		<dc:creator>Orion Granatir (Intel)</dc:creator>
				<category><![CDATA[Graphics & Media]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Software Tools]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[games]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[Smoke]]></category>
		<category><![CDATA[threading]]></category>

		<guid isPermaLink="false">http://software.intel.com/en-us/blogs/2008/12/10/introducing-smoke-and-orion/</guid>
		<description><![CDATA[Hi all ^_^ This is the first of a series of blogs I am going to post about Smoke. Smoke is a demo developed by Intel to show n-way threading in a game framework… that’s a mouth full! Basically, it shows one way games can maximize the CPU. To maximize CPU utilization, a game needs [...]]]></description>
			<content:encoded><![CDATA[<p>Hi all ^_^</p>
<p>This is the first of a series of blogs I am going to post about Smoke.<br />
Smoke is a demo developed by Intel to show n-way threading in a game framework… that’s a mouth full!  Basically, it shows one way games can maximize the CPU.  To maximize CPU utilization, a game needs to use all available cores.  By properly threading a game it can have more accurate physics, smarter AI, more particles, and/or a faster frame-rate.  Smoke demonstrates one way to achieve better games.</p>
<p>Smoke has been in development for over a year and a half!  I’m glad it’s finally released.<br />
There is a ton of content about Smoke <a href="http://software.intel.com/en-us/articles/smoke-game-technology-demo">here</a> (including all the source code).</p>
<p>Since this is my first blog on ISN, let me introduce myself...<br />
I’m a senior engineer with Intel’s Visual Computing Software Division.  I am the tech lead on the Smoke project.  Prior to joining Intel in 2007, I worked on several PlayStation 3 titles as a senior programmer for Insomniac Games.  My most recently published titles are Resistance: Fall of Man and Ratchet and Clank Future: Tools of Destruction.</p>
<p>In my next post, I’ll talk about some of the work I did on Smoke (e.g. multithreaded AI).</p>
<p>- Orion</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/en-us/blogs/2008/12/10/introducing-smoke-and-orion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

