<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>中文 &#187; 邓辉</title>
	<atom:link href="http://software.intel.com/zh-cn/blogs/author/285737/feed/" rel="self" type="application/rss+xml" />
	<link>http://software.intel.com/zh-cn/blogs</link>
	<description></description>
	<lastBuildDate>Mon, 28 May 2012 13:40:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>2011 英特尔® 线程挑战赛 —Parallelized Parser and Formula Interpreter</title>
		<link>http://software.intel.com/zh-cn/blogs/2011/08/09/2011-parallelized-parser-and-formula-interpreter/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2011/08/09/2011-parallelized-parser-and-formula-interpreter/#comments</comments>
		<pubDate>Tue, 09 Aug 2011 06:43:21 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>
		<category><![CDATA[并行计算]]></category>
		<category><![CDATA[开放源代码]]></category>
		<category><![CDATA[线程挑战赛]]></category>
		<category><![CDATA[解释器]]></category>
		<category><![CDATA[语法分析]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2011/08/09/2011-parallelized-parser-and-formula-interpreter/</guid>
		<description><![CDATA[  2011 英特尔® 线程挑战赛— 并行语法分析器与公式解释器 邓辉   denghui0815@hotmail.com 由于时间关系，文档写的不是太仔细，抱歉!  源码下载 问题描述 公式语法分析器和解释器分析连续的输入文字流，检查语法是否正确，根据输入内容分析出其数据结构并解析公式。公式的结果可以存储在变量中，将来的公式可以使用所有未作为公式的某个值从内存中除去的变量。最新多核硬件中的并行处理功能可用于增加语法分析器和解释器的吞吐量。许多热门 Web 浏览器的 JavaScript 引擎中的最新增强功能都需要提高与网络相关的语法分析器的处理速度。 问题描述： 编写一段多线程代码，解释、执行输入语句并生成输出结果。发送到输出结果的文字顺序应与程序输入语句中的顺序一致。您可以并行执行多个语句以增加吞吐量。 注意：关键字和变量区分大小写。行结束符为分号 (;) 下面的示例定义了两行： var x = 50; var y = 10; 有效关键字和函数： 关键字 描述 示例 var {variableName} 定义一个名为 variableName 的新变量。variableName 可由字母和数字 [A-Za-z0-9] 组成，但必须以字母开头。数据类型必须由程序根据使用赋值运算符 (=) 赋予变量的值来判断。 var x = 50;var double = 50.2353;var decimal = 345.212312321; free(variableName) 释放变量占用的内存。通过使用 var [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1 style="text-align: center;">2011 英特尔® 线程挑战赛—</h1>
<h1 style="text-align: center;">并行语法分析器与公式解释器</h1>
<h4 style="text-align: center;">邓辉   <a href="mailto:denghui0815@hotmail.com">denghui0815@hotmail.com</a></h4>
<h5 style="text-align: left;">由于时间关系，文档写的不是太仔细，抱歉!</h5>
<p style="text-align: left;"> <a href="http://software.intel.com/file/37324">源码下载</a></p>
<h1>问题描述</h1>
<p>公式语法分析器和解释器分析连续的输入文字流，检查语法是否正确，根据输入内容分析出其数据结构并解析公式。公式的结果可以存储在变量中，将来的公式可以使用所有未作为公式的某个值从内存中除去的变量。最新多核硬件中的并行处理功能可用于增加语法分析器和解释器的吞吐量。许多热门 Web 浏览器的 JavaScript 引擎中的最新增强功能都需要提高与网络相关的语法分析器的处理速度。</p>
<p><strong>问题描述： </strong></p>
<p>编写一段多线程代码，解释、执行输入语句并生成输出结果。发送到输出结果的文字顺序应与程序输入语句中的顺序一致。您可以并行执行多个语句以增加吞吐量。</p>
<p>注意：关键字和变量区分大小写。行结束符为分号 (;)</p>
<p>下面的示例定义了两行：</p>
<p>var x = 50;</p>
<p>var y = 10;</p>
<p>有效关键字和函数：</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top">关键字</td>
<td valign="top">描述</td>
<td valign="top">示例</td>
</tr>
<tr>
<td valign="top">var {variableName}</td>
<td valign="top">定义一个名为 variableName 的新变量。variableName 可由字母和数字 [A-Za-z0-9] 组成，但必须以字母开头。数据类型必须由程序根据使用赋值运算符 (=) 赋予变量的值来判断。</td>
<td valign="top">var x = 50;var double = 50.2353;var decimal = 345.212312321;</td>
</tr>
<tr>
<td valign="top">free(variableName)</td>
<td valign="top">释放变量占用的内存。通过使用 var 关键字再次定义变量名称，可以复用该变量名称。如果某个语句尝试使用已释放的变量，程序应停止执行，并显示错误消息指出导致发生“变量未定义”错误的行号和语句。</td>
<td valign="top">free(x);</td>
</tr>
<tr>
<td valign="top">string(variableName | value)</td>
<td valign="top">将指定为参数的值或变量转换为其字符串表示。转换时应始终在十进制小数点后使用固定数量的十位数字。例如，string(50.323467); 应求值为 50.3234670000。不进行任何舍入处理，仅在需要时截断和添加 0。</td>
<td valign="top">string(3450.212);string(double);string(result);</td>
</tr>
<tr>
<td valign="top">output(variableName | value)</td>
<td valign="top">将指定为参数的值或变量发送到输出缓冲区/文件。如果参数是一个数字，解释器应使用为字符串函数指定的相同机制将其转换为字符串。例如，output(50.323467) 应将 50.3234670000 发送到输出结果中。</td>
<td valign="top">ouput(325.30);output(25 + 32);output(result);</p>
<p>output(result + 32);</p>
<p>output(string(20) + string(70));</td>
</tr>
</tbody>
</table>
<p>注意：公式可以用作参数，而公式结果应为该参数的值。为方便起见，您可以将语法分析器类型映射为许多高级语言中定义的标准类型，例如有符号整数、有符号双精度和字符串（允许的最大长度为 256 个字符）等数字类型。</p>
<p>示例：</p>
<p>output(50 + 20);</p>
<p>string(35 + 32);</p>
<p>string(32 * 12 * multiplier);</p>
<p>使用 var 关键字定义变量后，可以不用关键字将新值赋予该变量。例如：</p>
<p>var a = 50;</p>
<p>a = a + 1;</p>
<p>a = 30;</p>
<p>free(a);</p>
<p>数字以及数字变量的有效算术运算符：</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top">运算符</td>
<td valign="top">描述</td>
<td valign="top">示例</td>
</tr>
<tr>
<td valign="top">=</td>
<td valign="top">基本赋值</td>
<td valign="top">var x = 50;var double = 50.2353;var decimal = 345.212312321;</td>
</tr>
<tr>
<td valign="top">+</td>
<td valign="top">加法</td>
<td valign="top">var z = x + double;var q = y + z;output((30 + 20) + 10);</td>
</tr>
<tr>
<td valign="top">-</td>
<td valign="top">减法</td>
<td valign="top">var z = q - d;var result = (30 - 16);</td>
</tr>
<tr>
<td valign="top">*</td>
<td valign="top">乘法</td>
<td valign="top">var m = 30 * 5;var z = (30 + 5) * 2;</td>
</tr>
<tr>
<td valign="top">/</td>
<td valign="top">除法 — 实型数据</td>
<td valign="top">var div = (30 / 2);var d = 5 / 2;var m = ((50 * d) / div) + 25;</td>
</tr>
</tbody>
</table>
<p>注意：语法分析器将支持从整数到双精度数据类型的隐式转换。</p>
<p>圆括号用于将表达式分组以控制执行顺序。</p>
<p>例如：</p>
<p>var m = ((50 * d) / div) + 25;</p>
<p>应按以下步骤解析：</p>
<p> (50 * d)</p>
<p>(上一个运算的结果) / div</p>
<p>(上一个运算的结果) + 25</p>
<p>字符串括在双引号 (") 内 — ASCII 码 34、" 或 "</p>
<p>字符串的有效运算符：</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top">运算符</td>
<td valign="top">描述</td>
<td valign="top">示例</td>
</tr>
<tr>
<td valign="top">=</td>
<td valign="top">基本赋值</td>
<td valign="top">var str = "String Value";var str2 = str + " and more";var str3 = string(35 + 12);</p>
<p>var str4 = "result: " + string(50 / 2);</p>
<p>str4 = str4 + "a";</p>
<p>str = str + str4;</td>
</tr>
<tr>
<td valign="top">{string} + {string}</td>
<td valign="top">连接</td>
<td valign="top">var strConcat = str1 + str2 + str3;output("First result: " + string(result));</td>
</tr>
</tbody>
</table>
<p>注意：不允许使用 {string} + {number} 连接。需要使用 string({number}) 将数字转换为字符串。关键字和例程名称对于变量名称无效。如果发生语法错误，程序应停止执行并显示错误消息指出导致发生语法错误的行号和语句。</p>
<p>有效数学例程/函数：</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top">函数</td>
<td valign="top">描述</td>
<td valign="top">示例</td>
</tr>
<tr>
<td valign="top">pow(variableName | value, variableName | value)</td>
<td valign="top">返回底数（第一个参数）的指数（第二个参数）次方</td>
<td valign="top">var result = pow(2, 4);var powRes = pow(20 + 5, (6 / res));</td>
</tr>
<tr>
<td valign="top">sqrt(variableName | value)</td>
<td valign="top">返回参数的平方根。如果数字为负数，应显示错误消息并停止执行，因为这属于语法错误。</td>
<td valign="top">var result = sqrt(1400);output("srq: " + string(sqrt(30)));&lt;/t</td>
</tr>
</tbody>
</table>
<p><strong>输入描述：</strong></p>
<p>此程序的输入内容来自命令行上指定的一个文本文件（此文件名作为程序的第一个参数）。文件结束标志着待执行语句的结束。如上所述，解释器可以决定用来解释输入文件中所有语句的最佳机制并生成输出结果。</p>
<p><strong>输出描述：</strong></p>
<p>该程序将解释输入文件中的每个输出函数并为其输出一行。输出内容应包含调用结果，并且输出顺序与输入文件中一致。保证输出结果正确无误最为重要，但高吞吐量和可扩展性作为评分考量因素也很重要。程序的输出结果将储存在命令行上作为第二个参数指定的文本文件中。如果出现语法错误，应按照前面的说明将相应消息写入到输出文件中。</p>
<p>命令行示例： formulaint.exe input.txt output.txt</p>
<p>输入示例文件，input.txt：</p>
<p>var a = 150;</p>
<p>var b = 320;</p>
<p>var c = (a + b) / 20;</p>
<p>var d = ((320 / 2) * 50);</p>
<p>output("Value for a: " + string(a));</p>
<p>output("Value for b: " + string(b)); var message = "a / b = ";</p>
<p>message = message + string(a / b);</p>
<p>var unusedStr = "I will not use this string at all. Will you waste memory on me?";</p>
<p>output(message);</p>
<p>free(message);</p>
<p>free(unusedStr);</p>
<p>free(a);</p>
<p>free(b);</p>
<p>free(c);</p>
<p>free(d);</p>
<p>输出示例文件，output.txt：</p>
<p>Value for a: 150.0000000000</p>
<p>Value for b: 320.0000000000</p>
<p>a / b = 0.4687500000</p>
<p><strong>计时：</strong>将使用此程序的总执行时间进行计分。为得到最准确的计时结果，所提交的代码需要包含计时代码并将计算出的总执行时间打印到标准输出，否则将使用外部秒表计时。</p>
<h1>串行算法</h1>
<p>解释器首先需要完成的功能是分词，分词规则如下：</p>
<p>1.     关键字： var sqrt pow string free output</p>
<p>2.     运算符号： = + - * / ( )；</p>
<p>3.     数字；    </p>
<p>4.     字符串；</p>
<p>5.     变量名；</p>
<p>6.     语句结束符号 ;</p>
<p>7.     分割符号 空格 \t \r \n</p>
<p>而语法分析器需要支持四种语法：</p>
<p>1.     变量定义语句：   var 变量名 = 表达式；</p>
<p>2.     变量赋值语句：   变量名 = 表达式；</p>
<p>3.     变量释放语句：   free（变量名）；</p>
<p>4.     输出语句：       output（表达式）；</p>
<p>在语法解析时将表达式转换为逆波兰表达式。</p>
<p>转换方法</p>
<p>1.     首先构造一个运算符栈，此运算符在栈内遵循越往栈顶优先级越高的原则。</p>
<p>2.     读入一个用中缀表示的简单算术表达式，为方便起见,设该简单算术表达式的右端多加上了优先级最低的特殊符号“#”。</p>
<p>3.     从左至右扫描该算术表达式，从第一个字符开始判断，如果该字符是数字，则分析到该数字串的结束并将该数字串直接输出。</p>
<p>4.     如果不是数字，该字符则是运算符，此时需比较优先关系。</p>
<p>做法如下：将该字符与运算符栈顶的运算符的优先关系相比较。如果，该字符优先关系高于此运算符栈顶的运算符，则将该运算符入栈。倘若不是的话，则将栈顶的运算符从栈中弹出，直到栈顶运算符的优先级低于当前运算符，将该字符入栈。</p>
<p>5.     重复上述操作(3)-(4)直至扫描完整个简单算术表达式，确定所有字符都得到正确处理，我们便可以将中缀式表示的简单算术表达式转化为逆波兰表示的简单算术表达式。</p>
<p>例如：</p>
<p>        a =  ( (b + c) * pow(d, e));</p>
<p>                   变量           运算符                   表达式</p>
<p>a                                       a</p>
<p>a               =                       a</p>
<p>a b             =                       a b</p>
<p>a b             = +                     a b</p>
<p>a b c           = +                     a b c</p>
<p>a b c           =                       a b c +</p>
<p>a b c           = *                     a b c +</p>
<p>a b c           = * pow                 a b c +</p>
<p>a b c d         = * pow                 a b c + d</p>
<p>a b c d e       = * pow                 a b c + d e</p>
<p>a b c d e       = *                     a b c + d e pow</p>
<p>a b c d e       =                       a b c + d e pow *</p>
<p>a b c d e       =                       a b c + d e pow * =</p>
<p>最后是对表达式求值。</p>
<h1>热点分析</h1>
<p>使用Intel Amplifier分析热点，结果如下：</p>
<p><img src="http://hi.csdn.net/attachment/201108/6/0_1312592290J9Sj.gif" alt="" /></p>
<p>结果表明，表达式求值函数与变量取值函数占用的时间较大。</p>
<h1>并行算法</h1>
<p>由于表达式包含输入和输出，如果表达式的输入没有交集时，这些表达式可以并行求值。所以可以分析表达式的输入，根据依赖关系激活表达式，已求值的表达式输出更新未求值表达式的输入。当一个未求值的表达式所有输入均已更新，则将其加入到待求职表达式队列中。利用多个线程并行对待求值表达式的求值运算。        </p>
<p>使用Intel Amplifier分析Concurrency，结果如下：</p>
<p><img src="http://hi.csdn.net/attachment/201108/6/0_1312592298625f.gif" alt="" /></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2011/08/09/2011-parallelized-parser-and-formula-interpreter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2011 英特尔® 线程挑战赛—Tiling Rectangles</title>
		<link>http://software.intel.com/zh-cn/blogs/2011/06/17/2011-tiling-rectangles/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2011/06/17/2011-tiling-rectangles/#comments</comments>
		<pubDate>Fri, 17 Jun 2011 01:46:19 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[并行计算]]></category>
		<category><![CDATA[开放源代码]]></category>
		<category><![CDATA[Tiling Rectangles]]></category>
		<category><![CDATA[并行]]></category>
		<category><![CDATA[线程挑战赛]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2011/06/17/2011-tiling-rectangles/</guid>
		<description><![CDATA[2011 英特尔® 线程挑战赛—Tiling Rectangles   邓辉   denghui0815@hotmail.com    源码下载 问题描述 给定一个整数尺寸的矩形区域，此区域可以再细分为多个方格，同样也是整数尺寸。此过程称为拼贴矩形。 对于这种用方格拼成的矩形，我们可以使用一系列分组的整数对拼贴进行编码。 从给定矩形水平方向的上方开始，从左向右、从上向下“读取”方格。 用括号将处于相同水平（位于拼贴矩形顶部）的方格的侧面长度组合到一起，然后按从左向右的顺序列出   例如，4x7 矩形将按以下方式拼贴…  _ _ _ _ _ _ _ &#124;         &#124;     &#124;_&#124; &#124;         &#124;_ _&#124;_&#124; &#124;         &#124;_&#124;     &#124; &#124; _ _ _ &#124;_&#124;_ _&#124;   …将编码为 (4 2 1) (1) (1 2) (1)   问题描述：编写一段多线程代码，输入未知数量的有序整数集，并判断其中的一些整数组是否可形成某些用方格拼成的矩形的正确编码。 用于测试的整数集将保存在此程序的命令行上所列的第一个文本文件中。对于发现的所有有效编码，此程序将输出矩形的高度和宽度，以及格式正确的拼贴编码。 输出内容将保存在命令行上所列的第二个文本文件中。 [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-family: 仿宋_GB2312; font-size: 22pt;" lang="EN-US">2011 </span><span style="font-family: 仿宋_GB2312; font-size: 22pt;">英特尔</span><span style="font-size: 22pt;" lang="EN-US"><span style="font-family: Calibri;">®</span></span><span style="font-family: 仿宋_GB2312; font-size: 22pt;" lang="EN-US"> </span><span style="font-family: 仿宋_GB2312; font-size: 22pt;">线程挑战赛—<span lang="EN-US">Tiling Rectangles</span></span></p>
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><strong><span style="font-family: 仿宋_GB2312; font-size: 15pt;"> </span></strong></p>
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><strong><span style="font-family: 仿宋_GB2312; font-size: 15pt;">邓辉<span lang="EN-US"><span>   </span><a href="mailto:denghui0815@hotmail.com"><span style="color: #000000;">denghui0815@hotmail.com</span></a></span></span></strong></p>
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><strong></strong> </p>
<h1 style="line-height: normal; margin: 10pt 0cm;"><span style="font-family: 宋体; font-size: 14pt;"> <a href="http://software.intel.com/file/36639">源码下载</a></span></h1>
<h1 style="line-height: normal; margin: 10pt 0cm;"><span style="font-size: x-large;"><span style="font-family: 宋体;">问题描述</span></span></h1>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">给定一个整数尺寸的矩形区域，此区域可以再细分为多个方格，同样也是整数尺寸。此过程称为拼贴矩形。 对于这种用方格拼成的矩形，我们可以使用一系列分组的整数对拼贴进行编码。 从给定矩形水平方向的上方开始，从左向右、从上向下“读取”方格。 用括号将处于相同水平（位于拼贴矩形顶部）的方格的侧面长度组合到一起，然后按从左向右的顺序列出</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">例如，<span lang="EN-US">4x7 </span>矩形将按以下方式拼贴…</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"><span> </span>_ _ _ _ _ _ _</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">|<span>        </span><span> </span>|<span>     </span>|_|</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">|<span>         </span>|_ _|_|</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">|<span>         </span>|_|<span>     </span>|</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">| _ _ _ |_|_ _|</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">…将编码为<span lang="EN-US"> (4 2 1) (1) (1 2) (1)</span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><strong><span style="font-family: 宋体; color: black;">问题描述：</span></strong><span style="font-family: 宋体; color: black;">编写一段多线程代码，输入未知数量的有序整数集，并判断其中的一些整数组是否可形成某些用方格拼成的矩形的正确编码。 用于测试的整数集将保存在此程序的命令行上所列的第一个文本文件中。对于发现的所有有效编码，此程序将输出矩形的高度和宽度，以及格式正确的拼贴编码。 输出内容将保存在命令行上所列的第二个文本文件中。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><strong><span style="font-family: 宋体; color: black;">输入描述：</span></strong><span style="font-family: 宋体; color: black;">此程序的输入内容来自命令行上一个指定的文本文件。 连续行会按顺序对应到矩形的相同可能拼贴。 “<span lang="EN-US">0</span>”（零）用于表示拼贴整数集的结束。输入文件中的每个文本行将包含<span lang="EN-US"> 20 </span>个整数，拼贴的最后一行可能少于<span lang="EN-US"> 20 </span>个整数，并且会以零结尾。 文件结束标志着拼贴整数集的结束。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><strong><span style="font-family: 宋体; color: black;">输出描述：</span></strong><span style="font-family: 宋体; color: black;">对于输入文件内的每个可能拼贴整数集，此程序将输出矩形的尺寸和每个矩形的拼贴编码，并将输出结果储存在此程序的命令行上所列的第二个文件中。如果从输入的整数集中得不到可行的编码，则应该打印一条消息说明这种情况。 整数集的输出顺序必须与输入顺序相同。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">命令行示例：<span lang="EN-US"> tiling.exe setsin.txt rectout.txt</span></span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">输入文件示例，<span lang="EN-US">setsin.txt:</span></span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">4 2 1 1 1 2 1 0</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">2 1 1 0</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">17 10 9 5 4 1 2 1 8 1 5 0</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">36 33 5 28 25 9 2 7 16 0</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">输出文件示例，<span lang="EN-US">rectout.txt:</span></span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">Set 1</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">dimensions: 4 x 7</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">(4 2 1) (1) (1 2) (1)</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">Set 2</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">dimensions: 2 x 3</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">(2 1) (1)</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">dimensions: 3 x 2</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">(2) (1 1)</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">Set 3</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">Cannot encode a rectangle</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">Set 4</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">dimensions: 61 x 69</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;">(36 33) (5 28) (25 9 2) (7) (16)</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><strong><span style="font-family: 宋体; color: black;">计时：</span></strong><span style="font-family: 宋体; color: black;">将使用此程序的总执行时间进行计分。为得到最准确的计时结果，所提交的代码需要包含计时代码并将计算出的总执行时间打印到标准输出，否则将使用外部秒表计时。</span></span></p>
<h1 style="line-height: normal; margin: 10pt 0cm;"><span style="font-family: 宋体; font-size: 14pt;">串行算法</span></h1>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">根据问题描述，我们需要将矩形按读入的顺序，按从左向右、从上向下的规则进行拼贴，如果所有矩形拼贴完成，得到的仍然为矩形，即得到一组合法的编码。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">假设所有矩形的面积为<span lang="EN-US">nArea,</span>前<span lang="EN-US">n</span>个矩形的宽度和为<span lang="EN-US">Wn</span>，那么<span lang="EN-US">nArea % Wn </span>必须零，才能可能拼贴出一个宽度为<span lang="EN-US">Wn</span>、面积为<span lang="EN-US">nArea</span>的矩形。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">在进行拼贴时，我们需要记录每个矩形的下边沿，即矩形下边沿的起点和长度（<span lang="EN-US">x</span>，<span lang="EN-US">y</span>，<span lang="EN-US">len</span>）。如果定义边沿到原点的距离为<span lang="EN-US"> y * W + x</span>，那么下一个矩形拼贴的位置为到原点距离最小的边沿的起点（用红色标识）。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">例如数据：<span lang="EN-US">100 150 50 50 100 50 50 50 50 50 0</span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">1.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">当<span lang="EN-US">n</span>为<span lang="EN-US">3</span>时，第一行有<span lang="EN-US">3</span>个矩形，那么<span lang="EN-US">W = 100 + 150 + 50 = 300</span>；拼贴这三个矩形后得到三个矩形的边沿。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;"><img src="http://hi.csdn.net/attachment/201106/16/0_1308226111lTY9.gif" alt="" /></span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">其中红色标识出该边沿到原点的距离最小。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">2.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">在红色点处拼贴第<span lang="EN-US">4</span>个矩形。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">   <img src="http://hi.csdn.net/attachment/201106/16/0_13082261171v12.gif" alt="" /></span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">3.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">在红色点处拼贴第<span lang="EN-US">5</span>个矩形。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">   <img src="http://hi.csdn.net/attachment/201106/16/0_1308226121MsLs.gif" alt="" /></span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">4.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">在红色点处拼贴第<span lang="EN-US">6</span>个矩形。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 18pt; margin: 0cm 0cm 0pt 21pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 42pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;"><img src="http://hi.csdn.net/attachment/201106/16/0_1308226124JjGl.gif" alt="" /></span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 42pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">当前边沿（<span lang="EN-US">150</span>，<span lang="EN-US"> 100</span>，<span lang="EN-US"> 150</span>）与 边沿（<span lang="EN-US">150</span>，<span lang="EN-US"> 250</span>，<span lang="EN-US"> 50</span>）连续且等高，进行合并得到边沿（<span lang="EN-US">150</span>，<span lang="EN-US"> 100</span>，<span lang="EN-US">200</span>）。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">5.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">在红色点处拼贴第<span lang="EN-US">7</span>个矩形。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"> </p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">    <img src="http://hi.csdn.net/attachment/201106/16/0_1308226128sf3w.gif" alt="" /></span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">6.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">依次拼贴剩余的矩形。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;"> <img src="http://hi.csdn.net/attachment/201106/16/0_1308226137547V.gif" alt="" /> </span></span></span><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">   </span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;"> <img src="http://hi.csdn.net/attachment/201106/16/0_1308226143rFi0.gif" alt="" /></span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">    <img src="http://hi.csdn.net/attachment/201106/16/0_1308226147ob1H.gif" alt="" /></span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><span style="font-size: small;">7.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span><span style="font-family: 宋体; color: black;"><span style="font-size: small;">最终合并后得到一条边沿<span lang="EN-US">(200, 0, 300),</span>拼贴结束。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"><span>    </span></span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 21pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">由于本算法需要反复获取到原点距离最小的边沿，那么可以将边沿放入二叉堆，其提取最小值的复杂度为<span lang="EN-US">O</span>（<span lang="EN-US">logn</span>），可大大提高求解速度。</span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 21pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt 21pt;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span style="font-size: small;"> </span></span></p>
<p class="MsoNormal" style="line-height: 150%; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体; color: black;">算法的流程图如下：</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体;"><span style="font-size: small;"><img src="http://hi.csdn.net/attachment/201106/16/0_1308226398RfV9.gif" alt="" /></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体;"><span style="font-size: small;">核心函数源码：</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">#define</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>  </span><span style="color: #010001;">XCOMPAREEDGE</span>(<span style="color: #010001;">ValA</span>, <span style="color: #010001;">ValB</span>) (<span style="color: #010001;">ValA</span>.<span style="color: #010001;">nVal64</span> &lt; <span style="color: #010001;">ValB</span>.<span style="color: #010001;">nVal64</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">二叉堆<span lang="EN-US">FixDown</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">__inline</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: blue;">void</span> <span style="color: #010001;">XFixDown</span>(<span style="color: blue;">int</span> <span style="color: #010001;">i</span>, <span style="color: blue;">int</span> <span style="color: #010001;">nHeap</span>, <span style="color: #010001;">XEdge</span>* <span style="color: #010001;">pHeap</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span> <span style="color: #010001;">j</span> = <span style="color: #010001;">i</span> * 2 + 1;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span> (<span style="color: #010001;">j</span> + 1 &lt; <span style="color: #010001;">nHeap</span> &amp;&amp; <span style="color: #010001;">XCOMPAREEDGE</span>(<span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span> + 1], <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>]) ) ++<span style="color: #010001;">j</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">while</span> (<span style="color: #010001;">j</span> &lt; <span style="color: #010001;">nHeap</span> &amp;&amp; <span style="color: #010001;">XCOMPAREEDGE</span>(<span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>], <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">i</span>]))</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">const</span> <span style="color: blue;">__m128i</span> <span style="color: #010001;">tmp</span> = <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">i</span>].<span style="color: #010001;">m128Val</span>; <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">i</span>].<span style="color: #010001;">m128Val</span> = <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>].<span style="color: #010001;">m128Val</span>; <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>].<span style="color: #010001;">m128Val</span> = <span style="color: #010001;">tmp</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">i</span> = <span style="color: #010001;">j</span>; <span style="color: #010001;">j</span> = <span style="color: #010001;">i</span> * 2 + 1;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">if</span>(<span style="color: #010001;">j</span> + 1 &lt; <span style="color: #010001;">nHeap</span> &amp;&amp; <span style="color: #010001;">XCOMPAREEDGE</span>(<span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span> + 1], <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>])) ++<span style="color: #010001;">j</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">二叉堆<span lang="EN-US">FixUp</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">__inline</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: blue;">void</span> <span style="color: #010001;">XFixUp</span>(<span style="color: blue;">int</span> <span style="color: #010001;">i</span>, <span style="color: #010001;">XEdge</span>* <span style="color: #010001;">pHeap</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{<span>    </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span> <span style="color: #010001;">j</span> = (<span style="color: #010001;">i</span> - 1) &gt;&gt; 1;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">while</span>(<span style="color: #010001;">i</span> &gt; 0 &amp;&amp; <span style="color: #010001;">XCOMPAREEDGE</span>(<span style="color: #010001;">pHeap</span>[<span style="color: #010001;">i</span>], <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>]))</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">const</span> <span style="color: blue;">__m128i</span> <span style="color: #010001;">tmp</span> = <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">i</span>].<span style="color: #010001;">m128Val</span>; <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">i</span>].<span style="color: #010001;">m128Val</span> = <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>].<span style="color: #010001;">m128Val</span>; <span style="color: #010001;">pHeap</span>[<span style="color: #010001;">j</span>].<span style="color: #010001;">m128Val</span> = <span style="color: #010001;">tmp</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">i</span> = <span style="color: #010001;">j</span>; <span style="color: #010001;">j</span> = (<span style="color: #010001;">i</span> - 1) &gt;&gt; 1;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">void</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">XTilingRect</span>(<span style="color: #010001;">XProblem</span>* <span style="color: #010001;">pArray</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span>*<span>   </span><span style="color: #010001;">pRect</span> = <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">pRect</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span><span>    </span><span style="color: #010001;">nRect</span> = <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">nRect</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span><span>    </span><span style="color: #010001;">i</span>,<span style="color: #010001;">j</span>,<span style="color: #010001;">xs</span>,<span style="color: #010001;">nHeap</span> = 0;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">uint64</span> <span style="color: #010001;">nArea</span> = 0,<span style="color: #010001;">nRetCnt</span>=0;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">uint32</span> <span style="color: #010001;">nW</span>,<span style="color: #010001;">nH</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XEdge</span>* <span style="color: #010001;">pHeap</span> = (<span style="color: #010001;">XEdge</span>*)<span style="color: #010001;">scalable_malloc</span>(<span style="color: #010001;">nRect</span> * <span style="color: blue;">sizeof</span>(<span style="color: #010001;">pHeap</span>[0]) + (<span style="color: #010001;">nRect</span> + 2) * <span style="color: blue;">sizeof</span>(<span style="color: blue;">int</span>));</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span>*<span>   </span><span style="color: #010001;">pCnt</span> = (<span style="color: blue;">int</span>*)(<span style="color: #010001;">pHeap</span> + <span style="color: #010001;">nRect</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">pHeap</span> == <span style="color: #010001;">NULL</span>) <span style="color: #010001;">XError</span>(<span style="color: #a31515;">"XTilingRect </span></span><span style="font-family: 新宋体; color: #a31515; font-size: 9pt;">内存不足<span lang="EN-US">!"</span></span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringExpand</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, 32 + <span style="color: #010001;">nRect</span> * 8);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringAddStr</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #a31515;">"Set "</span>, 4);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringAddInt</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">nIndex</span> + 1, 0);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringAddStr</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #a31515;">"\n"</span>, 1);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">计算所有矩形的总面积</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">for</span>(<span style="color: #010001;">i</span> = 0; <span style="color: #010001;">i</span> &lt; <span style="color: #010001;">nRect</span>; ++<span style="color: #010001;">i</span>) </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">nArea</span> += (<span style="color: #010001;">uint64</span>)<span style="color: #010001;">pRect</span>[<span style="color: #010001;">i</span>] * <span style="color: #010001;">pRect</span>[<span style="color: #010001;">i</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">循环<span lang="EN-US">ys</span>为的矩形个数</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">for</span>(<span style="color: #010001;">nW</span> = 0, <span style="color: #010001;">i</span> = 0; <span style="color: #010001;">i</span> &lt; <span style="color: #010001;">nRect</span>; ++<span style="color: #010001;">i</span>) </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">nW</span> += <span style="color: #010001;">pRect</span>[<span style="color: #010001;">i</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">nH</span> = (<span style="color: blue;">uint32</span>)(<span style="color: #010001;">nArea</span> / <span style="color: #010001;">nW</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">if</span>((<span style="color: blue;">uint64</span>)<span style="color: #010001;">nW</span> * <span style="color: #010001;">nH</span> == <span style="color: #010001;">nArea</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">将<span lang="EN-US">ys</span>为的边界加入堆</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">for</span>(<span style="color: #010001;">xs</span> = 0, <span style="color: #010001;">nHeap</span> = 0; <span style="color: #010001;">nHeap</span> &lt;= <span style="color: #010001;">i</span>; ++<span style="color: #010001;">nHeap</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">pHeap</span>[<span style="color: #010001;">nHeap</span>].<span style="color: #010001;">ys</span> = <span style="color: #010001;">pRect</span>[<span style="color: #010001;">nHeap</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">pHeap</span>[<span style="color: #010001;">nHeap</span>].<span style="color: #010001;">xs</span> = <span style="color: #010001;">xs</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">pHeap</span>[<span style="color: #010001;">nHeap</span>].<span style="color: #010001;">len</span> = <span style="color: #010001;">pRect</span>[<span style="color: #010001;">nHeap</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">xs</span> += <span style="color: #010001;">pRect</span>[<span style="color: #010001;">nHeap</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">XFixUp</span>(<span style="color: #010001;">nHeap</span>, <span style="color: #010001;">pHeap</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">int</span> <span style="color: #010001;">nPreY</span> = 0, <span style="color: #010001;">nCnt</span> = 0, <span style="color: #010001;">nCur</span> = <span style="color: #010001;">nHeap</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">pCnt</span>[<span style="color: #010001;">nCnt</span>++] = 0;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">while</span>(<span style="color: #010001;">nHeap</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">取出最靠近原点的边</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">XEdge</span> <span style="color: #010001;">xEdgeCur</span> = <span style="color: #010001;">pHeap</span>[0];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">将该边从堆中移去</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">pHeap</span>[0] = <span style="color: #010001;">pHeap</span>[--<span style="color: #010001;">nHeap</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">XFixDown</span>(0, <span style="color: #010001;">nHeap</span>, <span style="color: #010001;">pHeap</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">while</span>(<span style="color: #010001;">pHeap</span>[0].<span style="color: #010001;">ys</span> == <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">ys</span> &amp;&amp; <span style="color: #010001;">pHeap</span>[0].<span style="color: #010001;">xs</span> == <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">xs</span> + <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">len</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{<span>    </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">合并连续的边</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">len</span> += <span style="color: #010001;">pHeap</span>[0].<span style="color: #010001;">len</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">将该边从二叉堆中移去</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">pHeap</span>[0] = <span style="color: #010001;">pHeap</span>[--<span style="color: #010001;">nHeap</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">XFixDown</span>(0, <span style="color: #010001;">nHeap</span>, <span style="color: #010001;">pHeap</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">if</span>(<span style="color: #010001;">nPreY</span> != <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">ys</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{<span>    </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">换行判断</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">pCnt</span>[<span style="color: #010001;">nCnt</span>++] = <span style="color: #010001;">nCur</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">nPreY</span> = <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">ys</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">消除当前边</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">len</span> += <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">xs</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">while</span>(<span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">xs</span> &lt; <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">len</span> &amp;&amp; <span style="color: #010001;">nCur</span> &lt; <span style="color: #010001;">nRect</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{<span>    </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">将新的边界加入堆</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">pHeap</span>[<span style="color: #010001;">nHeap</span>].<span style="color: #010001;">ys</span> = <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">ys</span> + <span style="color: #010001;">pRect</span>[<span style="color: #010001;">nCur</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">pHeap</span>[<span style="color: #010001;">nHeap</span>].<span style="color: #010001;">xs</span> = <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">xs</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">pHeap</span>[<span style="color: #010001;">nHeap</span>++].<span style="color: #010001;">len</span> = <span style="color: #010001;">pRect</span>[<span style="color: #010001;">nCur</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">xs</span> += <span style="color: #010001;">pRect</span>[<span style="color: #010001;">nCur</span>++];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">XFixUp</span>(<span style="color: #010001;">nHeap</span> - 1, <span style="color: #010001;">pHeap</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">if</span>(<span style="color: #010001;">nHeap</span> == 0 &amp;&amp; <span style="color: #010001;">nCur</span> == <span style="color: #010001;">nRect</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{<span>    </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">完成拼接，输出结果</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span>++<span style="color: #010001;">nRetCnt</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">XTilingRectOutPutXString</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">pRect</span>, <span style="color: #010001;">nW</span>, <span style="color: #010001;">nH</span>, <span style="color: #010001;">pCnt</span>, <span style="color: #010001;">nCnt</span> - 1);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: blue;">break</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">else</span> <span style="color: blue;">if</span>(<span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">xs</span> != <span style="color: #010001;">xEdgeCur</span>.<span style="color: #010001;">len</span>) </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{<span>    </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">发生拼接错误</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: blue;">break</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">scalable_free</span>(<span style="color: #010001;">pHeap</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">nRetCnt</span> == 0)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">XStringAddStr</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #a31515;">"\nCannot encode a rectangle\n\n"</span>, 28);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">else</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">XStringAddStr</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #a31515;">"\n"</span>, 1);</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<h1 style="line-height: normal; margin: 10pt 0cm;"><span style="font-family: 宋体;"><span style="font-size: x-large;">热点分析</span></span></h1>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体;">使用</span><span lang="EN-US"><span style="font-family: Calibri;">Intel Amplifier</span></span><span style="font-family: 宋体;">分析热点，结果如下：</span><span lang="EN-US"><span style="font-family: Calibri;">(</span></span><span style="font-family: 宋体;">数据为</span><span lang="EN-US"><span style="font-family: Calibri;">17sisrs.txt </span></span><span style="font-family: 宋体;">复制为</span><span lang="EN-US"><span style="font-family: Calibri;">36192</span></span><span style="font-family: 宋体;">行</span><span lang="EN-US"><span style="font-family: Calibri;">)</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> <img src="http://hi.csdn.net/attachment/201106/16/0_1308226460veZy.gif" alt="" width="755" height="365" /></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体;"><span style="font-size: small;">分析结果显示热点处于：</span></span></p>
<p class="MsoNormal" style="text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><strong><span style="font-size: 15pt;" lang="EN-US"><span><span style="font-family: Calibri;">1.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span></strong><strong><span style="font-size: 15pt;" lang="EN-US"><span style="font-family: Calibri;"><span> </span>XLoadData_File</span></span></strong></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">加载测试数据文件版本</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">void</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">XLoadData_File</span>(<span style="color: blue;">const</span> <span style="color: blue;">char</span>* <span style="color: #010001;">szInput</span>, <span style="color: #010001;">XProblemAry</span>* <span style="color: #010001;">pProblemAry</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">char</span> <span style="color: #010001;">szLine</span>[2048] = {0};</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">FILE</span>* <span style="color: #010001;">fp</span> = <span style="color: #010001;">fopen</span>(<span style="color: #010001;">szInput</span>, <span style="color: #a31515;">"rb"</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">fp</span> == <span style="color: #010001;">NULL</span>) <span style="color: #010001;">XError</span>(<span style="color: #a31515;">"</span></span><span style="font-family: 新宋体; color: #a31515; font-size: 9pt;">打开输入文件失败<span lang="EN-US">!"</span></span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span><span>   </span><span style="color: #010001;">nReadSize</span> = 1024;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblem</span> <span style="color: #010001;">xProblemRead</span> = {0};</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblemAryExpand</span>(<span style="color: #010001;">pProblemAry</span>, 512);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblemExpand</span>(&amp;<span style="color: #010001;">xProblemRead</span>, <span style="color: #010001;">nReadSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">while</span>(<span style="color: #010001;">fgets</span>(<span style="color: #010001;">szLine</span>, <span style="color: blue;">sizeof</span>(<span style="color: #010001;">szLine</span>), <span style="color: #010001;">fp</span>) != <span style="color: #010001;">NULL</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">if</span>(<span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> + 20 &gt; <span style="color: #010001;">nReadSize</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">nReadSize</span> += <span style="color: #010001;">nReadSize</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">XProblemExpand</span>(&amp;<span style="color: #010001;">xProblemRead</span>, <span style="color: #010001;">nReadSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">int</span>* <span style="color: #010001;">pRead</span> = <span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">pRect</span> + <span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: red;">int i,nRead = sscanf(szLine, "%d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d", </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: red; font-size: 9pt;" lang="EN-US"><span>                                 </span>pRead, pRead + 1, pRead + 2, pRead + 3, pRead + 4, </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: red; font-size: 9pt;" lang="EN-US"><span>                                 </span>pRead + 5,pRead + 6, pRead + 7, pRead + 8, pRead + 9, </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: red; font-size: 9pt;" lang="EN-US"><span>                                 </span>pRead + 10, pRead + 11, pRead + 12, pRead + 13, pRead + 14, </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: red; font-size: 9pt;" lang="EN-US"><span>                                 </span>pRead + 15,pRead + 16, pRead + 17, pRead + 18, pRead + 19);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">if</span>(<span style="color: #010001;">pRead</span>[<span style="color: #010001;">nRead</span> - 1] == 0)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> += <span style="color: #010001;">nRead</span> - 1;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nIndex</span> = <span style="color: #010001;">pProblemAry</span>-&gt;<span style="color: #010001;">nProblem</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">XProblemAryExpand</span>(<span style="color: #010001;">pProblemAry</span>, 8);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">XProblemCopy</span>(&amp;<span style="color: #010001;">pProblemAry</span>-&gt;<span style="color: #010001;">pProblem</span>[<span style="color: #010001;">pProblemAry</span>-&gt;<span style="color: #010001;">nProblem</span>++], &amp;<span style="color: #010001;">xProblemRead</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> = 0;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">else</span> <span style="color: blue;">if</span>(<span style="color: #010001;">nRead</span> == 20)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> += <span style="color: #010001;">nRead</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">else</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">XError</span>(<span style="color: #a31515;">"</span></span><span style="font-family: 新宋体; color: #a31515; font-size: 9pt;">错误的数据，不足个整数，且没有结尾<span lang="EN-US">!"</span></span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblemFree</span>(&amp;<span style="color: #010001;">xProblemRead</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">fclose</span>(<span style="color: #010001;">fp</span>);</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体;">该函数功能为数据加载，通过</span><span lang="EN-US"><span style="font-family: Calibri;">FILE*</span></span><span style="font-family: 宋体;">访问文件，使用</span><span lang="EN-US"><span style="font-family: Calibri;">sscan</span></span><span style="font-family: 宋体;">读取数字，效率较低。使用内存映射读取数据可以提高效率。优化后代码如下：</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">加载测试数据内存映射串行版本</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">void</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">XLoadData_Serial</span>(<span style="color: blue;">const</span> <span style="color: blue;">char</span>* <span style="color: #010001;">szInput</span>, <span style="color: #010001;">XProblemAry</span>* <span style="color: #010001;">pProblemAry</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">文件大小</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">uint64</span> <span style="color: #010001;">nFileSize</span> = 0;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">文件映射</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XFILEMAPHANDLE</span> <span style="color: #010001;">hFileMap</span> = <span style="color: #010001;">XFileMapOpen</span>(<span style="color: #010001;">szInput</span>, <span style="color: #010001;">XFILEMAP_READONLY</span>, <span style="color: #010001;">nFileSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">hFileMap</span> == <span style="color: #010001;">NULL</span>) <span style="color: #010001;">XError</span>(<span style="color: #a31515;">"XFileMapOpen Error!"</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">读取测试数据的大小</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">const</span> <span style="color: blue;">char</span> *<span style="color: #010001;">pInput</span> = (<span style="color: blue;">const</span> <span style="color: blue;">char</span>*)<span style="color: #010001;">XFileMapView</span>(<span style="color: #010001;">hFileMap</span>, <span style="color: #010001;">XFILEMAP_READONLY</span>, 0, (<span style="color: #010001;">uint32</span>)<span style="color: #010001;">nFileSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">pInput</span> == <span style="color: #010001;">NULL</span>) <span style="color: #010001;">XError</span>(<span style="color: #a31515;">"XFileMapView Error!"</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">int</span><span>   </span><span style="color: #010001;">nReadSize</span> = 1024;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblem</span> <span style="color: #010001;">xProblemRead</span> = {0};</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblemAryExpand</span>(<span style="color: #010001;">pProblemAry</span>, 512);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblemExpand</span>(&amp;<span style="color: #010001;">xProblemRead</span>, <span style="color: #010001;">nReadSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">const</span> <span style="color: blue;">char</span> *<span style="color: #010001;">pReadTmp</span> = <span style="color: #010001;">pInput</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">const</span> <span style="color: blue;">char</span> *<span style="color: #010001;">pReadEnd</span> = <span style="color: #010001;">pInput</span> + <span style="color: #010001;">nFileSize</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">while</span>(<span style="color: #010001;">pReadTmp</span> &lt; <span style="color: #010001;">pReadEnd</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> = 0;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">while</span>(<span style="color: #010001;">pReadTmp</span> &lt; <span style="color: #010001;">pReadEnd</span> &amp;&amp; (*<span style="color: #010001;">pReadTmp</span> &lt; <span style="color: #a31515;">'0'</span> || *<span style="color: #010001;">pReadTmp</span> &gt; <span style="color: #a31515;">'9'</span>))<span>  </span>++<span style="color: #010001;">pReadTmp</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">while</span>(<span style="color: #010001;">pReadTmp</span> &lt; <span style="color: #010001;">pReadEnd</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">if</span>(<span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> + 20 &gt; <span style="color: #010001;">nReadSize</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">nReadSize</span> += <span style="color: #010001;">nReadSize</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">XProblemExpand</span>(&amp;<span style="color: #010001;">xProblemRead</span>, <span style="color: #010001;">nReadSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">int</span><span>  </span><span style="color: #010001;">j</span>,*<span style="color: #010001;">pReadRect</span> = <span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">pRect</span> + <span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">for</span>(<span style="color: #010001;">j</span> = 0; <span style="color: #010001;">j</span> &lt; 20; ++<span style="color: #010001;">j</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">XREAD_INT</span>(<span style="color: #010001;">pReadTmp</span>, <span style="color: #010001;">pReadEnd</span>, <span style="color: #010001;">pReadRect</span>[<span style="color: #010001;">j</span>]);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">if</span>(<span style="color: #010001;">pReadRect</span>[<span style="color: #010001;">j</span>] == 0 || <span style="color: #010001;">pReadRect</span>[<span style="color: #010001;">j</span>] == -1) <span style="color: blue;">break</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">if</span>(<span style="color: #010001;">j</span> &lt; 20)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">if</span>(<span style="color: #010001;">pReadRect</span>[<span style="color: #010001;">j</span>] == 0)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> += <span style="color: #010001;">j</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nIndex</span> = <span style="color: #010001;">pProblemAry</span>-&gt;<span style="color: #010001;">nProblem</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">XProblemAryExpand</span>(<span style="color: #010001;">pProblemAry</span>, 8);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">XProblemCopy</span>(&amp;<span style="color: #010001;">pProblemAry</span>-&gt;<span style="color: #010001;">pProblem</span>[<span style="color: #010001;">pProblemAry</span>-&gt;<span style="color: #010001;">nProblem</span>++], &amp;<span style="color: #010001;">xProblemRead</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: blue;">break</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: blue;">else</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                       </span><span style="color: #010001;">XError</span>(<span style="color: #a31515;">"</span></span><span style="font-family: 新宋体; color: #a31515; font-size: 9pt;">错误的数据，不足个整数，且没有结尾<span lang="EN-US">!"</span></span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">else</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">xProblemRead</span>.<span style="color: #010001;">nRect</span> += <span style="color: #010001;">j</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XProblemFree</span>(&amp;<span style="color: #010001;">xProblemRead</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XFileMapUnView</span>((<span style="color: blue;">void</span>*)<span style="color: #010001;">pInput</span>, (<span style="color: blue;">unsigned</span> <span style="color: blue;">int</span>)<span style="color: #010001;">nFileSize</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XFileMapClose</span>(&amp;<span style="color: #010001;">hFileMap</span>);</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><strong><span style="font-size: 15pt;" lang="EN-US"><span><span style="font-family: Calibri;">2.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span></strong><strong><span style="font-size: 15pt;" lang="EN-US"><span style="font-family: Calibri;">XStringAddInt</span></span></strong></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt 21pt;"><span style="font-size: small;"><span style="font-family: 宋体;">功能为输出整数到字符串，采用了</span><span lang="EN-US"><span style="font-family: Calibri;">sprintf</span></span><span style="font-family: 宋体;">，可通过查表法提高效率。</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">增加整数</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">void</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">XStringAddInt</span>(<span style="color: #010001;">XString</span>* <span style="color: #010001;">pString</span>, <span style="color: blue;">uint32 </span><span style="color: #010001;">nOut</span>, <span style="color: blue;">int</span> <span style="color: #010001;">nSpace</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">char</span> <span style="color: #010001;">szNum</span>[32];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">sprintf</span>(<span style="color: #010001;">szNum</span>, <span style="color: #010001;">nSpace</span> ? <span style="color: #a31515;">"%d "</span> : <span style="color: #a31515;">"%d"</span>, <span style="color: #010001;">nOut</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringAddStr</span>(<span style="color: #010001;">pString</span>, <span style="color: #010001;">szNum</span>, <span style="color: #010001;">strlen</span>(<span style="color: #010001;">szNum</span>));</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt 21pt;"><span style="font-family: 新宋体; font-size: 9pt;">增加映射表初始化函数，并修改<span style="color: #010001;" lang="EN-US">XStringAddInt</span><span style="color: #010001;">。</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">char</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>   </span><span style="color: #010001;">g_pNum2String</span>[100000][8];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">char</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>   </span><span style="color: #010001;">g_pNum2String_Zero</span>[100000][8];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: #010001; font-size: 9pt;" lang="EN-US">uint8<span style="color: windowtext;"><span>  </span></span>g_nNumeStringLen</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">[100000];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">初始化数字到字符串转换表</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">void</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">XInitNum2StringTab</span>()</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">cfor</span>(<span style="color: blue;">int</span> <span style="color: #010001;">i</span> = 0; <span style="color: #010001;">i</span> &lt; 100000; ++<span style="color: #010001;">i</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">sprintf</span>(<span style="color: #010001;">g_pNum2String</span>[<span style="color: #010001;">i</span>], <span style="color: #a31515;">"%d "</span>, <span style="color: #010001;">i</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">sprintf</span>(<span style="color: #010001;">g_pNum2String_Zero</span>[<span style="color: #010001;">i</span>], <span style="color: #a31515;">"%05d "</span>, <span style="color: #010001;">i</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">g_nNumeStringLen</span>[<span style="color: #010001;">i</span>] = <span style="color: #010001;">strlen</span>(<span style="color: #010001;">g_pNum2String</span>[<span style="color: #010001;">i</span>]) - 1;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">增加整数</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">void</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">XStringAddInt</span>(<span style="color: #010001;">XString</span>* <span style="color: #010001;">pString</span>, <span style="color: blue;">uint32 </span><span style="color: #010001;">nOut</span>, <span style="color: blue;">int</span> <span style="color: #010001;">nSpace</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">const</span> <span style="color: blue;">uint32 </span><span style="color: #010001;">nHigh</span> = <span style="color: #010001;">nOut</span> / 100000;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">nHigh</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">XStringAddStr</span>(<span style="color: #010001;">pString</span>, <span style="color: #010001;">g_pNum2String</span>[<span style="color: #010001;">nHigh</span>], <span style="color: #010001;">g_nNumeStringLen</span>[<span style="color: #010001;">nHigh</span>]);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">XStringAddStr</span>(<span style="color: #010001;">pString</span>, <span style="color: #010001;">g_pNum2String_Zero</span>[<span style="color: #010001;">nOut</span> % 100000], 5 + <span style="color: #010001;">nSpace</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>} </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">else</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">XStringAddStr</span>(<span style="color: #010001;">pString</span>, <span style="color: #010001;">g_pNum2String</span>[<span style="color: #010001;">nOut</span>], <span style="color: #010001;">g_nNumeStringLen</span>[<span style="color: #010001;">nOut</span>] + <span style="color: #010001;">nSpace</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">//char szNum[32];</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">//sprintf(szNum, nSpace ? "%d " : "%d", nOut);</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">//XStringAddStr(pString, szNum, strlen(szNum));</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="text-indent: -18pt; margin: 0cm 0cm 0pt 39pt;"><strong><span style="font-size: 15pt;" lang="EN-US"><span><span style="font-family: Calibri;">3.</span><span style="font: 7pt 'Times New Roman';">     </span></span></span></strong><strong><span style="font-size: 15pt;" lang="EN-US"><span style="font-family: Calibri;">XTilingRect</span></span></strong></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt 21pt;"><span style="font-family: 宋体;"><span style="font-size: small;">矩形拼贴算法，使用二叉堆，优化空间较小。</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt 21pt;"><span lang="EN-US"><span style="font-family: Calibri; font-size: small;"> </span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体;">再使用</span><span lang="EN-US"><span style="font-family: Calibri;">Intel Amplifier</span></span><span style="font-family: 宋体;">分析热点，结果如下：</span><span lang="EN-US"><span style="font-family: Calibri;">(</span></span><span style="font-family: 宋体;">扩大数据为</span><span lang="EN-US"><span style="font-family: Calibri;">17sisrs.txt </span></span><span style="font-family: 宋体;">复制为</span><span lang="EN-US"><span style="font-family: Calibri;">108576</span></span><span style="font-family: 宋体;">行</span><span lang="EN-US"><span style="font-family: Calibri;">)</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri; font-size: small;"> <img src="http://hi.csdn.net/attachment/201106/16/0_1308226516x7zr.gif" alt="" width="755" height="365" /></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体;">分析各个热点函数后，发现</span><span lang="EN-US"><span style="font-family: Calibri;">XStringAddStr</span></span><span style="font-family: 宋体;">还有优化空间。因为每次增加的字符串都比较短，所以</span><span lang="EN-US"><span style="font-family: Calibri;">memcpy</span></span><span style="font-family: 宋体;">的效率较低，改为循环赋值可提高效率。</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">增加字符串</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">__inline</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: blue;">void</span> <span style="color: #010001;">XStringAddStr</span>(<span style="color: #010001;">XString</span>* <span style="color: #010001;">pString</span>, <span style="color: blue;">const</span> <span style="color: blue;">char</span>* <span style="color: #010001;">pData</span>, <span style="color: blue;">int</span> <span style="color: #010001;">nLen</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringExpand</span>(<span style="color: #010001;">pString</span>, <span style="color: #010001;">nLen</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">memcpy</span>(<span style="color: #010001;">pString</span>-&gt;<span style="color: #010001;">pData</span> + <span style="color: #010001;">pString</span>-&gt;<span style="color: #010001;">nUse</span>, <span style="color: #010001;">pData</span>, <span style="color: #010001;">nLen</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">pString</span>-&gt;<span style="color: #010001;">nUse</span> += <span style="color: #010001;">nLen</span>;</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体;"><span style="font-size: small;">修改后代码：</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">增加字符串</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">__inline</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: blue;">void</span> <span style="color: #010001;">XStringAddStr</span>(<span style="color: #010001;">XString</span>* <span style="color: #010001;">pString</span>, <span style="color: blue;">const</span> <span style="color: blue;">char</span>* <span style="color: #010001;">pData</span>, <span style="color: blue;">int</span> <span style="color: #010001;">nLen</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XStringExpand</span>(<span style="color: #010001;">pString</span>, <span style="color: #010001;">nLen</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">char</span> * <span style="color: #010001;">pDst</span> = <span style="color: #010001;">pString</span>-&gt;<span style="color: #010001;">pData</span> + <span style="color: #010001;">pString</span>-&gt;<span style="color: #010001;">nUse</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">for</span>(<span style="color: blue;">int</span> <span style="color: #010001;">i</span> = 0; <span style="color: #010001;">i</span> &lt; <span style="color: #010001;">nLen</span>; ++<span style="color: #010001;">i</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">pDst</span>[<span style="color: #010001;">i</span>] = <span style="color: #010001;">pData</span>[<span style="color: #010001;">i</span>];</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; text-indent: 21pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: #010001; font-size: 9pt;" lang="EN-US">pString</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">-&gt;<span style="color: #010001;">nUse</span> += <span style="color: #010001;">nLen</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; text-indent: 21pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">//memcpy(pString-&gt;pData + pString-&gt;nUse, pData, nLen);</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri; font-size: small;"> </span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体;">相同矩形的拼贴，可通过优化算法大大提高效率，修改</span><span lang="EN-US"><span style="font-family: Calibri;">XTilingRect</span></span><span style="font-family: 宋体;">加入判定和优化代码。</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: #010001; font-size: 9pt;" lang="EN-US">BOOL</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> <span style="color: #010001;">bEqual</span> = <span style="color: #010001;">TRUE</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: green; font-size: 9pt;" lang="EN-US">// </span><span style="font-family: 新宋体; color: green; font-size: 9pt;">判断是否所有矩形尺寸相同</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">for</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">(<span style="color: #010001;">i</span> = 1; <span style="color: #010001;">bEqual</span> &amp;&amp; <span style="color: #010001;">i</span> &lt; <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">nRect</span>; ++<span style="color: #010001;">i</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">if</span>(<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">pRect</span>[<span style="color: #010001;">i</span>] != <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">pRect</span>[<span style="color: #010001;">i</span> - 1]) <span style="color: #010001;">bEqual</span> = <span style="color: #010001;">FALSE</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">if</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">(<span style="color: #010001;">bEqual</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">循环<span lang="EN-US">ys</span>为的矩形个数</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: blue;">for</span>(<span style="color: #010001;">nW</span> = 1; <span style="color: #010001;">nW</span> &lt;= <span style="color: #010001;">nRect</span>; ++<span style="color: #010001;">nW</span>) </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: #010001;">nH</span> = <span style="color: #010001;">nRect</span> / <span style="color: #010001;">nW</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span><span style="color: blue;">if</span>(<span style="color: #010001;">nW</span> * <span style="color: #010001;">nH</span> == <span style="color: #010001;">nRect</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: blue;">for</span>(<span style="color: #010001;">j</span> = 0; <span style="color: #010001;">j</span> &lt;= <span style="color: #010001;">nH</span>; ++<span style="color: #010001;">j</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>                   </span><span style="color: #010001;">pCnt</span>[<span style="color: #010001;">j</span>] = <span style="color: #010001;">nW</span> * <span style="color: #010001;">j</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: green;">// </span></span><span style="font-family: 新宋体; color: green; font-size: 9pt;">完成拼接，输出结果</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span>++<span style="color: #010001;">nRetCnt</span>;</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>              </span><span style="color: #010001;">XTilingRectOutPutXString</span>(&amp;<span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">xRet</span>, <span style="color: #010001;">pArray</span>-&gt;<span style="color: #010001;">pRect</span>, <span style="color: #010001;">nW</span> * <span style="color: #010001;">pRect</span>[0], <span style="color: #010001;">nH</span>* <span style="color: #010001;">pRect</span>[0], <span style="color: #010001;">pCnt</span>, <span style="color: #010001;">nH</span>);</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>         </span>}</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span>}</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<h1 style="line-height: normal; margin: 10pt 0cm;"><span style="font-size: x-large;"><span style="font-family: 宋体;">并行算法</span></span></h1>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">由于每个输入文件中存在多个测试数据，在并行对这些测试数据求解时无需考虑数据冲突问题，所以通过<span lang="EN-US">cilk_for</span>即可实现并行优化。</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: blue; font-size: 9pt;" lang="EN-US">for</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">(<span style="color: blue;">int</span> <span style="color: #010001;">i</span> = 0; <span style="color: #010001;">i</span> &lt; <span style="color: #010001;">xProblemAry</span>.<span style="color: #010001;">nProblem</span>; ++<span style="color: #010001;">i</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XTilingRect</span>(<span style="color: #010001;">xProblemAry</span>.<span style="color: #010001;">pProblem</span> + <span style="color: #010001;">i</span>);</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">利用<span lang="EN-US">Cilk</span>优化后代码如下：</span></span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; color: #010001; font-size: 9pt;" lang="EN-US">cilk_for</span><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">(<span style="color: blue;">int</span> <span style="color: #010001;">i</span> = 0; <span style="color: #010001;">i</span> &lt; <span style="color: #010001;">xProblemAry</span>.<span style="color: #010001;">nProblem</span>; ++<span style="color: #010001;">i</span>)</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="text-align: left; line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US"><span>     </span><span style="color: #010001;">XTilingRect</span>(<span style="color: #010001;">xProblemAry</span>.<span style="color: #010001;">pProblem</span> + <span style="color: #010001;">i</span>);</span></p>
<p class="MsoNormal" style="line-height: 10pt; margin: 0cm 0cm 0pt; background: #cccccc;"><span style="font-family: 新宋体; font-size: 9pt;" lang="EN-US">}</span></p>
<p class="MsoNormal" style="line-height: 150%; text-indent: 21pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">编译后，再使用<span lang="EN-US">Amplifier</span>检测<span lang="EN-US">Concurrency</span>结果如下：</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><img src="http://hi.csdn.net/attachment/201106/16/0_1308226555UEfr.gif" alt="" width="755" height="400" /></span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span>   </span></span><span style="font-family: 宋体; color: black;">算法已经具有良好的并行度<span lang="EN-US">,</span>但<span lang="EN-US">XLoadData_Serial</span>和<span lang="EN-US">XStringFree</span>函数为串行执行，其中<span lang="EN-US">XStringFree</span>在<span lang="EN-US">XSaveDataAndFree_Serial</span>调用，所以优化<span lang="EN-US">IO</span>也变得重要，同样使用<span lang="EN-US">cilk_for</span>实现并行版本的加载数据<span lang="EN-US">XLoadData_Parallel</span>和保存数据函数<span lang="EN-US">XSaveDataAndFree_Parallel</span>。</span></span></p>
<p class="MsoNormal" style="text-indent: 15.75pt; margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; color: black;"><span style="font-size: small;">重新编译后，再使用<span lang="EN-US">Amplifier</span>检测<span lang="EN-US">Concurrency</span>结果如下：</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span><img src="http://hi.csdn.net/attachment/201106/16/0_1308226657L8QR.gif" alt="" width="755" height="360" /></span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><span style="font-family: 宋体; color: black;" lang="EN-US"><span>   </span></span><span style="font-family: 宋体;">目前仅有一个系统函数</span><span lang="EN-US"><span style="font-family: Calibri;">__security_init_cookie</span></span><span style="font-family: 宋体;">没有并行，其为系统在</span><span lang="EN-US"><span style="font-family: Calibri;">mainCRTStartup</span></span><span style="font-family: 宋体;">中调用。</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri; font-size: small;"> </span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri; font-size: small;"> </span></span></p>
<h1 style="line-height: normal; margin: 10pt 0cm;"><span style="font-family: 宋体;"><span style="font-size: x-large;">性能测试</span></span></h1>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体;"><span style="font-size: small;">测试环境：</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>Operating System: Windows XP Professional (5.1, Build 2600) Service Pack 3 </span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>Language: Chinese (Regional Setting: Chinese)</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: Calibri;"><span style="font-size: small;"><span lang="EN-US">    </span><span lang="EN-US">System Manufacturer: Dell Inc.</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>System Model: Vostro 1088<span>     </span><span>                </span></span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>BIOS: Phoenix ROM BIOS PLUS Version 1.10 A02</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>Processor: Intel(R) Core(TM)2 Duo CPU<span>     </span>T6570<span>  </span>@ 2.10GHz (2 CPUs)</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>Memory: 2046MB RAM</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>   </span><span> </span>Page File: 908MB used, 3027MB available</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>Windows Dir: C:\WINDOWS</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri;"><span style="font-size: small;"><span>    </span>DirectX Version: DirectX 9.0c (4.09.0000.0904)</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: Calibri;"><span style="font-size: small;"><span lang="EN-US">    </span><span lang="EN-US">DX Setup Parameters: Not found</span></span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: Calibri;"><span style="font-size: small;"><span lang="EN-US">    </span></span></span><span lang="EN-US"><span style="font-family: Calibri; font-size: small;">DxDiag Version: 5.03.2600.5512 32bit Unicode</span></span></p>
<p class="MsoNormal" style="text-indent: 21pt; margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-family: Calibri; font-size: small;"> </span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体;"><span style="font-size: small;">测试结果：</span></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span><span style="font-family: Calibri; font-size: small;">   </span></span></span></p>
<table class="MsoNormalTable" style="border-collapse: collapse;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 38.75pt;">
<td style="padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 131.05pt; padding-right: 5.4pt; height: 38.75pt; padding-top: 0cm; border: windowtext 1pt solid;" width="175">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: small;"><strong><span lang="EN-US"><span><span style="font-family: Calibri;">        </span></span></span></strong><strong><span style="font-family: 宋体;">测试数据</span></strong><strong><span lang="EN-US"><span style="font-family: Calibri;"><span>       </span></span></span></strong></span></p>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-size: small;"><strong><span style="font-family: 宋体;">算法版本</span></strong><strong></strong></span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 103pt; padding-right: 5.4pt; height: 38.75pt; border-top: windowtext 1pt solid; border-right: windowtext 1pt solid; padding-top: 0cm;" width="137">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><strong><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">Setsin.txt</span></span></strong></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 86.1pt; padding-right: 5.4pt; height: 38.75pt; border-top: windowtext 1pt solid; border-right: windowtext 1pt solid; padding-top: 0cm;" width="115">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><strong><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">19sisrs.txt</span></span></strong></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 109.8pt; padding-right: 5.4pt; height: 38.75pt; border-top: windowtext 1pt solid; border-right: windowtext 1pt solid; padding-top: 0cm;" width="146">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><strong><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">20sisrs.txt</span></span></strong></p>
</td>
</tr>
<tr style="height: 14.2pt;">
<td style="border-bottom: windowtext 1pt solid; border-left: windowtext 1pt solid; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 131.05pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="175">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; font-size: 12pt;">串行版本</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 103pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="137">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">0.070894</span></span><span style="font-family: 宋体; font-size: 12pt;">秒</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 86.1pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="115">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">0.146068</span></span><span style="font-family: 宋体; font-size: 12pt;">秒</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 109.8pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="146">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">0.357285</span></span><span style="font-family: 宋体; font-size: 12pt;">秒</span></p>
</td>
</tr>
<tr style="height: 14.2pt;">
<td style="border-bottom: windowtext 1pt solid; border-left: windowtext 1pt solid; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 131.05pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="175">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; font-size: 12pt;">并行版本</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 103pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="137">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">0.041005</span></span><span style="font-family: 宋体; font-size: 12pt;">秒</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 86.1pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="115">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">0.089352</span></span><span style="font-family: 宋体; font-size: 12pt;">秒</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 109.8pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="146">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">0.207256</span></span><span style="font-family: 宋体; font-size: 12pt;">秒</span></p>
</td>
</tr>
<tr style="height: 14.2pt;">
<td style="border-bottom: windowtext 1pt solid; border-left: windowtext 1pt solid; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 131.05pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="175">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; font-size: 12pt;">加速比</span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 103pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="137">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">1.729</span></span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 86.1pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="115">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">1.635</span></span></p>
</td>
<td style="border-bottom: windowtext 1pt solid; border-left: #c8c8c8; padding-bottom: 0cm; background-color: transparent; padding-left: 5.4pt; width: 109.8pt; padding-right: 5.4pt; height: 14.2pt; border-top: #c8c8c8; border-right: windowtext 1pt solid; padding-top: 0cm;" width="146">
<p class="MsoNormal" style="text-align: center; margin: 0cm 0cm 0pt;"><span style="font-size: 12pt;" lang="EN-US"><span style="font-family: Calibri;">1.723</span></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"> </p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2011/06/17/2011-tiling-rectangles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 英特尔® 线程挑战赛参赛随笔系列 - 第六题 线段求交</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/07/08/400002053/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/07/08/400002053/#comments</comments>
		<pubDate>Wed, 08 Jul 2009 03:33:14 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>
		<category><![CDATA[线程挑战赛]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/07/08/400002053/</guid>
		<description><![CDATA[  问题描述 问题：写一段多线程代码，用来查找在三维空间内相交的线段对。线段的输入文件和输出结果文件将在命令行上指定。 输入格式：输入文件是一个文本文件，它的第一行将包含一个整数 N，表示文件内包含的线段数。后面的 N 行将包含线段的 4 个（可打印）字符名称和用来表示线段两个 (x,y,z) 端点的 6 个整数。其中的前 3 个数字是一个端点的坐标，后 3 个数字是另一个端点的坐标。求解此问题的程序只要能在 32 位平台上正确运行即可满足精度要求。 输出格式：使用某种人类可读的格式。列出所输入线段中的每一对相交线段。如果没有交点，则打印一条消息说明无交点。 有关本问题的详细情况，请进入本次线程挑战赛，获取更多信息。   对线段求交问题的分析和求解 有关此问题的参赛源代码，请点击这里下载。 串行算法 在网络上查找了一些线段相交的资料，很多都是二维线段相交的算法，从《计算几何实验：Balaban线段求交算法(测试版)》一文中了解到，目前二维直线相交的最优算法是Balaban提出的分治算法，而比较简单实用的是Bentley &#38; Ottmann的扫描线算法。 扫描线算法实现起来相对比较简单，只需要将所有直线按X坐标排序，然后依次对X有重合的线段进行跨立测试。这个算法可原封不动的推广到三维。 在判断做跨立测试前先根据两条直线决定的立方体是否有重叠，进行部分过滤。如果立方体没有重叠，那么这两条直线肯定不相交。然后选择XY平面做跨立测试，如果是跨立的，可进一步加入三维Z坐标做判断。 x0 + x1 * t1 = x2 + x3 * t2 y0 + y1 * t1 = y2 + y3 * t2 z0 + z1 [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1><span style="Calibri;">问题描述</span></h1>
<p class="MsoNormal" style="auto;" align="left"><span style="0pt;">问题：写一段多线程代码，用来查找在三维空间内相交的线段对。线段的输入文件和输出结果文件将在命令行上指定。</span></p>
<p class="MsoNormal" style="auto;" align="left"><span style="0pt;">输入格式：输入文件是一个文本文件，它的第一行将包含一个整数<span lang="EN-US"> N</span>，表示文件内包含的线段数。后面的<span lang="EN-US"> N </span>行将包含线段的<span lang="EN-US"> 4 </span>个（可打印）字符名称和用来表示线段两个<span lang="EN-US"> (x,y,z) </span>端点的<span lang="EN-US"> 6 </span>个整数。其中的前<span lang="EN-US"> 3 </span>个数字是一个端点的坐标，后<span lang="EN-US"> 3 </span>个数字是另一个端点的坐标。求解此问题的程序只要能在<span lang="EN-US"> 32 </span>位平台上正确运行即可满足精度要求。</span></p>
<p class="MsoNormal" style="auto;" align="left"><span style="0pt;">输出格式：使用某种人类可读的格式。列出所输入线段中的每一对相交线段。如果没有交点，则打印一条消息说明无交点。</span></p>
<p>有关本问题的详细情况，请进入本次线程挑战赛，获取<a href="http://software.intel.com/zh-cn/contests/Threading-Challenge-2009/codecontest.php">更多信息</a>。</p>
<h1> </h1>
<h1>对线段求交问题的分析和求解</h1>
<p>有关此问题的参赛源代码，请<a href="http://software.intel.com/file/20976">点击这里</a>下载。</p>
<p><strong>串行算法</strong></p>
<p><span style="small;"><span style="Calibri;">在网络上查找了一些线段相交的资料，很多都是二维线段相交的算法，从《</span><span style="Calibri;">计算几何实验：</span><span lang="EN-US"><span style="宋体, MS Song;">Balaban</span></span><span style="Calibri;">线段求交算法</span><span lang="EN-US"><span style="宋体, MS Song;">(</span></span><span style="Calibri;">测试版</span><span lang="EN-US"><span style="宋体, MS Song;">)</span></span><span style="Calibri;">》一文中了解到，目前二维直线相交的最优算法是</span><span lang="EN-US"><span style="宋体, MS Song;">Balaban</span></span><span style="Calibri;">提出的分治算法，而比较简单实用的是</span><span lang="EN-US"><span style="宋体, MS Song;">Bentley &amp; Ottmann</span></span><span style="Calibri;">的扫描线算法。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;">扫描线算法实现起来相对比较简单，只需要将所有直线按</span><span lang="EN-US"><span style="宋体, MS Song;">X</span></span><span style="Calibri;">坐标排序，然后依次对</span><span lang="EN-US"><span style="宋体, MS Song;">X</span></span><span style="Calibri;">有重合的线段进行跨立测试。这个算法可原封不动的推广到三维。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;">在判断做跨立测试前先根据两条直线决定的立方体是否有重叠，进行部分过滤。如果立方体没有重叠，那么这两条直线肯定不相交。然后选择</span><span lang="EN-US"><span style="宋体, MS Song;">XY</span></span><span style="Calibri;">平面做跨立测试，如果是跨立的，可进一步加入三维</span><span lang="EN-US"><span style="宋体, MS Song;">Z</span></span><span style="Calibri;">坐标做判断。</span></span></p>
<p class="MsoNormal" style="1.0gd;"><span lang="EN-US"><span style="宋体, MS Song;">x0 + x1 * t1 = x2 + x3 * t2 </span></span></p>
<p class="MsoNormal" style="1.0gd;"><span lang="EN-US"><span style="宋体, MS Song;">y0 + y1 * t1 = y2 + y3 * t2 </span></span></p>
<p class="MsoNormal" style="1.0gd;"><span lang="EN-US"><span style="宋体, MS Song;">z0 + z1 * t1 = z2 + z3 * t2 </span></span></p>
<p class="MsoNormal" style="1.0gd;"><span style="small;"><span style="Calibri;">三维直线若相交，</span><span style="Calibri;">那么上面三方程组成的方程组就一定有解，如果是线段相交就要求解的范围为<span lang="EN-US">0 &lt;= t1,t2 &lt;= 1</span>。 由于前面已经在<span lang="EN-US">XY</span>平面做了跨立测试，所以我们只需要判定第三个方程是否成立即可。</span></span></p>
<p class="MsoNormal" style="1.0gd;"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">Balaban</span></span><span style="Calibri;">算法由是分治算法，它利用归一化的端点坐标值对线段进行排序，所以归一化的端点取值范围是</span><span lang="EN-US"><span style="宋体, MS Song;">[0, 2N](N</span></span><span style="Calibri;">为端点数量</span><span lang="EN-US"><span style="宋体, MS Song;">)</span></span><span style="Calibri;">。排序完成后将这个值分为多个条带，按线段与条带边界相交的情况分别进行处理，但其向三维推广较扫描线算法复杂很多。</span></span></p>
<p class="MsoNormal" style="1.0gd;"><span style="Calibri;"><span style="small;">最终选择了相对简单的扫描线算法进行实现。</span></span></p>
<p><strong>并行算法</strong></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;">首先使用<span lang="EN-US">TBB</span>的<span lang="EN-US">parallel_sort</span>将线段按起点的<span lang="EN-US">X</span>坐标排序。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">然后，对排序后的每一条线段<span lang="EN-US">L</span>仅需要判断起点在其右边且小于<span lang="EN-US">L</span>的终点的线段判断是否相交。这样每条线段的判断都是相对独立的读访问线段参数及相交判断，不需要任何同步。所以<span lang="EN-US">OpenMP</span>的<span lang="EN-US">for</span>循环即可很容易的并行化本算法，但记录结果需要同步，使用<span lang="EN-US">TBB</span>的<span lang="EN-US">concurrent_vector</span>可以很方便快速的实现。</span></span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">// </span><span style="yes;">并行查找相交直线</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">#pragma</span><span style="yes;" lang="EN-US"> <span style="#010001;">omp</span> <span style="#010001;">parallel</span> <span style="blue;">for</span> <span style="#010001;">schedule</span>(<span style="#010001;">guided</span>, 1)</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">for</span><span style="yes;" lang="EN-US">(<span style="blue;">int</span> <span style="#010001;">i</span> = 0; <span style="#010001;">i</span> &lt; <span style="#010001;">nLine</span>; ++<span style="#010001;">i</span>)</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span><span style="blue;">const</span> <span style="blue;">int</span> <span style="#010001;">nEndX</span> = <span style="#010001;">pLine</span>[<span style="#010001;">i</span>].<span style="#010001;">nX1</span>;</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"> </span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span><span style="green;">// </span></span><span style="yes;">根据<span lang="EN-US">X</span>的排序结果只查找起点落在<span lang="EN-US">pLine[i].nX0</span>到<span lang="EN-US">pLine[i].nX1</span>之间的直线</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span><span style="blue;">for</span>(<span style="blue;">int</span> <span style="#010001;">j</span> = <span style="#010001;">i</span> + 1; <span style="#010001;">j</span> &lt; <span style="#010001;">nLine</span> &amp;&amp; <span style="#010001;">pLine</span>[<span style="#010001;">j</span>].<span style="#010001;">nX0</span> &lt;= <span style="#010001;">nEndX</span>; ++<span style="#010001;">j</span>)</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span>{</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="2;">         </span></span><span style="yes;">……</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span>}</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">}</span></p>
<h1> </h1>
<h1><span style="Calibri;">优化工具</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Hotspots</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Hotspots</span>检测功能查找二分查找算法的热点函数，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;">  <span style="small;"><span style="Calibri;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090707/Snap1.jpg" alt="" width="691" height="390" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090707/Snap2.jpg" alt="" width="695" height="332" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">检测结果显示热点函数在</span><span lang="EN-US"><span style="宋体, MS Song;">XLSI_Solve</span></span><span style="Calibri;">内部，进一步深入发现大约</span><span lang="EN-US"><span style="宋体, MS Song;">8</span></span><span style="Calibri;">秒的时间用于定位是否可能相交，也就是绝大部分时间实际上都用于判断两条线段所决定的立方体是否重叠上，这个判断目前并不是最优的，只是使用了</span><span lang="EN-US"><span style="宋体, MS Song;">x</span></span><span style="Calibri;">坐标排序，</span><span lang="EN-US"><span style="宋体, MS Song;">yz</span></span><span style="Calibri;">坐标循环判断的方式，有更高效的方法可以大大缩短这个时间，例如使用分治法将直线划分到很多个立方体区域内，对每个立方体内部的线段及跨越多个立方体的线段分别处理。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Concurrency</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Concurrency</span>检测功能查找可进行并行优化的代码，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090707/Snap3.jpg" alt="" width="695" height="337" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">检测结果显示，加载数据部分是完全串行的，但由于占整个运行时间的比例较小，属于可以接受的范围，而进行线段相交判定的<span lang="EN-US">XLSI_Solve</span>函数并行度较好，执行时间也比较长。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span style="Calibri;" lang="EN-US">Locks and Waits</span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Locks and Waits</span>检测功能查找两种算法的锁和同步等待消耗，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090707/Snap4.jpg" alt="" width="691" height="327" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">检测结果显示由于使用了<span lang="EN-US">concurrent_vector</span>，导致有一定的同步开销可以通过优化提升性能。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">其他优化</span></h1>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;">  </span></span></span><span style="Calibri;"><span style="small;">在数据加载时保证线段两个端点<span lang="EN-US">X</span>坐标的顺序<span lang="EN-US">X0 &lt; X1</span>以及计算部分数据减少运算。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;">  </span></span></span><span style="Calibri;"><span style="small;">优化直线相交跨立测试部分，只是用乘法和加法运算。</span></span></p>
<h1>  </h1>
<h1><span style="Calibri;">性能测试</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">小数据量测试：</span></span></strong></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">操作系统：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="1;">    </span>32bit</span></span><span style="Calibri;">的测试在</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri;">位</span><span lang="EN-US"><span style="宋体, MS Song;">XP</span></span><span style="Calibri;">下完成。</span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri;">：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>Intel(R) Core(TM)2 CPU<span style="yes;">  </span>5270<span style="yes;">  </span>@ 1.40GHz</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">内存：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>1G</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">时间单位：</span><span lang="EN-US"><span style="yes;"><span style="宋体, MS Song;">    </span></span></span><span style="Calibri;">秒</span></span></p>
<div>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="198">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">直线数量</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="99">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">串行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="109">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">并行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="133">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">加速比</span></span></strong></p>
</td>
</tr>
<tr style="1;">
<td style="solid windowtext .5pt;" width="198">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">10000</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="99">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.770783</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="109">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.431424</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="133">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.79</span></span></p>
</td>
</tr>
<tr style="2;">
<td style="solid windowtext .5pt;" width="198">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">20000</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="99">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">3.042050</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="109">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.676659</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="133">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.81</span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="198">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">50000</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="99">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">19.585916</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="109">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">10.880589</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="133">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.80</span></span></p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span></p>
<h1>  </h1>
<h1><span style="Calibri;">编译说明</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Windows</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">VS2008</span>和<span lang="EN-US">Intel Parallel Studio</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">1. </span><span style="Calibri;">用<span lang="EN-US">VS2008</span>打开本项目<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="2.0;"><span style="small;"><span style="Calibri;" lang="EN-US">2. </span><span style="Calibri;">选择<span lang="EN-US">X64</span>平台<span lang="EN-US">Relase</span>编译<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt 21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">3. </span><span style="Calibri;">进入<span lang="EN-US">Bin</span>目录执行文件为<span lang="EN-US">XLSI.exe.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Linux</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">ICC</span>和<span lang="EN-US">TBB</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">上传压缩包种的<span lang="EN-US">Src</span>和<span lang="EN-US">Linux</span>两个目录到服务器上<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XLSI/Linux</span>目录 执行<span lang="EN-US">make</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XLSI/Bin</span>目录 执行文件为<span lang="EN-US">XLSI.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">其他：</span></span></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong><span style="small;"><strong><span style="Calibri;">主办方请使用</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Win32</span></span></strong><strong><span style="Calibri;">平台</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Release</span></span></strong><strong><span style="Calibri;">版本测试，谢谢！</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="red;" lang="EN-US"><span style="宋体, MS Song;"> </span></span></strong></p>
<h1><span style="Calibri;">优化结论</span></h1>
<p class="MsoNormal" style="2.0;"><span style="small;"><span style="Calibri;">本解决方案实现的三维直线求交的算法还有很大优化余地。特别是在缩小判定范围上做得不够，复杂度比较高。通过将</span><span lang="EN-US"><span style="宋体, MS Song;">Balaban</span></span><span style="Calibri;">算法的分治思想推广到三维的情况可大大降低复杂度，提高执行性能。</span></span></p>
<p>如您需要就本文所涉及的问题与作者进行交流讨论，请发送邮件联系邓辉，邮件地址为 <a href="mailto:denghui0815@hotmail.com"><span style="#000000;">denghui0815@hotmail.com</span></a></p>
<h1> </h1>
<h1><span style="Calibri;">致谢</span></h1>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Clay Breshears</span></span><span style="Calibri;">所做的解答，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Mu</span></span><span style="Calibri;">，</span><span lang="EN-US"><span style="宋体, MS Song;">Pryce</span></span><span style="Calibri;">为本文章发表到</span><span lang="EN-US"><span style="宋体, MS Song;">ISN</span></span><span style="Calibri;">所做的工作，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Xia, JeffX P</span></span><span style="Calibri;">为我的解决方案进行了认真细致的翻译。</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/07/08/400002053/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 英特尔® 线程挑战赛参赛随笔系列 - 第五题: 背包问题</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/07/01/400001970/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/07/01/400001970/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 02:48:39 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>
		<category><![CDATA[线程挑战赛]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/07/01/400001970/</guid>
		<description><![CDATA[2009 英特尔® 线程挑战赛—背包问题实现了动态规划和分支限界算法 
但并行化不太理想。]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1><span style="x-large;"><span style="Calibri;">问题描述</span></span></h1>
<p style="21pt;"><span style="Calibri;"><span style="small;">在你为旅行准备行李时，重要的是你要注意不要在行李箱里放过多的东西，否则你可能会为因行李过重而付出额外的代价。这可能是来自航空公司的额外行李费用，或由于携带背包引起的肌肉酸痛。而且，当你装包时，控制总重量并不是你的唯一标准，你需要为你的旅行放入更有价值的物品。比如当你到澳大利亚度假时，如果你在包里塞满了羽毛，这不会有任何意义。背包问题的目的是要找到物品的最优选择，从而使这些物品的总价值最大，同时使总重量控制在限制范围内。</span></span></p>
<p style="21pt;"><span style="Calibri;"><span style="small;">问题：写一个线程代码来找到能够形成最大总价值的物品清单，同时这些物品的总重量能够满足一个背包（容器）的限定容量。每个物品将有一个相关属性，包括容量（用重量表示）和价值（价格）。背包容量和要使用的物品清单会包含在命令行指定的输入文件内。装入的物品清单，装入的物品数量，总重量以及最终总价值将打印到一个输出文件：<span lang="EN-US">packinglist.out</span>。</span></span></p>
<p style="21pt;"><span style="Calibri;"><span style="small;">输入格式：第一行的文本文件将包含三个数：一个实数和两个整数。实数表示背包的容量，第一个整数表示每个可用物品的最大数目，这就是说，即使每个物品会在文件中列出一次，可能会有多个相同物品用来装包。第二个整数表示文件中物品的数目。随后的每一行由三项数据组成：表示物品名的唯一的<span lang="EN-US">13</span>个字符长的字符串<span lang="EN-US">;</span>表示物品重量的一个实数<span lang="EN-US">;</span>以及表示物品价值的一个实数。在输入中不存在重量和价值相同的两个物品。</span></span></p>
<p style="21pt;"><span style="Calibri;"><span style="small;">输出格式：使用易懂的一些格式。列出装入包中的每个物品以及这些物品的数量。同时需要包括所有装入包中物品的总重量和总价值。</span></span></p>
<p>有关本问题的详细情况，请进入本次线程挑战赛，获取<a href="http://software.intel.com/zh-cn/contests/Threading-Challenge-2009/codecontest.php">更多信息</a>。</p>
<h1> </h1>
<h1><span style="Calibri;">对背包问题的分析和求解</span></h1>
<p>有关此问题的参赛源代码，请<a href="http://software.intel.com/file/20252">点击这里</a>下载。</p>
<p><strong><span style="Calibri;">串行算法</span></strong></p>
<p style="150%;"><span style="Calibri;"><span style="small;">本题属于多重背包问题，多重背包可以转化为<span lang="EN-US">01</span>背包进行求解。将物品分组<span lang="EN-US">1, 2, 4, 8..... 2^(k - 1), nCount - 2^(k + 1) + 1</span>。由这些组可以组合成<span lang="EN-US">[0-nCount]</span>的任意值。将减少转化后的物品数量。</span></span></p>
<p style="150%;"><span style="small;"><span style="Calibri;" lang="EN-US">01</span><span style="Calibri;">背包问题可以用动态规划和分支限界两种算法进行求解，当背包较大时动态规划并不适用，而且内存访问过于频繁，分支限界算法利用分支可能组合的最大价值与当前已经找到的最大价值的比较结果进行剪枝，如果物品的价值和重量比值都差异很小的时候分支限界算法可能会有较长的执行时间。</span></span></p>
<p><strong><span style="Calibri;">并行算法</span></strong></p>
<p style="none;" align="left"><span style="small;"><span style="Calibri;">动态规划是数据高度依赖的算法，所以并行相当的困难，根据测试的结果看线程同步的开销甚至让并行算法比串行算法还要慢。</span></span></p>
<p style="150%;"><span style="small;"><span style="Calibri;">分支限界算法的并行也存在类似的问题，最终采取了多个线程分别计算不同的分支从而达到并行加速的目的。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">优化工具</span></h1>
<p style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">根据测试结果最终选定压缩版本的</span><span lang="EN-US"><span style="宋体, MS Song;">WM</span></span><span style="Calibri;">算法提交，以下为</span><span lang="EN-US"><span style="宋体, MS Song;">ZWM</span></span><span style="Calibri;">算法的优化结果。</span></span></p>
<p style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Hotspots</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Hotspots</span>检测功能查找热点函数，结果如下：</span></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090625/Snap1.jpg" alt="" width="741" height="323" /></span></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;">检测结果显示主要的时间开销为<span lang="EN-US">XKnapsack_BB_Bound</span>价值上限估算函数，通过检测结果优化了该函数，原采用同一物品以堆为单位进行估算，优化后按物品为单位进行估算，运行效率得到了一倍的提高。</span></span></p>
<p style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Concurrency</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Concurrency</span>检测功能查找可进行并行优化的代码，结果如下：</span></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090625/Snap2.jpg" alt="" width="745" height="325" /></span></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;">检测结果显示</span></span><span style="yes;" lang="EN-US">XKnapsack_BB_Bound</span><span style="Calibri;"><span style="small;">价值上限估算函数具有良好的并行度。</span></span></p>
<p style="150%;"><span style="Calibri;" lang="EN-US"><span style="small;"> </span></span><span style="small;"><strong><span style="Calibri;" lang="EN-US">Locks and Waits</span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Locks and Waits</span>检测功能查找锁和同步等待消耗，结果如下：</span></span></p>
<p style="0cm 0cm 0pt;"><strong></strong></p>
<p style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090625/Snap3.jpg" alt="" width="740" height="323" /></span></span></p>
<p style="150%;"><span style="Calibri;"><span style="small;">检测结果显示算法几乎不存在同步和锁消耗。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">其他优化</span></h1>
<p style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">使用内存映射加载数据。</span></span></p>
<p style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">使用二叉堆获取价值上限最大的节点。</span></span></p>
<p style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">使用内存缓冲，减少内存分配占用的时间。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">性能测试</span></h1>
<p style="21pt;"><span style="small;"><span style="Calibri;">操作系统：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="1;">    </span>32bit</span></span><span style="Calibri;">的测试在</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri;">位</span><span lang="EN-US"><span style="宋体, MS Song;">XP</span></span><span style="Calibri;">下完成。</span></span></p>
<p style="21pt;"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri;">：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">           </span>Intel(R) Core(TM)2 CPU<span style="yes;">  </span>5270<span style="yes;">  </span>@ 1.40GHz</span></span></span></p>
<p style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">内存：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">           </span>1G<span style="yes;">   </span></span></span></span></p>
<p style="21pt;"><span style="small;"><span style="Calibri;">时间单位：</span><span lang="EN-US"><span style="yes;"><span style="宋体, MS Song;">  </span></span></span><span style="Calibri;">秒</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="yes;">     </span><span style="yes;"> </span></span></span><span style="Calibri;">最短</span><span lang="EN-US"><span style="宋体, MS Song;">DNA</span></span><span style="Calibri;">序列长度：</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri;">字节</span></span></p>
<div>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="100">
<p style="center;" align="center"><strong><span style="Calibri;"><span style="small;">背包容量</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="76">
<p style="center;" align="center"><strong><span style="Calibri;"><span style="small;">物品数量</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="82">
<p style="center;" align="center"><strong><span style="Calibri;"><span style="small;">物品种类</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p style="center;" align="center"><strong><span style="Calibri;"><span style="small;">串行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p style="center;" align="center"><strong><span style="Calibri;"><span style="small;">并行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p style="center;" align="center"><strong><span style="Calibri;"><span style="small;">加速比</span></span></strong></p>
</td>
</tr>
<tr style="1;">
<td style="solid windowtext .5pt;" width="100">
<p style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">75.00</span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="76">
<p style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">99</span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="82">
<p style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">10028</span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.017726</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.020503</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.86</span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="100">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1000000.00<span style="yes;">  </span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="76">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="82">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100000</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">0.31599</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.330507</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.96</span></span></p>
</td>
</tr>
</tbody>
</table>
</div>
<h1> </h1>
<h1> </h1>
<h1><span style="Calibri;">编译说明</span></h1>
<p style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Windows</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">VS2008</span>和<span lang="EN-US">Intel Parallel Studio</span></span></span></p>
<p style="21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">1. </span><span style="Calibri;">用<span lang="EN-US">VS2008</span>打开本项目<span lang="EN-US">.</span></span></span></p>
<p style="2.0;"><span style="small;"><span style="Calibri;" lang="EN-US">2. </span><span style="Calibri;">选择<span lang="EN-US">Win32</span>平台<span lang="EN-US">Release</span>编译<span lang="EN-US">.</span></span></span></p>
<p style="0cm 0cm 0pt 21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">3. </span><span style="Calibri;">进入<span lang="EN-US">Bin</span>目录执行文件为<span lang="EN-US">XKnapsack.exe.</span></span></span></p>
<p style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Linux</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">ICC</span>和<span lang="EN-US">TBB</span></span></span></p>
<p style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">上传压缩包种的<span lang="EN-US">Src</span>和<span lang="EN-US">Linux</span>两个目录到服务器上<span lang="EN-US">.</span></span></span></p>
<p style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XKnapsack/Linux</span>目录 执行<span lang="EN-US">make</span></span></span></p>
<p style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XKnapsack/Bin</span>目录 执行文件为<span lang="EN-US">XKnapsack.</span></span></span></p>
<p style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">其他：</span></span></strong></p>
<p style="0cm 0cm 0pt;"><span style="small;"><strong><span style="Calibri;">主办方请使用</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Win32</span></span></strong><strong><span style="Calibri;">平台</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Release</span></span></strong><strong><span style="Calibri;">版本测试，谢谢！</span></strong></span></p>
<h1> </h1>
<h1><span style="Calibri;">优化结论</span></h1>
<p style="2.0;"><span style="Calibri;"><span style="small;">高数据依赖的算法并行非常困难，如何在数据量很小的时候能够并行计算，使并行计算减少的时间远远大于同步的开销？我需要更多的时间去探索。</span></span></p>
<p>如您需要就本文所涉及的问题与作者进行交流讨论，请发送邮件联系邓辉，邮件地址为 <a href="mailto:denghui0815@hotmail.com"><span style="#000000;">denghui0815@hotmail.com</span></a></p>
<h1> </h1>
<h1><span style="Calibri;">致谢</span></h1>
<p style="21pt;"><span style="small;"><span style="Calibri;">感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Clay Breshears</span></span><span style="Calibri;">在论坛上所做的解答，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Mu</span></span><span style="Calibri;">，</span><span lang="EN-US"><span style="宋体, MS Song;">Pryce</span></span><span style="Calibri;">为本文章发表到</span><span lang="EN-US"><span style="宋体, MS Song;">ISN</span></span><span style="Calibri;">所做的工作，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Xia, JeffX P</span></span><span style="Calibri;">为此解决方案进行的认真细致的翻译。</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/07/01/400001970/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 英特尔® 线程挑战赛参赛随笔系列 - 第四题: 字符串匹配</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/06/10/400001953/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/06/10/400001953/#comments</comments>
		<pubDate>Wed, 10 Jun 2009 03:59:23 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>
		<category><![CDATA[线程挑战赛]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/06/10/400001953/</guid>
		<description><![CDATA[  问题描述 问题描述：写一个线程程序来搜索用字符串表示的 DNA 序列的数据库，来找到可以匹配的其他 DNA 序列。数据库和查询字符串由四个字符组成：'A', 'C', 'G'. 和 'T' 。对于每个输入的搜索查询字符串，输出必须报告任何输入 DNA 序列内的精确匹配位置。如果查询字符串与数据库内的多个序列匹配，必须报告每个结果;如果查询字符串与同一个数据库序列的多个位置匹配，必须报告完全匹配的那个最早出现的位置。这个问题（资料库档案，查询档案，输出结果）的文件名将在命令行上给出。 文件格式：输入数据库和查询档案将具有相同的格式。每个序列将以大于字符('&#62;')开始，紧跟序列来源的描述，描述不超过 131 个字符。序列会在下一行开始，持续几行。每一行都将包含'A', 'C', 'G'. 和'T'集合的整 80 个字符，除了最后一行，可能会少于80个字符。在最后一行之后将是从下一个序列直至文件末尾的描述符，文件末尾将由描述符（"&#62;EOF" ）表示 。 对于包含在第二个输入文件的每个查询字符串，输出文件应打印查询序列的描述符和数据库序列的描述符，这个数据库序列包含一个匹配和此匹配在数据库序列内的位置。如果在任何数据库序列内，都没有找到查询序列字符串，应在查询描述符之后打印出一个相关消息来体现这个结果。 计时：总执行时间将用于计分。这允许输入数据库序列的编码或压缩在输入期间完成（如果你的算法采用这种转换）   对字符串匹配问题的分析和求解 有关此问题的参赛源代码，请点击这里下载。 串行算法 字符串匹配算法分为单模式匹配算法和多模式匹配算法。一个待匹配串为一个待匹配模式，单模式匹配算法一次只进行一个模式的匹配，多模式匹配算法可一次匹配多个模式，常见的单模式字符串匹配算法有Karp-Rabin（KR），Knuth-Morris-Pratt（KMP），Boyer-Moore （BM）等等，多模式算法有Aho Corasick（AC），Wu Manber（WM）等等。 通过一步步了解字符串匹配算法，依次实现了KMP算法，BM算法、WM算法，压缩DNA的WM算法。 其中KMP和BM算法都是利用模式本身包含的信息避免重复对一个字符进行多次比较。其中KMP利用前缀信息加速窗口移动，而BM利用后缀信息加速窗口移动。 WM算法，利用后缀的Hash值对模式进行分组，减少待匹配模式，窗口移动也类似于BM，不同的是它使用Hash值计算Shift表。 由于DNA只包含(A,C,G,T)4个字符，而WM用两个字符计算Hash值，导致冲突过多，效率下降，所以压缩4个字符到1个字节中可减少冲突，提高效率。 压缩后的WM算法仅仅支持长度大于等于8的模式，所以小于8的模式直接用顺序匹配。 并行算法 由于题目是在多个DNA序列中匹配多个模式，所以并行化比较容易，可以使用多个线程分别匹配不同的DNA序列。 单模式匹配算法：      #pragma omp parallel for schedule(guided, 1)      for(i = 0; i &#60; [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1><span style="x-large;"><span style="Calibri;">问题描述</span></span></h1>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">问题描述：写一个线程程序来搜索用字符串表示的<span lang="EN-US"> DNA </span>序列的数据库，来找到可以匹配的其他<span lang="EN-US"> DNA </span>序列。数据库和查询字符串由四个字符组成：<span lang="EN-US">'A', 'C', 'G'. </span>和<span lang="EN-US"> 'T' </span>。对于每个输入的搜索查询字符串，输出必须报告任何输入<span lang="EN-US"> DNA </span>序列内的精确匹配位置。如果查询字符串与数据库内的多个序列匹配，必须报告每个结果<span lang="EN-US">;</span>如果查询字符串与同一个数据库序列的多个位置匹配，必须报告完全匹配的那个最早出现的位置。这个问题（资料库档案，查询档案，输出结果）的文件名将在命令行上给出。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">文件格式：输入数据库和查询档案将具有相同的格式。每个序列将以大于字符<span lang="EN-US">('&gt;')</span>开始，紧跟序列来源的描述，描述不超过<span lang="EN-US"> 131 </span>个字符。序列会在下一行开始，持续几行。每一行都将包含<span lang="EN-US">'A', 'C', 'G'. </span>和<span lang="EN-US">'T'</span>集合的整<span lang="EN-US"> 80 </span>个字符，除了最后一行，可能会少于<span lang="EN-US">80</span>个字符。在最后一行之后将是从下一个序列直至文件末尾的描述符，文件末尾将由描述符（<span lang="EN-US">"&gt;EOF" </span>）表示 。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">对于包含在第二个输入文件的每个查询字符串，输出文件应打印查询序列的描述符和数据库序列的描述符，这个数据库序列包含一个匹配和此匹配在数据库序列内的位置。如果在任何数据库序列内，都没有找到查询序列字符串，应在查询描述符之后打印出一个相关消息来体现这个结果。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">计时：总执行时间将用于计分。这允许输入数据库序列的编码或压缩在输入期间完成（如果你的算法采用这种转换）</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">对字符串匹配问题的分析和求解</span></h1>
<p>有关此问题的参赛源代码，请<a href="http://software.intel.com/file/19413">点击这里</a>下载。</p>
<p><strong><span style="Calibri;">串行算法</span></strong></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;">字符串匹配算法分为单模式匹配算法和多模式匹配算法。一个待匹配串为一个待匹配模式，单模式匹配算法一次只进行一个模式的匹配，多模式匹配算法可一次匹配多个模式，常见的单模式字符串匹配算法有</span><span lang="EN-US"><span style="宋体, MS Song;">Karp-Rabin</span></span><span style="Calibri;">（</span><span lang="EN-US"><span style="宋体, MS Song;">KR</span></span><span style="Calibri;">），</span><span lang="EN-US"><span style="宋体, MS Song;">Knuth-Morris-Pratt</span><span style="Calibri;" lang="EN-US"><span lang="EN-US">（</span></span><span style="宋体, MS Song;">KMP</span><span style="Calibri;" lang="EN-US"><span lang="EN-US">）</span></span></span><span style="Calibri;">，</span><span lang="EN-US"><span style="宋体, MS Song;">Boyer-Moore </span></span><span style="Calibri;">（</span><span lang="EN-US"><span style="宋体, MS Song;">BM</span></span><span style="Calibri;">）等等，多模式算法有</span><span lang="EN-US"><span style="宋体, MS Song;">Aho Corasick</span></span><span style="Calibri;">（</span><span lang="EN-US"><span style="宋体, MS Song;">AC</span></span><span style="Calibri;">），</span><span lang="EN-US"><span style="宋体, MS Song;">Wu Manber</span></span><span style="Calibri;">（</span><span lang="EN-US"><span style="宋体, MS Song;">WM</span></span><span style="Calibri;">）等等</span><span style="Calibri;">。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">通过一步步了解字符串匹配算法，依次实现了<span lang="EN-US">KMP</span>算法，<span lang="EN-US">BM</span>算法、<span lang="EN-US">WM</span>算法，压缩<span lang="EN-US">DNA</span>的<span lang="EN-US">WM</span>算法。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">其中<span lang="EN-US">KMP</span>和<span lang="EN-US">BM</span>算法都是利用模式本身包含的信息避免重复对一个字符进行多次比较。其中<span lang="EN-US">KMP</span>利用前缀信息加速窗口移动，而<span lang="EN-US">BM</span>利用后缀信息加速窗口移动。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;" lang="EN-US">WM</span><span style="Calibri;">算法，利用后缀的<span lang="EN-US">Hash</span>值对模式进行分组，减少待匹配模式，窗口移动也类似于<span lang="EN-US">BM</span>，不同的是它使用<span lang="EN-US">Hash</span>值计算<span lang="EN-US">Shift</span>表。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">由于<span lang="EN-US">DNA</span>只包含<span lang="EN-US">(A,C,G,T)4</span>个字符，而<span lang="EN-US">WM</span>用两个字符计算<span lang="EN-US">Hash</span>值，导致冲突过多，效率下降，所以压缩<span lang="EN-US">4</span>个字符到<span lang="EN-US">1</span>个字节中可减少冲突，提高效率。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">压缩后的<span lang="EN-US">WM</span>算法仅仅支持长度大于等于<span lang="EN-US">8</span>的模式，所以小于<span lang="EN-US">8</span>的模式直接用顺序匹配。</span></span></p>
<p><strong><span style="Calibri;">并行算法</span></strong></p>
<p class="MsoNormal" style="150%;"><span style="Calibri">由于题目是在多个<span lang="EN-US">DNA</span>序列中匹配多个模式，所以并行化比较容易，可以使用多个线程分别匹配不同的<span lang="EN-US">DNA</span>序列。</span></p>
<p class="MsoNormal" style="150%"><span style="Calibri">单模式匹配算法：</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="blue">#pragma</span> <span style="#010001">omp</span> <span style="#010001">parallel</span> <span style="blue">for</span> <span style="#010001">schedule</span>(<span style="#010001">guided</span>, 1)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="blue">for</span>(<span style="#010001">i</span> = 0; <span style="#010001">i</span> &lt; <span style="#010001">nDatabaseCnt</span>; ++<span style="#010001">i</span>)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span>{</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="2">         </span><span style="#010001">XDNASequence</span>&amp; <span style="#010001">xDNADB</span> = <span style="#010001">xDatabaseVec</span>.<span style="#010001">at</span>(<span style="#010001">i</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span></span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="2">         </span><span style="blue">for</span>(<span style="#010001">j</span> = 0; <span style="#010001">j</span> &lt; <span style="#010001">nPatternCnt</span>; ++<span style="#010001">j</span>)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="2">         </span>{</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="3">              </span><span style="#010001">XDNASequence</span>&amp; <span style="#010001">xDNAPattern</span> = <span style="#010001">xPatternVec</span>.<span style="#010001">at</span>(<span style="#010001">j</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="3">              </span><span style="#010001">pOutput</span>[<span style="#010001">i</span> * <span style="#010001">nPatternCnt</span> + <span style="#010001">j</span>] = </span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="4">                   </span><span style="#010001">XStrMatch_BM</span>( <span style="#010001">xDNADB</span>.<span style="#010001">pSequence</span>, <span style="#010001">xDNADB</span>.<span style="#010001">nSequenceLen</span>,</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="4">                   </span><span style="#010001">xDNAPattern</span>.<span style="#010001">pSequence</span>, <span style="#010001">xDNAPattern</span>.<span style="#010001">nSequenceLen</span>,</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="4">                   </span><span style="#010001">xDNAPattern</span>.<span style="#010001">pExt</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="2">         </span>}</span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="yes" lang="EN-US"><span style="1">     </span>}</span></p>
<p class="MsoNormal" style="2.0"><span style="Calibri">根据<span lang="EN-US">Amplifier</span>的检测结果进行了优化，将任务粒度减小，提高并行度。</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="blue">int</span> <span style="#010001">nCount</span> = <span style="#010001">nDatabaseCnt</span> * <span style="#010001">nPatternCnt</span>;</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="blue">#pragma</span> <span style="#010001">omp</span> <span style="#010001">parallel</span> <span style="blue">for</span> <span style="#010001">schedule</span>(<span style="#010001">guided</span>, 1)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="blue">for</span>(<span style="#010001">i</span> = 0; <span style="#010001">i</span> &lt; <span style="#010001">nCount</span>; ++<span style="#010001">i</span>)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span>{</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">    </span><span style="blue">const</span> <span style="blue">int</span> <span style="#010001">nIndexD</span> = <span style="#010001">i</span> / <span style="#010001">nPatternCnt</span>;</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">    </span><span style="blue">const</span> <span style="blue">int</span> <span style="#010001">nIndexM</span> = <span style="#010001">i</span> % <span style="#010001">nPatternCnt</span>;</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span></span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">    </span><span style="#010001">XDNASequence</span>&amp; <span style="#010001">xDNADB</span> = <span style="#010001">xDatabaseVec</span>.<span style="#010001">at</span>(<span style="#010001">nIndexD</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">    </span><span style="#010001">XDNASequence</span>&amp; <span style="#010001">xDNAPattern</span> = <span style="#010001">xPatternVec</span>.<span style="#010001">at</span>(<span style="#010001">nIndexM</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span></span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">    </span><span style="#010001">pOutput</span>[<span style="#010001">i</span>] = <span style="#010001">XStrMatch_BM</span>( <span style="#010001">xDNADB</span>.<span style="#010001">pSequence</span>, <span style="#010001">xDNADB</span>.<span style="#010001">nSequenceLen</span>,</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">                           </span><span style="#010001">xDNAPattern</span>.<span style="#010001">pSequence</span>, <span style="#010001">xDNAPattern</span>.<span style="#010001">nSequenceLen</span>,</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span><span style="yes">      </span><span style="yes">                     </span><span style="#010001">xDNAPattern</span>.<span style="#010001">pExt</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="1">     </span>}</span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="Calibri" lang="EN-US"> </span></p>
<p class="MsoNormal" style="150%"><span style="Calibri">多模式匹配算法：</span></p>
<p class="MsoNormal" style="2.0" align="left"><span style="yes" lang="EN-US">#pragma</span><span style="yes" lang="EN-US"> <span style="#010001">omp</span> <span style="#010001">parallel</span> <span style="blue">for</span> <span style="#010001">schedule</span>(<span style="#010001">guided</span>, 1)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="yes">    </span><span style="blue">for</span>(<span style="#010001">i</span> = 0; <span style="#010001">i</span> &lt; <span style="#010001">nDatabaseCnt</span>; ++<span style="#010001">i</span>)</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="yes">    </span>{</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="yes">        </span><span style="#010001">XDNASequence</span>&amp; <span style="#010001">xDNADB</span> = <span style="#010001">xDatabaseVec</span>.<span style="#010001">at</span>(<span style="#010001">i</span>);</span></p>
<p class="MsoNormal" style="none" align="left"><span style="yes" lang="EN-US"><span style="yes">        </span><span style="#010001">XZWMSearch</span> (<span style="#010001">pXWM</span>, <span style="#010001">xDNADB</span>.<span style="#010001">pSequence</span>, <span style="#010001">xDNADB</span>.<span style="#010001">nSequenceLen</span>, <span style="#010001">pOutput</span> + <span style="#010001">i</span> * <span style="#010001">nPatternCnt</span>);</span></p>
<p class="MsoNormal" style="150%;"><span style="yes" lang="EN-US"><span style="yes">    </span>}</span></p>
<h1> </h1>
<h1><span style="Calibri;">优化工具</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">根据测试结果最终选定压缩版本的</span><span lang="EN-US"><span style="宋体, MS Song;">WM</span></span><span style="Calibri;">算法提交，以下为</span><span lang="EN-US"><span style="宋体, MS Song;">ZWM</span></span><span style="Calibri;">算法的优化结果。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Hotspots</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Hotspots</span>检测功能查找热点函数，结果如下：</span></span> </p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090609/1.JPG" alt="" width="656" height="424" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">检测结果显示主要的时间开销为读取数据，读取及压缩数据与匹配的时间开销比例为<span lang="EN-US">8</span>：<span lang="EN-US">1</span>。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Concurrency</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Concurrency</span>检测功能查找可进行并行优化的代码，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090609/2.JPG" alt="" width="664" height="402" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">检测结果显示算法的并行度很好， <span lang="EN-US">Read</span>函数采用了分块读取数据，每个块的大小必须大于最大<span lang="EN-US">DNA</span>序列长度<span lang="EN-US">10^6</span>，所以当数据量较小，块数量不够时会导致部分<span lang="EN-US">CPU</span>闲置。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;" lang="EN-US"><span style="small;"> </span></span><span style="small;"><strong><span style="Calibri;" lang="EN-US">Locks and Waits</span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Locks and Waits</span>检测功能查找锁和同步等待消耗，结果如下：</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090609/3.JPG" alt="" width="661" height="432" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">检测结果显示算法几乎不存在同步和锁消耗。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">其他优化</span></h1>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用内存映射结合<span lang="EN-US">OpenMp</span>并行的分块加载<span lang="EN-US">DNA</span>序列。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用<span lang="EN-US">concurrent_vector</span>保存<span lang="EN-US">DNA</span>数据。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用<span lang="EN-US">parallel_sort</span>进行模式的排序。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;" lang="EN-US"><span style="small;"> </span></span></p>
<h1><span style="Calibri;">性能测试</span></h1>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;"> 操作系统：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="1;">    </span>32bit</span></span><span style="Calibri;">的测试在</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri;">位</span><span lang="EN-US"><span style="宋体, MS Song;">XP</span></span><span style="Calibri;">下完成。</span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;"> CPU</span></span><span style="Calibri;">：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>Intel(R) Core(TM)2 CPU<span style="yes;">  </span>5270<span style="yes;">  </span>@ 1.40GHz</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span lang="EN-US"><span style="1;"><span style="宋体, MS Song;"> </span></span></span><span style="Calibri;">内存：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>1G</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;"> 时间单位：</span><span lang="EN-US"><span style="yes;"><span style="宋体, MS Song;">    </span></span></span><span style="Calibri;">秒</span><span lang="EN-US"><span style="yes;"><span style="宋体, MS Song;">      </span></span></span><span style="Calibri;">最短</span><span lang="EN-US"><span style="宋体, MS Song;">DNA</span></span><span style="Calibri;">序列长度：</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri;">字节</span></span></p>
<div>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="125">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">测试数据</span></span></strong></p>
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">文件大小</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="119">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">测试数据</span></span></strong></p>
<p class="MsoNormal" style="center;" align="center"><span style="small;"><strong><span style="Calibri;" lang="EN-US">DNA</span></strong><strong><span style="Calibri;">数量</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">串行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">并行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">加速比</span></span></strong></p>
</td>
</tr>
<tr style="1;">
<td style="solid windowtext .5pt;" width="125">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">6M</span><span style="Calibri;" lang="EN-US">_257k</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="119">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">10_500</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.041158</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.030744</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.34</span></span></p>
</td>
</tr>
<tr style="2;">
<td style="solid windowtext .5pt;" width="125">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">49.5M</span><span style="Calibri;" lang="EN-US">_257k</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="119">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100_500</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.330507</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.213908</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.54</span></span></p>
</td>
</tr>
<tr style="3;">
<td style="solid windowtext .5pt;" width="125">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">49.5M</span><span style="Calibri;" lang="EN-US">_2.5M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="119">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100_5000</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.565867</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.363544</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.56</span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="125">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">250M</span><span style="Calibri;" lang="EN-US">_2.5M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="119">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">500_5000</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">2.814094</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.711416</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.64</span></span></p>
</td>
</tr>
</tbody>
</table>
</div>
<h1> </h1>
<h1> </h1>
<h1><span style="Calibri;">编译说明</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Windows</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">VS2008</span>和<span lang="EN-US">Intel Parallel Studio</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">1. </span><span style="Calibri;">用<span lang="EN-US">VS2008</span>打开本项目<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="2.0;"><span style="small;"><span style="Calibri;" lang="EN-US">2. </span><span style="Calibri;">选择<span lang="EN-US">Win32</span>平台<span lang="EN-US">Release</span>编译<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt 21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">3. </span><span style="Calibri;">进入<span lang="EN-US">Bin</span>目录执行文件为<span lang="EN-US">XStringMatch.exe.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Linux</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">ICC</span>和<span lang="EN-US">TBB</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">上传压缩包种的<span lang="EN-US">Src</span>和<span lang="EN-US">Linux</span>两个目录到服务器上<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XStringMatch/Linux</span>目录 执行<span lang="EN-US">make</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XStringMatch/Bin</span>目录 执行文件为<span lang="EN-US">XStringMatch.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">其他：</span></span></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong><span style="small;"><strong><span style="Calibri;">主办方请使用</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Win32</span></span></strong><strong><span style="Calibri;">平台</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Release</span></span></strong><strong><span style="Calibri;">版本测试，谢谢！</span></strong></span></p>
<h1> </h1>
<h1><span style="Calibri;">优化结论</span></h1>
<p class="MsoNormal" style="2.0;"><span style="Calibri;"><span style="small;">通过解决本题进一步了解了串匹配算法，串匹配算法可以说有数十种之多，但多数算法的基本原理都是利用已经比较出的结果，结合模式本身的信息，尽可能的滑动比较窗口。</span></span></p>
<p class="MsoNormal" style="2.0;"><span style="small;"><span style="Calibri;" lang="EN-US">TBB</span><span style="Calibri;">提供的<span lang="EN-US">concurrent_vector</span>是一个高性能的、线程安全的并行<span lang="EN-US">vector</span>类，在并行读取<span lang="EN-US">DNA</span>序列时，各个线程需要把读完的<span lang="EN-US">DAN</span>序列压入<span lang="EN-US">vector</span>，存在并发访问，如果使用<span lang="EN-US">std::vector</span>必须做同步，而<span lang="EN-US">concurrent_vector</span>则不用同步，非常方便。</span></span></p>
<p>如您需要就本文所涉及的问题与作者进行交流讨论，请发送邮件联系邓辉，邮件地址为 <a href="mailto:denghui0815@hotmail.com"><span style="#000000;">denghui0815@hotmail.com</span></a></p>
<h1> </h1>
<h1><span style="Calibri;">致谢</span></h1>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Clay Breshears</span></span><span style="Calibri;">在论坛上所做的解答，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Mu</span></span><span style="Calibri;">，</span><span lang="EN-US"><span style="宋体, MS Song;">Pryce</span></span><span style="Calibri;">为本文章发表到</span><span lang="EN-US"><span style="宋体, MS Song;">ISN</span></span><span style="Calibri;">所做的工作，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Xia, JeffX P</span></span><span style="Calibri;">为此解决方案进行的认真细致的翻译。</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/06/10/400001953/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 英特尔® 线程挑战赛参赛随笔系列 - 第三题：查找</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/05/27/400001855/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/05/27/400001855/#comments</comments>
		<pubDate>Wed, 27 May 2009 03:48:32 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>
		<category><![CDATA[线程 挑战赛 查找 TBB OpenMP]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/05/27/400001855/</guid>
		<description><![CDATA[  问题描述 问题描述：写一个线程程序在一个线性存储且有序的唯一关键字集合中搜索给定关键字集合的所在位置。关键字是由15个字符组成的字符串，第一个输入文本文件包含有序关键字集合。第二个输入文本文件包含不定数量的关键字，这些关键字将在第一个文件中的关键字集合里进行查询。对于第二个输入文件中的每一个关键字在输出文件中都应该有对应的单行输出，输出信息包括关键字以及其在第一个文件中的下标，下标从0开始。如果待搜索关键字不在第一个文件中，那么将在输出文件中打印没有找到的信息。程序所涉及到的文件名（包括关键字集合文件，待搜索关键字集合文件，输出文件）将在命令行中给出。 文件格式：第一个输入文件的第一行是一个整数，代表文件中有序关键字集合的关键字数量（这里用N表示）。接下来存储的就是N行有序的关键字，每个关键字是15个字符的字符串。第二个文件包括待查询的不定数量的关键字集合，关键字同样是15字符的字符串，每行存储一个关键字。 输出文件应当与第二个输入文件有相同的输出行数。对于第二个文件中列出的每个关键字，在输出文件中都有对应的一行输出信息表明待搜索关键字以及在第一个文件中的下标或者未找到的信息。 有关本问题的详细情况，请进入本次线程挑战赛，获取更多信息。   对查找问题的分析和求解 串行算法 本题为查找算法题，可选的查找算法很多，有顺序查找、二分查找、HashMap等等。 顺序查找算法是将待查找的值依次与数组中每一个值比较，相等即查找成功，否则查找失败，顺序查找是O(N)的算法，不需要额外的空间。 二分查找算法只能在已排序的数据中查找，所以首先是将数据排序，然后取中间的值与待查找值进行比较，相等即查找成功。否则根据比较的结果在一半的数据中递归查找。二分查找是O(logN)的算法，不需要额外的空间。 HashMap查找算法通过Hash函数计算Hash值将数据分散到桶中，每个桶里的数据Hash值相同。不同值有相同Hash值我们称为冲突，冲突会导致桶中的数据增加。查找时利用Hash函数得到待查找值的Hash值，在这个Hash值限定桶中进行顺序查找，理论上该算法是O（1）的。如果数据集增大或者Hash函数不够好，冲突会增加，导致桶内数据增加，效率也会随之降低。如果所有数据的Hash值相同，则退化为顺序查找。 由于本题的输入是已经排序的字符串，所以首选当然是二分查找，但它是O(logN)的，理论上不及HashMap查找，但HashMap需要构建时间，性能与数据量也有关系。所以使用什么算法是需要根据数据量来调整的。 每个关键字为15字节，可以放入到__m128i中，于是定义XSearchItem结构表示关键字 // 关键字数据 typedef struct tagXSearchItem {      union      {          char cVal[16];          uint16   nVal16[8];          uint32   nVal32[4];          uint64   nVal64[2];          __m128i nVal128;      };      tagXSearchItem(__m128i nV128)  { nVal128 = nV128; } }XSearchItem,*PXSEARCHITEM; 实现了三个版本的算法。 二分查找 ：见XSearch_Binary.cpp中的XSearch_Binary_Serial函数。 HashMap：  [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1><span style="Calibri;">问题描述</span></h1>
<p class="MsoNormal" style="21pt;"><span style="Calibri;"><span style="small;">问题描述：写一个线程程序在一个线性存储且有序的唯一关键字集合中搜索给定关键字集合的所在位置。关键字是由<span lang="EN-US">15</span>个字符组成的字符串，第一个输入文本文件包含有序关键字集合。第二个输入文本文件包含不定数量的关键字，这些关键字将在第一个文件中的关键字集合里进行查询。对于第二个输入文件中的每一个关键字在输出文件中都应该有对应的单行输出，输出信息包括关键字以及其在第一个文件中的下标，下标从<span lang="EN-US">0</span>开始。如果待搜索关键字不在第一个文件中，那么将在输出文件中打印没有找到的信息。程序所涉及到的文件名（包括关键字集合文件，待搜索关键字集合文件，输出文件）将在命令行中给出。</span></span></p>
<p class="MsoNormal" style="21pt;"><span style="Calibri;"><span style="small;">文件格式：第一个输入文件的第一行是一个整数，代表文件中有序关键字集合的关键字数量（这里用<span lang="EN-US">N</span>表示）。接下来存储的就是<span lang="EN-US">N</span>行有序的关键字，每个关键字是<span lang="EN-US">15</span>个字符的字符串。第二个文件包括待查询的不定数量的关键字集合，关键字同样是<span lang="EN-US">15</span>字符的字符串，每行存储一个关键字。</span></span></p>
<p class="MsoNormal" style="21pt;"><span style="Calibri;"><span style="small;">输出文件应当与第二个输入文件有相同的输出行数。对于第二个文件中列出的每个关键字，在输出文件中都有对应的一行输出信息表明待搜索关键字以及在第一个文件中的下标或者未找到的信息。</span></span></p>
<p>有关本问题的详细情况，请进入本次线程挑战赛，获取<a href="http://software.intel.com/zh-cn/contests/Threading-Challenge-2009/codecontest.php">更多信息</a>。</p>
<h1> </h1>
<h1><span style="Calibri;">对查找问题的分析和求解</span></h1>
<p><strong><span style="Calibri;">串行算法</span></strong></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">本题为查找算法题，可选的查找算法很多，有顺序查找、二分查找、<span lang="EN-US">HashMap</span>等等。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">顺序查找算法是将待查找的值依次与数组中每一个值比较，相等即查找成功，否则查找失败，顺序查找是<span lang="EN-US">O(N)</span>的算法，不需要额外的空间。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">二分查找算法只能在已排序的数据中查找，所以首先是将数据排序，然后取中间的值与待查找值进行比较，相等即查找成功。否则根据比较的结果在一半的数据中递归查找。二分查找是<span lang="EN-US">O(logN)</span>的算法，不需要额外的空间。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;" lang="EN-US">HashMap</span><span style="Calibri;">查找算法通过<span lang="EN-US">Hash</span>函数计算<span lang="EN-US">Hash</span>值将数据分散到桶中，每个桶里的数据<span lang="EN-US">Hash</span>值相同。不同值有相同<span lang="EN-US">Hash</span>值我们称为冲突，冲突会导致桶中的数据增加。查找时利用<span lang="EN-US">Hash</span>函数得到待查找值的<span lang="EN-US">Hash</span>值，在这个<span lang="EN-US">Hash</span>值限定桶中进行顺序查找，理论上该算法是<span lang="EN-US">O</span>（<span lang="EN-US">1</span>）的。如果数据集增大或者<span lang="EN-US">Hash</span>函数不够好，冲突会增加，导致桶内数据增加，效率也会随之降低。如果所有数据的<span lang="EN-US">Hash</span>值相同，则退化为顺序查找。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">由于本题的输入是已经排序的字符串，所以首选当然是二分查找，但它是<span lang="EN-US">O(logN)</span>的，理论上不及<span lang="EN-US">HashMap</span>查找，但<span lang="EN-US">HashMap</span>需要构建时间，性能与数据量也有关系。所以使用什么算法是需要根据数据量来调整的。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">每个关键字为<span lang="EN-US">15</span>字节，可以放入到<span lang="EN-US">__m128i</span>中，于是定义</span></span><span style="yes;" lang="EN-US">XSearchItem</span><span style="Calibri;"><span style="small;">结构表示关键字</span></span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">// </span><span style="yes;">关键字数据</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">typedef</span><span style="yes;" lang="EN-US"> <span style="blue;">struct</span> <span style="#010001;">tagXSearchItem</span></span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">{</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span><span style="blue;">union</span></span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span>{</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="2;">         </span><span style="blue;">char</span><span style="1;"> </span><span style="#010001;">cVal</span>[16];</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="2;">         </span><span style="#010001;">uint16</span><span style="1;">   </span><span style="#010001;">nVal16</span>[8];</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="2;">         </span><span style="#010001;">uint32</span><span style="1;">   </span><span style="#010001;">nVal32</span>[4];</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="2;">         </span><span style="#010001;">uint64</span><span style="1;">   </span><span style="#010001;">nVal64</span>[2];</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="2;">         </span><span style="blue;">__m128i</span> <span style="#010001;">nVal128</span>;</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span>};</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US"><span style="1;">     </span><span style="#010001;">tagXSearchItem</span>(<span style="blue;">__m128i</span> <span style="#010001;">nV128</span>)<span style="yes;">  </span>{ <span style="#010001;">nVal128</span> = <span style="#010001;">nV128</span>; }</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="yes;" lang="EN-US">}<span style="#010001;">XSearchItem</span>,*<span style="#010001;">PXSEARCHITEM</span>;</span></p>
<p class="MsoNormal" style="none;" align="left"><span style="small;"><span style="Calibri;">实现了三个版本的算法。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span>二分查找 ：见<span lang="EN-US">XSearch_Binary.cpp</span>中的<span lang="EN-US">XSearch_Binary_Serial</span>函数。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span>HashMap：</span><span style="Calibri;" lang="EN-US"><span style="yes;">  </span></span><span style="Calibri;">见<span lang="EN-US">XSearch_HashMap.cpp</span>中的<span lang="EN-US">XSearch_HashMap_Serial</span>函数。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span>concurrent_hash_map：</span><span style="Calibri;">见<span lang="EN-US">XSearch_HashMap_TBB.cpp</span>中的<span lang="EN-US">XSearch_HashMap_TBB_</span></span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;" lang="EN-US">Serial</span><span style="Calibri;">函数。</span></span></p>
<p><strong><span style="Calibri;">并行算法</span></strong></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;">对应的实现了三种算法的并行版本。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">二分查找 ：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用多个线程，分别处理一部分待查关键字即可实现二分查找的并行算法。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">见<span lang="EN-US">XSearch_Binary.cpp</span>中的</span></span><span style="yes;" lang="EN-US">XSearch_Binary_Parallel</span><span style="Calibri;"><span style="small;">函数。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span>HashMap：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">在构造阶段，每个线程分别构造桶数量相同的<span lang="EN-US">HashMap</span>，构造完毕后，使用多个线程进行桶的合并。然后使用多个线程，分别处理一部分待查关键字即可实现<span lang="EN-US">HashMap</span>查找的并行算法。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">见<span lang="EN-US">XSearch_HashMap.cpp</span>中的<span lang="EN-US">XSearch_HashMap_</span></span></span><span style="yes;" lang="EN-US">Parallel</span><span style="Calibri;"><span style="small;">函数。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span>concurrent_hash_map：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="small;"><span style="Calibri;" lang="EN-US">concurrent_hash_map</span><span style="Calibri;">本身就是支持并行的数据结构，所以我们只需开启多个线程同时向<span lang="EN-US">concurrent_hash_map</span>插入数据，构造完<span lang="EN-US">concurrent_hash_map</span>后使用多个线程，分别处理一部分待查关键字即可实现<span lang="EN-US">concurrent_hash_map</span>查找的并行算法。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">见<span lang="EN-US">XSearch_HashMap_TBB.cpp</span>中的<span lang="EN-US">XSearch_HashMap_TBB_</span></span></span><span style="yes;" lang="EN-US">Parallel</span><span style="Calibri;"><span style="small;">函数。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">优化工具</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Hotspots</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"> <span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile Hotspots_1.JPG" alt="" width="303" height="44" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Hotspots</span>检测功能查找二分查找算法的热点函数，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><span lang="EN-US"><img src="http://software.intel.com/file/19158" alt="" width="727" height="400" /></span></span></span></p>
<p class="MsoNormal" style="150%;"> <span style="Calibri;"><span style="small;">二分查找的时间开销基本上都在关键字比较函数</span></span><span style="yes;" lang="EN-US">XCompareItem</span><span style="Calibri;"><span style="small;">中，最初的使用的字符串比较函数替换为汇编版本的比较函数，该函数每次将<span lang="EN-US">4</span>个字节的数据通过<span lang="EN-US">bswap</span>转化为一个<span lang="EN-US">int</span>值后进行比较，大幅度的提高了程序性能。</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Hotspots</span>检测功能查找<span lang="EN-US">HashMap</span>算法的热点函数，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><span lang="EN-US"><img src="http://software.intel.com/file/19159" alt="" width="730" height="376" /></span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"> <span style="Calibri;"><span style="small;">主要的时间开销都在函数<span lang="EN-US">XSearch_HashMap_</span></span></span><span style="yes;" lang="EN-US">Parallel</span><span style="Calibri;"><span style="small;">内。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Concurrency</span></span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile Concurrency_1.JPG" alt="" width="306" height="40" /></span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Concurrency</span>检测功能查找可进行并行优化的代码，结果如下：</span></span></p>
<p class="MsoNormal" style="150%;"><span style="Calibri;"><span style="small;"><span lang="EN-US"><img src="http://software.intel.com/file/19160" alt="" width="721" height="397" /></span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"> <span style="Calibri;"><span style="small;">根据检测结果可知，二分查找算法几乎实现了完全并行。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"> <span style="small;"><span style="Calibri;">检测结果显示在</span><span style="Calibri;" lang="EN-US">XSearch_HashMap_</span><span style="yes;" lang="EN-US">Parallel</span><span style="yes;">内存在较多的串行代码，进入函数内部可以以找到串行执行的代码行如下：</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="yes;"><img src="http://software.intel.com/file/19161" alt="" width="722" height="380" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="yes;"><img src="http://software.intel.com/file/19162" alt="" width="724" height="130" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"> </p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="yes;"><img src="http://software.intel.com/file/19163" alt="" width="725" height="160" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;">串行执行时间均为分配内存和释放内存开销。由于需要构造很多的<span lang="EN-US">HashMap</span>所以内存需求较大，时间消耗也比较多，约为<span lang="EN-US">150ms</span>。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;" lang="EN-US"><span style="small;"> </span></span><span style="small;"><strong><span style="Calibri;" lang="EN-US">Locks and Waits</span></strong><strong><span style="Calibri;">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile Locks and Waits_1.JPG" alt="" width="308" height="41" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Locks and Waits</span>检测功能查找两种算法的锁和同步等待消耗，结果如下：<img src="http://software.intel.com/file/19164" alt="" width="733" height="433" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="Calibri;"><span style="small;"><span lang="EN-US"><img src="http://software.intel.com/file/19165" alt="" width="730" height="442" /></span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong></p>
<p class="MsoNormal" style="150%;"> <span style="Calibri;"><span style="small;">检测结果显示两种算法不存在较严重的同步和锁消耗。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">其他优化</span></h1>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用内存映射结合<span lang="EN-US">OpenMp</span>并行的读取数据。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用<span lang="EN-US">_mm_loadu_si128</span>加载每个关键字。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">统计每个分块数据的输出信息需要的空间大小，进而使用内存映射结合<span lang="EN-US">OpenMp</span>并行的保存数据。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">4.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用汇编实现二分查找比较函数。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">5.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">使用<span lang="EN-US">XSearchItem</span>结构的<span lang="EN-US">nVal64[2]</span>判断关键字是否相同。</span></span></p>
<p class="MsoNormal" style="list 18.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">6.</span><span style="7pt &quot;Times New Roman&quot;;">     </span></span></span><span style="Calibri;"><span style="small;">根据查找数据和待查找数据的数据量自动选择二分查找和<span lang="EN-US">HashMap</span>查找算法。</span></span></p>
<h1> </h1>
<h1><span style="Calibri;">性能测试</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">小数据量测试：</span></span></strong></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">操作系统：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="1;">    </span>32bit</span></span><span style="Calibri;">的测试在</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri;">位</span><span lang="EN-US"><span style="宋体, MS Song;">XP</span></span><span style="Calibri;">下完成。</span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri;">：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>Intel(R) Core(TM)2 CPU<span style="yes;">  </span>5270<span style="yes;">  </span>@ 1.40GHz</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">内存：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>1G</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">时间单位：</span><span lang="EN-US"><span style="yes;"><span style="宋体, MS Song;">    </span></span></span><span style="Calibri;">秒</span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">测试数据</span><span lang="EN-US"><span style="宋体, MS Song;">:<span style="1;">     </span>DataCount_KeyCount</span></span></span></p>
<div>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="120">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">测试数据</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">二分查找</span></span></strong></p>
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">串行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">二分查找</span></span></strong></p>
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">并行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">加速比</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;" lang="EN-US"><span style="small;">HashMap</span></span></strong></p>
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">串行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;" lang="EN-US"><span style="small;">HashMap</span></span></strong></p>
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">并行</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">加速比</span></span></strong></p>
</td>
</tr>
<tr style="1;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100k_100k</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.038339</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.024409</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.57</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.031137</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.032812</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.95</span></span></p>
</td>
</tr>
<tr style="2;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100k_200k</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.075259</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.045982</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.64</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.055593</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.050704</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.10</span></span></p>
</td>
</tr>
<tr style="3;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">100k_1M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.362107</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.218493</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.66</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.259891</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.167116</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.56</span></span></p>
</td>
</tr>
<tr style="4;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">1M</span><span style="Calibri;" lang="EN-US">_1M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.700737</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.427367</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.64</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.500109</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.403246</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.24</span></span></p>
</td>
</tr>
<tr style="5;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">1M</span><span style="Calibri;" lang="EN-US">_2M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.497653</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.853737</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.75</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.857886</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.603892</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.42</span></span></p>
</td>
</tr>
<tr style="6;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">1M</span><span style="Calibri;" lang="EN-US">_5M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">3.437704</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">2.069611</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.66</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.867009</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.189546</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.57</span></span></p>
</td>
</tr>
<tr style="7;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">5M</span><span style="Calibri;" lang="EN-US">_1M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.007964</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">0.616225</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.62</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">2.003100</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.257592</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.59</span></span></p>
</td>
</tr>
<tr style="8;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">5M</span><span style="Calibri;" lang="EN-US">_2M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">2.005481</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.207397</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.66</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">2.989992</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.721885</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.73</span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;" lang="EN-US">5M</span><span style="Calibri;" lang="EN-US">_5M</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="79">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">4.876103</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="78">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">2.981140</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="65">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.64</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="80">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">5.893172</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="83">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">3.201393</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="74">
<p class="MsoNormal" style="center;" align="center"><span style="Calibri;" lang="EN-US"><span style="small;">1.84</span></span></p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">大数据测试：</span></span></strong></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">操作系统：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="1;">    </span>Red Hat Enterprise Linux AS release 4 (Nahant Update 2)</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri;">：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>Intel(R) Core(TM)2 CPU<span style="yes;">  </span>6320<span style="yes;">  </span>@ 1.86GHz</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">内存：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2;">         </span>4G</span></span></span></p>
<div>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="120">
<p class="MsoNormal" style="center;" align="center"><strong><span style="Calibri;"><span style="small;">测试数据</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">10M</span><span lang="EN-US">_10M</span><strong></strong></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="100">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">10M</span><span lang="EN-US">_20M</span><strong></strong></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="98">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">20M</span><span lang="EN-US">_10M</span><strong></strong></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="107">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">20M</span><span lang="EN-US">_20M</span><strong></strong></span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="120">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="Calibri;">测试结果</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">6.812783</span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="100">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">13.501797</span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="98">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">7.963913</span></span></span></p>
</td>
<td style="solid windowtext .5pt;" width="107">
<p class="MsoNormal" style="center;" align="center"><span style="small;"><span style="宋体, MS Song;"><span lang="EN-US">18.479586</span></span></span></p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span></p>
<h1> </h1>
<h1><span style="Calibri;">编译说明</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Windows</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">VS2008</span>和<span lang="EN-US">Intel Parallel Studio</span></span></span></p>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">1. </span><span style="Calibri;">用<span lang="EN-US">VS2008</span>打开本项目<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="2.0;"><span style="small;"><span style="Calibri;" lang="EN-US">2. </span><span style="Calibri;">选择<span lang="EN-US">X64</span>平台<span lang="EN-US">Relase</span>编译<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt 21pt;"><span style="small;"><span style="Calibri;" lang="EN-US">3. </span><span style="Calibri;">进入<span lang="EN-US">Bin</span>目录执行文件为<span lang="EN-US">XSearch.exe.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Linux</span></span></strong><strong><span style="Calibri;">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><span style="small;"><span style="Calibri;">使用<span lang="EN-US">ICC</span>和<span lang="EN-US">TBB</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">1.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">上传压缩包种的<span lang="EN-US">Src</span>和<span lang="EN-US">Linux</span>两个目录到服务器上<span lang="EN-US">.</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">2.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XSearch/Linux</span>目录 执行<span lang="EN-US">make</span></span></span></p>
<p class="MsoNormal" style="list 39.0pt;"><span style="Calibri;" lang="EN-US"><span style="Ignore;"><span style="small;">3.</span><span style="7pt &quot;Times New Roman&quot;;"> </span></span></span><span style="Calibri;"><span style="small;">进入<span lang="EN-US">XSearch/Bin</span>目录 执行文件为<span lang="EN-US">XSearch.</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="small;"><span style="Calibri;">其他：</span></span></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong></strong><span style="small;"><strong><span style="Calibri;">主办方请使用</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Win32</span></span></strong><strong><span style="Calibri;">平台</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Release</span></span></strong><strong><span style="Calibri;">版本测试，需要使用</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Linux</span></span></strong><strong><span style="Calibri;">格式的输入文件。使用默认的</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">Auto Parallel Mode</span></span></strong><strong><span style="Calibri;">【</span><span style="red;" lang="EN-US"><span style="宋体, MS Song;">ap</span></span></strong><strong><span style="Calibri;">】运行模式，谢谢！</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt;"><strong><span style="red;" lang="EN-US"><span style="宋体, MS Song;"> </span></span></strong></p>
<h1><span style="Calibri;">优化结论</span></h1>
<p class="MsoNormal" style="2.0;"><span style="Calibri;"><span style="small;">通过解决本题进一步学习了查找算法，更深入的了解了二分查找和<span lang="EN-US">HashMap</span>在多核系统上的优缺点、性能差异。</span></span></p>
<p class="MsoNormal" style="2.0;"><span style="Calibri;"><span style="small;">学习使用<span lang="EN-US">TBB</span>提供的<span lang="EN-US">concurrent_hash_map</span>，它是一个高性能的、线程安全的并行<span lang="EN-US">HashMap</span>类，由于本题的数据是已经排序的，而且构造好<span lang="EN-US">HashMap</span>后再进行查询，不存在并发的、反复的增加数据、查询数据。所以使用它来解答本题虽然很方便，但效率上不占优势。在实际的软件开发环境中强烈推荐使用。</span></span></p>
<p class="MsoNormal" style="2.0;"><span style="Calibri;"><span style="small;">同时在<span lang="EN-US">Linux</span>下使用<span lang="EN-US">l_cc_p_10.1.015</span>编译<span lang="EN-US">XSearch_HashMap_TBB.cpp</span>是报了一个编译错误，时间关系没有继续查原因，不知是否存在<span lang="EN-US">Bug</span>。 错误如下：</span></span></p>
<p class="MsoNormal" style="2.0;"><span style="Calibri;" lang="EN-US"><span style="small;">error: unknown opcode "pause;" -- __asm</span></span></p>
<p>如您需要就本文所涉及的问题与作者进行交流讨论，请发送邮件联系邓辉，邮件地址为 <a href="mailto:denghui0815@hotmail.com"><span style="#000000;">denghui0815@hotmail.com</span></a></p>
<h1> </h1>
<h1><span style="Calibri;">致谢</span></h1>
<p class="MsoNormal" style="21pt;"><span style="small;"><span style="Calibri;">感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Clay Breshears</span></span><span style="Calibri;">所做的解答，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Mu</span></span><span style="Calibri;">，</span><span lang="EN-US"><span style="宋体, MS Song;">Pryce</span></span><span style="Calibri;">为本文章发表到</span><span lang="EN-US"><span style="宋体, MS Song;">ISN</span></span><span style="Calibri;">所做的工作，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Xia, JeffX P</span></span><span style="Calibri;">为我的解决方案进行了认真细致的翻译。</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/05/27/400001855/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 英特尔® 线程挑战赛参赛随笔系列 - 第二题: 3SAT</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/05/20/2009-3sat/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/05/20/2009-3sat/#comments</comments>
		<pubDate>Wed, 20 May 2009 06:17:20 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/05/20/2009-3sat/</guid>
		<description><![CDATA[  问题描述 可满足性问题研究的是取值为布尔值的析取子式的合取范式组成的表达式是否为TRUE的问题。要解决这个问题，也就是使得整个表达式为TRUE，你必须首先确定一组取值为TRUE或者FALSE的布尔变量是否存在。三元可满足性（3SAT）是可满足性问题的一种，它限制了析取子式只能有3个变量。以下是3SAT表达式的一个例子： (X_1 &#124; !X_2 &#124; X_3) &#38; (X_3 &#124; X_2 &#124; !X_1) &#38; (X_2 &#124; X_1 &#124; !X_3) &#38; (!X_1 &#124; !X_2 &#124; !X_3) “&#124;”表示逻辑“或”（OR）， “＆”表示逻辑“与”（AND），“！”表示逻辑“非”（NOT）。当X_1和X_2为TRUE, X_3为FALSE时，此表达式为TRUE。 写一个线程程序，首先从一个文本文件中导入3SAT问题表达式实例，然后确定是否存在使得给定表达式为TRUE的布尔值赋值。 如果存在这样一组赋值，应该把结果打印到标准输出设备，同时将布尔变量的赋值输出到一个文本文件。如果赋值不存在，输出文件会被省略。输入和输出文件名应为应用程序执行命令行的第一和第二个参数。 文件格式：输入文件的第一行将包含两个整数：表达式中布尔变量的最大数量（这里用N表示）和文件中析取子式的数量（这里用K表示）。接下来的K行将包含ABS（ [ 1 ，N] ）中三个整数，三个整数之间以一个空格隔开。这些整数代表布尔变量的下标，负值表示对子式中布尔变量取反。 如果存在这样一组赋值，满足整个输入的表达式的值为TRUE，那么输出文件将包含N行。每一行表示N个变量中的第几个变量以及所赋的布尔值。每行格式： [ 1 ，N]中的一个整数，一个空格，表示TRUE或FALSE的字符 “T” 或 “F”。 有关本问题的详细情况，请进入本次线程挑战赛，获取更多信息。   对 3SAT 的分析和求解 有关此问题的参赛源代码，请点击这里下载。 串行算法 可满足性问题属于NP问题，可以使用局部搜索算法（GWSAT， WALKSAT）和全局搜索算法（DPLL），局部搜索算法对于有解的问题，通常可以快速的找到解，但不能确定问题是否无解。全局搜索算法能够确定问题无解，但速度要远远低于局部搜索算法。 WALKSAT局部搜索的算法： 1，  随机的对每个变量赋值，统计未满足的表达式和已满足的表达式为真的子条件个数。 [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1><span style="Calibri">问题描述</span></h1>
<p class="MsoNormal" style="21pt"><span style="Calibri"><span style="small;">可满足性问题研究的是取值为布尔值的析取子式的合取范式组成的表达式是否为<span lang="EN-US">TRUE</span>的问题。要解决这个问题，也就是使得整个表达式为<span lang="EN-US">TRUE</span>，你必须首先确定一组取值为<span lang="EN-US">TRUE</span>或者<span lang="EN-US">FALSE</span>的布尔变量是否存在。三元可满足性（<span lang="EN-US">3SAT</span>）是可满足性问题的一种，它限制了析取子式只能有<span lang="EN-US">3</span>个变量。以下是<span lang="EN-US">3SAT</span>表达式的一个例子：</span></span></p>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri" lang="EN-US">(X_1 | !X_2 | X_3) &amp; (X_3 | X_2 | !X_1) &amp; (X_2 | X_1 | !X_3) &amp; (!X_1 | !X_2 | !X_3) </span><span style="Calibri">“<span lang="EN-US">|</span>”表示逻辑“或”（<span lang="EN-US">OR</span>）， “＆”表示逻辑“与”（<span lang="EN-US">AND</span>），“！”表示逻辑“非”（<span lang="EN-US">NOT</span>）。当<span lang="EN-US">X_1</span>和<span lang="EN-US">X_2</span>为<span lang="EN-US">TRUE, X_3</span>为<span lang="EN-US">FALSE</span>时，此表达式为<span lang="EN-US">TRUE</span>。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="Calibri"><span style="small;">写一个线程程序，首先从一个文本文件中导入<span lang="EN-US">3SAT</span>问题表达式实例，然后确定是否存在使得给定表达式为<span lang="EN-US">TRUE</span>的布尔值赋值。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="Calibri"><span style="small;">如果存在这样一组赋值，应该把结果打印到标准输出设备，同时将布尔变量的赋值输出到一个文本文件。如果赋值不存在，输出文件会被省略。输入和输出文件名应为应用程序执行命令行的第一和第二个参数。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="Calibri"><span style="small;">文件格式：输入文件的第一行将包含两个整数：表达式中布尔变量的最大数量（这里用<span lang="EN-US">N</span>表示）和文件中析取子式的数量（这里用<span lang="EN-US">K</span>表示）。接下来的<span lang="EN-US">K</span>行将包含<span lang="EN-US">ABS</span>（<span lang="EN-US"> [ 1 </span>，<span lang="EN-US">N] </span>）中三个整数，三个整数之间以一个空格隔开。这些整数代表布尔变量的下标，负值表示对子式中布尔变量取反。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="Calibri"><span style="small;">如果存在这样一组赋值，满足整个输入的表达式的值为<span lang="EN-US">TRUE</span>，那么输出文件将包含<span lang="EN-US">N</span>行。每一行表示<span lang="EN-US">N</span>个变量中的第几个变量以及所赋的布尔值。每行格式：<span lang="EN-US"> [ 1 </span>，<span lang="EN-US">N]</span>中的一个整数，一个空格，表示<span lang="EN-US">TRUE</span>或<span lang="EN-US">FALSE</span>的字符 “<span lang="EN-US">T</span>” 或 “<span lang="EN-US">F</span>”。</span></span></p>
<p>有关本问题的详细情况，请进入本次线程挑战赛，获取<a href="http://software.intel.com/zh-cn/contests/Threading-Challenge-2009/codecontest.php">更多信息</a>。</p>
<h1> </h1>
<h1><span style="Calibri">对 3SAT 的分析和求解</span></h1>
<p>有关此问题的参赛源代码，请<a href="http://d.download.csdn.net/down/1314273/denghui0815"><span style="medium;">点击这里</span></a>下载。</p>
<p><strong><span style="Calibri">串行算法</span></strong></p>
<p class="MsoNormal" style="21pt"><span style="Calibri"><span style="small;">可满足性问题属于<span lang="EN-US">NP</span>问题，可以使用局部搜索算法（<span lang="EN-US">GWSAT</span>， </span></span></p>
<p><span lang="EN-US">WALKSAT</span>）和全局搜索算法（<span lang="EN-US">DPLL</span>），局部搜索算法对于有解的问题，通常可以快速的找到解，但不能确定问题是否无解。全局搜索算法能够确定问题无解，但速度要远远低于局部搜索算法。</p>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri" lang="EN-US">WALKSAT</span><span style="Calibri">局部搜索的算法：</span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">1，<span style="7pt 'Times New Roman'">  </span></span>随机的对每个变量赋值，统计未满足的表达式和已满足的表达式为真的子条件个数。</span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">2，<span style="7pt 'Times New Roman'">  </span></span>选择一个变量。</span></p>
<p class="MsoNormal" style="3.72gd"><span>选择方法有很多种，比如从未满足的表达式中随机选择一个表达式，从这个表达式中选择一个翻转后导致未满足的表达式变为满足的数量最大的。（优化的算法使用了随机策略，一定几率的任意返回这个表达式的任意子变量）。还有一些算法参考<span lang="EN-US">walksat + tabu</span>，<span lang="EN-US">novelty</span>，<span lang="EN-US">rnovelty</span><span style="small;">。</span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">3，<span style="7pt 'Times New Roman'">  </span></span>翻转该变量，更新未满足的表达式和已满足的表达式为真的子条件个数。</span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">4，<span style="7pt 'Times New Roman'">  </span></span>如果所有表达式都已满足，输出解，否则回到<span lang="EN-US">2</span>。</span></p>
<p class="MsoNormal" style="list 39.0pt"><strong></strong><span style="Calibri"><span style="small;">设定局部搜索算法变量翻转次数<span lang="EN-US">N</span>，如果翻转了<span lang="EN-US">N</span>次依然没有找到解，就认为该问题无解，<span lang="EN-US">N</span>越大，将有解问题判定为无解的几率越小，但绝对不是零。</span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span style="Calibri"><span style="small;">一种全局搜索算法：</span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">1，<span style="7pt 'Times New Roman'">  </span></span>选择一个未赋值变量，该变量关联的未满足表达式最多。</span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">2，<span style="7pt 'Times New Roman'">  </span></span>对该变量赋值<span lang="EN-US">T</span>，</span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">3，<span style="7pt 'Times New Roman'">  </span></span>如果变量赋值<span lang="EN-US">T</span>，所有未满表达式都有未赋值的变量</span></p>
<p class="MsoNormal" style="150%"><span>A. 如果所有表达式都已满足，输出解，否则递归调用<span lang="EN-US">1.</span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">4，<span style="7pt 'Times New Roman'">  </span></span>对该变量赋值<span lang="EN-US">F</span>，<span lang="EN-US"> </span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span><span style="Ignore">5，<span style="7pt 'Times New Roman'">  </span></span>如果变量赋值<span lang="EN-US">F</span>，所有未满表达式都有未赋值的变量</span></p>
<p class="MsoNormal" style="150%"><span>B. 如果所有表达式都已满足，输出解，否则递归调用<span lang="EN-US">1.</span></span></p>
<p class="MsoNormal" style="150%"><span style="Calibri"><span style="small;">由于不确定是否允许使用局部搜索算法，所以我同时实现了这两种算法，其中局部搜索算法根据<span lang="EN-US">Henry Kautz&lt;kautz@cs.washington.edu&gt;</span>的程序<span lang="EN-US">walksat</span>改编。效率非常高，能瞬间求出有<span lang="EN-US">250</span>个变量的有解问题的解。全局搜索为自行设计的算法实现，跟成熟的<span lang="EN-US">DPLL</span>算法相比有很大差距，目前求解<span lang="EN-US">75</span>个变量的问题大概在<span lang="EN-US">10</span>秒左右。</span></span></p>
<p><strong>并行算法</strong></p>
<p class="MsoNormal" style="none" align="left"><span style="Calibri"><span style="small;">对于局部搜索算法，可以开启多个线程各自搜索，一旦搜索到解，所有线程退出。使用<span lang="EN-US">tbb</span>：：<span lang="EN-US">parallel_for</span>可以很容易的实现，见<span lang="EN-US">CX3SATLocalSearchOperator</span>类。</span></span></p>
<p class="MsoNormal" style="none" align="left"><span style="Calibri"><span style="small;">对于全局搜索算法，可以在递归调用环节进行任务分解，分别处理赋值<span lang="EN-US">T</span>和赋值<span lang="EN-US">F</span>。使用<span lang="EN-US">tbb</span>：：<span lang="EN-US">task</span>实现，见<span lang="EN-US">CX3SATWholeSearchTask</span>类。</span></span></p>
<h1> </h1>
<h1><span style="Calibri">优化工具</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="Calibri"><span style="small;">以下给出局部搜索并行算法的性能优化检测结果。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Hotspots</span></span></strong><strong><span style="Calibri">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"> <span style="Calibri"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile Hotspots_1.JPG" alt="" width="303" height="44" />使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Hotspots</span>检测功能查找热点函数，结果如下：</span></span></p>
<p class="MsoNormal" style="150%"><span style="Calibri"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090511/1.JPG" alt="" width="676" height="374" /></span></span></p>
<p class="MsoNormal" style="150%"><span style="Calibri"><span style="small;">主要的时间开销都在函数翻转变量</span></span><span style="yes" lang="EN-US">X3SATAtomFlip_Local</span><span style="Calibri"><span style="small;">及选择变量</span></span><span style="yes" lang="EN-US">X3SATPickBest_Local</span><span style="Calibri"><span style="small;">的函数内，优化这两个函数能得到最大的性能提升。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Concurrency</span></span></strong><strong><span style="Calibri">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><strong></strong></p>
<p class="MsoNormal" style="150%"><span style="Calibri"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile Concurrency_1.JPG" alt="" width="306" height="40" /></span></span></p>
<p class="MsoNormal" style="150%"><span style="Calibri"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Concurrency</span>检测功能查找可进行并行优化的代码，结果如下：</span></span></p>
<p class="MsoNormal" style="150%"><span style="Calibri"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090511/2.JPG" alt="" width="612" height="403" />代码主要的函数</span></span><span style="yes" lang="EN-US">X3SATAtomFlip_Local</span><span style="yes">及<span lang="EN-US">X3SATPickBest_Local</span>都有很高的并行度</span><span style="Calibri"><span style="small;">。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="Calibri" lang="EN-US"><span style="small;"> </span></span><span style="small;"><strong><span style="Calibri" lang="EN-US">Locks and Waits</span></strong><strong><span style="Calibri">检测</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><strong></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="Calibri"><span style="small;"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile Locks and Waits_1.JPG" alt="" width="308" height="41" /></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="Calibri"><span style="small;">使用<span lang="EN-US">Intel Amplifier</span>的<span lang="EN-US">Locks and Waits</span>检测功能查找锁和同步等待消耗，结果如下：</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><strong><span style="Calibri" lang="EN-US"><img src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090511/3.JPG" alt="" width="611" height="396" /></span></strong></p>
<p class="MsoNormal" style="2.0"><span style="Calibri"><span style="small;">结果显示存在<span lang="EN-US">23ms</span>的同步消耗，不存在较严重的同步和锁消耗。</span></span><span style="Calibri" lang="EN-US"><span style="small;"> </span></span></p>
<p class="MsoNormal" style="2.0"><span style="Calibri" lang="EN-US"><span style="small;"> </span></span></p>
<h1><span style="Calibri">性能测试</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><span style="Calibri">操作系统：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="1">    </span>32bit</span></span><span style="Calibri">的测试在</span><span lang="EN-US"><span style="宋体, MS Song;">32</span></span><span style="Calibri">位</span><span lang="EN-US"><span style="宋体, MS Song;">XP</span></span><span style="Calibri">下完成。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">64bit</span></span><span style="Calibri">的测试在</span><span lang="EN-US"><span style="宋体, MS Song;">64</span></span><span style="Calibri">位</span><span lang="EN-US"><span style="宋体, MS Song;">XP</span></span><span style="Calibri">下完成。</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri">：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2">           </span>Intel(R) Core(TM)2 CPU<span style="yes">  </span>5270<span style="yes">  </span>@ 1.40GHz</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><span style="Calibri">内存：</span><span lang="EN-US"><span style="宋体, MS Song;"><span style="2">           </span>1G</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><span style="Calibri">时间单位：</span><span lang="EN-US"><span style="yes"><span style="宋体, MS Song;">  </span></span></span><span style="Calibri">秒</span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span><span style="Calibri"><span style="small;">局部搜索算法结果：</span></span></p>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="91">
<p class="MsoNormal" style="center" align="center"><strong><span style="Calibri"><span style="small;">测试数据</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">32bit</span></strong><strong><span style="Calibri">串行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">32bit</span></strong><strong><span style="Calibri">并行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><strong><span style="Calibri"><span style="small;">加速比</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">64bit</span></strong><strong><span style="Calibri">串行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">64bit</span></strong><strong><span style="Calibri">并行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><strong><span style="Calibri"><span style="small;">加速比</span></span></strong></p>
</td>
</tr>
<tr style="1;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uf50-218</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.008363</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.007737</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.09</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.010860</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.011945</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.91</span></span></p>
</td>
</tr>
<tr style="2;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uf75-325</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.007995</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.007788</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.02</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.010906</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.010992</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.00</span></span></p>
</td>
</tr>
<tr style="3;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uf250-1065</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.009309</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.008380</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.12</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.013284</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.015172</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.87</span></span></p>
</td>
</tr>
<tr style="4;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uuf50-218</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.319345</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.195871</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.63</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.377701</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.252893</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.49</span></span></p>
</td>
</tr>
<tr style="5;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uuf75-325</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.528795</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.319043</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.66</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.566366</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.380140</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.49</span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uuf250-1065</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">8.406969</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">4.981784</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.68</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">8.097578</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="94">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">4.849143</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="62">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.67</span></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="0cm 0cm 0pt"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span><span style="Calibri"><span style="small;">全局搜索算法结果：</span></span></p>
<table class="MsoNormalTable" style=".5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="yes;">
<td style="windowtext 1pt solid;" width="91">
<p class="MsoNormal" style="center" align="center"><strong><span style="Calibri"><span style="small;">测试数据</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">32bit</span></strong><strong><span style="Calibri">串行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">32bit</span></strong><strong><span style="Calibri">并行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><strong><span style="Calibri"><span style="small;">加速比</span></span></strong></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">64bit</span></strong><strong><span style="Calibri">串行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="93">
<p class="MsoNormal" style="center" align="center"><span style="small;"><strong><span style="Calibri" lang="EN-US">64bit</span></strong><strong><span style="Calibri">并行</span></strong></span></p>
</td>
<td style="solid windowtext .5pt;" width="61">
<p class="MsoNormal" style="center" align="center"><strong><span style="Calibri"><span style="small;">加速比</span></span></strong></p>
</td>
</tr>
<tr style="1;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uf50-218</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.018955</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.021946</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.86</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.037525</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="93">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.039605</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="61">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.95</span></span></p>
</td>
</tr>
<tr style="2;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uf75-325</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">10.975312</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">6.074871</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.80</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">11.240384</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="93">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">6.231125</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="61">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.80</span></span></p>
</td>
</tr>
<tr style="3;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uuf50-218</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.079563</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.048530</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.63</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.112661</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="93">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">0.103112</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="61">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.09</span></span></p>
</td>
</tr>
<tr style="yes;">
<td style="solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">uuf75-325</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="90">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">11.109457</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="88">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">6.467592</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="60">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.72</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="101">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">11.205552</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="93">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">6.554428</span></span></p>
</td>
<td style="solid windowtext .5pt;" width="61">
<p class="MsoNormal" style="center" align="center"><span style="Calibri" lang="EN-US"><span style="small;">1.71</span></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="0cm 0cm 0pt"><span lang="EN-US"><span style="宋体, MS Song;"> </span></span></p>
<h1><span style="Calibri">编译说明</span></h1>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Windows</span></span></strong><strong><span style="Calibri">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p>使用VS2008和Intel Parallel Studio</p>
<p>1. 用VS2008打开本项目.</p>
<p>2. 选择X64平台Relase编译.</p>
<p>3. 进入Bin目录执行文件为X3SAT.exe.</p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><strong><span lang="EN-US"><span style="宋体, MS Song;">Linux</span></span></strong><strong><span style="Calibri">平台</span><span lang="EN-US"><span style="宋体, MS Song;">:</span></span></strong></span></p>
<p class="MsoNormal" style="Calibri">使用<span lang="EN-US">ICC</span>和<span lang="EN-US">TBB</span></p>
<p class="MsoNormal" style="list 39.0pt"><span style="Calibri" lang="EN-US"><span style="Ignore"><span style="small;">1.上传压缩包种的<span lang="EN-US">Src</span>和<span lang="EN-US">Linux</span>两个目录到服务器上<span lang="EN-US">.</span></span></span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span style="Calibri" lang="EN-US"><span style="Ignore"><span style="small;">2.进入<span lang="EN-US">X3SAT/Linux</span>目录 执行<span lang="EN-US">make</span></span></span></span></p>
<p class="MsoNormal" style="list 39.0pt"><span style="Calibri" lang="EN-US"><span style="Ignore"><span style="small;">3.进入<span lang="EN-US">X3SAT/Bin</span>目录 执行文件为<span lang="EN-US">X3SAT.</span></span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><strong><span style="small;"><span style="Calibri">其他：</span></span></strong></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><strong></strong><span style="small;"><strong><span style="Calibri">请使用</span><span style="red" lang="EN-US"><span style="宋体, MS Song;">Win</span></span></strong><strong><span style="Calibri">下的</span><span style="red" lang="EN-US"><span style="宋体, MS Song;">64bit Relase</span></span></strong><strong><span style="Calibri">版本进行测试。</span></strong></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><strong><span style="red" lang="EN-US"><span style="宋体, MS Song;"> </span></span></strong></p>
<h1><span style="Calibri">优化结论</span></h1>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri">通过这道竞赛题，我熟悉了</span><span lang="EN-US"><span style="宋体, MS Song;">3SAT</span></span><span style="Calibri">问题的完全搜索算法和局部搜索算法，也通过</span><span lang="EN-US"><a href="http://bbs.ustc.edu.cn/cgi/bbsqry?userid=blackcoffee"><span style="windowtext"><span style="宋体, MS Song;">blackcoffee</span></span></a><span style="宋体, MS Song;">(</span></span><span style="Calibri">最后一支烟</span><span lang="EN-US"><span style="宋体, MS Song;">)</span></span><span style="Calibri">的文章了解了这个问题从完全搜索，到引用</span><span lang="EN-US"><span style="宋体, MS Song;">N</span></span><span style="Calibri">皇后问题的算法，采用局部搜索。然后是</span><span lang="EN-US"><span style="宋体, MS Song;">GSAT-&gt;GSAT+TABU-&gt;GWSAT -&gt;WalkSAT-&gt; WalkSAT+TABU</span></span></span></p>
<p class="MsoNormal" style="0cm 0cm 0pt"><span style="small;"><span lang="EN-US"><span style="宋体, MS Song;">-&gt;NOVELTY</span></span><span style="Calibri">和</span><span lang="EN-US"><span style="宋体, MS Song;">R-NOVELTY </span></span><span style="Calibri">的一步步发展。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri">由于时间关系，只实现了</span><span lang="EN-US"><span style="宋体, MS Song;">WalkSAT</span></span><span style="Calibri">，也尝试了一下</span><span lang="EN-US"><span style="宋体, MS Song;">NOVELTY</span></span><span style="Calibri">，效果没有想象中那么明显，最后还是回到了</span><span lang="EN-US"><span style="宋体, MS Song;">WalkSAT</span></span><span style="Calibri">。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri">完全搜索算法，我没有太多的去了解，是自己摸索出来的算法，刚开始是按顺序对每个变量赋值，然后检查是否有子句不可能满足，如果有就进行剪枝。再深入一步，优先考虑关联表达式较多的变量可以将剪枝提前，从而缩小搜索范围。于是在加载的时候统计每个变量的关联的表达式数量，数量多的先赋值，效率提高不少。最后就是参考</span><span lang="EN-US"><span style="宋体, MS Song;">walksat</span></span><span style="Calibri">，每一次都选择关联的不满足的表达式数量最多的变量进行赋值，就是最终的提交版本了。</span></span></p>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri">值得一提的是测试结论，可以局部搜索算法在处理有解问题的时候，加速比很小，有时候比串行代码还要慢，原因在于题目只要求找一个解，而局部搜索算法又有随机性，所以串行代码在数据量</span><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri">核很少的时候比并行算法还快。</span><span lang="EN-US"><span style="宋体, MS Song;">CPU</span></span><span style="Calibri">的核越多，问题的变量越多，并行算法比串行算法的几率就越大。</span></span></p>
<p>如您需要就本文所涉及的问题与作者进行交流讨论，请发送邮件联系邓辉，邮件地址为 <a href="mailto:denghui0815@hotmail.com"><span style="#000000;">denghui0815@hotmail.com</span></a></p>
<h1> </h1>
<h1><span style="Calibri">致谢</span></h1>
<p class="MsoNormal" style="21pt"><span style="small;"><span style="Calibri">感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Clay Breshears</span></span><span style="Calibri">所做的解答，感谢</span><span lang="EN-US"><span style="宋体, MS Song;">Henry Kautz</span></span><span style="Calibri">开源的</span><span lang="EN-US"><span style="宋体, MS Song;">walksat-dist</span></span><span style="Calibri">帮组我完成了</span><span lang="EN-US"><span style="宋体, MS Song;">3SAT</span></span><span style="Calibri">问题的编码，感谢</span><span lang="EN-US"><a href="http://bbs.ustc.edu.cn/cgi/bbsqry?userid=blackcoffee"><span style="windowtext"><span style="宋体, MS Song;">blackcoffee</span></span></a><span style="宋体, MS Song;">(</span></span><span style="Calibri">最后一支烟</span><span lang="EN-US"><span style="宋体, MS Song;">)</span></span><span style="Calibri">在瀚海星云发表的《一个经典问题的算法发展史》帮助我了解</span><span lang="EN-US"><span style="宋体, MS Song;">3SAT</span></span><span style="Calibri">问题的发展历史，并最终选择了</span><span lang="EN-US"><span style="宋体, MS Song;">walksat</span></span><span style="Calibri">算法进行并行改进。</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/05/20/2009-3sat/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 英特尔® 线程挑战赛参赛随笔系列 - 第一题: 基数排序</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/05/07/400001638/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/05/07/400001638/#comments</comments>
		<pubDate>Thu, 07 May 2009 01:18:23 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>
		<category><![CDATA[线程挑战赛]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/05/07/400001638/</guid>
		<description><![CDATA[  问题描述 给出一组含有关键字的未排序字符串，这些关键字可视为整数的二进制表示，关键字内的各个位可以用来对这组字符串进行排序。这种排序方法被称为基数排序。 请编写一个用多线程实现基数排序算法的程序：对从输入文件读取的关键字进行排序，然后将排序后的关键字输出到另一个文件。输入文件名和输出文件名应为执行程序命令行的第一和第二个参数。 输入文件中的第一行是要排序的关键字总个数 (N)；后面紧跟 N 个关键字，每行一个；关键字是由 7 个可打印字符组成的字符串，不含空格 (ASCII 0x20)。文件中关键字的个数小于 2^31 – 1。排序后的输出结果必须保存在文本文件中，每行一个关键字； 计时：如果您在程序中加入计时代码来计算排序过程所用的时间并报告已用的时间，将用这个时间来计分；如果不加入计时代码，将使用整个执行时间（包括输入时间和输出时间）来计分。 有关本问题的详细情况，请进入本次线程挑战赛获取更多信息。   对基数排序的分析和求解 串行算法 “基数排序法”（radix sort）则是属于“分配式排序”（distribution sort），基数排序法又称“桶子法”（bucket sort）或bin sort，顾名思义，它是透过键值的部份资讯，将要排序的元素分配至某些“桶”中，藉以达到排序的作用，基数排序法是属于稳定性的排序，其时间复杂度为O (nlog(r)m)，其中r为所采取的基数，而m为堆数，在某些时候，基数排序法的效率高于其它的比较性排序法。 基数排序的方式可以采用LSD（Least significant digital）或MSD（Most significant digital），LSD的排序方式由键值的最右边开始，而MSD则相反，由键值的最左边开始。 显然MSD更适合并行处理，我们只需要像快速排序那样按照指定的bit位为1的和为0的分开，让为0的在前面，为1的在后面就可以了，然后递归由高到低的处理每个bit位。 本问题需要排序的为7个字符，共7 * 8 = 56 bit，加上换行符共64bit刚好可以放入一个int64变量中。 下面给出不使用小数据排序展开优化的源码，使用展开的代码较长，就不放入文档中了。 //基数排序,返回分割点不使用展开优化 __inline uint64* XPartition64(uint64* pBeg, uint64* pEnd, uint64 nBitMask) { //判断如果少于一个数据无需排序 if(pBeg + 1 &#60; pEnd) [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<h1>问题描述</h1>
<p>给出一组含有关键字的未排序字符串，这些关键字可视为整数的二进制表示，关键字内的各个位可以用来对这组字符串进行排序。这种排序方法被称为基数排序。<br />
请编写一个用多线程实现基数排序算法的程序：对从输入文件读取的关键字进行排序，然后将排序后的关键字输出到另一个文件。输入文件名和输出文件名应为执行程序命令行的第一和第二个参数。<br />
输入文件中的第一行是要排序的关键字总个数 (N)；后面紧跟 N 个关键字，每行一个；关键字是由 7 个可打印字符组成的字符串，不含空格 (ASCII 0x20)。文件中关键字的个数小于<br />
2^31 – 1。排序后的输出结果必须保存在文本文件中，每行一个关键字；</p>
<p><strong>计时</strong>：如果您在程序中加入计时代码来计算排序过程所用的时间并报告已用的时间，将用这个时间来计分；如果不加入计时代码，将使用整个执行时间（包括输入时间和输出时间）来计分。</p>
<p>有关本问题的详细情况，请进入本次线程挑战赛获取<a href="http://software.intel.com/zh-cn/contests/Threading-Challenge-2009/codecontest.php">更多信息</a>。</p>
<p><!--输入文件样例：<br />
8<br />
<a href="mailto:H@skell" mce_href="mailto:H@skell">H@skell</a><br />
surVEYs<br />
sysTEMS<br />
HASKELL<br />
Surveys<br />
1234567<br />
SURveys<br />
systEMS</div>
<div mce_tmp="1">输出文件样例：<br />
1234567<br />
<a href="mailto:H@skell" mce_href="mailto:H@skell">H@skell</a><br />
HASKELL<br />
SURveys<br />
Surveys<br />
surVEYs<br />
sysTEMS<br />
systEMS<br />
--></p>
<h1> </h1>
<h1>对基数排序的分析和求解</h1>
<p><strong>串行算法</strong></p>
<p>“基数排序法”（radix sort）则是属于“分配式排序”（distribution sort），基数排序法又称“桶子法”（bucket sort）或bin<br />
sort，顾名思义，它是透过键值的部份资讯，将要排序的元素分配至某些“桶”中，藉以达到排序的作用，基数排序法是属于稳定性的排序，其时间复杂度为O (nlog(r)m)，其中r为所采取的基数，而m为堆数，在某些时候，基数排序法的效率高于其它的比较性排序法。</p>
<p>基数排序的方式可以采用LSD（Least significant digital）或MSD（Most significant digital），LSD的排序方式由键值的最右边开始，而MSD则相反，由键值的最左边开始。</p>
<p>显然MSD更适合并行处理，我们只需要像快速排序那样按照指定的bit位为1的和为0的分开，让为0的在前面，为1的在后面就可以了，然后递归由高到低的处理每个bit位。</p>
<p>本问题需要排序的为7个字符，共7 * 8 = 56 bit，加上换行符共64bit刚好可以放入一个int64变量中。</p>
<p>下面给出不使用小数据排序展开优化的源码，使用展开的代码较长，就不放入文档中了。</p>
<p>//基数排序,返回分割点不使用展开优化</p>
<p>__inline uint64* XPartition64(uint64* pBeg, uint64* pEnd, uint64 nBitMask)</p>
<p>{</p>
<p>//判断如果少于一个数据无需排序</p>
<p>if(pBeg + 1 &lt; pEnd)</p>
<p>{</p>
<p>uint64 *i = pBeg - 1, *j = pEnd;</p>
<p>while ((++i) &lt; pEnd &amp;&amp; (*i &amp; nBitMask) == 0);</p>
<p>while ((--j) &gt; pBeg &amp;&amp; (*j &amp; nBitMask));</p>
<p>while (i &lt; j)</p>
<p>{</p>
<p>const uint64 t = *i; *i = *j; *j = t;</p>
<p>do ++i; while ((*i &amp; nBitMask) == 0);</p>
<p>do --j; while ((*j &amp; nBitMask));</p>
<p>}</p>
<p>return i;</p>
<p>}</p>
<p>else</p>
<p>{</p>
<p>return NULL;</p>
<p>}</p>
<p>};</p>
<p>//基数排序_串行版本</p>
<p>void XRadixSort64Serial(uint64* pBeg, uint64* pEnd, uint64 nBitMask)</p>
<p>{</p>
<p>uint64* pSplit = XPartition64(pBeg, pEnd, nBitMask);</p>
<p>if(pSplit != NULL &amp;&amp; nBitMask != XRADIX_MASK_END)</p>
<p>{</p>
<p>XRadixSort64Serial(pBeg, pSplit, XRADIX_MASK_SHIFT_64(nBitMask));</p>
<p>XRadixSort64Serial(pSplit, pEnd, XRADIX_MASK_SHIFT_64(nBitMask));</p>
<p>}</p>
<p>}</p>
<p>代码跟快速排序的代码几乎一致，只是对比较方法及分割位置的处理有些不同，用红色字体标出。</p>
<p><strong>并行算法</strong></p>
<p>分裂式的算法，采用TBB的Task做并行优化是个不错的选择。一次分裂出两个Task，分裂出的Task可以并行处理，彼此之间没有共享数据，可以完全并行。</p>
<p>// 基数排序Task</p>
<p>class CXRadixSort64Task: public tbb::task</p>
<p>{</p>
<p>uint64 *m_pBeg,*m_pEnd,m_nBitMask;</p>
<p>BOOL m_bIsContinuation;</p>
<p>static uint32 ms_nCutOff32;</p>
<p>public:</p>
<p>CXRadixSort64Task( uint64* pBeg, uint64* pEnd, uint64 nBitMask) :</p>
<p>m_pBeg(pBeg), m_pEnd(pEnd), m_nBitMask(nBitMask), m_bIsContinuation(FALSE) {<br />
}</p>
<p>tbb::task* execute()</p>
<p>{</p>
<p>tbb::task *pNextA = NULL, *pNextB = NULL;</p>
<p>if(m_pEnd - m_pBeg &lt; ms_nCutOff32)</p>
<p>{ // 基数排序_串行版本</p>
<p>XRadixSort64Serial(m_pBeg, m_pEnd, m_nBitMask);</p>
<p>}</p>
<p>else</p>
<p>{</p>
<p>if( !m_bIsContinuation )</p>
<p>{ // 移动数据并得到分割点</p>
<p>uint64* pSplit = XPartition64(m_pBeg, m_pEnd, m_nBitMask);</p>
<p>if(pSplit != NULL)</p>
<p>{ // 分裂新的Task</p>
<p>pNextA = new( allocate_child() ) CXRadixSort64Task(m_pBeg, pSplit,</p>
<p>XRADIX_MASK_SHIFT_64(m_nBitMask));</p>
<p>pNextB = new( allocate_child() ) CXRadixSort64Task(pSplit, m_pEnd,</p>
<p>XRADIX_MASK_SHIFT_64(m_nBitMask));</p>
<p>m_bIsContinuation = TRUE;</p>
<p>recycle_as_continuation();</p>
<p>set_ref_count(2);</p>
<p>spawn(*pNextB);</p>
<p>}</p>
<p>}</p>
<p>}</p>
<p>return pNextA;</p>
<p>}</p>
<p>static void SetCutOff(uint32 nCutOff)</p>
<p>{</p>
<p>if(nCutOff &lt; 1024)</p>
<p>ms_nCutOff32 = 1024;</p>
<p>else if(nCutOff &gt; XRADIXSORT_CUTOFF)</p>
<p>ms_nCutOff32 = XRADIXSORT_CUTOFF;</p>
<p>else</p>
<p>ms_nCutOff32 = nCutOff &gt;&gt; 10 &lt;&lt; 10;</p>
<p>}</p>
<p>};</p>
<p>// 基数排序_并行版本</p>
<p>void XRadixSort64Parallel(uint64* pBeg, uint64* pEnd, uint64 nBitMask)</p>
<p>{</p>
<p>// 计算CUTOFF值</p>
<p>uint32 nSize = (uint32)(pEnd - pBeg) / task_scheduler_init::default_num_threads()<br />
/ 4;</p>
<p>CXRadixSort64Task::SetCutOff(nSize);</p>
<p>CXRadixSort64Task&amp; xtask = *new(tbb::task::allocate_root())</p>
<p>CXRadixSort64Task(pBeg, pEnd, nBitMask);</p>
<p>tbb::task::spawn_root_and_wait(xtask);</p>
<p>}</p>
<p>在Task处理前先判断数据量的大小，如果小于ms_nCutOff32个数据可直接调用串行算法，减少Task的数量，降低TBB维护Task的开销。ms_nCutOff32的大小根据待排序数组的大小进行动态调整，最大为64k。</p>
<p><strong></strong></p>
<p><strong></strong> </p>
<h1>优化工具</h1>
<p><strong>Hotspots检测</strong></p>
<p><a href="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile%20Hotspots_1.JPG"><img class="alignnone" src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile%20Hotspots_1.JPG" alt="" width="303" height="44" /></a></p>
<p>使用Intel Amplifier的Hotspots检测功能查找热点函数，结果如下：</p>
<p><a href="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/3.jpg"><img class="alignnone size-medium wp-image-400001732" title="3" src="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/3-300x148.jpg" alt="" width="600" height="297" /></a></p>
<p>主要的时间开销都在函数XPartition64内，优化该函数能得到最大的性能提升。</p>
<p><strong>Concurrency检测</strong></p>
<p><a href="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile%20Concurrency_1.JPG"><img class="alignnone" src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile%20Concurrency_1.JPG" alt="" width="306" height="40" /></a></p>
<p>使用Intel Amplifier的Concurrency检测功能查找可进行并行优化的代码，结果如下：</p>
<p><a href="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/4.jpg"><img class="alignnone size-medium wp-image-400001733" title="4" src="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/4-300x148.jpg" alt="" width="600" height="297" /></a></p>
<p>XLoadData和XSaveData使用了内存映射，同时使用OpenMp做了并行优化，由于比赛不统计输入输出的时间所以这部分代码没有做太多优化。</p>
<p>另外在排序刚开始的时候，XPartition64存在串行执行的部分，原因在于第1层调用只有1个Task可以执行，第2层调用只有2个Task可以执行，第3层调用只有4个Task可以执行，第n层调用只有2<br />
^ n个Task可以执行.测试平台为16核系统，至少要到第5层调用16个核才能开始完全并行，所以这部分串行执行的时间与核的数量成正比，这部分串行执行时间，任然是需要进行优化的。这样就需要在前期执行时使用并行版本的XPartition64函数。由于时间有限就不再做这部分优化了。</p>
<p><strong>Locks and Waits检测</strong></p>
<p><a href="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile%20Locks%20and%20Waits_1.JPG"><img class="alignnone" src="http://p.blog.csdn.net/images/p_blog_csdn_net/denghui0815/EntryImages/20090426/Profile%20Locks%20and%20Waits_1.JPG" alt="" width="306" height="41" /></a></p>
<p>使用Intel Amplifier的Locks and Waits检测功能查找锁和同步等待消耗，结果如下</p>
<p><a href="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/51.jpg"><img class="alignnone size-medium wp-image-400001735" title="51" src="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/51-300x148.jpg" alt="" width="600" height="297" /></a><br />
结果显示存在136ms的同步消耗，不存在较严重的同步和锁消耗。</p>
<p><strong>其他优化</strong></p>
<p>1. 在32位平台下int64的运算的开销较大，在32位平台上可以分别处理高32bit和低32bit提高效率。见函数XPartition32H和XPartition32L。<br />
2. 可见字符的高bit位均为零，可以利用这个规律对数据进行压缩，在数据加载的时候将数据压缩到49个比特位，后来看到haojn的提问及Clay Breshears的解答，否定了压缩方案，因为数据压缩的开销比较大。但是可以采取另外一种方案来加速，在右移动BitMask的时候，直接跳过每个字节的高位，具体见XRADIX_MASK_SHIFT_32和XRADIX_MASK_SHIFT_64宏函数。<br />
3. 对较少的数据（2个，3个，4个）的排序展开。分别为XPartition64_2，XPartition64_3，XPartition64_4减少比较和内存搬运的次数，提高效率。<br />
4. 在数据加载和保存的时候利用OpenMp进行并行优化。</p>
<p><strong></strong> </p>
<h1>性能测试</h1>
<p>小数据量测试：</p>
<p>操作系统： 32bit的测试在32位XP下完成。<br />
64bit的测试在64位XP下完成。<br />
CPU： Intel(R) Core(TM)2 CPU 5270 @ 1.40GHz<br />
内存： 1G<br />
时间单位： 秒</p>
<table class="MsoNormalTable" style="margin: auto auto auto 5.4pt; width: 698px; border-collapse: collapse; height: 92px; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0cm 5.4pt 0cm 5.4pt; mso-border-insideh: .5pt solid windowtext; mso-border-insidev: .5pt solid windowtext;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 7.5pt; mso-yfti-irow: 0; mso-yfti-firstrow: yes;">
<td style="padding-right: 5.4pt; padding-left: 5.4pt; background: silver; padding-bottom: 0cm; width: 54.6pt; padding-top: 0cm; height: 7.5pt; mso-border-alt: solid windowtext .5pt; border: windowtext 1pt solid;" width="73">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">测试数据</span></span></span></strong></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; border-top: windowtext 1pt solid; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 69.3pt; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 7.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;" width="92">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-size: small;"><span style="font-size: x-small;"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">32bit</span></strong><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">串行</span></strong></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; border-top: windowtext 1pt solid; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 67.9pt; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 7.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-size: small;"><span style="font-size: x-small;"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">32bit</span></strong><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">并行</span></strong></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; border-top: windowtext 1pt solid; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 46.9pt; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 7.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;" width="63">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">加速比</span></span></span></strong></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; border-top: windowtext 1pt solid; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 79.05pt; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 7.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;" width="105">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-size: small;"><span style="font-size: x-small;"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">64bit</span></strong><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">串行</span></strong></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; border-top: windowtext 1pt solid; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 72.8pt; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 7.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;" width="97">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-size: small;"><span style="font-size: x-small;"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">64bit</span></strong><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;">并行</span></strong></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; border-top: windowtext 1pt solid; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 48.3pt; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 7.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;" width="64">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><strong style="mso-bidi-font-weight: normal;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">加速比</span></span></span></strong></p>
</td>
</tr>
<tr style="height: 9.4pt; mso-yfti-irow: 1;">
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; background: silver; padding-bottom: 0cm; border-left: windowtext 1pt solid; width: 54.6pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="73">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">100k</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 69.3pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="92">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.021771</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 67.9pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.011811</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 46.9pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="63">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.84</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 79.05pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="105">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.016750</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 72.8pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="97">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.009149</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 48.3pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="64">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.83</span></span></span></p>
</td>
</tr>
<tr style="height: 9.4pt; mso-yfti-irow: 2;">
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; background: silver; padding-bottom: 0cm; border-left: windowtext 1pt solid; width: 54.6pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="73">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-size: small;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: x-small;">1M</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 69.3pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="92">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.232440</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 67.9pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.120195</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 46.9pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="63">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.93</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 79.05pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="105">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.195054</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 72.8pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="97">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">0.105122</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 48.3pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="64">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.86</span></span></span></p>
</td>
</tr>
<tr style="height: 9.4pt; mso-yfti-irow: 3; mso-yfti-lastrow: yes;">
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; background: silver; padding-bottom: 0cm; border-left: windowtext 1pt solid; width: 54.6pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="73">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-size: small;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: x-small;">10M</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 69.3pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="92">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">2.446142</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 67.9pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="91">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.355823</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 46.9pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="63">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.80</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ffcccc; padding-bottom: 0cm; width: 79.05pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="105">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">2.124676</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #ccffff; padding-bottom: 0cm; width: 72.8pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="97">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt; text-align: center;" align="center"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.203946</span></span></span></p>
</td>
<td style="border-right: windowtext 1pt solid; padding-right: 5.4pt; padding-left: 5.4pt; border-left-color: #b4b4b4; background: #99ccff; padding-bottom: 0cm; width: 48.3pt; border-top-color: #b4b4b4; padding-top: 0cm; border-bottom: windowtext 1pt solid; height: 9.4pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;" width="64">
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span style="font-family: 宋体; mso-hansi-font-family: Calibri;"><span style="font-size: small;"><span style="font-size: x-small;">1.76</span></span></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin: 0cm 0cm 0pt;"><span lang="EN-US"><span style="font-size: small; font-family: 宋体, MS Song;"><span style="font-size: x-small;"> </span></span></span></p>
<p>大数据测试：</p>
<p>操作系统： Red Hat Enterprise Linux AS release 4 (Nahant Update 2)<br />
CPU： Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz<br />
内存： 4G<br />
测试数据: 250M条数据<br />
测试结果： 36.790519 seconds!</p>
<p><strong></strong> </p>
<h1>编译说明</h1>
<p><strong>Windows平台:</strong></p>
<p>使用VS2008和Intel Parallel Studio<br />
1. 用VS2008打开本项目.<br />
2. 选择X64平台Relase编译.<br />
3. 进入Bin目录执行文件为XRadixSort.exe.</p>
<p><strong>Linux平台:</strong><br />
使用ICC和TBB<br />
1. 上传压缩包种的Src和Linux两个目录到服务器上.<br />
2. 进入XRadixSort/Linux目录 执行make<br />
3. 进入XRadixSort/Bin目录 执行文件为XRadixSort.</p>
<p><strong>其他：</strong></p>
<p><span style="color: #ff0000;">如果不允许在加载数据时改变字节序，请注释XRadixSortPubDef.h第53行的X_SWAP_ON_READ宏定义。<br />
</span><br />
<strong></strong></p>
<p><strong></strong></p>
<h1> </h1>
<h1>优化结论</h1>
<p>通过使用Intel Parallel Studio集成了编译器，TBB，IPP，OpenMp以及Amplifier和Inspector，通过使用这些工具我完成了基数排序的并行优化，虽然这个并行优化还没有做到极限，但是已经达到了我预期的效果。<br />
Amplifier能够简单快速的检测到多线程程序的并行性能缺陷，找出需要优化的代码，是并行优化不可或缺的优化工具。<br />
Inspecto也能对程序的内存错误及线程错误进行检测，能很方便的检测多线程程序的常见错误，特别同步方面的错误。是多线程程序调试的极佳选择。</p>
<p><strong></strong> </p>
<h1>致谢</h1>
<p>感谢Clay Breshears所做的解答，感谢Mu，Pryce和Xia, JeffX P友好的通知我参加比赛，感谢haojn在论坛上的讨论的回复，让我修改了原来的bit压缩处理方案否则我的解决方案可能不被认可。</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/05/07/400001638/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>英特尔® 线程挑战赛系列文章—2008 第五题：数独解决方案</title>
		<link>http://software.intel.com/zh-cn/blogs/2009/05/04/400001525/</link>
		<comments>http://software.intel.com/zh-cn/blogs/2009/05/04/400001525/#comments</comments>
		<pubDate>Mon, 04 May 2009 00:50:15 +0000</pubDate>
		<dc:creator>邓辉</dc:creator>
				<category><![CDATA[博客征文专栏]]></category>

		<guid isPermaLink="false">http://software.intel.com/zh-cn/blogs/2009/05/04/400001525/</guid>
		<description><![CDATA[数独(Sudoku)是一种逻辑谜题游戏,解答方法是将数字放到栅格里,但同行同列或者同一个子块里不能有相同的数字。通常栅格一般取 9x9。这样的话，每一行每一列以及这9 个 3x3 的不重叠子块的每一个，都会包括整数 1-9 的一个具体实例。除了 9x9 的栅格外，也有可能会用 16x16 或者 6x6 的，还有一些变化形式并不使用方形的子块。 问题：写一个分析程序判断 6x6 的数独(Sudoku)谜题是否有唯一解。正确的解应该将数字 1-6 不重复的放到每一行每一列。同样的，不相重叠的 2x3 子块(2行，3列)必须包含这六个整数的一个唯一的实例。例如： 2 3 1 &#124; 4 5 6 6 5 4 &#124; 3 2 1 ------------- 1 4 3 &#124; 5 6 2 5 6 2 &#124; 1 3 4 ------------- 3 1 6 &#124; 2 [...]]]></description>
			<content:encoded><![CDATA[<p>数独(Sudoku)是一种逻辑谜题游戏,解答方法是将数字放到栅格里,但同行同列或者同一个子块里不能有相同的数字。通常栅格一般取 9x9。这样的话，每一行每一列以及这9 个 3x3 的不重叠子块的每一个，都会包括整数 1-9 的一个具体实例。除了 9x9 的栅格外，也有可能会用 16x16 或者 6x6 的，还有一些变化形式并不使用方形的子块。</p>
<p><strong>问题</strong>：写一个分析程序判断 6x6 的数独(Sudoku)谜题是否有唯一解。正确的解应该将数字 1-6 不重复的放到每一行每一列。同样的，不相重叠的<br />
2x3 子块(2行，3列)必须包含这六个整数的一个唯一的实例。例如：</p>
<p>2 3 1 | 4 5 6</p>
<p>6 5 4 | 3 2 1</p>
<p>-------------</p>
<p>1 4 3 | 5 6 2</p>
<p>5 6 2 | 1 3 4</p>
<p>-------------</p>
<p>3 1 6 | 2 4 5</p>
<p>4 2 5 | 6 1 3</p>
<p><strong>输入文件描述</strong>：当程序开始执行时，将输入文件的名字送入，通常是以命令行参数的形式。输入文件会包含一定数量的行，每行有 36 个非空子符。每行代表了在一个可能的<br />
6×6 数独(Sudoku)谜题的初始状态下，全部字符的逐行排列。迷题初始状态中的空格将由星号字符代替。文件的结尾表示输入的结束。</p>
<p><strong>输出</strong>：输出要标准化。应该指出每一个输入谜题是有唯一解，无解还是有多解。</p>
<p>一次输入 3 个谜题的输入示例：</p>
<p>*314*******1**356**621**3*******561*</p>
<p>***4****41*2*4321**6534*5*16****6***</p>
<p>5******5******5******5******5******5</p>
<p>(第一行对应了上面已解决的那个谜题在初始状态下的字符排列)</p>
<p>输出举例：</p>
<p>谜题 # 1 有唯一解</p>
<p>谜题 # 2 无解</p>
<p>谜题 # 3 有多解</p>
<p><strong>计时</strong>：以时钟时间(Wall-clock time)为准(包括输入输出时间，即I/O时间)</p>
<p><strong>算法分析</strong>：</p>
<p>数独由m x n 个 m x n 的子块，形成m x n x m x n = 36的栅格，(本题的m为2，n为3)，根据规则单元格中可填写的数字为 1到m<br />
x n，同时每个单元格Cell[i,j]必须遵守以下三个规则：</p>
<p>1. Cell[i,j]值在第i行是唯一的。</p>
<p>2. Cell[i,j]值在第j行是唯一的。</p>
<p>3. Cell[i,j]值在它所属的子块中是唯一的。</p>
<p>当每个Cell[i,j]都存在满足以上三个规则的值，我们便认为该数独有解。</p>
<p>换而言之如果两个单元格满足以下条件则这两个单元格存在关联关系,不能填入相同的值。</p>
<p>1. 单元格的横坐标相同。</p>
<p>2. 单元格的纵坐标相同。</p>
<p>3. 单元格所在子块相同。</p>
<p>与Cell[i,j]存在互斥关系的单元格一共有m x n x 3 - m – n - 1 个。</p>
<p>栅格的每个单元格用一个int值表示，初始填为&amp;lt; XGRID_BSIZE) – 1【低XGRID_BSIZ个bit的值为1】，表示该单元格可以填的值为[1,<br />
XGRID_BSIZE]。 当一个单元格填入值t，就把它的关联单元格第t个bit置为0，表示该这些关联单元格不能填入t。再将这个填入值t的单元格高bit置为1【或上XFILLEDMASK】。例如：</p>
<p>单元格的值为 0x0003 表示该单元格填入1和2是合法的</p>
<p>单元格的值为 0x0005 表示该单元格填入1和3是合法的</p>
<p>单元格的值为 0x8008 表示该单元格一填入4。</p>
<p>加载迷题的时候，直接对有初始值的单元格进行填值，填值前对合法性进行检查，如果不合法直接返回无解。</p>
<p>迷题加载完成后，对所有未填值的单元格进行检测，找到一个可能性最少的【为1的bit位最少的单元格】单元格Cell[i,j]进行尝试填入合法的值【这样做可以降低堆栈的深度，减少分支的数量。】，如果在处理该Cell[i,j]的关联单元格时发现该关联单元格的值变为为0，表示该关联单元格没有了合法值，对Cell[i,j]尝试填入其他合法值。当所有单元格都填值完成，表示找到了该迷题的一组解。使用简单的递归调用即可完成求解。为追求效率，可以将递归展开。</p>
<p>首先定义一些宏、类型及结构。</p>
<p>// 填值失败</p>
<p>#define XFILL_INVALID 0x7FFFFFFF</p>
<p>// 已填值标记</p>
<p>#define XFILLEDMASK 0x8000</p>
<p>// 每个小块的宽度</p>
<p>#define XGRID_BW 3</p>
<p>// 每个小块的高度</p>
<p>#define XGRID_BH 2</p>
<p>// 每个小块的大小</p>
<p>#define XGRID_BSIZE (XGRID_BW * XGRID_BH)</p>
<p>// 整个网格的大小</p>
<p>#define XGRID_AREA (XGRID_BSIZE * XGRID_BSIZE)</p>
<p>// 整个网格的大小SSE对齐</p>
<p>#define XGRID_AREASSE ((XGRID_AREA + 7) &amp;gt;&amp;gt; 3 &amp;lt;&amp;lt;<br />
3)</p>
<p>// 每个单元格关联的单元格数量</p>
<p>#define XGRID_MUTUALITY (XGRID_BSIZE * 3 - XGRID_BW - XGRID_BH - 1)</p>
<p>// 每个单元格关联的单元格数量SSE对齐</p>
<p>#define XGRID_MUTUALITYSSE ((XGRID_MUTUALITY + 7) &amp;gt;&amp;gt; 3 &amp;lt;&amp;lt;<br />
3)</p>
<p>// 修改关联单元格的条件</p>
<p>#define XGRID_CHANGE_MUTUALITY_SIZE (XGRID_AREA * 30 / 100)</p>
<p>// 堆栈的最大深度</p>
<p>#define XGRID_STACK_SIZE (XGRID_AREA - XGRID_CHANGE_MUTUALITY_SIZE)</p>
<p>// 当数独的子块大于等于9的时候可以动态清除关联以提高效率</p>
<p>#if (XGRID_BSIZE &amp;gt;= 9)</p>
<p>#define XGRID_MUTUALITYSSE_CLEAR</p>
<p>#endif</p>
<p>// 用于保存单元格的值</p>
<p>typedef int XCells[XGRID_AREASSE];</p>
<p>// 用于保存网格的数据结构</p>
<p>typedef struct tagXGrid</p>
<p>{</p>
<p>__declspec(align(16)) XCells xCells; // 单元格的值</p>
<p>#ifdef XGRID_MUTUALITYSSE_CLEAR</p>
<p>// 单元格的关联单元格索引数组</p>
<p>__declspec(align(16)) uint8 pMutuality[XGRID_AREA][XGRID_MUTUALITYSSE];</p>
<p>// 单元格的关联单元格数量</p>
<p>__declspec(align(16)) uint8 nMutuality[XGRID_AREA];</p>
<p>#endif</p>
<p>// 已经填充的单元格位置</p>
<p>__declspec(align(16)) int pFilledPos[XGRID_AREA];</p>
<p>// 已填充的单元格个数</p>
<p>int nFilled;</p>
<p>// 运行模式和填充方案的数量</p>
<p>int *pRunMode,*pTotal;</p>
<p>}XGrid;</p>
<p>XGrid结构用于表示整个栅格。</p>
<p>初始化部分：</p>
<p>1，初始化PopCnt数组 g_xPopCnt，用于寻找填值方案最少的单元格，用查表法获取bit位为1的个数。</p>
<p>g_xPopCnt[0] = 0;</p>
<p>g_xPopCnt[1] = 1;</p>
<p>g_xPopCnt[2] = 1;</p>
<p>g_xPopCnt[3] = 2;</p>
<p>for(i = 0x04; i &amp;lt; 0x10; ++i) g_xPopCnt[i] = g_xPopCnt[i &amp;gt;&amp;gt;<br />
2] + g_xPopCnt[i &amp;amp; 0x03];</p>
<p>for(i = 0x10; i &amp;lt; 0x100; ++i) g_xPopCnt[i] = g_xPopCnt[i &amp;gt;&amp;gt;<br />
4] + g_xPopCnt[i &amp;amp; 0x0F];</p>
<p>#pragma omp parallel for</p>
<p>for(i = 0x100; i &amp;lt; XFILLEDMASK; ++i) g_xPopCnt[i] = g_xPopCnt[i &amp;gt;&amp;gt;<br />
8] + g_xPopCnt[i &amp;amp; 0xFF];</p>
<p>#pragma omp parallel for</p>
<p>for(i = XFILLEDMASK; i &amp;lt; 0x10000; ++i) g_xPopCnt[i] = 0xFF;</p>
<p>2，初始化每个单元格的关联单元格数组。</p>
<p>// 计算每个单元格的关联单元格数组</p>
<p>for(i = 0; i &amp;lt; XGRID_AREA; ++i)</p>
<p>{</p>
<p>g_nMutuality[i] = 0;</p>
<p>for(j = 0; j &amp;lt; XGRID_AREA; ++j)</p>
<p>{</p>
<p>if(i == j) continue;</p>
<p>// 如果两个单元格满足以下条件则这两个单元格存在关联关系</p>
<p>// 1. 单元格的横坐标相同</p>
<p>// 2. 单元格的纵坐标相同</p>
<p>// 3. 单元格所在块的横坐标相同且单元格所在块的纵坐标相同[两个单元格在同一个块中]</p>
<p>if( ( i % XGRID_BSIZE == j % XGRID_BSIZE ) || ( i / XGRID_BSIZE == j / XGRID_BSIZE<br />
) ||</p>
<p>( ( i / XGRID_BSIZE / XGRID_BH == j / XGRID_BSIZE / XGRID_BH ) &amp;amp;&amp;amp;</p>
<p>( ( i % XGRID_BSIZE ) / XGRID_BW ) == ( ( j % XGRID_BSIZE ) / XGRID_BW ) )<br />
)</p>
<p>{</p>
<p>g_xMutuality[i][g_nMutuality[i]++] = j;</p>
<p>}</p>
<p>}</p>
<p>for(j = XGRID_MUTUALITY; j &amp;lt; XGRID_MUTUALITYSSE; ++j) g_xMutuality[i][j]<br />
= XGRID_AREA;</p>
<p>}</p>
<p><strong>单元格填值部分</strong>：</p>
<p>1，填充单元格</p>
<p>前提为nPos位置的单元格置上nMask表示的值是合法的。首先将pCells[nPos]的值改为nMask + XFILLEDMASK，表示该单元格填上nMask表示的值，然后处理该单元格的关联单元格，去掉该单元格填充nMask的可能性，当某个关联单元格清除后仅剩下一种可能性则继续递归调用XFillCell填充掉该单元格，最后将填充的单元格位置保存到pGrid-&amp;gt;pFilledPos中</p>
<p>__inline int XFillCell(XGrid* pGrid, XCells pCells, int nPos, int nMask)</p>
<p>{</p>
<p>int nFilled = pGrid-&amp;gt;nFilled;</p>
<p>for(;;)</p>
<p>{</p>
<p>#ifdef XGRID_MUTUALITYSSE_CLEAR</p>
<p>uint8* const pMutuality = pGrid-&amp;gt;pMutuality[nPos];</p>
<p>const uint8 nMutuality = pGrid-&amp;gt;nMutuality[nPos];</p>
<p>#else</p>
<p>uint8* const pMutuality = g_xMutuality[nPos];</p>
<p>const uint8 nMutuality = g_nMutuality[nPos];</p>
<p>#endif</p>
<p>pCells[nPos] = nMask + XFILLEDMASK;</p>
<p>pGrid-&amp;gt;pFilledPos[nFilled++] = nPos;</p>
<p>nPos = -1;</p>
<p>#pragma unroll(16)</p>
<p>for(int i = 0; i &amp;lt; nMutuality; ++i)</p>
<p>{</p>
<p>const uint8 nTmp = pMutuality[i];</p>
<p>// 去掉该单元格填充nMask的可能性</p>
<p>pCells[nTmp] &amp;amp;= ~nMask;</p>
<p>// 如果单元格没有可选择的填值方案返回失败</p>
<p>if(pCells[nTmp] == 0) return XFILL_INVALID;</p>
<p>// 判断该单元格是否只有一种填值可能,如果是将调用XFillCell给该单元格填值</p>
<p>if(g_xPopCnt[pCells[nTmp]] == 1) nPos = nTmp;</p>
<p>}</p>
<p>if(nPos &amp;lt; 0) break;</p>
<p>nMask = pCells[nPos];</p>
<p>}</p>
<p>pGrid-&amp;gt;nFilled = nFilled;</p>
<p>return pGrid-&amp;gt;nFilled;</p>
<p>}</p>
<p><strong>迷题加载部分</strong>：</p>
<p>1，加载时填充单元格，与XFillCell不同，迷题加载的时候需要判断填值的合法性。</p>
<p>__inline int XFillCellLoad(XGrid* pGrid, XCells pCells, int nPos, int nMask)</p>
<p>{</p>
<p>int nNewPos = -1;</p>
<p>#ifdef XGRID_MUTUALITYSSE_CLEAR</p>
<p>uint8* const pMutuality = pGrid-&amp;gt;pMutuality[nPos];</p>
<p>const uint8 nMutuality = pGrid-&amp;gt;nMutuality[nPos];</p>
<p>#else</p>
<p>uint8* const pMutuality = g_xMutuality[nPos];</p>
<p>const uint8 nMutuality = g_nMutuality[nPos];</p>
<p>#endif</p>
<p>// 如果该单元格已经填充了nMask 直接返回</p>
<p>if(pCells[nPos] == nMask + XFILLEDMASK) return pGrid-&amp;gt;nFilled;</p>
<p>// 判断当前单元格能填nMask表示的值</p>
<p>if((pCells[nPos] &amp;amp; nMask) == 0) return XFILL_INVALID;</p>
<p>// 为第nPos个单元格填上nMask表示的值</p>
<p>pCells[nPos] = nMask + XFILLEDMASK;</p>
<p>for(int i = 0; i &amp;lt; nMutuality; ++i)</p>
<p>{</p>
<p>const uint8 nTmp = pMutuality[i];</p>
<p>// 去掉该单元格填充nMask的可能性</p>
<p>pCells[nTmp] &amp;amp;= ~nMask;</p>
<p>// 如果单元格没有可选择的填值方案返回失败</p>
<p>if(pCells[nTmp] == 0) return XFILL_INVALID;</p>
<p>// 判断该单元格是否只有一种填值可能,如果是将调用XFillCell给该单元格填值</p>
<p>if(g_xPopCnt[pCells[nTmp]] == 1) nNewPos = nTmp;</p>
<p>}</p>
<p>// 记录当前填充的单元格</p>
<p>pGrid-&amp;gt;pFilledPos[pGrid-&amp;gt;nFilled++] = nPos;</p>
<p>if(nNewPos &amp;gt;= 0)</p>
<p>{</p>
<p>if(XFillCell(pGrid, pCells, nNewPos, pCells[nNewPos]) == XFILL_INVALID)</p>
<p>{</p>
<p>// 恢复已经填值的单元格</p>
<p>--pGrid-&amp;gt;nFilled;</p>
<p>return XFILL_INVALID;</p>
<p>}</p>
<p>}</p>
<p>return pGrid-&amp;gt;nFilled;</p>
<p>}</p>
<p>2，加载迷题，将pInupt表示的迷题加载到pGrid中。</p>
<p>// 加载测试数据</p>
<p>__inline int XLoadGrid(XGrid* pGrid, const char* pInput)</p>
<p>{</p>
<p>int nPos = 0;</p>
<p>int nMask = (1 &amp;lt;&amp;lt; XGRID_BSIZE) - 1;</p>
<p>#ifdef XGRID_MUTUALITYSSE_CLEAR</p>
<p>XMEMCOPY(pGrid-&amp;gt;pMutuality, g_xMutuality, sizeof(g_xMutuality));</p>
<p>XMEMCOPY(pGrid-&amp;gt;nMutuality, g_nMutuality, sizeof(g_nMutuality));</p>
<p>#endif</p>
<p>for(nPos = 0; nPos &amp;lt; XGRID_AREA; ++nPos) pGrid-&amp;gt;xCells[nPos]<br />
= nMask;</p>
<p>for(nPos = 0; nPos &amp;lt; XGRID_AREA; ++nPos)</p>
<p>{</p>
<p>if(pInput[nPos] &amp;gt;= '1' &amp;amp;&amp;amp; pInput[nPos] &amp;lt;= '0'<br />
+ XGRID_BSIZE)</p>
<p>{ // 填充nPos的值为1 &amp;lt;&amp;lt; ( pInput[nPos] - '1' )</p>
<p>if(XFillCellLoad(pGrid, pGrid-&amp;gt;xCells, nPos, 1 &amp;lt;&amp;lt; ( pInput[nPos]<br />
- '1' )) == XFILL_INVALID)</p>
<p>{</p>
<p>return XFILL_INVALID;</p>
<p>}</p>
<p>}</p>
<p>}</p>
<p>#ifdef XGRID_MUTUALITYSSE_CLEAR</p>
<p>// [XGRID_BSIZE&amp;gt;=9且需要查找所有情况时清除关联可加速]</p>
<p>for(int i = 0; i &amp;lt; pGrid-&amp;gt;nFilled; ++i)</p>
<p>{</p>
<p>XRemoveMutuality(pGrid, pGrid-&amp;gt;pFilledPos[i]);</p>
<p>}</p>
<p>#endif</p>
<p>return XGRID_AREA - pGrid-&amp;gt;nFilled;</p>
<p>}</p>
<p><strong>迷题寻解部分:</strong></p>
<p>1，查找可填的数字最少的单元格</p>
<p>__inline int XFindMinFeasibilityPos(XCells pCells)</p>
<p>{</p>
<p>int nMinCnt = 0xFF;</p>
<p>int nMinPos = -1;</p>
<p>// 在所有未填值的单元格中查找可填值方案最少的单元格</p>
<p>for(int i = 0; i &amp;lt; XGRID_AREA &amp;amp;&amp;amp; nMinCnt &amp;gt; 1;<br />
++i)</p>
<p>{</p>
<p>if(g_xPopCnt[pCells[i]] &amp;lt; nMinCnt)</p>
<p>{</p>
<p>nMinPos = i;</p>
<p>nMinCnt = g_xPopCnt[pCells[i]];</p>
<p>}</p>
<p>}</p>
<p>return nMinPos;</p>
<p>}</p>
<p>2，迷题求解，将递归展开。</p>
<p>__inline void XFillCellStack(XGrid* pGrid)</p>
<p>{</p>
<p>__declspec(align(16)) XCells xCellsStack[XGRID_AREA + 1];</p>
<p>__declspec(align(16)) int nMinPosStack[XGRID_AREA];</p>
<p>__declspec(align(16)) int nCurValStack[XGRID_AREA];</p>
<p>__declspec(align(16)) int nFilledStack[XGRID_AREA];</p>
<p>int nIndex = 0;</p>
<p>XMEMCOPY(xCellsStack[0], pGrid-&amp;gt;xCells, sizeof(xCellsStack[0]));</p>
<p>// 查找当前填值可能性最少的单元格</p>
<p>nMinPosStack[nIndex] = XFindMinFeasibilityPos(xCellsStack[nIndex]);</p>
<p>nCurValStack[nIndex] = xCellsStack[nIndex][nMinPosStack[nIndex]];</p>
<p>nFilledStack[nIndex] = pGrid-&amp;gt;nFilled;</p>
<p>while( (*pGrid-&amp;gt;pRunMode &amp;amp; XRUN_FIND_FLAG) &amp;amp;&amp;amp;<br />
nIndex &amp;gt;= 0 )</p>
<p>{</p>
<p>nFilledStack[nIndex + 1] = XFILL_INVALID;</p>
<p>// 尝试填充当前单元格</p>
<p>while(nFilledStack[nIndex + 1] == XFILL_INVALID &amp;amp;&amp;amp; nCurValStack[nIndex])</p>
<p>{</p>
<p>XMEMCOPY(xCellsStack[nIndex + 1], xCellsStack[nIndex], sizeof(xCellsStack[0]));</p>
<p>unsigned int nMask = nCurValStack[nIndex] &amp;amp; -nCurValStack[nIndex];</p>
<p>nCurValStack[nIndex] -= nMask;</p>
<p>nFilledStack[nIndex + 1] = XFillCell(pGrid, xCellsStack[nIndex + 1], nMinPosStack[nIndex],<br />
nMask);</p>
<p>}</p>
<p>if(nFilledStack[nIndex + 1] &amp;gt; XGRID_AREA)</p>
<p>{ // 没有合法的填充值</p>
<p>pGrid-&amp;gt;nFilled = nFilledStack[--nIndex];</p>
<p>}</p>
<p>else if(nFilledStack[nIndex + 1] != XGRID_AREA)</p>
<p>{ // 是合法的填充值, 还存在未填值的单元格</p>
<p>++nIndex;</p>
<p>// 查找当前填值可能性最少的单元格</p>
<p>nMinPosStack[nIndex] = XFindMinFeasibilityPos(xCellsStack[nIndex]);</p>
<p>nCurValStack[nIndex] = xCellsStack[nIndex][nMinPosStack[nIndex]];</p>
<p>}</p>
<p>else</p>
<p>{ // 所有单元格均已填值</p>
<p>XAddSolution(pGrid-&amp;gt;pRunMode, pGrid-&amp;gt;pTotal, xCellsStack[nIndex<br />
+ 1]);</p>
<p>pGrid-&amp;gt;nFilled = nFilledStack[nIndex];</p>
<p>}</p>
<p>}</p>
<p>}</p>
<p>3，迷题求解。</p>
<p>// 对pInput表示的迷题求解</p>
<p>int XFindGrid(int nRunMode, const char* pInput)</p>
<p>{</p>
<p>int nTotal = 0;</p>
<p>XGrid xGrid = {0};</p>
<p>xGrid.pRunMode = &amp;amp;nRunMode;</p>
<p>xGrid.pTotal = &amp;amp;nTotal;</p>
<p>// 加载网格数据</p>
<p>switch(XLoadGrid(&amp;amp;xGrid, pInput))</p>
<p>{</p>
<p>case 0:</p>
<p>nTotal = 1;</p>
<p>break;</p>
<p>case XFILL_INVALID:</p>
<p>break;</p>
<p>default:</p>
<p>// 填充单元格</p>
<p>XFillCellStack(&amp;amp;xGrid);</p>
<p>}</p>
<p>return nTotal;</p>
<p>}</p>
<p><strong>并行优化</strong>：</p>
<p><strong>串行模式</strong></p>
<p>for(int i = 0; i &amp;lt; nCount; ++i)</p>
<p>{</p>
<p>pTotal[i] = XFindGrid(nRunMode, pInput + i * nLineSize);</p>
<p>}</p>
<p><strong>使用OpenMP优化</strong></p>
<p>#pragma omp parallel for schedule(guided, 1)</p>
<p>for(int i = 0; i &amp;lt; nCount; ++i)</p>
<p>{</p>
<p>pTotal[i] = XFindGrid(nRunMode, pInput + i * nLineSize);</p>
<p>}</p>
<p><strong>使用TBB优化</strong></p>
<p>class CXSudokuTask</p>
<p>{</p>
<p>int m_nRunMode;</p>
<p>int m_nLineSize;</p>
<p>const char* m_pInput;</p>
<p>int* m_pTotal;</p>
<p>public:</p>
<p>CXSudokuTask(int nRunMode, int nLineSize, int* pTotal, const char* pInput)<br />
:</p>
<p>m_nRunMode(nRunMode), m_nLineSize(nLineSize),</p>
<p>m_pTotal(pTotal),m_pInput(pInput){ }</p>
<p>void operator () (const blocked_range&amp;lt;size_t&amp;gt; &amp;amp; r) const</p>
<p>{</p>
<p>for (size_t i = r.begin(); i != r.end(); ++ i)</p>
<p>{</p>
<p>m_pTotal[i] = XFindGrid(m_nRunMode, m_pInput + i * m_nLineSize);</p>
<p>}</p>
<p>}</p>
<p>};</p>
<p>parallel_for(blocked_range&amp;lt;size_t&amp;gt;(0, nCount, max(1, nCount/32)),</p>
<p>CXSudokuTask(nRunMode, nLineSize, pTotal, pInput));</p>
<p><strong>编译说明</strong>:</p>
<p>Windows平台:</p>
<p>使用ICC编译.不使用PGO【使用PGO后发现效率降低】</p>
<p>Linux平台:</p>
<p>使用Icpc编译</p>
<p>1.上传压缩包种的Src和Linux两个目录到服务器上.</p>
<p>2.进入Linux目录执行make</p>
<p>3.进入Bin目录 执行文件为XSoduku.</p>
<p>其他:</p>
<p>如果输出结果需要使用英文 请打开XSudoku.h第86行的XUSE_ENGLISH宏定义,默认使用汉语.</p>
<p>需要使用Intel编译器，TBB，请下载并正确安装。</p>
<p><a href="http://www.pudn.com/downloads158/sourcecode/mpi/111186773sudoku.rar" class="broken_link">点击下载源码</a></p>
<p><strong>测试结果</strong>:</p>
<p>测试平台：</p>
<p>Time of this report: 4/9/2009, 14:16:02</p>
<p>Machine name: X-DELL</p>
<p>Operating System: Windows XP Professional (5.1, Build 2600) Service Pack 3</p>
<p>(2600.xpsp_sp3_gdr.080814-1236)</p>
<p>Language: Chinese (Regional Setting: Chinese)</p>
<p>System Manufacturer: Dell Inc.</p>
<p>System Model: Vostro 1400</p>
<p>BIOS: Phoenix ROM BIOS PLUS Version 1.10 A03</p>
<p>Processor: Intel(R) Core(TM)2 Duo CPU T5270 @ 1.40GHz (2 CPUs), ~550MHz</p>
<p>Memory: 1014MB RAM</p>
<p>Page File: 812MB used, 1628MB available</p>
<p>Windows Dir: C:\WINDOWS</p>
<p>DirectX Version: DirectX 9.0c (4.09.0000.0904)</p>
<p>DX Setup Parameters: Not found</p>
<p>DxDiag Version: 5.03.0001.0904 32bit Unicode</p>
<p>测试数据为100k个迷题。</p>
<p>使用串行模式测试</p>
<p>XSudoku.exe d100k.in os &amp;gt; ret.txt</p>
<p>计算时间:0.411790秒</p>
<p>输出时间:0.032550秒</p>
<p>总时间:0.444407秒</p>
<p>使用OpenMP并行模式测试</p>
<p>Sudoku.exe d100k.in opo &amp;gt; ret.txt</p>
<p>计算时间:0.209383秒</p>
<p>输出时间:0.034386秒</p>
<p>总时间:0.243842秒</p>
<p>使用TBB并行模式测试</p>
<p>XSudoku.exe d100k.in opt &amp;gt; ret.txt</p>
<p>计算时间:0.216393秒</p>
<p>输出时间:0.035294秒</p>
<p>总时间:0.251751秒</p>
<p>函数性能分析:</p>
<p><a href="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/2.jpg"><img class="alignnone size-medium wp-image-400001717" title="2" src="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/2-300x187.jpg" alt="" width="600" height="375" /></a><br />
根据Vtune采集的数据分析:<br />
目前时间开销最大的是XFillCell该函数功能是为某个单元格填值.Debug版本CPI为0.869.效率比较理想<br />
其次是XFillCellLoad该函数功能是按输入的数据为某个单元格填值.Debug版本CPI为0.70.效率比较理想.</p>
<p><a href="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/1.jpg"><img class="alignnone size-medium wp-image-400001716" title="1" src="http://software.intel.com/zh-cn/blogs/wordpress/wp-content/uploads/2009/05/1-300x187.jpg" alt="" width="600" height="375" /></a></p>
<p>根据Thread Profiler采集数据的分析: 算法并行的时间达到94%,比较完美.</p>
]]></content:encoded>
			<wfw:commentRss>http://software.intel.com/zh-cn/blogs/2009/05/04/400001525/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

