# Graph Algorithms: Shortest Path

Dijkstra algorithm is a graph search algorithm that solves the single-source shortest path problem for a graph with non-negative edge path costs, producing a shortest path tree. The algorithm requires repeated searching for the vertex having the smallest distance and accumulating shortest distance from the source vertex. This example calculates the shortest path between each pair of vertexes in a complete graph having 2000 vertexes using Dijkstra algorithm. It run serially, with Intel® Cilk™ Plus Array Notation (AN) for vectorization, with Intel Cilk Plus cilk_for for parallelization, and with both vectorization and cilk_for.

#### Code Change Highlights:

• `cilk_for`
• linear version:
```221 int i;
222 // Temporary array storing intermedia path length result to each vertex
223 unsigned int vtemp[VNUM];
224 // Flag array:
225 // "1" means the shortest path hasn't been finished
226 // "0" menas the shortest path has been finished
227 unsigned char vflag[VNUM];
228
229 // Main loop calculate the shortest path to all other vertexes from vertex "i" in each iteration
230 for (i = 0;i < VNUM;++i) {

...

270 }```
`cilk_for` version:
```331 cilk_for (int i = 0;i < VNUM;++i) {
332 // Declare temporary arrays inside "cilk_for" loop body to make them private
333 // Temporary array storing intermedia path length result to each vertex
334 unsigned int vtemp[VNUM];
335 // Flag array:
336 // "1" means the shortest path hasn't been finished
337 // "0" menas the shortest path has been finished
338 unsigned char vflag[VNUM];

...

377 }```
• Array Notation
• Searching for the vertex having the smallest distance
Scalar version:
```248 minval = minpos = INFINITE;
249 // Loop scan vtemp to find the index of vertex having the shortest path in vtemp and its length
250 for (k = 0; k < VNUM;k++)
251   if (vtemp[k] < minval) {
252       minpos = k;
253       minval = vtemp[k];
254 }```
Array notation version:
```306 minpos = __sec_reduce_min_ind(vtemp[:]);
307 minval = vtemp[minpos];```
Accumulating shortest distance from the source vertex
Scalar version:
```264 for (k = 0; k < VNUM;k++)
265     if (vflag[k] && ((m_graph[minpos][k] + minval) < vtemp[k])) {
266        vtemp[k] = (m_graph[minpos][k] + minval);
267        m_pvertex[i][k] = minpos;
268    }```
Array notation version:
```317 if (vflag[:] && ((m_graph[minpos][:] + minval) < vtemp[:])) {
318      vtemp[:] = (m_graph[minpos][:] + minval);
319      m_pvertex[i][:] = minpos;
320 }```
• `cilk_for` + Array Notation
• linear version:
```221 int i;
222 // Temporary array storing intermedia path length result to each vertex
223 unsigned int vtemp[VNUM];
224 // Flag array:
225 // "1" means the shortest path hasn't been finished
226 // "0" menas the shortest path has been finished
227 unsigned char vflag[VNUM];
228
229 // Main loop calculate the shortest path to all other vertexes from vertex "i" in each iteration
230 for (i = 0;i < VNUM;++i) {

...

248         minval = minpos = INFINITE;
249         // Loop scan vtemp to find the index of vertex having the shortest path in vtemp and its length
250         for (k = 0; k < VNUM;k++)
251             if (vtemp[k] < minval) {
252                 minpos = k;
253                 minval = vtemp[k];
254             }

...

264         for (k = 0; k < VNUM;k++)
265             if (vflag[k] && ((m_graph[minpos][k] + minval) < vtemp[k])) {
266                 vtemp[k] = (m_graph[minpos][k] + minval);
267                 m_pvertex[i][k] = minpos;
268             }

...

270 }```
`cilk_for` + Array Notation version:
```386 cilk_for (int i = 0;i < VNUM;++i) {
387 // Declare temporary arrays inside "cilk_for" loop body to make them private
388 // Temporary array storing intermedia path length result to each vertex
389 unsigned int vtemp[VNUM];
390 // Flag array:
391 // "1" means the shortest path hasn't been finished
392 // "0" menas the shortest path has been finished
393 unsigned char vflag[VNUM];

...

412      vtemp[:] = (m_graph[minpos][:] + minval);
413      m_pvertex[i][:] = minpos;

...

423      if (vflag[:] && ((m_graph[minpos][:] + minval) < vtemp[:])) {
424          vtemp[:] = (m_graph[minpos][:] + minval);
425          m_pvertex[i][:] = minpos;
426      }

...

428 }```
This simple change creates code that ran about 5x faster on our machine.

#### Performance Data:

Note: Modified Speedup shows performance speedup with respect to serial implementation.

Modified Speedup Compiler (Intel® 64) Compiler options System specifications
AN: 2.0x
cilk_for: 4.4x
Both: 8.3x
Intel C++ Compiler 15.0 for Windows /O2 /QxAVX /Qipo Microsoft Windows Server 2012* (x64)
2rd generation Intel® Xeon® E3 1280 CPU @ 3.50GHz
8GB memory
AN: 2.1x
cilk_for: 4.6x
Both: 8.2x
Intel C++ Compiler 15.0 for Linux -O2 -xAVX -ipo RHEL* 7 (x64)
4rd Generation Intel Core™ i7-4790 CPU @ 3.60GHz
32GB memory

### Build Instructions:

• For Microsoft Visual Studio* 2010 and 2012 users:
• Open the solution `.sln` file
[Optional] To collect performance numbers (will run example 5 times and take average time):
• Project Properties -> C/C++ -> Preprocessor -> Preprocessor Definitions: add `PERF_NUM`
Choose a configuration (for best performance, choose a release configuration):
• Intel-debug and Intel-release: uses Intel® C++ compiler
• VSC-debug and VSC-release: uses Visual C++ compiler (only linear/scalar will run)
• For Windows* Command Line users:
• Enable your particular compiler environment
For Intel® C++ Compiler:
• Open the appropriate Intel C++ compiler command prompt
• Navigate to project folder
• Compile with `Build.bat [perf_num]`
• `perf_num`: collect performance numbers (will run example 5 times and take average time)
• To run: `Build.bat run [help|0|1|2|3|4]`
For Visual C++ Compiler (only linear/scalar will run):
• Open the appropriate MicrosoftVisual Studio* 2010 or 2012 command prompt
• Navigate to project folder
• To compile: `Build.bat [perf_num]`
• `perf_num`: collect performance numbers (will run example 5 times and take average time)
• To run: `Build.bat run`
• For Linux* or OS X* users:
• Set the icc environment: `source <icc-install-dir>/bin/compilervars.sh {ia32|intel64}`
Navigate to project folder
For Intel® C++ compiler:
• To compile: `make [icpc] [perf_num=1]`
• `perf_num=1:` collect performance numbers (will run example 5 times and take average time)
• To run: `make run [option=help|0|1|2|3|4]`
For gcc (only linear/scalar will run):
• Compile with `make gcc [perf_num=1]`
• `perf_num=1:` collect performance numbers (will run example 5 times and take average time)
• To run: `make run`
Pour de plus amples informations sur les optimisations de compilation, consultez notre Avertissement concernant les optimisations.