Progress report of using MTL during 2nd half of 2010 June

Progress report of using MTL during 2nd half of 2010 June


Access to MTL in this period was part of a two phase project:
-- Phase 1: validating a design pattern to parallelize single thread
algorithms of in-place sorting (and beyond)
-- Phase 2: getting students of San Jose State University involved in
2010 Fall in cooperation with Prof Robert Chun

Executive Summary

The goals of Phase 1 were achieved through serendipity, partly, and not
according to the initial plan of testing C-based programs with Posix
threads. Instead, an equivalent Java based multi-threaded program
saved the day. Clay Breshears's assistance was crucial all along.
Mike Pearce's forum reply helped to reduce confusion.


The learning curve to get going is steep due to the need to use a VPN
(which kills off other web-access) and to use ssh. Lost a day at least
due to having received an erroneous password (which was also hard to
decipher on a received jpg image).

Compilation of a C-program with Posix threads required - to my
surprise - a parameter '-lpthread'.

Mastering the batch system was a hit and miss affair - even after down
loading the full-scale manual.

Only way later did I stumble on the fact that the login node 'acano01'
has multiple cpus/cores, which would have obviated launching my tiny
batch jobs.


It remains a mystery to this day why my Posix powered C-program does
not show any speedup, no matter how many threads are created that do
share a common workload. Clay claims now that the C-clock() function
is defective and misrepresents the actual running time. But why is
there no speed up on a 2-cpu MAC? Same defective clock() function?
I still wonder whether the cc/gcc compiler generates the proper code
to achieve proper thread-cpu association.

Success after all

It so happens that I have also a multi-threaded Java program that
uses the same design pattern and has the same functionality as the
Posix C-program. Running this program on acano01 confirmed excellent
# N cpus/cores N/1 time ratio
2 0.53
3 0.36
4 0.295

Next steps

- Testing the parallel version of SixSort - a LARGE Posix C-program that
uses the same design pattern

- This requires somehow finding a fix for the C clock() function or for
the thread-cpu association problem, if that is what is going on

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.