I have come to an interresting subject, so be smart and follow with me
I have tried to do a worst scalability prediction with an HDD hardisk
for my parallel archiver(you will find my parallel archiver here:
http://pages.videotron.com/aminer/) with Parallel LZMA, and i think it's worst than what i have thought..
there is four things in my Parallel LZMA algorithm:
First we have to copy serially a stream from the hardisk to the memory
and this will take in average 0.9 second and in the compression method we have to copy a stream to the memory and this will take in average 0.01 second and in the compression method you have to compress a stream to another stream in memory and this will take in average 3.1 seconds
and in the compression method you have to copy a compressed stream to a hardisk file and this will take in average 0.01 second.
So we have the serial part that is: 0.9 second + 0.01 second + 0.01 second
and the parallel part will that is: 3.1 second
So the worst case scalability scenario using an HDD and using the Amdahl equation will
give us: 1 / 0.22% + (0.77%/N) (N: is the number of cores)
So this will scale up to: 4.54X , so as you have noticed with an HDD hardisk this not good..
So what can we do to scale more parallel archiver using parallel LZMA ?
You can for example use a RAID 10 with a base configuration of 4 HDD hardrives, so this will cut in 4 the 0.9 second and the 0.01 second , so this will give a scalability of 16.9X and this is better.. but
to speed more the things we can use SSD hardrives that are 2X time faster than a HDD hardrives using a RAID 10
configuration and this will give: 35X worst case scalability.
So as you have noticed if you are using only an HHD
with a multicore system you will not get more than 4.54X with my parallel archiver using parallel LZMA, so you have to use a RAID 10 with SSD drives to scale it up to 35X.
And this is why i have talked about RAID 10 etc.
Amine Moulay Ramdane.