Haskell: forkOS vs forkIO

Haskell: forkOS vs forkIO

Hi,
First of all I want to boast I've successfully compiled and used cisco vpnclient version for Linux for the first time since few years ;)I know I'm not haskell guru or something but I wrote simple Monte-Carlo pi computing multi-threaded implementation. I wrote one version using forkIO function and the second one using forkOS. I did simple benchmark too:
number of points | number of threads | pi_forkIO | pi_forkOS-----------------------+--------------------------+-------------+---------------1000000 | 2 | 8.576s | 8.707s 1000000 | 4 | 6.430s | 8.711s 1000000 | 8 | 4.789s | 8.691s 1000000 | 16 | 3.964s | 8.661s 1000000 | 32 | 3.753s | 8.699s 1000000 | 64 | 3.626s | 8.731s 1000000 | 512 | 3.541s | 8.774s
Why there is such a big difference between using these two functions?Below one of my implementations (the second one is using forkIO instead of forkOS)

module Main where

import Control.Monad
import Control.Concurrent
import System.Random
import System.IO.Unsafe

------------------------------------------------------------------------------
-- The computing part of program
------------------------------------------------------------------------------
rand :: IO Int
rand = getStdRandom ( randomR (0, maxBound) )

shooting :: Int -> Int -> Int -> IO Int
shooting n throws hits  
    | throws == n  = return hits
    | otherwise    = do
                     throw <- compute hits
                     shooting n (throws + 1) $! throw

compute :: Int -> IO Int
compute hits = do
    x <- rand
    y <- rand
    return $ (inCircle (toDouble x) (toDouble y) ) + hits
  where 
    toDouble a = (fromIntegral a) / (fromIntegral (maxBound :: Int) ) * 2.0 - 1.0

inCircle :: Double -> Double -> Int
inCircle x y = 
    if circle <= 1.0
    then 1
    else 0
  where
    circle = x*x + y*y


------------------------------------------------------------------------------
-- The concurrent part 
------------------------------------------------------------------------------
threadsNumber :: Int
threadsNumber = 512

threads :: MVar [MVar Int]
threads = unsafePerformIO (newMVar [])

waitForThreads :: Int -> IO Int
waitForThreads nk = do
    children <- takeMVar threads
    case children of 
      []     -> return nk
      c:ch   -> do
                putMVar threads ch
                k <- takeMVar c
                waitForThreads $ nk + k

doFork :: Int -> Int -> IO ThreadId
doFork nthreads num 
  | nthreads == 1   = fork num
  | otherwise       = do 
                      fork num
                      doFork (nthreads - 1) num 
  where
    fork num = do 
        child    <- newEmptyMVar
        children <- takeMVar threads
        putMVar threads (child:children)
        forkOS $ ( (shooting num 0 0) >>= (putMVar child) )

main = do
    putStrLn "Give me a number of points: "
    num <- getLine
    let tnum = floor $ (fromIntegral $ read num) / (fromIntegral (threadsNumber + 1) )
    doFork threadsNumber tnum 
    nk2 <- shooting ( (read num) - threadsNumber * tnum) 0 0
    nk  <- waitForThreads 0
    let nks = nk + nk2
    putStrLn $ "Number of hits: " ++ (show $ nks)
    putStrLn $ "pi: " ++ (show $ 4.0 * (fromIntegral nks) / (fromIntegral $ read num) )

Best regards,Karol Samborski

2 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Hi. 

I cannot answer about the difference between forkOS and forkIO
but I have rewritten your program to be shorter and clearer.

And faster - in particular, I think that in your version, randomRIO might be a bottleneck,
since this is a global state update (of the system random generator), which cannot be good thing. 

(Either it forces synchronization of threads, thus it is costly, or it is not thread-safe. 
Anyway that's exactly what the various split-functions in System.Random are meant for.)
Note that you don't need any IO inside one thread, but you need to make sure
that the value you're writing is fully evaluated (hence, the seq before writeChan).

(PS: I don't see how to include verbatim source text with this forum editor.
It seems to ignore my lambdas (after the two dollar signs).
Is there a way to switch off all wysiwyg and HTML and whatnots,
and have plain ASCII with fixed width fonts?)

-- | compile: ghc --make pi -threaded -O2 -rtsopts -- | run: time ./pi 1000000 1000 +RTS -N8 import Control.Monad ( forM ) import Control.Concurrent ( forkIO, newChan, writeChan, readChan ) import System.Random ( newStdGen, randomRs ) import System.Environment ( getArgs ) main = do [ points, threads ] <- getArgs run ( read points :: Int ) ( read threads :: Int ) run points threads = do putStrLn $ unwords [ "points", show points, "threads:", show threads ] let radius = 1 :: Double collector <- newChan forM [ 1 .. threads ] $ \ t -> forkIO $ do genx <- newStdGen geny <- newStdGen let hits = length $ filter ( \ (x,y) -> x ^ 2 + y ^ 2 < radius ^ 2 ) $ take ( div points threads ) $ zip ( randomRs ( 0, radius ) genx :: [ Double ] ) ( randomRs ( 0, radius ) geny :: [ Double ] ) seq hits $ writeChan collector hits counts <- forM [ 1 .. threads ] $ \ t -> readChan collector let approx_pi = 4 * fromIntegral ( sum counts ) / fromIntegral points putStrLn $ unwords [ "approx. pi:" , show approx_pi ]

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui