Loading...
You are not logged-in Login/Register





  • Posts   Search Threads
  • balahindustani@gmail.comJanuary 2, 2006 8:25 AM PST   
    Intel optimised Linpack scaling comparision

    The intel optimised linpack (n=1000) that is run a 16 way server has given the following results that is causing a little concern.

    with
    1 thread and 1 cpu : 5.8731 Gflops
    4 thread and 4 cpu : 15.101 Gflops
    8 thread and 8 cpu : 15.107 Gflops
    16 thread and 16 cpu : 12.471 Gflops

    I am not to trace what could be the problem?
    Please help.

    Bala

    Clay Breshears (Intel)January 3, 2006 5:34 PM PST
    Rate
     
    Re: Intel optimised Linpack scaling comparision

    Bala -

    At only 1000 rows and columns, the workload is likely too small to sustain good speedup for 16 threads (just over 60 rows per thread).  System and threading overheads are likely taking a relatively larger fraction of time to the work being done, which reduces Gflops.

    What kind of speeds do you get for larger values of n (e.g., 5000, 10000, 20000)?

    --clay



Forum jump:  

Intel Software Network Forums Statistics

16,377 users have contributed to 46,364 threads and 164,041 posts to date.

In the past 24 hours, we have 9 new thread(s) 31 new posts(s), and 20 new user(s).

In the past 3 days, the most popular thread for everyone has been Program compiles in release but not debug The most posts were made to You need to show us the whole The post with the most views is vectorization of sin/cos results in wrong values

Please welcome our newest member fruitbrown


For more complete information about compiler optimizations, see our Optimization Notice.