At the HPC site where I worked before joining Intel, I used a program called pmake to build applications in parallel. Pmake worked on distributed-memory parallel systems. One of the proprietary UNIX systems also had a regularmake program with a -P option for parallel builds.
In the Intel Itanium Architecture Software Developer's Manual, Volume 1:Application Architecture, Rev 2.1 Dated October 2002, in section 5.1, Data Types and Formats of the Floating-point Programming Model chapter, it states:
A seventh data type, IEEE-style quad precision, is supported by software routines. A future architecture extension may include additional support for the quad-precision real type.
I was just wondering if anyone had any experience of the new Intel Cluster Math Kernel Libs and how they compared with the regular MKL in a clustered environment.
I guess there are pros and cons of each?
Any practical feedback experience appreciated.
Can anyone tell me which Intel products support NUMA.
From my understanding NUMA means memory that is accessed by its own CPU (and can also be accessed by others) or shared memory that can be partitioned.
This topic has been moved to the Talk Back: Watercooler/Catchall forum.
I am new to HPC, and if I want to know more about HPC, what kinds of book should I read? or what kinds of knowledge should I have?
And what is the relationship between HPC and parallel computing?
We've had some problems in understanding performance counter results
on P4/Xeon (collected with VTune 2.0 on Linux). Maybe this is just a
series of misunderstandings with the documentation, but anyway:
1) In the IA32 Architecture Optimization document it is said that P4's
hardware counter "2nd Level Cache Read Misses" has bugs that can cause
miscounting by a factor of two. Since the measurements for same code with
same data size delivers reproducable counting results with vtune, this
From I know, IPF2 one cpu can get 92% efficiency, I am amazing the thunder gained 86% for 4096 number of cpu!
Does QsNet give so much good performance? I used QsNet1, it is no such good performance.
And another factor is tiger4 is not as good as tiger2 for its sharemem.
which expert could answer such question?
thank you very much
We are interested in knowing your HPC usage model - Is it a production cluster, or is it a development cluster? (This means, are you doing research and can afford a little downtime, or is this a production environment and cannot afford any downtime. If you can, how much downtime can you afford?)