I_MPI_FABRICS: ofa or dapl ?

I_MPI_FABRICS: ofa or dapl ?

Аватар пользователя Guillaume De Nayer

Hi,

Is ofa faster than dapl ? or dapl faster than ofa ? or it depends on the hardware ?

Best regards,
Guillaume

14 сообщений / 0 новое
Последнее сообщение
Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.
Аватар пользователя James Tullos (Intel)

Hi Guillaume,

To my knowledge, there is no general rule of which is better. If you need particular support from one or the other, then use that one. Otherwise, whichever one is better for your system and application is the one you should use.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Аватар пользователя drMikeT

Hello James,

I have some recollection seeing either in Intel forums or in one of the articles published on the Intel site for developers, a statement mantioning that OFED verbs is the direction for future Intel MPI as oppoased to DAPL which was "emphasized" earlier. Are OFED verbs going to be the main thrust for transport for Intel MPI?

The reason I am asking this is that having to benhcmark both dapl and verbs is couble the effort and it would be nice to focus benchmarkings on the "better" transport.

thanks
-Michael

R/D High-Performance Computing and Engineering
Аватар пользователя James Tullos (Intel)

Hi Michael,

If you can find where you saw that statement, it would be helpful. I'll ask around internally to see if there is anything official regarding a future focus on OFED, but there is nothing at this time.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Аватар пользователя Gergana Slavova (Intel)

Hey Michael,

Unfortunately, that answer is not as straight-forward as we'd like to think. Certainly, we're investing time and effort in directly supporting OFED verbs via the ofa fabric because we feel it's worth it; it gives us the ability to optimize directly for the OFED software stack and has some nice fringe bandwidth benefits via the multi-rail support.

On the flip-side, if you don't have OFED installed, your other option (when taking advantage of DAPL-enabled fabrics) is dapl. That in itself gives us good scalability via DAPL UD, as well as support for things like iWarp*, XPMEM*, etc.

The one thing we can recommend is: if you do have OFED installed, take advange of it through ofa. As James says, we'll see if there are any other opinions internally and let you know.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
Аватар пользователя drMikeT

Hey Gergana,

(long time no see ... :)

So if I use OFED verbs I would be getting at least as good performance as using the DAPL transport?

take care
_michael_

R/D High-Performance Computing and Engineering
Аватар пользователя drMikeT

Hi James,

I will try to look around to locate the document....

In your oppinion is there any tangible performance difference going OFED verbs or DAPL? Is daplv2 'better' transport than daplv1?

take care
Michael

R/D High-Performance Computing and Engineering
Аватар пользователя Gergana Slavova (Intel)

Hey Michael :)

Actually, if you're using OFED verbs with ofa, we'd like to see better performance but, as a baseline, it should be at least as good as dapl. And, certainly, if you find that's not the case, we'd like to know that as well.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
Аватар пользователя Gergana Slavova (Intel)
Quoting drMikeT In your oppinion is there any tangible performance difference going OFED verbs or DAPL? Is daplv2 'better' transport than daplv1?

That might be a better question for one of the guys on the OFED team. I'll ping him to ask if he has any advice. In general, though, DAPL 2.x certainly has more functionality than DAPL 1.x (for example, the multi-rail support); it's certainly better optimized in terms of scalability, performance, etc.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
Аватар пользователя drMikeT

OK, I will get more comprehensive benchmarking results using the IMB 3.2.3 suite to compare DAPL2 vs OFED verbs. Incidentally we are still at Intel MPI 4.0.0.28 (we will upgrade if we can extend the license). Do you think we may get performance improvements if we move to more recent release ?

BTW, I have been benhcmarking 3 MPI stacks and various versions for each one of them (intel, OpenMPI and MVAPICH2) and Intel so far is holding its ground well.

regards, -michael

R/D High-Performance Computing and Engineering
Аватар пользователя Gergana Slavova (Intel)

Excellent! We actually have a new process manager in our latest 4.0 Update 3 release (Hydra PM vs. the old MPDs) so I'd definitely upgrade. It'll be intersting to see the difference in performance from your side (make sure to use mpirun instead of mpiexec).

Also, if you need help with licensing, go ahead and send me an e-mail offline. You can also grab an evaluation copy from www.intel.com/go/mpi.

Quoting drMikeT BTW, I have been benhcmarking 3 MPI stacks and various versions for each one of them (intel, OpenMPI and MVAPICH2) and Intel so far is holding its ground well.

That's pretty nice to hear. You might have just made my Thursday :)

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
Аватар пользователя Gergana Slavova (Intel)
Best Reply

Hey Michael,

Just to close the loop on your original question; this is directly from the main OFED developer in regards to performance of the different fabrics:

ofa or daplv2 is fine, performance is pretty much a wash. daplv2 gives you RC plus UD and hardware based collectives from Mellanox. ofa supports only reliable connections (RC) and doesnt support UD or offloaded collectives. However, ofa supports multi-rail and daplv2 doesnt.

Basically, if you don't care about all the extra bells and whistles, you're ok with either. As I mentioned before, if you have OFED installed already, we recommend using the ofa option for Intel MPI Library.

As another aside, I know a lot of customers have moved off of daplv1 so I'm not sure how long that'll be around. The Open Fabrics guys will announce any changes on their website.

I hope this helps.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
Аватар пользователя drMikeT

Gergana, this is good concise information. So the key points are that, if you care about UD and/or off-loaded collectives try uDAPL, otherwise stay with OFED verbs.

I will keep this as guidelines with the current Intel MPI.

thanks again
Michael

R/D High-Performance Computing and Engineering
Аватар пользователя Paul C.

Hi Gergana,

Apologies to ressurect an old thread, but could you clarify something by putting this question to your main OFED developer that you quoted in your last post in this thread?

In my understanding if a system that only has underlying Infiniband connectivity (i.e. not iWarp or anything else which DAPL could use as an alternative) then DAPL exists solely as an abstracted layer on top of OFA/Infiniband.

Further discussion here for example: http://thegeekinthecorner.wordpress.com/2010/08/14/on-dapl/

If this is the case, then how can DAPL offer more functionality than OFA does if DAPL relies solely on OFA as the only layer beneath it? Or was your OFED developer stating that when there are more options under DAPL in the stack (e.g. iWarp, anything else?) then DAPL can choose to use either option in order to provide more functionality to the application compared to using OFA directly?

Thanks!

Зарегистрируйтесь, чтобы оставить комментарий.