significance/meaning of zero byte MPI messages (APS message profiling data)

TCE Options

TCE Level: 

TCE Open Date: 

Tuesday, January 14, 2020 - 15:58

significance/meaning of zero byte MPI messages (APS message profiling data)


Hi,
I recently tried the APS tool to capture the message details (size and amount) for the WRF application on the intel 8280 (opa).
We launch 1 process per core, and here is the data - 
nodes,Message_size(B),Volume(MB),Volume(%),Transfers,Time(sec),Time(%)
1,0,0,0,58099903,3988.14,97.05
2,0,0,0,219491539,7554.45,96.19
4,0,0,0,850730419,15073.44,96.02

It seems that significant amount of time is spent in the transfer of these 0 byte messages and with more number of nodes, the amount of messages increases. Could you please help me in understanding following-

Q: The significance of these 0 bytes messages and How are they related to MPI communication protocol? 

I guess aps collects messages transferred between all processes (inter node + intra node), so 
Q:Is there a way to check (from aps) that how much of these messages were transmitted to the network? (inter node messages - for 2 and 4 node runs)

4 posts / 0 new

Hi Puneet,

Thanks for reaching out to us. We are working on your issue. we will get back to you soon.

-Prasanth


Thank you for the reply. 
WRF version 3.9.1.1 and the dataset was conus 2.5km.


Messages to a target include a Tag in addition to data (if any). Thus you can pass status (information) via Tag as opposed to in the data blob. If I were to guess, I suspect that this is the cause of 0-byte messages. A guessed-at example might be a SYNC or Watchdog tickle, but the user application can do this as well.

Jim Dempsey

Leave a Comment

Please sign in to add a comment. Not a member? Join today