Intel® Trace Collector intercepts all calls related to buffered sends and simulates the worst-case scenario that the application has to be prepared for according to the standard. By default (
GLOBAL:DEADLOCK:POTENTIALenabled) it also ensures that the sends do not complete before there is a matching receive.
By doing both it detects several different error scenarios which all can lead to insufficient available buffer errors that might not occur depending on timing and/or MPI implementation aspects:
Buffer Size:The most obvious error is that the application did not reserve enough buffer to store the message(s), perhaps because it did not actually calculate the size with
MPI_Pack_size()or forgot to add the
MPI_BSEND_OVERHEAD. This might not show up if the MPI implementation bypasses the buffer, for example, for large messages. See the
local_buffered_send_sizeexample at the online samples resource.
Race Condition:Memory becomes available again only when the oldest messages are transmitted. It is the responsibility of the application to ensure that this happens in time before the buffer is required again; without suitable synchronization an application might run only because it is lucky and the recipients enter their receives early enough. See the
local_buffered_send_policyexamples at the online samples resource.
MPI_Buffer_detach()will block until all messages inside the buffer have been sent. This can lead to the same (potential) deadlocks as normal sends. See the
local_buffered_send_deadlockexample at the online samples resource.
Since it is critical to understand how the buffer is currently being used when a new buffered send does not find enough free space to proceed, the
LOCAL:BUFFER:INSUFFICIENT_BUFFERerror message contains all information about free space, active and completed messages and the corresponding memory ranges. Memory ranges are given using the [
<end address>] notation where the
<end address>is not part of the memory range. For convenience the number of bytes in each range is also printed. For messages this includes the
MPI_BSEND_OVERHEAD, so even empty messages have a non-zero size.