• 2019 Update 3
  • 03/07/2019
  • Public Content
  • Download as PDF
Contents

(LOCAL:BUFFER:INSUFFICIENT_BUFFER)
Intel® Trace Collector intercepts all calls related to buffered sends and simulates the worst-case scenario that the application has to be prepared for according to the standard. By default (
GLOBAL:DEADLOCK:POTENTIAL
enabled) it also ensures that the sends do not complete before there is a matching receive.
By doing both it detects several different error scenarios which all can lead to insufficient available buffer errors that might not occur depending on timing and/or MPI implementation aspects:
Buffer Size:
The most obvious error is that the application did not reserve enough buffer to store the message(s), perhaps because it did not actually calculate the size with
MPI_Pack_size()
or forgot to add the
MPI_BSEND_OVERHEAD
. This might not show up if the MPI implementation bypasses the buffer, for example, for large messages. See the
local_buffered_send_size
example at the online samples resource.
Race Condition:
Memory becomes available again only when the oldest messages are transmitted. It is the responsibility of the application to ensure that this happens in time before the buffer is required again; without suitable synchronization an application might run only because it is lucky and the recipients enter their receives early enough. See the
local_buffered_send_race
and
local_buffered_send_policy
examplesat the online samples resource.
Deadlock:
MPI_Buffer_detach()
will block until all messages inside the buffer have been sent. This can lead to the same (potential) deadlocks as normal sends. See the
local_buffered_send_deadlock
exampleat the online samples resource.
Since it is critical to understand how the buffer is currently being used when a new buffered send does not find enough free space to proceed, the
LOCAL:BUFFER:INSUFFICIENT_BUFFER
error message contains all information about free space, active and completed messages and the corresponding memory ranges. Memory ranges are given using the [
<start address>
,
<end address>
] notation where the
<end address>
is not part of the memory range. For convenience the number of bytes in each range is also printed. For messages this includes the
MPI_BSEND_OVERHEAD
, so even empty messages have a non-zero size.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804