Record/replay overhead is a function of number of memory accesses and the amount of sharing in the test program.
Source : CGO2014 paper on DrDebug
Source: Measured with PinPlay kit 2.0. (we are continuously looking to improve these)
|Benchmark/Input|| How recorded/replayed
(pin -t pinplay-driver.so ...)
|SPEC2006/'ref'||-log:mt 0 / -replay:addr_trans 0||98x||11x|
|PARSEC/'native' >=4T||-log:mt 1 / -replay:addr_trans 0||197x||37x|
The design goals of PinPlay were:
As a result, PinPlay works on multiple operating systems 'out of the box' and provides the guarantee that a bug once captured will not escape. However, that comes with a high overhead, especially during recording.
There are two major sources of slow-down in PinPlay (we are continuously looking to improve these):
A shadow memory is implemented during recording. All real memory writes observed in the program are replicated on the shadow memory. Memory reads lead to a comparison of 'real' memory values and 'shadow' memory values and mismatch/missing value leads to an injection being emitted in the *.sel file. At replay time, all memory reads are monitored and recorded memory values are injected if present. The details are described in our SIGMETRICS 2006 paper "Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation".
The overhead of this technique is proportional to the number of memory accesses in the program.
During recording, all memory accesses are monitored and a cache coherency protocol is simulated including maintenance of last reader/writer for each shared memory access. A subset of detected read-after-write, write-after-read, and write-after-write dependences is recorded in the *.race file. During replay, all memory accesses are monitored and a thread is delayed if it tries to access a shared memory location out of order.
The overhead of this technique is proportional to the number of shared memory accesses in the program.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804