For me, it fails consistently when compiled with debug symbols, no optimization, as recommended for thread checker. It seems it may use a shared temporary in the array constructor.
If there is a race in the non-optimized implementation of the array constructor, depending on optimization to keep the values in register and avoid a race may produce results which depend on your compiler version.
With a current compiler, with optimization, instead of a race condition, Thread Checker complains about closing of a synchronization object at the final deallocate. This doesn't make much sense to me, particularly as that deallocation would be performed implicitly at the end of the subroutine in any case.