VTune Amplifier XE causes system crash/restart

VTune Amplifier XE causes system crash/restart

We have an SGI UV1000 with 8 core Xeon E7's. 64GB of RAM. Operating system is Redhat Linux 6.0 SE.

We created a simple project to generate a Nehalem general hardware profiling report on the /bin/ls executable (pretty basic test)

When the project is run, our system freezes and reboots. This is not a hanging thread nor a specific process that is not responding - the entire system actually experiences a freeze. After reboot, we checked the results of the VTune run and found no results.

Is there some kernel configuration that we must modify in order to let the architecture-specific hardware profiling work?

10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Some background questions:

- What version of Amplifier XE are you using?
- How many total cores (including HT if on)?
- Are there any error messages in the /var/log/messages log starting with 'SEP3_1' or 'amplxe-runsa'?

(An FYI for the likely cause of the problem: The hardware sampling driver allocates memory and the size of the allocation is determined by the number of cores. It is these memory allocations that can fail on machines with extremely large core counts and cause the OS to freeze.)

Mark

Mark, thanks for the response:

- What version of Amplifier XE are you using?
We are using VTune Amplifier XE for Linux, version 2011 (Update 7) on active support.

- How many total cores (including HT if on)?
We have 32 physical CPUs, 8 cores per CPU, and hyperthreading is disabled.

- Are there any error messages in the /var/log/messages log starting with 'SEP3_1' or 'amplxe-runsa'?
We are checking on this now.

Mark,

There are no error messages with those terms in /var/log/messages.

However, the system console shows that just before the system freezeup, VTune tried to allocate more memory than was available.

Is there any way to (interactively or by configuration) restrict the amount of memory VTune allocates for each core?

>>...Operating system is Redhat Linux 6.0 SE

Is it a32-bit or 64-bit edition? Is it a Server Edition?

>>...However, the system console shows that just before the system freezeup, VTune tried to allocate more
>>memory than was available...

How much memorywasallocated before VTune crashed?
Did you try toincrease a virtual file size?

I could only assume that an incorrect processing is happening in VTune C/C++ codes, like:

...
*p = ( * )malloc( ... ); // or new(...), or calloc(...), etc

// malloc returns NULL because itfailed toallocate some amount of memory
// and processing continues because there is no verification that p is equal to NULL

// and of courseVTune crashes...
...

64-bit. Redhat Linux SE stands for Security Enabled. The system is running in permissive mode, non-virtual.

I will check on memory allocation size before the crash.

I think you misunderstand the issue. The entire system is crashing, not just VTune.

I was hoping a simple configuration change to the memory allocation could be applied as a temporary fix.

This issue is also being
worked via case 657396.

- Rob

Quoting cachecoherent64-bit. Redhat Linux SE stands for Security Enabled. The system is running in permissive mode, non-virtual.

I will check on memory allocation size before the crash.

[SergeyK] Any details?

I think you misunderstand the issue. The entire system is crashing, not just VTune.

[SergeyK] I understood the problem completely and I explained why it happens. Another possible
reasonthat VTune corrupts an operating system stack after its memory requestfailed.
After that OS crashes.

I was hoping a simple configuration change to the memory allocation could be applied as a temporary fix.

[SergeyK] If you install more RAM that could help.

Best regards,
Sergey

Memory allocation size before the crash was shown as something small, around 2GB, much smaller than system RAM. That is the last message in the log, however. Probably not the last operation that occurred.

We have run the latest patch of VTUNE (Dec 20 2011) and the crash still occurs.

Now when we experience a crash, the system does not log a memory allocation. Instead it goes straight into kernel panic.

The system has 2TB RAM in total (64GB per processor, 32 processors). We will not be installing any more RAM. We would expect that 2 Terabytes of RAM is sufficient for the execution of VTUNE Amplifier on /bin/ls.

As per Intel's recommendation, we ran the same Nehalem General Exploration test on the Tachyon sample application that ships with VTUNE Amplifier XE 2011 and found the same result.

I have attached the uvcon log from our machine showing the contents of the kernel panic. Hope that helps.

Attachments: 

AttachmentSize
Download uvcon.log254.83 KB

Attached is the result from the Amplifier Feedback reporting tool:

amplxe-feedback.exe --create-bug-report=report.txt

Attachments: 

AttachmentSize
Download sep_report.zip128.77 KB

Leave a Comment

Please sign in to add a comment. Not a member? Join today