We encountered a problem when migrating a code from Intel MPI 4.1.3.049 to 5.0.3.048. The code in question is a complex simulation that first reads global input state from disk into several parts in memory and then accesses this memory in a hard to predict fashion to create a new decomposition. We use active target RMA for this (on machines which support this like BG/Q we also use passive target) since a rank might need data from the part that is at another rank to form its halo.
I'm getting bad performance with MPI barriers in a microbenchmark on this system configuration:
Hello, I would like to run an asynchronous calculation, but am having a hard time understanding with the intel user and reference guide are saying regarding this. I have code that looks like the following.
I have a system with 4 MIC cards.
When I start a process in offload mode on mic0, one core of other mic cards is occupied with coi_daemon process. Why?
Unfortunately, I get high variances in timing when other mic cards are used by other users.
This post covers two questions. I actually just need a (positive) answer for one of them, as that would be enough to solve my problem. But it would be nice to get an answer for both.
1. Is it possible to write to disk from the offload region?
2. How can I use memory allocated inside the offload region in the host?
I've got a problem with my program. It's simple code which show my problem.
Is it possible to setup an external network bridge for the PHI on using a windows based operating system?
This information is not provided in the User's Guide.
I have simply highlighted the PHI and External network adapters and created a bridge.
What are the next steps for this to work?
Do i absolutely need to use Linux for this feature?
I would like to ask if there is a way to use sssd i stead of ldap for user authentication to mic?
Thank you very much