I have an system with 32 processors (cores), but on large messages the performance decreases. Are there a set of variables I should change when going from a 16 core system to a 32 core system.
Have you tried using the tuning capability? This is described in the Intel MPI Library Reference Manual (Chapter4 in Windows* or Chapter3 in Linux*). We provide an automatic tuning utility that will test different values for the tuning parameters based on results from the Intel MPI Benchmarks or a different, user-specified application.
This should give you a starting point to improve your application's performance. If you can provide more details about the application (resource and communication usage in particular) and the systems in questions, I can try to give some additional information.
Sincerely,James TullosTechnical Consulting EngineerIntel Cluster Tools