Using PBLAS for large matrix-vector products

Using PBLAS for large matrix-vector products

I have very large symmetric NxN matrix (N~4-5x10^4) and need to compute y=A*x. I have been using dspmv from blas 2 for matrix vector multiplication since the packed matrix allows me enough storage. How do parallelize this for a multi-core machine while not increasing my memory footprint greatly? Memory requirements for such a problem size is right now 6 GB and I want to limit to 8 GB. So unpacking into a full matrix is not an option.

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Please allow me to ask some questions for clarification:

1) Are you doing multi-threading (shared-memory) parallelization?
2) Would you like to parallelize matrix-vector multiplication by doing multiple dspmv calls from different threads?

I cannot think of an easy way to achieve (2). Also, matrix-vector multiplication for your problem sizes may show poor scaling for single-socket systems.

Thank you,

Leave a Comment

Please sign in to add a comment. Not a member? Join today