Simple question on threading

Simple question on threading

We are a member of the Intel Software Partner program. Our product is in the field of semantic-based search and natural language processing.

The core technology will likely involve massively parallel computations as part of an artificial intelligence algorithm. The form of parallelism we envision is very simple and does not involve graphics computations.

My question: how can I get very simple, limited input on the best use of the Intel architecture to do this?

Issue: we have afundamentalquestion on approach: parallelism via the CPU (i.e., Intel) versus alternative hardware like FPGAs or GPGPUs (aka, Nvidia). Iknow, boo/hiss, I said the Nvidia word.

My preference is to stick with a commodity platform, like Intel, rather than consider alternatives like FPGA or GPGPU hardware. Very high level guidance or pointers to literature that provide such an answerwould be greatly appreciated. I don't think it shoud take long.

Caveat: we need to have the ability to scale to a web-scale with the least cost/hardware footprint.

Thank you in advance for the assistance.

George J. Shannon
President, Raphael Analytics, Inc.
4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello George,

I've moved this thread to the Threading on Intel Parallel Architectures forum because we have phased out the Intel Software Partner Program forum. Your question is certainly on topic here as well.

Thanks for your question.

Best regards,

==
Aubrey W.
Intel Software Network Support

George,

Without seeing your full functional specification (which you might not be able to disclose) it is rather difficult to offer advice....

My experience in programming GPGPU's goes back a few years, and much has happened since then so my information may be out of date.

The GPGPU programming was relatively easy for simple problems. However, as complexity of the problem increased the GPGPU programming became harder. Especially when you had to do debugging.

Programming complex problems on an Intel64 platform is much easier, especially when using an IDE such as Visual Studio and/or Eclipse.

So you have to look at the programming considerations.

For GPGPU you have two principal vendors: nVidia (Tesla) and ATI (FireStream). The work you do for one might not port to the other (although this may have changed).

Intel64 and/or AMD64 selection provides for a compatible programming and debugging environment (different environments between Windows and Linux).

>>Caveat: we need to have the ability to scale to a web-scale with the least cost/hardware footprint.

This is difficult to say without the particulars.

Are we talking single workstation, large server, farm of servers, cloud of servers,... other
Does power and facillities cost factor in.
How does time factor in?
...

Jim Dempsey

www.quickthreadprogramming.com

I think your actual questions may be obscured by buzzwords, but I'll take a stab.
Parallel programming models continue to be developed which are intended to cover the range of architectures you mention.
Intel compilers have supported OpenMP as well as Windows or pthreads parallelism, for several years. The TBB threading model for C++ has also become well established. These work with multiple vendors' products. Current Intel C++ has added Cilk+ and ABB namespaces for threaded parallelism. They are being extended to the Intel Many-core MIC architecture.
For a degree of portability across GPGPU-like architectures, among the available alternatives are extended OpenMP models with traditional programming languages, such as PGI cuda compilers and similar Intel MIC compilers, and OpenCL. This area is evolving rapidly, with most vendors aiming to converge in some fashion with "big CPUs" over the next 5 years.
When you speak of web scale, a possibility which comes to mind is Hadoop clustering on Java based model. Big strides in expanding deployment and efficiency have been made this year. It seems likely that serious efforts to incorporate massive parallelism will be undertaken over 2 or 3 years, but the software and hardware components probably don't exist yet, so this clearly falls in the research topic area.

Leave a Comment

Please sign in to add a comment. Not a member? Join today