How does it work without threads?!?

How does it work without threads?!?

Okay, I'm baffled. All of the literature I've been able to access says it's "better than threads", "18x faster than threads", etc., but I haven't found anything yet that explains the actual mechanism by which this is all implemented. Templates are nice, but it all has to be code (maybe even machine code) eventually.

How is this accomplished without calls to clone(1) under Linux? Or, if it does use clone(1), why would startup be so much faster?

Also, is it possible to use tbb in a .so file, where the loading program has no knowledge of it? It seems like if clone is involved, the parent program might be surprised to have children, possibly be receiving signals, etc.

Is there a TR somewhere that describes the implementation?


4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Intel TBB has threads. In fact, it maintains a thread pool that by default is created with one thread per processing element. What you're seeing with the "better than threads" comments is a desire to shift the focus off those threads and onto the actual work the developer is trying to accomplish. Threads are a construct tied to the particular hardware your program is running upon. "Tasks" are units of work that canbe described without reference to the underlying threads. There's an intendedphilosophy shift towards thinking about the work, not the workers, and to leave worker management to the library. And now that Threading Building Blocks has been provided to the open source community, you can download the code and see how it works for yourself.

Thanks for your reply. I'm very happy to see Intel open source this code.

That said, it would be useful if there were also a concise technical summary available of the mechanism and compatibility with other techniques. For my own project, for example, I'm curious whether TBB-based code could be linked into a Python module. This would be very useful, but it's not obvious whether TBB code would be compatible with the Python interpreter.

I don't know much about linking to Python, but a quick scan of the Python/C API reference manual suggests to me that the relationship between Python and its C extensions is fine grained and tightly coupled. My first concern about integrating a TBB managed module into this environment would be thethread safety of that environment. TBB has its own concurrent containers because the STL versions are not thread-safe. Are there similar issues with Python data structures? Consider reference counts. TBB offers an atomictemplate class to provide thread safety for bumping them in a multi-processor environment. Are Py_INCREF() and Py_DECREF() thread safe?

Given these sorts of questions, I would be very careful about attempting such an integration. Though it may not match the philosophy of embedded C in Python, I'd start with aself-contained plug-in that exposes parallel programming through its use ofTBB to complete some specific computationally intense task and doesn't rely on Python object structures intervening in that computation.

Leave a Comment

Please sign in to add a comment. Not a member? Join today