Lock or Be Free

Jeff Andrews wrote a great article about multithreaded game engines over on Whatif.intel.com. These are the concepts that the Smoke (n-way threaded game framework) demo was built on. One of the readers, Josh, brought up a good comment (check out the comments at the bottom of the article). One of the questions in his comment really got me thinking about lock step versus free step. Which is better? Which is easier? Which does Smoke use?

The last question is the easiest: Smoke uses lock step. Jeff goes into some good detail in section 2.1 about these two execution modes. Lock step just means all systems update at the same rate. For each frame, all systems update and sync. If a system takes longer to run, the other systems have to wait for it to finish. This is the biggest complaint against lock step. In free step, all systems run at separate frequencies… they update and sync based on events.

Lock step does have its advantages. It’s easier to understand and code. It’s definitely easier to debug (you can check for sanity at each sync). As Josh points out, lock mode doesn't suffer from the possibly increased frame latency of free step. But lock step can waste resources if other systems are waiting around… or does it?

First attempts at threading games mostly involved functional decomposition; just put each system (graphics, IO, physics, etc) on a separate thread. If a system finishes updating and has to wait for another system, that system’s thread would sleep. This is wasting resources because that thread could be doing more work… this is the major fault of lock step. However, Smoke uses a job pool and worker threads to support functional and data decomposition. So… if a system finishes its work, that thread can work on jobs for other systems! Now we are not wasting resources on lock step ^_^ Score! There are a few exceptions, if a system doesn’t divide its work properly and takes a long period of time to finish… then the worker threads could end up unemployed.

I have want to rework Smoke to run in free step mode. But I am comfortable with lock step. It’s easy for me to understand and explain… and I can easily map out the latency between the systems. I wonder if anyone will take up the challenge and get Smoke working in a free step mode… I’d love to hear if anyone out there gives it a try ^_^ I’d also like to hear more about peoples’ experience on free or lock step. Which do you like? Is free step worth the possible headaches (especially with a large team of developers)?

While running free sounds nice. I think I’ll keep my projects locked down for the near future.

For more complete information about compiler optimizations, see our Optimization Notice.


anonymous's picture

Yep, Destroy the Castle is free step in botht the original and Threading Building Blocks version -- each frame is triggered as fast as rendering + input is processed. Practically speaking, this creates frame rates at around 5 per physics step on a 4-core system. But you can sure launch a lot of cannonballs very very rapidly! Destroy the Castle (including the TBB version) is available at http://software.intel.com/en-us/articles/code-demo-destroy-the-castle.

Making Smoke into a free step system requires two steps:

1) TaskManagerTP and TaskManagerTBB need to be structured to allow waiting on only some of the Systems. Presently they both wait on everything. For TaskManagerTP, events will need to be added for each System, and then the main thread should add a loop to test the events, and if they are not signaled, should pull something off of the work queue and execute it. TaskManagerTBB needs to be modified to put each system into its own root, then the roots can be waited on, one after another. This is something we had designed early in Smoke's development, but pulled after we realized that Smoke would be lock step.

2) The Scheduler needs to have a policy to call TaskManager::WaitForSystemTask on only the Graphics and Input Systems every frame, and the other Systems at some more extended interval. The TaskManagers could be modified to provide a quick check on whether a given System has completed. This would allow the Scheduler to run each System as soon as its previous execution has finished.

Orion Granatir (Intel)'s picture

Destroy the Castle is a good example (thanks Alexey)... you can find it here: http://software.intel.com/en-us/articles/code-demo-destroy-the-castle

Alexey Kukanov (Intel)'s picture

I believe that Destroy The Castle demo, or at least its parallel variant that uses TBB, is written as free step. Physics, AI, and particles all run at each own pace, independently on frame rate. Its code should exist somewhere at ISN I believe.

Orion Granatir (Intel)'s picture

By the way, you can download all the source code over on http://software.intel.com/en-us/whatif

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.