Is DOS the ideal parallel environment - Part I

By Asaf Shelly (16 posts) on August 25, 2008 at 3:55 pm

Previous article described the evolution of computer languages from early days Assembly to Object Oriented languages favored today. The emphasis was that early languages had clear execution flow that could be abused into "spaghetti code" (so many jumps / GoTos in the code that it becomes unmanageable), and on the other hand computer languages evolved to adopt the Object Oriented paradigm which can be abused into "spaghetti flow" (single flow passes so many objects that execution flow is untraceable).

 

These questions are raised only today because until now we had only a single core CPU and we used a few threads very carefully. Today we are facing a revolution that could bring 128 CPU cores very soon. In order to utilize the CPU we must use multiple threads. To fully utilize a 128 core CPU we must use at least 128 threads (worker threads, not waiting threads). The bad news is that most applications do not have 128 different processes at the same time. Usually it is between one to a few CPU consuming processes simultaneously which means that we will end up having multiple threads doing the same job. Using locks is out of the question because a lock is a way to pin down a resource to a single core which is the opposite of using multiple cores.

 

Before we go any further to talk about advanced solutions and new models, we should take a look at existing technologies that are available today. The technology that most affects us is the environments in which out code runs. We must familiarize ourselves with these environments – the operating systems and its services.

 

Going from the most commonly used to the least, I will start with Windows (NT based) OS. It is important to mention that the Windows OS is made out of two completely different operating systems. One is User Mode With its Win32 API and the other is Kernel Mode that has a completely different programming model, API, and execution environment. We start with Windows User Mode, then UNIX, continue to Windows Kernel, and last explore DOS as an ideal parallel system.

This article is broken into four parts because of it's length.

 

 

Windows User Mode

Originally User Mode Windows started as an expansion of DOS. Later this operating system was ported over the Windows NT Kernel instead. The Windows OS was based on graphical windows and each window is owned by a thread. This is a very good design because instead of sharing ownership over the resource (window) only a single owner exists, thus no need for locks.

Windows OS uses messages to initiate operations on a window, so each window responds to a predefined set of operations such as 'mouse move', 'window resize', and 'plug & play events'. We can say that the window supports predefined sets of operations and the system can initiate these operations. In other words every window supports a predefined set of Tasks. When we want our window to support new types of tasks, other than the predefined, we register the new type of Task with the system or we silently define it internally between our windows.

 

The concept of events sent to an object is very good. The implementation defined a huge switch-case that detects the type of event (or Task) sent and made it almost impossible to manage the code. Then came along MFC which was supposedly an Object Oriented wrapper. This was bad in so many ways. First of all Object Oriented code only works if the design was Object Oriented. MFC tried to wrap the flat design. On the other hand MFC now hid the huge 'switch-case' implementation and made it even harder to add event handlers on the fly. Eventually the event-driven model of windows was almost forgotten.

 

User Mode Windows had all these window messages sent through a queue. The queue belongs to the thread that owned the window. This model is very good and still exists. Every thread has a message queue and every thread can post messages to a queue that belongs to another thread or to its own queue. Using message queue communication means that threads do not share resources and thus no locks are used. No locks means no deadlocks, no race conditions, and means an increase in overall system performance (because a lock is a way to pin a resource down to a single core). This is a very good model that was also long forgotten and most times when a queue is required developers use Message-Queues engines, databases, or network sockets. The model is very good but the implementation was somewhat problematic, having some very limited queue length, API that uses the windows messages format, and almost no marshaling support which means that allot of work is required for a very simple operation.

 

Windows User Mode uses processes and threads. Every DLL loaded to the process receives a callback when the process is created and when it exits. Also every dll is called for every thread created and every thread closed. This allows implementations of extremely good technologies such as TLS (see this link: http://AsyncOp.com/Link.aspx?TLS ). However most developers do not really know TLS even though it exists for over 10 years, they do however know how to create a thread and the intuition says that when you want a task performed you create a thread and when you want it closed you close that thread. On the Window User Mode OS this means calling all the initialization callbacks in all the DLLs attached to the process. So here again we have an exceptionally good mechanism that almost nobody is using and instead we get bad code implementations. UNIX can live with multiple processes and threads created and destroyed. Under Windows User Mode OS if the actual work to do is about 10 lines of code long, then creating a thread and closing it should multiply work that would slow the process approximately 100 times in relation to a process that recycles threads (thread-pool).

 

 

 

Next part of this article analyses the built it in parallelism of the UNIX OS.

 

 

 

 

Categories: Parallel Programming

Comments (8)

August 25, 2008 5:58 PM PDT


Quin
"No locks means no deadlocks...."

Wrong. See Herb Sutter's "Many Faces of Deadlock":

http://www.ddj.com/hpc-high-performance-computing/209900973
August 25, 2008 10:35 PM PDT

Asaf Shelly
Total Points:
1,430
Status Points:
930
Brown Belt
Hi Quin,

That makes an interesting question. When two threads are Deadlocked on two lock objects the system is still working properly in terms of system state and each thread is in the designed state / phase. I am still trying to figure out the scenarios for messages deadlocking, but it seems to me that one of the threads has to be in the wrong state / phase. For example lost a message and therefore is not working on the task, or had an exception during a message handling which means that it lost a task. In any event that I can see here the thread was not working correctly and you can find the bug by going over the code. In a mutual lock deadlock two threads are working correctly but the system design is flawed, which is by far harder to track down. Wouldn't you agree?

I accept the fact that there is generally a problem that needs to be addressed: What do we do to make sure that our system does not lose tasks. This adds up to another good question: What do we do if one of the system elements is flooded. You can't just lose the message, but you can also not keep queuing it. If several threads handle a task in a chain then we need to implement to management mechanism or the threads will eventually have to deal with these extreme conditions locally.

Asaf
August 26, 2008 11:40 AM PDT


Quin
ISTM that it is only semantics to say that "When two threads are Deadlocked on two lock objects the system is still working properly in terms of system state and each thread is in the designed state / phase", but to say that two message-based threads are in a non-working state when mutually waiting on messages from one another.

The example Sutter gives on the first page of that article intends to show how a message-based system can still be deadlocked because both threads are unable to continue until they receive a message from the other thread, but neither can send that message because neither can continue. Replace "send a message" and "receive a message" with "acquire a lock" and "release a lock" and you've got deadlock in terms of locks.
August 26, 2008 3:01 PM PDT

Asaf Shelly
Total Points:
1,430
Status Points:
930
Brown Belt
I can see why, and yet I still see a difference. You can have two perfectly designed threads that were also debugged separately but when combined might block each other because the order of acquisition is different. This bug can only be detected on integration.
If thread A is supposed to post a message to thread B and does not then it must have lost the message or had broken flow. Otherwise why would it not send the message?
Also threads wait for any message so other threads can still work with the two threads and be serviced.

I'm still looking for a use scenario that cannot be detected by a simple code review.
August 28, 2008 9:58 AM PDT


Quin
Why could one order inversion be detected only at integration while the other can be detected earlier in code reviews? It seems to me that both could be detected in code reviews or, failing that, at integration (or worse -- in the field). Of course the more complex the code is, the harder it will be for manual code reviews to catch such things.

As for losing a message or broken flow, if thread A is waiting on thread B to send a message to A, and B is waiting on A's message before it sends this message, you have deadlock. You can call that "broken flow" if you want, but I don't see a substantive difference between this scenario and one where the sending and receiving of messages is replaced by the releasing and acquiring of locks.

I'm not sure what you're getting at with: "Also threads wait for any message so other threads can still work with the two threads and be serviced." Are you specifically referring to a Windows GUI's message pump only? If so, fine, but that's not the only use of message-passing for parallelism.
August 28, 2008 10:53 AM PDT

Asaf Shelly
Total Points:
1,430
Status Points:
930
Brown Belt
Here is thread A:
ThreadProc()
{
Lock L_File;
Lock L_Database;
read from file and write to database
Unlock L_Database;
Unlock L_File;
}

Here is thread B:
ThreadProc()
{
Lock L_Database;
Lock L_File;
read from database and write to file
Unlock L_File;
Unlock L_Database;
}

The first was created by the developers responsible for the document editor, the second by the team responsible for version control and backup. Code review to see that it is all good but integration will fail.

I can't find the scenario for two threads waiting for events of each other. The question is: what is the process / task / job that brought in the initial event, and how did that get lost.

If the two threads are servicing each other then who is receiving input from external systems and from the user? One of the two has to get some other external input somehow.

Asaf
September 8, 2008 1:03 PM PDT


Quin
// Thread A
mq1.receive(); // blocks
// ...
mq2.send( x );

// Thread B
mq2.receive(); // blocks
// ...
mq1.send( y );

Am I just thick, or is this deadlock with messaging?
September 9, 2008 6:40 AM PDT

Asaf Shelly
Total Points:
1,430
Status Points:
930
Brown Belt
The question is: What is the operation that we perform?
Is it "Do A,B" or "Do B,A"?
In a real-world scenario one of the two has to receive an external input to start the operation.

Asaf

Trackbacks (1)


Leave a comment  

To obtain technical support, please go to Software Support.
Name (required)*

Email (required; will not be displayed on this page)*

Your URL (optional)


Comment*