Hello,

An M/M/n queuing model simulation with Object Pascal and my Thread Pool Engine - version 1.02

You can download it from:

http://pages.videotron.com/aminer/

Read more bellow...

Author: Amine Moulay Ramdane

Description:

It's harder and sometimes impossible to get analytical results about waiting times and queue length for general interarrival and service distributions; so, it's important to be able to estimate these quantities by observing the results of simulation.

It's very easy in Object Pascal to simulate a sequence of arrival times with a given interarrival distribution.

Look at the examples MM1.pas( M/M/1 queuing model) and MMn.pas(M/M/n - n: number of servers -) inside the zip file:

---------------------------

InterArrivals:=TExponentialDistribution.Create(420623,1.0/3.0);

ServiceTimes:=TExponentialDistribution.Create(220623,1.0/4.0);

currtime:=0.0;

for i:=1 to simnumber

do

begin

obj:=TJob.create;

obj.simnumber:=simnumber;

obj.number:=i;

currtime:=currtime+InterArrivals.Sample;

obj.ArrivalTime:=currtime;

obj.Servicetime:= ServiceTimes.sample;

TP.execute(myobj.myproc1,pointer(obj),NORMAL_PRIORITY);

end;

-------------------------------------------

Here we have the InterArrivals object and ServiceTimes object and we are calling InterArrivals.Sample to get our samples from the Exponential Distribution.

After that we are calling myobj.myproc1 to simulate our M/M/1 queuing model...

If you look at MM1.pas , you will see that the arrival rate is: 3 and the service rate is 4 , so, this will give us a theoretical value of 1/(4-3) = 1 for one server, and the Object Pascal simulation gave me 1.02 for one server.

Now i want to talk about themathematical modeling of

queuing networks , if you have noticed in a webserver

that uses a distributed database , the database

have to be modeled as a queuing system with an

hyperexponential service, but this will complicate the

mathematical modeling of the webserver since the

database have to be modeled as an M/G/1 queuing system,

so we have to simplify the mathematical modeling , so if

for exemple the database's write transactions takes in average

much more time than a database's read transaction so we have

to choose the worst case scenario , that means we have to choose

the rate of the database's write transactions as the rate

of the arrival to the queuing system of the database so

that the database can be modeled as an M/M/1 queuing system

(M: means markovian), so that the webserver can finally be

modeled as two M/M/1 queuing system in series, one for

the database and one for the internet network.

But there is still a problem, if we want to also resolve the

worst case scenario when for exemple thousands of custumers

arrive at the same time to the webserver.. so the webserver

have to be modeled taking into acount this worst case scenario...

Here is the matematical modeling of an M/M/1 queuing system:

We already know that to satisfy a Poisson process we must

have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant

that means the counting increments must be independant.

We have the following relation between the Poisson law

and Exponential law:

the expected value E(X exponential) = 1 / E(X poisson)

so if the arrival is poissonian then the interarrivals are

exponential..

Now i will calculate the expected mean waiting time and

mean number of custumers in the queuing system:

The Little law says that the waiting time in the queuing system:

Ws = Ls/lambda

And the waiting time in the queue of the queuing system is:

Wq = Ws - 1/Mu

= Ls/lambda - 1/Mu

And the Little law gives us the expected mean number of custumers in the

queue:

Lq = Wq*lambda = Ls - Phi et Phi = lamda/Mu

That implies:

Ls - Lq = Phi

When the system is in a stationary state , the balance equations gives:

State rate in = rate out

state 0: Mu * P1 = lambda*P0

state 1: lambda*P0 + Mu*P2 = lambda*P1 + Mu*P1

state 2: lambda*P1 + Mu*P3 = lambda*P2 + Mu*P2

...

state n: lambda*Pn-1 + Mu*Pn+1 = lambda*Pn + Mu*Pn

And that gives us the following balance equations:

lambda * P0 = Mu*P1 <=> P1 = (lambda/Mu)*P0

lambda * P1 = Mu*P2 <=> P1 = (lambda/Mu)^2*P0

lambda * P2 = Mu*P3 <=> P1 = (lambda/Mu)^3*P0

...

lambda * Pn-1 = Mu*Pn <=> P_n = (lambda/Mu)^k*P0 [1]

Note: P0 means the probality of having zero custumer in the queuing system.

and ^ means power.

And we have also that:

Sum(n=0 -> infinite) (Pn) = 1 , that means the sum of probabilities equal 1

And [1] gives us:

Sum(n=0 -> infinite) (Phi^n*P0) =1

And by resolving the sum of a geometric series that gives us:

P0 * 1/(1-Phi) = 1 => P0 = 1 - Phi and Phi < 1

And [1] gives us:

Pn = (1-Phi) * Phi^n

And we have the mean expected number iof custumer in the queuing system is:

Ls = Sum(n=0 -> infinite) (n*Pn)

That implies:

Ls = Sum(n=0 -> infinite) (n*(1-Phi)*Phi^n) = Phi/1-Phi and Phi<1

and we have the mean expected number of custumers in the queue of the queing system is :

Lq = Ls -Phi = Phi^2/ (1-Phi) et Phi<1

Little law gives us the mean waiting time in the system and in the queue:

Ws = 1/lambda * Phi/(1-Phi) and Phi<1

et

Wq= 1/lambda * Phi^2/(1-Phi) and Phi<1

I have just read the following page, look at the the USL

(Universal Law of Computational Scalability) of Dr. Gunther,

he wrote this: (See: http://en.wikipedia.org/wiki/Neil_J._Gunther )

--------------

The relative capacity C(N) of a computational platform is given by:

C(N) = N

------------------------------------------

1 + α (N - 1) + ß N (N - 1)

where N represents either the number of physical processors

in the hardware configuration or the number of users driving the

software application. The parameters α and ß represent respectively

the levels of contention (e.g., queueing for shared resources) and

coherency delay (i.e., latency for data to become consistent) in the

system. The ß parameter also quantifies the retrograde throughput

seen in many stress tests but not accounted for in either Amdahl's law

or event-based simulations.

----------

His website: http://www.perfdynamics.com/

If you read carefully , you will see that Dr. Gunther is using this

model to predict scalability after he simulates a relatively small

number of vusers in LoadRunner ( because of licensing costs, it's

cost-effective) and after that he finds the coefficients of the

2nd-degree polynomial (quadratic equation) and then transform

those coefficients back to the USL parameters using the α = b - a

and ß = a.

And then he is extrapolating with the USL model to higher loads

to predict scalability.

He is also applying the model to webservers with heterogeneous

workloads. like in the following page:

http://perfdynamics.blogspot.com/2009/04/assessing-usl-scalability-wi...

Now my question follows:

Suppose we have obtained a small number of measured load-points

with Loadrunner or others tools, and we calculated the USL equation

to predict scalability of a webserver , how the USL model can predict

if the scalability/performance is limited by the network bandwidth

and not the server ? I think USL can not predict this.

When we are modeling webservers , we have to include

the network node in our network queuig model (that includes

the computer server queue) , and since the service in the

computer server is comprised of multiple services (when we

are using htmls , databases etc.) the network queue will not

be markovian in its service , and we have to model the network

queue as an M/G/1 and this will complicate the mathematical

analytical modeling...

So, i think the best way is to use a webserver stress tool

like http://fwptt.sourceforge.net/

You can even test the webserver with an heterogeneous

workloads by starting multiple fwtpp processes, and

you should increase the number of threads to 5 and after

that to 10 threads, 15 ... and so on until the webserver

applications stops responding propperly(and this will inform

on the maximum number of users that you can have in the same time...)

and as you are stress testing you can even see (by testing/measuring

it) if the network bandwidth is not the bottleneck... and this can

not be done by the USL model.

We already know that to satisfy a Poisson process we must

have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant

that means the counting increments must be independant.

We have the following relation between the Poisson law

and Exponential law:

the expected value E(X exponential) = 1 / E(X poisson)

and if the arrival is poissonian then the interarrivals are

exponential..

Now what about a webserver ?

I have read the following paper:

http://docs.google.com/viewer?a=v&q=cache:JFYCp_dSPP4J:citeseerx.ist....

And it says that a simple model with M/M/1/k with FCFS discipline

can predict webserver performance quite well..

Hence, i think we can model our webserver over internet

with 3 queues connected as a Jackson Network like this

A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A

A: is the arrival rate

and we have the following:

Ni: number of jobs in each queue

Ui: utilization of each queue

Ni = Ui / (1-Ui)

Adding all the Ni in each individual queue will give the

average number of jobs in the entire queuing network.

After that we apply the Little formula:

A: network arrival rate

T: average response time

N = A*T => T = N / A

And after that from the mathematical analytical equation

we can simulate the jackson queuing network that model our

webservers...

Now there is still an important question that i have:

Our analytical jackson network model can give us insight

on the webservers behavivior.. but the difficulty that i find in

practice is that: suppose we have found the right parametters

that we want to choose, like for example the service rate of

the M/M/1 Server Queue , how , from this service rate, can

we buy the right computer that satisfies the service rate

that we want ?

We say in OR that:

"Understanding the behavior of a system is what Queueing Theory

and Little’s Law is all about. But, managing a Queue requires not

just understanding the behavior of a system, but also in-depth

knowledge of how to improve a system - improving both aspects

of Queueing will mean a better, more efficient and cost-effective

system and, more importantly, a much better customer experience."

I wrote before that:

---

"It says that a simple model with M/M/1/k with FCFS discipline

can predict webserver performance quite well..

Hence, i think we can model our webserver over internet

with 3 queues connected as a Jackson Network like this

A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A

A: the arrival rate

and we have the following:

Ni: number of jobs in each queue

Ui: utilization of each queue

Ni = Ui / (1-Ui)

Adding all the Ni in each individual queue will give the

average number of jobs in the entire queuing network.

After that we apply the Little formula:

A: network arrival rate

T: average response time

N = A*T => T = N / A

And after that from the mathematical analytical equation

we can simulate the jackson queuing"

--

As you have noticed , this mathematical model of

this jackson network does in fact take into account

the M/M/1 Network queue node , the USL model can not

do this... and with this performance data from the mathematical

analytical model simulation we can for example validate

the performance data of the fwptt stress webserver simulation..

But you have to take into account worst cases and the

peak traffic loads...

Let for example we have a a webserver hosting html pages

and it is receiving 1000000 HTTP operations

per day with an average file size of 10 KB.

What would be the network bandwidth required for this website

considering peak traffic if the peak traffic load from past

observations was four times greater than average loads?

Required bandwidth is solved by the following equation:

HTTP op/sec x average file size or

1000000 HTTP ops per day =1000000/24 = 41,667 op/hour =

41,667/3600= 11.6 HTTP ops/sec

The needed bandwidth is

11.6 HTTP ops/sec X 10 KB/HTTP op = 116 KB/sec = 928 Kbps.

If we assume a protocol overhead of 20% then the actual throughput

required is 928 Kbps X 1.2 = 1,114 Kbps.

However if peak loads, as i said before, is as much as

4 times greater, the bandwidth required to handle spikes

would be 4 X 1,114 Kbps = 4.456 Mbps.

So you have to think also about the cost of this line...

I will add the following:

As you have noticed i said that:

"As you have noticed , this mathematical model of

this jackson network does in fact take into account

the M/M/1 Network queue node , the USL model can not

do this... and with this performance data from the mathematical

analytical model simulation we can for example validate

the performance data of the fwptt stress webserver simulation.."

and i said that:

"Hence, i think we can model our webserver over internet

with 3 queues connected as a Jackson Network like this

An M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client queue"

And of course on Capacity Planning for Enterprise Datacenters

and Websites , you can mirror many computer servers and load

balance between them with a software... to make the system much

FASTER, and this will be modeled as a jackson network like this:

A -> M/M/n Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A

A: the arrival rate to the system

But there is still an important thing , as i have showed before

on my calculations:

"However if peak loads, as i said before, is as much as

4 times greater, the bandwidth required to handle spikes

it would be 4 X 1,114 Kbps = 4.456 Mbps.

So you have to think also about the cost of this line..."

I think that you have also to take into account the knee utilisation

of your M/M/n Servers Queues, if for example the number of computer

servers is 8 and the Knee is 74% that means that in our previous

example the bandwidth must equal to:

74% X 4.456 Mbps = 3.297 Mbps.

Cause as you know, above this Knee of 74% for 8 servers

the curve of the waiting time does grow quickly ..

And we have to take into account the cost of the line ...

Let us take another example:

In the network node of the Jackson network, there is three

main parameters that are responsable for its performance:

latency, bandwidth and utilization.

Now, if for example you take a look at my provider

http://www.videotron.com/service/internet-services/internet-access/hi...

My upload network speed is clearly 820 Kbs => 102.5 Kbytes/sec

Hence, if the average file size on my web site is for

example 25 Kbyte and the protocol overhead is 20%, that

means that the file size will be in average:

25 * 120% = 30 Kbyte

And if the Knee of the M/M/1 queuing node in our Jackson network

is 50%, then the bandwidth available will be reduced to

102.5 Kbytes/sec * 50% = 51.25 Kbytes/sec

If the download speed of the client is for example 7.5 Mbps,

that means that the file can be retrieved in no more than

30 / 51.25 = 0.585 second = 585 milliseconds.

But if we assume for example that a typical web visitor

to my web site is spending 585 milliseconds retrieving the

page and 60 seconds reading the HTML page THEN each client

on average only requires:

0.585 / (0.585 + 60) = 9.6% of the bandwidth that is allocated to

it..

but in practice i think we have to consider the worst case

scenarios..

So, since we can support a constraint of no more than 5 seconds

in the T (average response time), that means:

585 ms * X = 5000 milliseconds => X = 8.54

Hence, in worst case , my network node can not support more

than 8.54 users in the same time...

So, on *average* my network node can support:

(8.54 * 86400) / 5 = 147571 users per day

I wrote before that:

--

And of course on Capacity Planning for Enterprise Datacenters

and Websites , you can mirror many computer servers and load

balance between them with a software... to make the system much

FASTER, and this will be modeled as a Jackson network like this:

(1)

A -> M/M/n Server Queue -> M/M/1 Network queue ->

-> M/M/1 Client queue -> A

A: the arrival rate to the system"

--

We know also from an operational law of queuing theory that:

(2) the rate of job leaving any stable node must equal

its arrival rate.

(2) is from the Equivalence property.

Equivalence property: Assume that a service facility with

s servers and infinite queue has a Poisson input with

parameter lambda and the same exponential service-time

distribution with parameter Mu for each server (the M/M/s model),

where s * Mu > lambda. Then the steady-state output of

this service facility is also poisson process with parameter

lambda.

We know for an M/M/c queue that:

Note: try to add servers with almost the same hardware

configuration...

C(c, U) = Erlang formula = P(c) / (1 - Utilization)

note: c the number of servers..

P(c): means the probability that all the servers are busy

P(0): means the probability that there is no waiting time in the

queue, that means also: AT LEAST one server among the C servers

are not busy...

The average waiting time in the 'queue' =

C(c,U) / (service rate x c x (1 - Utilization)) (3)

It's approximatly equal to:

Utilization^C/(service rate x (1 - Utilization^C)

Note: ^ means power

This approximation is exact for the M/M/1 and M/M/2 models,

but 'slightly' lower than the result in (3) if c > 2

and

Utilization = Density of circulation / C (number of servers)

Note: ^ means power()

and C means the number of servers

Response time = The average waiting time in the 'queue' +

(1 / service rate)

average numbers of users in the system = service rate x response time

average number of users in queue = service rate x average waiting

time

in the 'queue'

Now as i said before:

--

So the equation for

Ni: number of jobs in each queue

Ui: utilization of each queue

Ni = Ui / (1-Ui)

Adding all the Ni in each individual queue will give the

average number of jobs in the entire queuing network.

After that we apply the Little formula:

A: network arrival rate

T: average response time

N = A*T => T = N / A

And after that from the mathematical analytical equation

we can simulate the jackson queuing network that model our

webservers...

---

If we try to calculate the Ni = Ui / (1-Ui) in

the Jackson network, and from the operational law (2) above

this will give us:

Ns for the M/M/n Server queue is:

(DC / n) / (1 - (DC/n))

and DC = Ss / A => Ns = ((Ss/A)/n) / (1 -((Ss/A)/n))

Ss: service rate at the queuing server.

A: Arrival rate to the jackson network

DC: is the Density of circulation

n: number of servers

And Nn for the M/M/1 Network queue is:

and Utilization in the M/M/1 network queue = Sn / A

this imply that => Nn = (Sn/A) / (1 -(Sn/A))

Nn: number of jobs in the M/M/1 network queue node

Un: Utilization in the network queue node

Sn: service rate at the queuiNg server.

A: Arrival rate to the jackson network

And Nc for the M/M/1 Client queue node is:

and Uc= Sc / (F/5)

this imply that => Nc = (Sc/(F/5)) / (1 -(Sc/(F/5)))

Note: F/5, if for example the rate is one file every 5 seconds.

Nc: number of jobs in the M/M/1 client queue node

Uc: Utilization in the M/M/1 client queue node

Sc: service rate at the queuiNg server.

A: Arrival rate to the Jackson network

F: Average file size

Adding all the Ni in each individual queue will

give the average number of jobs in the entire

queuing network that is equal to:

Ni = Nn + Ns + Nc

= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))

+ (Sc/(F/5)) / (1 -(Sc/(F/5)))

After that we apply the Little formula:

A: network arrival rate

T: average response time

N = A*T => T = N / A

this imply that the T(the average response time in the Jackson

network)

is:

Ni = Nn + Ns + Nc

= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))

+ (Sc/(F/5)) / (1 -(Sc/(F/5)))

Now finally we have our mathematical analytic equation

and we can begin to simulate our Enteprise webserver

and also validate the performance data with the fwptt stress

webserver simulation tool...

And don't forget what i have said about USL:

"Suppose we have obtained a small number of measured

load-points with Loadrunner or others tools, and we

calculated the USL equation to predict scalability of

a webserver , how the USL model can predict if the

scalability/performance is limited by the network bandwidth

and not the server ? I think USL can not predict this."

Also i wrote about USL that:

"As you have noticed , this mathematical model of

this jackson network does in fact take into account

the M/M/1 Network queue node , the USL model can not

do this... and with this performance data from the mathematical

analytical model simulation we can for example validate

the performance data of the fwptt stress webserver simulation.."

If you have noticed i have tried to show also that one

of the principal objectives is to control and manage

the *Quality* of the system or process, i wrote in my

post the following:

---

this imply that the T(the average response time in the Jackson

network)

is:

Ni = Nn + Ns + Nc

= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))

+ (Sc/(F/5)) / (1 -(Sc/(F/5)))

Now finally we have our mathematical analytic equation

and we can begin to simulate our Enteprise webserver

and also validate the performance data with the fwptt stress

webserver simulation tool...

---

So, as you have noticed you can simulate our enterprise

webserver with this mathematical analytical equation and

validate the data collected by fwptt stress webserrer tool

and try to keep T( the average response time of your HTTP requests)

bellow or equal to 5 seconds, and see also if your webserver

is scaling, and we call this also Quality control using OR

(operational research).

In quality management, quality can also be maintained and

improved through the detection and reduction of process

variation. Where statistical techniques as ANOVA and others

can be used..

The ANOVA method , for example, analyses the variation in

the data set to discover the amount of variation that can

be attributed to each factor.. etc.

There is still an important thing..

As you have noticed, i have finally arrived to the following

mathematical model of our Jackson network that model an Enterprise

website:

= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))

+ (Sc/(F/5)) / (1 -(Sc/(F/5)))

T: is the average response time..

So read it carefully an you will notice that this mathematical

model also can simulate and predict an intranet website behavior

or an internet website..

First look at the first factor (((Ss/A)/n) / (1 -((Ss/A)/n))

in our mathematical model, as you have noticed as

n (number of servers) grows this will make the denominator

grow rapidly and this will make (((Ss/A)/n) / (1 -((Ss/A)/n))

smaller, that's good for the response time...

Now let's look at the other factors in this mathematical

model:

IF the system is an Intranet, and for example the local

area network Sn is faster than the multiserver, that means:

We have Sn >> Ss, and that means => Ss /A >> Ss /A => that the

Knee in the server side can be reached more quickly than the

Knee in the network node of our Jackson network, so that

the bottleneck becomes the server node.

In the other hand if:

Ss >> Sn => Ss /A >> Sn / A => the Knee in the M/M/1 network

node can be reached more quickly, so that the network node in

the Jackson network becomes the bottleneck.

Availability:

The demand for high availability and reliability of computer and

software systems has led to a formal verification of such systems.

There are two principal approaches to formal verification:

model checking and theorem proving.

And i wrote about Petri Nets and how to model parallel

programs and how to exploit verification tools as Tina to verify

the properties such as liveleness of the system.

Please take a look at my other article about Petri Nets:

http://pages.videotron.com/aminer/PetriNet/formal.htm

And by using Petri Net and Tina you can make your system

more reliable , and if it's more reliable then the system

will have higher availability.

There are important factors to take into consideration when

building systems, factors such us availability and

efficiency(scalability etc....).

System availability is also calculated by modeling the system

as an interconnection of parts in series and parallel.

The availability A of a system is calculated like this

A = MTBF / (MTBF + MTTR)

MTBF: Mean time between failure

MTTR :Mean time to repair

and the MTTR is composed of:

time to respond to the failure + time to find

the problem + time to correct the problem + time

to verify that the system is working corretly.

Now if the interconnected parts are parallel , the equation

that we use to calculate the availability is:

A = 1- (U1 * U2 * U3 * ... * Un) (1)

Ui: is the non-availability of the component part of the system...

The availability of a system composed of interconnected parts in

series is:

A = A1 * A2 * ... * An (2)

Now as you have noticed i have modeled an enterprise website

as a Jackson network, and i have said before that:

--

And of course on Capacity Planning for Enterprise Datacenters

and Websites , you can mirror many computer servers and load

balance between them with a software... to make the system much

FASTER, and this will be modeled as a jackson network like this:

A -> M/M/n Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A

A: the arrival rate to the system

--

As you have noticed i wrote "FASTER" but i didn't spook

about the AVAILABILITY, you will notice that you can not just

make your system more EFFICIENT and FASTER , but also

you can higher the AVAILABILITY of your system by mirroring

many servers...

I have also wrote before that:

---

"If you have noticed i have tried to show also that one

of the principal objectives is to control and manage

the *Quality* of the system or process, i wrote in my

previous post the following:

this imply that the T(the average response time in the Jackson

network)

is:

T = Ni /A = (Nn + Ns + Nc) /A

= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))

+ (Sc/A) / (1 -(Sn/A))) / A

Now finally we have our mathematical analytic equation

and we can begin to simulate our Enteprise webserver

and also validate the performance data with the fwptt stress

webserver simulation tool...

So, as you have noticed you can simulate our enterprise

webserver with this mathematical analytical equation and

validate the data collected by fwptt stress webserrer tool

and try to keep T( the average response time of your HTTP requests)

bellow or equal to 5 seconds, and see also if your webserver

is scaling, and we call this also Quality control using OR

(operational research).

In quality management, quality can also be maintained and

improved through the detection and reduction of process

variation. Where statistical techniques as ANOVA and others

can be used.. "

---

We have many mathematical tools that we can use in Quality control,

example: Correlation, Regression , ANOVA etc.

ANOVA is a powerfull and important mathematic tool in

Quality control , by calculating the p-value in Excel or other

statistical tools , this will give us an important information like

if one or more variables have a significant effect on other

variables. If for example the p-value <= 0.05 (or if F > critical F value)

this imply that the null hypothesis is rejected , and the factor (or factors)

has in fact an effect..

Regards,

Amine Moulay Ramdane.