This is an organic document, meaning, that it will expand as need and request dictate. The purpose is to help establish a baseline understanding of terms used in Platform Monitoring, concepts described, and utilizations or capabilities comprehended.
Performance Monitoring Terminology
- The percentage of time that a device is busy servicing requests
- The remaining percentage of time is idle time.
- The fraction of time that is actually spent doing useful work
- A system may incur a lot of overhead and other inefficiencies during the course of servicing requests
- A high Utilization does not necessarily imply good efficiency
Latency (may also be referred to as Response Time)
- Total time required to complete an action
- Can be viewed as a cumulative sum of several latencies of subtasks
- The amount of work that can be completed by the system per unit of time
- There is usually an upper bound on how much work a system can complete per unit of time
- Observed bandwidth is one measure of throughput
- The number of work items that can be completed simultaneously such as threads, executables, etc.
- Concurrency can be used to reduce effective latency and/or increase observed total throughput
- Points of serialization that exist when work must wait for other work to be finished
- It is the slowest parts of the system
- The bottleneck eventually determines how much work a system can do per unit of time
- The ability to execute multiple processes or programs on a single system
- On systems with multiple processor, multi-processing will improve throughput
- The number of execution paths in a program that can execute simultaneously
- Systems that support multi-threading can improve individual application performance by overlapping high latency requests with execution
- The relative improvement in performance that is obtained through a performance enhancement
- Typically measured as a ration between a baseline measurement and an optimized measurement
- The ability to increase performance by increasing numbers of resources (i.e., CPUs)
- Commonly used with SMP systems to indicate the ability to take advantage of the multiple CPUs
- Two possible metrics:
- On a fixed workload, the relative improvements in performance with increasing CPUs
- With a scalable workload, the ability to maintain performance with increasing workload and CPUs
- The performance that can be gained by using a faster mode of execution is limited by the fraction of time that does not execute faster
- Amdahl’s Law is used to determine the maximum potential speedups.
(1-Fraction Enhanced) + ____Fraction Enhanced______
Speed-Up of Fraction Enhanced
- For a system in equilibrium, there is a relationship between # of tasks, arrival time and response time
# of Tasks in Service Queue = Arrival Rate * Response Time
- # of Tasks = number of tasks in the system
- Arrival Rate indicates the rate at which requests arrive at the system
When the rate at which requests leave a system is greater than the rate at which they arrive, the system is said to be stable, or at equilibrium, or at steady state.
- Response Time is the mean time to complete task. It includes the time requests spent waiting for service, as well as time spent receiving service.