... as you may have guessed is systems integration (or rather, lack thereof). If we take a data center environment as an example, the processors inside servers carry special instructions and registers to assist in monitoring, logging an controlling the internal state.
Likewise, most present day servers have sensors and even a special processor, the BMC or baseboard management controller apart from the main CPU that will allow controlling most any function in a server in response to a contingency, even turning a machine off or on as necessary.
Moving up the stack, most any enterprise application or database program will come with extensive self-diagnostic or management capabilities, such as [http://www.opengroup.org/public/member/proceedings/q106/23PL.htm] Oracle Enterprise Manager.
With today's technology, a CPU will slow down defensively if it start overheating. However, if we have 42 2U servers in a cabinet, it is very difficult to implement a policy that says that the collective power consumption in that cabinet shall not exceed 4.5 KW and expect the servers to self-manage to that limit.
An even harder capability to implement is to vary the set point as a function of the ambient temperature, for instance, to lower the consumption limit to 3.5 KW if the ambient temperature goes over 75 degrees Farenheit.
With today's technology it is possible to design fairly "intelligent" devices that will exhibit appropriate individual responses to environmental stimuli. It is still difficult to make collections of devices to behave in a coordinated manner.
There are at least two issues that need to be addressed to manage collections: aggregation and semantics. The dynamic here is similar to the way large organizations work.
Aggregation refers to the information retained from one level to the next. Some information needs to be culled to keep complexity in check. For instance, the information used to manage a regional data center should say very little about the individual processors in the constituent servers, or the operator would be overwhelmed to deal with 10,000 servers at the same time in a crisis.
The semantic problem is that of retaining the correct mix of information, i.e., the information that is meaningful at the next level up. Deciding what information to cull is hard to do in an automated way. This problem is often referred as the "semantic gap."