Intel® Active Management Technology Use Case #5: Remote Diagnosis, Local Repair (Heal)

Tags:
Intel® Active Management Technology (Intel® AMT) can help to reduce the support overhead associated with repairing system-boot failures, even when the issues that underlie those failures cannot be repaired remotely. By enabling problem diagnosis on a down-the-wire basis, Intel® AMT platforms can reduce the need for time-consuming technician visits to diagnose the platform, which otherwise increase user downtime, as well as consuming IT resources.

In this use-case example, end-user platforms cannot boot due to hardware issues such as hard drive corruption and memory errors.

Conventional Limitations to Remote Diagnosis and Local Repair

In the typical scenario where an end-user's system will not boot, the user calls the help desk for assistance, and the help-desk technician attempts to diagnose the problem. Because the system will not boot, however, the help desk is typically unable to resolve the issue. After the issue is escalated to a support technician who goes to the end-user's work location, the technician diagnoses the issue as hardware-related and identifies a field-replaceable unit (FRU) that needs to be replaced to repair the problem. The technician must then obtain the correct part from inventory and return to the end-user's work location to repair the platform.

In this conventional scenario, two or more desk-side visits are required to repair the system. This will impact user productivity and pulls IT resources.

Using Intel® AMT to Overcome Limitations

In the corresponding scenario to the one described above in an environment where Intel® AMT is in use, an event from the user's machine may be received on a management console operated by the support organization to indicate inoperable or malfunctioning hardware. Policies configured on the console evaluate the event to determine whether an alert to the help desk is needed. In addition, the user may also contact the help desk directly.

The help desk diagnoses the problem down-the-wire using Intel® AMT's Serial-over-LAN (SoL) or KVM along with IDE-R remote boot capability and third-party diagnostics. While the help desk is unable to repair the system remotely, it is able to remotely identify the correct FRU to perform the repair, so that the field technician has the part with them when they are first dispatched to the end-user's location, and they are able to perform the repair at desk-side on their first visit.

In this Intel® AMT-enhanced scenario, only one desk-side visit is required to repair the system, saving one desk-side visit.

Key Functionality Enabled by Intel® AMT that Underlies this Use Case

The following table summarizes the features and functionality utilized in this use case that are provided by Intel® AMT or enabled by Intel® AMT in third-party software:

Feature Functionality
Out-of band (OOB) access Platform is diagnosed and/or repaired in a crashed state via OOB access to Intel® AMT, KVM or SoL/IDE-R, and third-party diagnostics
Remote field-replaceable unit (FRU) inventory FRU inventory list in firmware is used to identify the platform's FRU makes and models
Remote troubleshooting and recovery Third-party management application's capabilities are used remotely, down-the-wire to remotely diagnose the crashed platform
Alerting Event may be generated by Intel® AMT (depending on OEM implementations) and sent to the third-party management console to notify the help desk†
Intel® AMT flash Allows BIOS to store/update hardware list in dedicated flash memory; technicians remotely access this list to identify what hardware make/model to bring to the platform
Tamper-resistant agent Allows for access to the platform and its inventory information, with little risk of agent tampering by a user


† The event is logged locally in a standard format in the NVStore and available for use by third-party management applications, which determine whether the event should cause an alert to its console.

The Advantage of Intel® AMT

Intel® AMT enables support organizations to reduce technician desk-side visits by remotely diagnosing the issue and determining failed FRU make and model information out-of-band. Thus, fewer troubleshooting hours are required, and user downtime is reduced.

Business Value of the Intel® AMT Solution

This use case enables IT organizations to save on support and productivity costs:

  • Savings from Reducing Desk-side Visits: Intel® AMT reduces the need for desk-side visits for help-desk calls related to FRU failures. These trouble tickets typically require two desk-side visits to address using conventional means, and Intel® AMT potentially eliminates one of these desk-side visits.
  • Savings in End-user Productivity: By improving average time to repair, organizations can realize savings in terms of avoided end-user downtime.

Non-KVM High Level Flow



KVM High Level Flow



Remote Diagnosis, Local Repair Usage Case Implementation

A typical remote diagnosis, remote repair scenario may consist of using IDE-R (IDE Redirect) to boot a client with a corrupt operating system or hardware failure. Implementation of this Use Case depends on the following preconditions:

  1. AMT clients are provisioned.
  2. All clients are connected to the network.
  3. AMT enabled clients are powered and in one of the following states: S5, S4, S3, S1, S0.
  4. AMT enabled Management Console is running on the network.
  5. All systems have been discovered (see Use Case 1)
  6. For KVM, the AMT client is version 6.x and the system is utilizing integrated graphics.
In order to implement the Remote Diagnosis, Local Repair Usage Case, the following actions would be taken:

Step Workflow (Basic course of events for replacing/removing missing/added hardware components)
1 Knowledge worker attempts to boot up their system.
2 During POST, the BIOS stops the boot and reports that it is unable to find a particular piece of hardware (e.g., Memory, CD-ROM, HDD, etc).
3 Knowledge worker calls IT and reports the problem with the system.
4 IT professional attempts to get some context about the issue and some background information from the knowledge worker.
5 IT professional uses SOL or KVM and IDE-R to remote boot the system to a diagnostic OS.
6 IT professional uses diagnostic tools on the remote diagnostic OS to identify the cause of the issue.
7 IT professional identifies what the failed FRU is and dispatches a technician to the system to replace the hardware with a like model.
8 System is able to boot back into the User's OS and knowledge worker is able to use the system as well as they were before the initial problem.
9 System is able to boot back into the User's OS and Knowledge worker is able to use the system as well as they were before the initial problem.
Alternate Path 1 - OS Boots with Faulty Hardware:
2 Once system is in the OS, knowledge worker notices that they no longer have access to a piece of hardware (sound, CD-ROM, HDD, Floppy, etc). Continue with steps 3-8 from the Basic Course of Events.
Alternate Path 2 - Agent Detects Missing Device:
2 Once system is in the OS, a software agent detects that a particular device is no longer available and send an alert to the IT console. Continue with steps 3-8 from the Basic Course of Events.
Alternate Path 3 - Device Reports Impending Failure:
2 During POST a S.M.A.R.T. device detects that it's device is having an issue and that a failure is impending.
3 An Alert is sent to the IT console.
4 IT professional gets the alert and contacts the knowledge worker to inform them of the issue and to schedule a time for diagnostics & repair. Continue with steps 5-8 from the Basic Course of Events.

The following table lists the relevant Software Development Flows and Realms that would be applied with Remote Diagnosis/Local Repair Usage Case. Note that since this Usage Case is primarily based on implementing a SOL/IDER session which uses a protocol different from SOAP and WSMan. Implementing a SOL/IDER session not only includes calling the necessary APIs referenced in the Redirection Design Library Guide, but also we must set up the Remote Power interface in order to direct the boot path appropriately.

Relevant Software Development Flows WSMan Interface Realm
1 Redirection Administration Flow (AMT 3.0, 3.2, 4.0, 5.0, 5.1) Redirection Administration Realm
2 Remote Power Control Flow Remote Control

The following assumptions underlie the analysis in this use case:

  1. The third-party remote-management application in this use case supports Intel® AMT.
  2. Alert Standard Format (ASF) for client platforms is not implemented, because it does not have application support.
  3. The FRU failure is a platform FRU and not a chipset or motherboard.
  4. All research data is gathered from global, US-based IT organizations.
  5. Platforms being managed using Intel® AMT are connected to a power source (Desktop mode) or they are connected to an AC or DC power source (Mobile mode), but the platform does not have to be powered on. Notebooks that are Intel® AMT 2.6 or older must be in S0 state (powered on.)
  6. Platforms are connected through a working Ethernet connection to the corporate LAN (Desktop mode) or wirelessly connected to the corporate network (remote mode) and not over VPN for out-of-band (OOB) access.
  7. For Intel® AMT 2.6 and below, this analysis assumes a mostly wired environment or one where laptops are often wired. Notebooks that are AMT 4.0 and above have wireless OOB access in Sx states.
The following Intel® AMT SDK resources provide examples of the components involved for implementing the Remote Diagnosis/Local Repair Usage Case.

  • KVM (Sample Source Code)
  • AMTRedirection (Sample Source Code)
  • RemoteControl (Sample Source Code)
  • IMRGUI (Utility exe)
Additional information on the features associated with this Use Case can be found in the Intel® AMT SDK html based documentation. Download and install the SDK; open the file default.htm found under ...\ DOCS\Implementation and Reference Guide\. Under the "Contents" tab select "Intel® AMT Features".
For more complete information about compiler optimizations, see our Optimization Notice.

Comments

's picture

respected sir i have D845EBG@ mother Board and i have problem with it.where it shows memory in biose it is showing Bus is 133 and second bank of memory shows NON spd ? what it is and why it is can you please tell me to slove this priblem . i am thinking to update it's BIOS .can it is posible to slove with it .i am waiting your help i am from pakistan

's picture

respected sir i have D845EBG2mother Board and i have problem with it.where it shows memory in biose it is showing Bus is 133 and second bank of memory shows NON spd ? what it is and why it is can you please tell me to slove this priblem . i am thinking to update it's BIOS .can it is posible to slove with it .i am waiting your help i am from pakistan