Using Health Explorer to identify and resolve problems

You can use the Health Explorer to identify and resolve error states that show up when monitoring IBM® systems and hardware components. For a quick check up, you can take a look at Active Alerts, Windows® Computer on IBM System x™ and BladeCenter® x86 Blade Systems, or All IBM System x and BladeCenter x86 Blade Systems. These views show any existing Alerts on your IBM hardware.

You can use the Health Explorer to view, learn, and take action on alerts, state changes, and other issues raised by a monitored object. The Health Explorer helps troubleshoot your alerts.

Suppose that you see a critical error when you are monitoring your systems and hardware components, such as shown in the following graphic.
Figure 1. An example of a critical error showing up in a managed system
An example of a critical error showing up in a managed system

Use the following procedure to identify and resolve the error.

  1. To open the Health Explorer when there is a Warning or a Critical alert, click All IBM System x and BladeCenter x86 Blade Systems; then double-click on the state. By default, the health Explorer opens with all failed monitors in expanded view.
    The following graphic shows how such an error might be displayed in the Heath Explorer:
    Figure 2. Example of hardware components causing a system to be in the error state
    Example of hardware components causing a system to be in the error state

    If there is no warning or critical alert, highlight an IBM system in the All IBM System x and BladeCenter x86 Blade Systems view; then right-click it to show its context menu. Click Open; then click Health Explorer for system_name.

  2. Use the Health Explorer to identify the basal level health monitor that is indicating an error. The indication should refer to a particular component instance.

    In this case, the cause of the error is a faulty fan.

  3. Click State Change Events in the right-hand pane for details about the latest state change event.

    You can see the date and the time that the fan went into the error state. You can also read details about the nature of the error.

Go to Using knowledge pages to resolve problems to learn how to use the knowledge pages to get help for resolving an error state and to learn about hardware components.