• 4.0.0
  • 04/10/2020
  • Public Content

Intel® Cluster Checker is extendable to provide the ability to easily add and configure functionality. This document uses a single end-to-end example to help illustrate how to extend Intel® Cluster Checker. In the example, the fictional Waterfowl Industries has developed the Duck Diagnostic Tool. This program comprehensively evaluates nodes using its trade secret methodology and rates them on its patented quack scale. A node is rated between 1 and 5 quacks, 5 being best. If there is an error during the evaluation, the node prints honk.
Collect Extensions
Intel® Cluster Checker provides two methods of collecting data - pdsh and Intel® MPI Library. By default, the tool uses pdsh for data collection. For more information about using collect extensions, see the
Data Collection
chapter.
Data Providers
Intel® Cluster Checker uses data providers to collect data from the system. For more information about data providers, see the
Data Providers
chapter.
Analyzer Extensions
Intel® Cluster Checker analyzer extensions bridge the gap between the database and the knowledge base. Conceptually, an analyzer extension functions as follows:
  1. Read the data from the database.
  2. Transform the data. For example, extract the relevant information from the raw, unstructured database content via regular expressions.
  3. Create CLIPS instances using the transformed data.
Extensions, in the form of shared libraries, plug into the analyzer framework, to perform these functions. Typically, there will be one analyzer extension per data provider/CLIPS class, but this may not always be the case.
The interfaces described in this chapter are located in
/opt/intel/clck/20.x.y/include/analyzer
.
Analyzer Extension Format
Extensions are implemented through the
Extension
and
Transform
classes. Conceptually, the purpose of the
Transform
class is to read in data from a source, process the data, and then send the data to an output. The
Extension
class is specialized to use the Intel® Cluster Checker SQLite or ODBC database sources as the input and create CLIPS instances as the output.
Analyzer performs the following actions on each extension:
  1. Load the extension shared library using
    dlopen()
    .
  2. Call the constructor of the extension.
  3. Run the data input method
    parse()
    .
  4. Call the destructor of the extension.
  5. Unload the shared library using
    dlclose()
    .
Transform Class Member Methods Custom analyzer extensions require a custom parse function. All other functions in this section are provided within Intel® Cluster Checker for use within a custom parse function.
parse()
Pulls data from the input source, that is, the database, transform it, and calls
route()
. Parse executes data manipulation and extrapolation specific to each extension.
parse()
is a virtual function that serves as a placeholder for parsing raw data from the database into a format that can then be routed. This is the key function to be written when creating custom analyzer extensions.
set_header()
Defines the CLIPS slots to be populated. The order should match the order used in
route()
.
set_header({"node_id",
"timestamp",
"count",
"sound"});
set_name()
Sets the internal name of the extension. This name should match the name of the shared library and is also used in the framework definition analyzer_extension tag to configure the extension to be loaded. set_name(“duck”);
route()
Sends the data to the output sink; that is, create a CLIPS instance. The order should match the order used in set_header().
route({rows[i].hostname,
rows[i].timestamp,
variable1,
"quack"});
Transform Class Member Variables
void* clips_env
Pointer to the CLIPS knowledge base environment.
Extension Class Summary
The
Extension
class inherits from the
Transform
class and provides additional functionality and class variables. Examples of this would be providing class member variables to store database data and functions to format parsed data for routing.
Custom Extensions for Framework Definitions
Framework Definitions accept native or custom extensions as long as they are specified as follows:
<
configuration
>
<
framework_definition
>
<
analyzer_extension
>
<
group
>
<
entry
>
all_to_all
</
entry
>
<
entry
>
cpu
</
entry
>
<
entry
>
duck
</
entry
>
</
group
>
</
analyzer_extension
>
</
framework_definition
>
</
configuration
>
In the previous example,
all_to_all
and
cpu
are native extensions, while
duck
is a user defined extension. All the above extensions need to be located in the same folder as only one extension path can be specified per Framework Definition. If no path is specified, the default location is assumed
/opt/intel/clck/20.x.y/analyzer/intel64/cpp
.
Database Interface
The
database
base class is a general interface for reading data from the database and currently supports SQLite and ODBC. The
database
class also allows to configure multiple database sources for analysis via configuration file. The data is queried over the provided data sources and select the data from the first available database. The following wrapper methods are provided for accessing the database, and are defined in
/opt/intel/clck/20.x.y/include/datastore/datastore.h
. These methods query the database view clck_1. If the database is provided by the user, the clck_1 view must be created manually (see the
Database Schema
section in the Reference for the database view).
bool
select_provider_data
(
std
::
vector
<
std
::
string
>
providers
,
Rows
&
rows
,
std
::
string
where_clause
,
bool
mark_as_baseline
);
The database rows resulting from the query are appended to the vector of rows provided by the caller in the second argument. When this function is called, an SQL query of the following form is constructed and executed over a loop of all the provider_names specified by the providers vector. (
Note
: The argument
mark_as_baseline
in all the database methods is experimental and not to be used.)
SELECT
*
FROM
clck_1
WHERE
provider
=<
provider_name
>
AND
<
where_clause
>
A more general select method is also available.
bool
select_data
(
const
std
::
string
query
,
Rows
&
rows
,
const
std
::
map
<
std
::
string
,
int
>&
columns
,
bool
mark_as_baseline
);
As before, the database rows resulting from the query are appended to the vector of rows provided by the caller in the second argument. The difference is that the first argument may be any valid SQL SELECT query. Since not all database columns may be returned by the query, the third argument is a map of column names and their order in the SELECT query. The wrapper method selects the data from the first available database.
For example, the following would select the latest rows for each node corresponding to the duck provider. (see the
Database Schema
section in the Reference for the database view).
clck
::
database
::
select_data
(
"SELECT * FROM clck_1 a INNER JOIN
(
SELECT
Hostname
,
MAX
(
Unique_timestamp
)
AS
Unique_timestamp
,
Provider
FROM
clck_1
WHERE
Provider
=
'duck'
GROUP
BY
Hostname
,
Provider
)
b
ON
a
.
Provider
=
b
.
Provider
AND
a
.
Unique_timestamp
=
b
.
Unique_timestamp
AND
a
.
Hostname
=
b
.
Hostname
", rows);
A nearly equivalent set of data can be obtained using the following function.
bool
get_latest_rows_provider_data
(
const
std
::
vector
<
std
::
string
>&
providers
,
Rows
&
rows
,
std
::
string
hostname
,
bool
mark_as_baseline
);
Unlike the general
select_data()
function,
get_latest_rows_provider_data()
populates all the database columns rather than just the specified subset.
Knowledge Base and CLIPS Interface
Analyzer makes use of the CLIPS C API for interacting with the knowledge base ( http://clipsrules.sourceforge.net/documentation/v630/apg.pdf )
Creating CLIPS Class Instances
Analyzer extensions populate the knowledge base by creating CLIPS instances. The format of the data expected by the knowledge base (that is, the CLIPS slots) is defined by the corresponding knowledge base class.
Parsing Database Output
Once the data is read from the database it is available for processing. Any method available to C++ can be used to filter and transform the data into the format expected by the knowledge base, such as regular expression.
Handling Parse Errors
Parse errors can occur when an analyzer extension reads unexpected or invalid data from the database. If the error is critical to the operation of the entire extension, then it is appropriate to log an error and throw an exception. In the case of non-critical errors, then the parser should log a warning message, ignore the offending row in the database, and continue processing the rest of the rows.
Building Analyzer Extensions
Extensions are shared libraries and need to be built as such.
Sample extensions and a sample
Makefile
are available in the SDK Duck Sample* at https://software.intel.com/en-us/product-code-samples?topic=20903 .
GCC* 4.9 or later is required to build extensions. The Intel® C++ Compiler 15.0 or later may also be used, but GCC* 4.9 or later is still required.
Intel® Cluster Checker uses features from C++11, therefore the command line option
-std=c++11
is required to build analyzer extensions.
Loading Extensions
To load an analyzer extension, add it to a custom framework definition using the following XML tags:
<
configuration
>
<
framework_definition
name
=
"customFWD"
>
<
analyzer_extension
>
<
group
>
<
entry
>
custom_extension
</
entry
>
</
group
>
</
analyzer_extension
>
</
framework_definition
>
</
configuration
>
The basename of the extension should match the internal extension name assigned by
set_name()
. This name is the value that should be added to the list of analyzer extensions.
Example
A complete, fully functional analyzer extension that transforms the output of the
duck
provider into instances of
DUCK
CLIPS class is located in the SDK Duck Sample* at https://software.intel.com/en-us/product-code-samples?topic=20903 .
Knowledge Base
The knowledge base uses CLIPS rules to produce signs and diagnoses based on collected data. It is the framework through which Intel® Cluster Checker comes to conclusions about a system and thereby produces analysis. The knowledge base is documented in full in the
Knowledge Base
chapter.
Postprocessor Extensions
Postprocessor extensions format the results of analysis in a readable format. By default, Intel® Cluster Checker runs the summary postprocessor extension followed by the CLCK output log postprocessor extension. Postprocessor extensions can be specified in the config file using the following format:
<
configuration
>
<
postprocessor
>
<
postproc_extensions
>
<
group
>
<
entry
>
summary
</
entry
>
<
entry
>
clck_output_log
</
entry
>
</
group
>
</
postproc_extensions
>
</
postprocessor
>
</
configuration
>
They can also be included in an individual framework definition in a similar manner. For more information about customizing Framework Definitions, see the
Framework Definitions
chapter. The following postprocessor extensions are available:
Summary
The summary postprocessor extension displays a brief summary of the analysis results to the screen. This extension runs by default and can be specified with the entry tag using the string “summary”, as shown above.
CLCK Output Log
The CLCK output log postprocessor extension writes full analysis details to a log file. This extension runs by default and can be specified with the entry tag using the string “clck_output_log”, as shown above.
*Note: the duck sample is not available for Intel® Cluster Checker 2019 but will be available in the future.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804