Stress Settings

As a comprehensive real-time monitoring solution for multi-user environments, ControlUp is capable of displaying a complex and flexible measure of system health, called “Stress Level”, for every monitored resource, be it a folder, computer, user session or process. Stress Level is a numeric column, which is displayed in ControlUp’s grid with the following ranges: None, Low, Medium, High and Critical. Using this column, you can quickly determine the health of your resources, for example by sorting the grid so that highly stressed resources are on top. In this chapter, you will learn how to configure the Stress Level column to optimally represent the current health status of resources in your environment.

Stress Settings tab layout

All Stress Level-related settings are configured using the Stress Settings tab of the Settings Window. As seen in the image above, this tab contains a folder tree (1),
which is identical to the tree displayed in “My Organization” pane, a navigation bar (2)
for switching between resource types, a counter selection area (3),
an “Applies To” are for configuring filters (4),
a Settings area for configuring the computation of the Stress Level value (5)
and a “Stress Levels” area (6)
for configuring the boundaries between different levels.

About Stress Level Inheritance

Default Settings

By default, every ControlUp organization contains a single set of Stress Level settings, configured on the root folder of the organization. Unless configured otherwise, these settings are inherited by all child folders and the computers within them.

Subfolder Inheritance

ControlUp’s folder tree is designed to allow the user to arrange computers in folders according to their type. For example, you might want to separate your workstations from your servers, and further segment the servers folder into subfolders containing different types of machines. This type of arrangement is generally convenient, and is especially useful for configuring different Stress Level settings for different types of computers.

Filter Inheritance

Besides segmenting resources into subfolders, ControlUp also distinguishes between resources automatically, allowing to configure performance counter thresholds which are optimal for each monitored resource. This is done by using filters, which are pre-configured criteria configurable in the counter area.

COMPUTER FILTERS

You may configure different thresholds for each computer type using the filter area of each counter configuration. ControlUp distinguishes between the following computer types:

General Purpose Server – a computer running a server-class Operating System, with no Terminal (Remote Desktop) Services installed. This could be a file server, an Exchange server, a Web server or an SQL server. These computers typically host a limited number of user sessions for administrative purposes, and have most of their resources consumed by background services.

RDS – a computer running a server-class Operating System with Terminal (Remote Desktop) Services role installed. These computers typically host multiple end-user sessions, running virtualized applications or full-desktop environments.

Workstation – a computer running a client Operating System, such as Windows 7, 8 or XP. A computer of this type typically hosts a single user session with foreground processes (applications) consuming most of the computer’s resources.

By default, all filters within every counter inherit its default thresholds. By clicking on the filter name on the left, you can customize the thresholds for each filter, as described below.

Configuring the Stress Level Computation

The counter area of the Stress Settings tab allows you to configure which metrics contribute to the computation of the Stress Level column for each resource in ControlUp.

Per-Counter Configurations

The counter area includes a row for every column included in each of ControlUp’s views (Folders, Computers, Sessions, Processes, Accounts and Executables). Please note that each view supports a different set of columns. You can switch between views using the navigation buttons on top of the grid.

Each counter row includes several settings which configure the contribution of that counter to the total Stress Level of the record.

YELLOW AND RED

Every counter has a Yellow and a Red zone, with configurable numeric boundaries. In the example above, a computer “CPU” column’s default settings are 80% for Yellow and 90% for Red. Once a computer’s CPU usage climbs to 85%, the cell in its CPU column will become yellow. If the CPU usage drops below 80% again, this cell will go back to green. These changes in the grid should be instantaneous.

Note: Some counters (such as Free Disk Space) have reverse zone boundaries, i.e. Red values will be lower than Yellow values, since in these cases a lower value indicates a more severe condition.

DURATION

Once a Yellow or a Red boundary is crossed, ControlUp tracks the time the value of the counter stays above that boundary. You can configure ControlUp to increase the resource’s Stress Level when this happens, specifying how long should the value stay above the threshold. For example, you may decide that if a computer’s Disk Queue Length value stays over 1 for 1 minute, this may indicate an I/O bottleneck and should affect the computer’s Stress Level, and if the value exceeds 2 for a minute it may indicate a severe I/O issue you might want to be displayed in red, as shown:

LOAD

The “Load” value determines how many points should be added to the value of the record’s Stress Level column when a threshold is crossed for the time duration described above. For example, in the Disk Queue example above, if the value stays between 1 and 2 for a minute, the Stress Level will be incremented by 1 point. If the value is above 2, the Stress Level will be incremented by 2 points.

SEVERITY BY

To change the value used by the information grid to display the performance data of a column and modify the cell color accordingly, select a computation method from the “Severity by” drop-down list. The following values are available:

  • Current Value – the column will display the present point value of the counter. This is recommended for counters such as “Memory Utilization” or disk free space, for which knowing the most current present value is most valuable.
  • Max – the maximum value recorded in the counter since its sampling started. Valuable mainly for peak analysis and capacity planning.
  • Min – same as the above, referring to the minimum value.

Note: ControlUp’s performance counters maintain a buffer of samples that were significantly different from previous samples. The number of stored values depends on the variance of the sample. While the computation formulas are beyond the scope of this document, while changing the default computation method for columns, you should keep in mind that “In History” values are computed in relation to more recently received data.

  • Max In History – the maximum value of the counter’s current buffer.
  • Average – the average value computed on all values recorded by the counter since its sampling started. Valuable mainly for long-term analysis and establishing baselines.
  • Average In History – the average value of the counter’s current buffer. Valuable mainly for rapidly fluctuating counters, such as Page Faults/sec and CPU usage.

In order to illustrate the usage of the above values, let us consider the case of a computer’s “CPU” column. If you select “Average In History” in the “Severity by” drop-down list, you may witness a situation in which the counter will be colored red, while its displayed value is in the “green” range. The reason for this is the fact that the displayed value is based on the current value (e.g. 5%), while the severity color code is based upon the “Average In History” value, which may be high (e.g. 90%). This type of configuration makes sense in most environments, since a momentary peak of CPU usage is usually no cause for alarm, while a prolonged CPU load detected by the “Average In History” value my indicate a performance issue and justifies a color coded severity alert. It is highly recommended that you take extreme care when customizing the counter thresholds and their calculation sources. It is best to consider the variance and fluctuation rate of each counter when planning a change to these values.

N/A COLOR

Some counters have a complex computation mechanism, which may fail under certain conditions. For example, when a value of a performance counter cannot be retrieved. For each of the metrics collected by ControlUp, you may decide that a failure to collect a counter’s value in itself represents an issue and should change the color of the column to yellow or red. For example, the XenApp Load.

Configuring boundaries between Stress Levels

Using the Stress Levels panel on the left side of the Stress Settings pane, you can customize the numeric boundaries between ControlUp’s stress levels.

By default, all resources in your organization inherit the following default stress levels boundaries:

  • (No Stress) < 1
  • 1 <= Low < 2
  • 2 <= Medium < 3
  • 3 <= High < 6
  • 6 <= Critical

Just like the stress level computation settings, these boundaries are configurable on a folder basis, which means that a resource (computer, session or process) with a stress level of 7 may be considered Critical in one folder and Medium in another, according to the needs of your environment.

In order to customize the Stress Level boundaries for a subfolder:

  1. Switch to the Stress Settings tab of the Settings Window.
  2. Click on the desired subfolder in the organization tree.
  3. Uncheck the “Default Configurations” checkbox in the “Stress Levels” panel just below the tree.
  4. Adjust the numeric boundaries using the sliders or by typing the numbers into the fields corresponding to each level.
  5. Click “Apply Settings” on the Home ribbon.

In order to reset default Stress Level boundaries for a subfolder:

  1. Switch to the Stress Settings tab of the Settings Window.
  2. Click on the desired subfolder in the organization tree.
  3. Check the “Default Configurations” checkbox in the “Stress Levels” panel just below the tree.
  4. Click “Apply Settings” on the Home ribbon.

Receiving Stress Level Alerts

You can configure ControlUp to alert you when resources in your environment reach a configured stress level. For more information, please refer to the Trigger Settings section.

Powered by Zendesk