GPU Monitoring with ControlUp 8.7
  • Dark
    Light
  • PDF

GPU Monitoring with ControlUp 8.7

  • Dark
    Light
  • PDF

Introduction

With ControlUp, you benefit from real-time monitoring of your GPU data. You can monitor the performance of your GPUs, whether it is on a machine, session or process level.

Up until version 8.7, metrics for only data for NVIDIA GPUs was collected. Version 8.7 now collects data for all GPU models by using Windows built-in GPU metrics. This means that you can display GPU metrics in the Real-Time Console and Solve just as you would by running the Task Manager on the machine with the GPU installed.

System Prerequisites

Version 8.7 supports a broad range of graphic card manufacturers. The monitored machine running the GPU must meet one of the following requirements:

  • Non-NVIDIA GPU models require the Windows Display Driver Model (WDDM). To support non-NVIDIA graphic cards, the machine must run one of the following operating systems:
    • Windows Server 2019 or higher
    • Windows 10 1709 or higher
  • Physical Windows machine or virtual Windows machine with passthrough devices.
  • Any other Windows / VM configuration where data is visible from the task manager.

ControlUp Prerequisites

To use the new GPU feature, you need to

  1. Deploy the ControlUp Agent on each machine that runs the GPU.
  2. On the same machines, set the following registry values:

Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: IsGPUDisabled
Type: REG_DWORD
Value: 0

Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: EnableNvidiaGPUCollection
Type: REG_DWORD
Value: 0

Tip for adding registry values on multiple machines

By using Controllers, you can set the registry keys on multiple machines simultaneously.

GPU Metrics

We provide a number of useful GPU metrics that help you monitor the GPUs of your machines. The table below provides an overview of all GPU-related metrics that are implemented in our products:

Metric Name Description Tab
Average GPU Frame Buffer Usage Average frame buffer usage for all GPUs Folders
Average GPU Usage Average usage of all GPUs Folders
GPU Architecture GPU architecture Machine
GPU Available Memory GPU available memory in megabytes (MB) Machine
GPU CPU Utilization GPU CPU Utilization Session
GPU Decoder Utilization GPU decoder Utilization Machine
GPU Driver version Current version of the installed GPU driver Machine
GPU Encoder Utilization GPU encoder Utilization Machine
GPU Frame Buffer Memory Utilization GPU Frame Buffer Memory Utilization Session
GPU Frame Buffer Size Size of memory assigned to the GPU Machine
GPU Frame Buffer Usage Used size of the physical graphics card frame buffer memory in percent Machine
GPU License Port Primary license server port Machine
GPU License Server Primary license server name Machine
GPU Memory Usage GPU memory usage in percents Machine
GPU Model GPU name or GRID GPU profile type Machine
GPU Number of Cores Number of CUDA cores Machine
GPU Usage GPU usage in percents Machine
GPU Utilization GPU Utilization Process
GPU Video Decoder Usage GPU video decoder usage in percents Session
GPU Video Encoder Usage GPU video encoder usage in percents Session
Machines with GPU Number of machines with GPUs Folders

Collecting Data from NVIDIA API (Optional)

Windows is the default data source for collecting GPU metrics in 8.7. To switch back to collecting data from the NVIDIA API, set the following registry key on the machine that has the NVIDIA GPU installed:

Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: EnableNvidiaGPUCollection
Type: REG_DWORD
Value: 1

Known Issue

A known issue in version 8.7 is that GPU metrics are displayed even when no GPU is enabled. In this case you need to set the following registry key on each agent machine with no GPU enabled:

Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: EnableNvidiaGPUCollection
Type: REG_DWORD
Value: 0

If you need to apply the registry changes on multiple machines, use the Controllers pane.

Note

This issue is going to be fixed in the next release (8.8)


Was this article helpful?

What's Next