- Print
- DarkLight
- PDF
GPU Monitoring with ControlUp 8.7
- Print
- DarkLight
- PDF
Introduction
With ControlUp, you benefit from real-time monitoring of your GPU data. You can monitor the performance of your GPUs, whether it is on a machine, session or process level.
Up until version 8.7, metrics for only data for NVIDIA GPUs was collected. Version 8.7 now collects data for all GPU models by using Windows built-in GPU metrics. This means that you can display GPU metrics in the Real-Time Console and Solve just as you would by running the Task Manager on the machine with the GPU installed.
System Prerequisites
Version 8.7 supports a broad range of graphic card manufacturers. The monitored machine running the GPU must meet one of the following requirements:
- Non-NVIDIA GPU models require the Windows Display Driver Model (WDDM). To support non-NVIDIA graphic cards, the machine must run one of the following operating systems:
- Windows Server 2019 or higher
- Windows 10 1709 or higher
- Physical Windows machine or virtual Windows machine with passthrough devices.
- Any other Windows / VM configuration where data is visible from the task manager.
ControlUp Prerequisites
To use the new GPU feature, you need to
- Deploy the ControlUp Agent on each machine that runs the GPU.
- On the same machines, set the following registry values:
Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: IsGPUDisabled
Type: REG_DWORD
Value: 0
Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: EnableNvidiaGPUCollection
Type: REG_DWORD
Value: 0
By using Controllers, you can set the registry keys on multiple machines simultaneously.
GPU Metrics
We provide a number of useful GPU metrics that help you monitor the GPUs of your machines. The table below provides an overview of all GPU-related metrics that are implemented in our products:
Metric Name | Description | Tab |
---|---|---|
Average GPU Frame Buffer Usage | Average frame buffer usage for all GPUs | Folders |
Average GPU Usage | Average usage of all GPUs | Folders |
GPU Architecture | GPU architecture | Machine |
GPU Available Memory | GPU available memory in megabytes (MB) | Machine |
GPU CPU Utilization | GPU CPU Utilization | Session |
GPU Decoder Utilization | GPU decoder Utilization | Machine |
GPU Driver version | Current version of the installed GPU driver | Machine |
GPU Encoder Utilization | GPU encoder Utilization | Machine |
GPU Frame Buffer Memory Utilization | GPU Frame Buffer Memory Utilization | Session |
GPU Frame Buffer Size | Size of memory assigned to the GPU | Machine |
GPU Frame Buffer Usage | Used size of the physical graphics card frame buffer memory in percent | Machine |
GPU License Port | Primary license server port | Machine |
GPU License Server | Primary license server name | Machine |
GPU Memory Usage | GPU memory usage in percents | Machine |
GPU Model | GPU name or GRID GPU profile type | Machine |
GPU Number of Cores | Number of CUDA cores | Machine |
GPU Usage | GPU usage in percents | Machine |
GPU Utilization | GPU Utilization | Process |
GPU Video Decoder Usage | GPU video decoder usage in percents | Session |
GPU Video Encoder Usage | GPU video encoder usage in percents | Session |
Machines with GPU | Number of machines with GPUs | Folders |
Collecting Data from NVIDIA API (Optional)
Windows is the default data source for collecting GPU metrics in 8.7. To switch back to collecting data from the NVIDIA API, set the following registry key on the machine that has the NVIDIA GPU installed:
Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: EnableNvidiaGPUCollection
Type: REG_DWORD
Value: 1
Known Issue
A known issue in version 8.7 is that GPU metrics are displayed even when no GPU is enabled. In this case you need to set the following registry key on each agent machine with no GPU enabled:
Path: HKEY_LOCAL_MACHINE\SOFTWARE\Smart-X\ControlUp\Agent\GPU\
RegKey: EnableNvidiaGPUCollection
Type: REG_DWORD
Value: 0
If you need to apply the registry changes on multiple machines, use the Controllers pane.
This issue is going to be fixed in the next release (8.8)