NVTOP/en
NVTOP stands for Neat Videocard TOP, a (h)top like task monitor for GPUs and accelerators. It can handle multiple GPUs and print information about them in a htop-familiar way.
Monitor GPU usage¶
NVTOP can monitor single or multiple GPUs. It can show the GPU usage and its memory. One can also select a specific device from the menu (F2 -> GPU Select).
GPU Usage Efficiency
NVTOP is useful to monitor and verify that your job is using the GPU as efficiently as possible.
Monitor batch job¶
If you have submitted a non-interactive job and would like to see its current GPU usage.
-
From a login node, find the job ID and select the one to monitor:
-
Attach to the running job:
Monitor interactive job¶
-
Start your interactive job with minimal resources.
-
In a second terminal, connect to the login node, find the job ID:
-
Attach to the running job:
You'll be able to see the usage in real time as you run your commands in the first terminal.
Monitor a GPU on a specific node¶
When running multi-node jobs, it can be useful to verify that one or all GPUs are effectively used.
-
From a login node, find the job ID and identify the node names:
-
Attach to the running job on the specific node: ```bash srun --pty --overlap --jobid JOBID --nodelist NODENAME nvtop