• Home »
  • »
  • GPU Linux Shell Temp: Get Nvidia GPU Temperatures Via Linux CLI

GPU Linux Shell Temp: Get Nvidia GPU Temperatures Via Linux CLI

So as every one knows we just rebuilt our massive cracking server.  On the old server I was doing temperature monitoring with LM_Sensors, which for CPU temps is very good but I was not to sure about its accuracy on GPU Temperatures. So I decided to do some investigation into another solution.

If you are running a X-server then you can simply open up the GUI application for nvidia-settings and get your temps that way. Our server on the other hand does not run X and is simply a pure  CLI environment.  Monitoring GPU temps has never been of paramount importance in Linux since the primary reason for doing so was for gamers and as we all know, most games suck on linux, so the GPU temp applications are primarily windows based.

Now that the world of GPU based tools on Linux is becoming more popular it has become increasingly important to monitor our GPU temps. So back to my dilemma. I knew that nvidia-settings could be used from CLI however it still required a X-server which I did not want to install on the server.

I could have almost kicked myself when I actually found the solution. It turns out there is another utility which is installed with the drivers called nvidia-smi. This stands for “System Management Interface”.

Here is the output from the –help:


  1. [root@tools ~]# nvidia-smi --help
  2. nvidia-smi [OPTION1] [OPTION2 ARG] ...
  3. NVIDIA System Management Interface program for Tesla S870
  5.         -h, --help                                  Show usage and exit
  6.         -x, --xml-format                            Produce XML log (to stdout by default, unless
  7.                                                     a file is specified with -f or --filename=FILE
  8.         -l, --loop-continuously                     Probe continuously, clobbers old logfile if not printing to stdout
  9.         -t NUM, --toggle-led=NUM                    Toggle LED state for Unit <NUM>
  10.         -i SEC, --interval=SEC                      Probe once every <SEC> seconds if the -l option
  11.                                                     is selected (default and minimum: 1 second)
  12.         -f FILE, --filename=FILE                    Specify log file name
  13.         --gpu=GPUID --compute-mode-rules=RULESET    Set rules for compute programs
  14.                                                     where GPUID is the number of the GPU (starting at zero) in the system
  15.                                                     and RULESET is one of:
  16.                                                     0: Normal mode
  17.                                                     1: Compute-exclusive mode (only one compute program per GPU allowed)
  18.                                                     2: Compute-prohibited mode (no compute programs may run on this GPU)
  19.         -g GPUID -c RULESET                         (short form of the previous command)
  20.         --gpu=GPUID --show-compute-mode-rules
  21.         -g GPUID -s                                 (short form of the previous command)
  22.         -L, --list-gpus
  23.         -lsa, --list-standalone-gpus-also           Also list standalone GPUs in the system along with their temperatures.
  24.                                                     Can be used with the -l, --loop-continuously option
  25.         -lso, --list-standalone-gpus-only           Only list standalone GPUs in the system along with their temperatures.
  26.                                                     Can be used with the -l, --loop-continuously option

The help section shows all the available options but for now all I am concerned with is getting the temps on our cards. Since we have 4 295 GTX cards in a 4U rack case I am guessing they are pretty hot. I have been told that 100c is about the maximum temperature we want to run this card at so I am hoping I am close to this otherwise I will have to figure out some more cooling options.

I run the command “nvidia-smi -lso”:


  1. [root@tools ~]# nvidia-smi -lso
  3. GPU 0:
  4.         Product Name            : GeForce GTX 295
  5.         Serial                  : 2074402432753
  6.         PCI ID                  : 5eb10de
  7.         Temperature             : 95 C
  8. GPU 1:
  9.         Product Name            : GeForce GTX 295
  10.         Serial                  : 562607522042
  11.         PCI ID                  : 5eb10de
  12.         Temperature             : 99 C
  13. GPU 2:
  14.         Product Name            : GeForce GTX 295
  15.         Serial                  : 3045216059627
  16.         PCI ID                  : 5e010de
  17.         Temperature             : 100 C
  18. GPU 3:
  19.         Product Name            : GeForce GTX 295
  20.         Serial                  : 1249785487812
  21.         PCI ID                  : 5e010de
  22.         Temperature             : 99 C
  23. GPU 4:
  24.         Product Name            : GeForce GTX 295
  25.         Serial                  : 2761580421585
  26.         PCI ID                  : 5e010de
  27.         Temperature             : 100 C
  28. GPU 5:
  29.         Product Name            : GeForce GTX 295
  30.         Serial                  : 418726093573
  31.         PCI ID                  : 5e010de
  32.         Temperature             : 98 C
  33. GPU 6:
  34.         Product Name            : GeForce GTX 295
  35.         Serial                  : 420974240487
  36.         PCI ID                  : 5e010de
  37.         Temperature             : 96 C
  38. GPU 7:
  39.         Product Name            : GeForce GTX 295
  40.         Serial                  : 1243494032209
  41.         PCI ID                  : 5e010de
  42.         Temperature             : 97 C

As you can see I can now grab the temps of all 8 of my cards via a ssh session which is what I wanted to do all along. There are other options with this tool I have not yet explored, but we plan to maybe write a custom Cacti plugin or something to monitor and graph GPU temps. I know the solution here may seem simple but I literally have been looking for days for a way to do this and be able to get data written to stoudt so that I can use this information in a another script or program.