CentOS IRQ Errors, Server: kernel: Disabling IRQ #177

Recently we have been doing a lot of testing of oclHashcat on CentOS Linux servers. The oclHashcat application takes advantage of the GPU’s, or Graphical Processing Units, of Nvidia graphics cards or ATI graphics cards. Anyhow one of the servers we have been testing with has four Nvidia 295 GTX’s and at times was receiving an error stating the kernel was disabling the IRQ. Below we describe the error in more detail along with a kernel parameter that was added to resolve the error. Even though we experienced this issue with oclHashcat specifically the error could happen with other applications and/or Linux Operating Systems so the resolution could be the same.

Kernel IRQ Disable Error On CentOS:

bash

  1. server kernel: Disabling IRQ #177

The above error actually states the hostname of the server before “kernel” in the above output and the IRQ number won’t necessarily be the same since it could technically be any IRQ experiencing the issue. After some investigation I noticed the below output after receiving the above error to the shell I was working from.

CentOS Messages Log File With IRQ Error Details:

bash

  1. Oct 16 12:20:29 server kernel: irq 177: nobody cared (try booting with the "irqpoll" option)
  2. Oct 16 12:20:29 server kernel:
  3. Oct 16 12:20:29 server kernel: Call Trace:
  4. Oct 16 12:20:29 server kernel:  <IRQ>  [<ffffffff800babaf>] __report_bad_irq+0x30/0x7d
  5. Oct 16 12:20:29 server kernel:  [<ffffffff800bade2>] note_interrupt+0x1e6/0x227
  6. Oct 16 12:20:29 server kernel:  [<ffffffff800ba2de>] __do_IRQ+0xbd/0x103
  7. Oct 16 12:20:29 server kernel:  [<ffffffff8001231e>] __do_softirq+0x89/0x133
  8. Oct 16 12:20:29 server kernel:  [<ffffffff8006c9bf>] do_IRQ+0xe7/0xf5
  9. Oct 16 12:20:29 server kernel:  [<ffffffff8005726a>] mwait_idle+0x0/0x4a
  10. Oct 16 12:20:29 server kernel:  [<ffffffff8005d615>] ret_from_intr+0x0/0xa
  11. Oct 16 12:20:29 server kernel:  <EOI>  [<ffffffff88d7828b>] :acpi_cpufreq:acpi_cpufreq_target+0x0/0x3f8
  12. Oct 16 12:20:29 server kernel:  [<ffffffff800572a0>] mwait_idle+0x36/0x4a
  13. Oct 16 12:20:29 server kernel:  [<ffffffff8004947b>] cpu_idle+0x95/0xb8
  14. Oct 16 12:20:29 server kernel:  [<ffffffff80077474>] start_secondary+0x498/0x4a7
  15. Oct 16 12:20:29 server kernel:
  16. Oct 16 12:20:29 server kernel: handlers:
  17. Oct 16 12:20:29 server kernel: [<ffffffff801f1eda>] (usb_hcd_irq+0x0/0x55)
  18. Oct 16 12:20:29 server kernel: [<ffffffff801f1eda>] (usb_hcd_irq+0x0/0x55)
  19. Oct 16 12:20:29 server kernel: [<ffffffff886fa92c>] (nv_kern_isr+0x0/0x54 [nvidia])
  20. Oct 16 12:20:29 server last message repeated 2 times
  21. Oct 16 12:20:29 server kernel: Disabling IRQ #177

As seen above in the messages log file output there is a conflict with IRQ #177 and the server disables that IRQ when the problem is encountered. Notice the very first line in the output recommends passing irqpoll to the kernel during boot which is easy to do by modifying the grub.conf file on your server. You also might be curious what irqpoll actually is so below is a brief description of irqpoll followed by an example of a modified grub.conf file that passes irqpoll to the server.

CentOS Kernel Option irqpoll:

When an interrupt is not handled, search all known interrupt handlers for it and also check all handlers on each timer interrupt. This is intended to get systems with badly broken firmware running.

Example CentOS Grub Configuration File With irqpoll Option:

bash

  1. # grub.conf generated by anaconda
  2. #
  3. # Note that you do not have to rerun grub after making changes to this file
  4. # NOTICE:  You have a /boot partition.  This means that
  5. #          all kernel and initrd paths are relative to /boot/, eg.
  6. #          root (hd0,0)
  7. #          kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
  8. #          initrd /initrd-version.img
  9. #boot=/dev/sda
  10. default=0
  11. timeout=5
  12. splashimage=(hd0,0)/grub/splash.xpm.gz
  13. hiddenmenu
  14. title CentOS (2.6.18-194.17.1.el5.centos.plus)
  15.         root (hd0,0)
  16.         kernel /vmlinuz-2.6.18-194.17.1.el5.centos.plus ro root=/dev/VolGroup00/LogVol00 irqpoll
  17.         initrd /initrd-2.6.18-194.17.1.el5.centos.plus.img
  18. title CentOS (2.6.18-164.11.1.el5)
  19.         root (hd0,0)
  20.         kernel /vmlinuz-2.6.18-164.11.1.el5 ro root=/dev/VolGroup00/LogVol00 irqpoll
  21.         initrd /initrd-2.6.18-164.11.1.el5.img
  22. title CentOS (2.6.18-164.el5)
  23.         root (hd0,0)
  24.         kernel /vmlinuz-2.6.18-164.el5 ro root=/dev/VolGroup00/LogVol00
  25.         initrd /initrd-2.6.18-164.el5.img

Please note that there are three possible kernels that the server can boot from but the “default=0” line specifies the server should boot the very first available kernel configuration which in this case is “CentOS (2.6.18-194.17.1.el5.centos.plus)”. The configuration line that starts with kernel below the title configuration line is where irqpoll is specified at the end. The grub.conf file is located in /boot/grub/ and there is also a symbolic link in the /etc directory to it. Use your favorite file editor such as vi to modify /boot/grub/grub.conf and simply add irqpoll to the end of the kernel configuration line. Make sure to not change anything else in the grub.conf file since any errors in this file could cause the server not to boot.

LINUX Network Administrators Guide: 508 pages (Paperback)


List Price: Click For Price
New From: 0 Out of Stock
Used from: $23.03 USD In Stock

Simple Real-time Operating System: A Kernel Inside View for a Beginner (Paperback)


List Price: $30.50 USD
New From: $30.50 USD In Stock
Used from: $24.65 USD In Stock

Share