All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bezdeka, Florian" <florian.bezdeka@siemens.com>
To: "xenomai@xenomai.org" <xenomai@xenomai.org>
Subject: irq pipeline: Netdev watchdog triggered
Date: Mon, 6 Sep 2021 12:58:12 +0000	[thread overview]
Message-ID: <e28cb7b73db9bfe4faa08f3c6468a59ed578a366.camel@siemens.com> (raw)

Hi,

here is another IRQ_PIPELINE problem we're facing on on of our quite
big Xenomai machines: 

Platform:
Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
2 Sockets, 6 cores per socket, 2 threads per core
(2 NUMA nodes)

As soon as we enable CONFIG_IRQ_PIPELINE the netdev watchdog triggers.
There is no specific load on the system. In most cases the watchdog
triggers right after booting up. There is no xenomai task running when
the crash happens.

The problem was detected during reboot tests. Disabling
CONFIG_IRQ_PIPELINE helped. We didn't run into a problem for >400
reboots.

Any advise how to debug that? Dovetail torture tests have already been
run and there was no error reported. CONFIG_DEBUG_IRQ_PIPELINE is
enabled as well and did not help so far.

The stacktrace follows but I guess it's not really helpful:

------------[ cut here ]------------
[   42.114151] NETDEV WATCHDOG: enp7s0f0 (igb): transmit queue 5 timed out
[   42.114173] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:467 dev_watchdog+0x205/0x210
[   42.163593] Modules linked in: overlay x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul mei_me iTCO_wdt iTCO_vendor_support mei hed ipmi_si ipmi_devintf ipmi_msghandler msr fuse configfs ip_tables x_tables ipv6 autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx hid_generic usbhid hid raid6_pq libcrc32c raid1 raid0 multipath linear lpc_ich ahci i2c_i801 xhci_pci i2c_smbus mfd_core libahci mpt3sas igb i2c_algo_bit raid_class xhci_hcd scsi_transport_sas
[   42.435191] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.61-dovetaildbg-noxeno-pipedbg-0 #1
[   42.486198] Hardware name: Siemens <...snip...>, BIOS s2.0a_M1 08/02/2016
[   42.546050] IRQ stage: Linux
[   42.563263] RIP: 0010:dev_watchdog+0x205/0x210
[   42.589837] Code: 91 26 fe ff eb b1 4c 89 ef c6 05 9e 1d b7 00 01 e8 90 86 fb ff 44 89 e1 4c 89 ee 48 c7 c7 48 41 de b7 48 89 c2 e8 8a c1 0a 00 <0f> 0b eb 92 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f0 80 a7 d8 00 00
[   42.702205] RSP: 0018:ffff9ff4c0003ec0 EFLAGS: 00010282
[   42.733459] RAX: 0000000000000000 RBX: ffff930cc42438c0 RCX: 4000000000000002
[   42.776148] RDX: 4000000000000004 RSI: 0000000000000000 RDI: ffffffffb70cb824
[   42.818841] RBP: ffff930cc7e903b8 R08: 000000000007daee R09: ffff9ff4c0003cc8
[   42.861531] R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000006
[   42.904224] R13: ffff930cc7e90000 R14: ffffffffb7686860 R15: ffff9ff4c0003f18
[   42.946916] FS:  0000000000000000(0000) GS:ffff931bff800000(0000) knlGS:0000000000000000
[   42.995328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   43.029700] CR2: 000000c000e50000 CR3: 00000001057ca001 CR4: 00000000001706f0
[   43.072391] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   43.115082] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   43.157774] Call Trace:
[   43.172388]  <IRQ>
[   43.184400]  ? pfifo_fast_dequeue+0x1e0/0x1e0
[   43.210453]  call_timer_fn+0x2b/0x100
[   43.232344]  __run_timers.part.0+0x1d9/0x250
[   43.257879]  ? ktime_get+0x35/0x90
[   43.278205]  run_timer_softirq+0x26/0x50
[   43.301661]  __do_softirq+0x105/0x30b
[   43.323550]  asm_call_irq_on_stack+0xf/0x20
[   43.348562]  </IRQ>
[   43.361099]  do_softirq_own_stack+0x37/0x40
[   43.386108]  irq_exit_rcu+0xaa/0xe0
[   43.406960]  arch_do_IRQ_pipelined+0x78/0x130
[   43.433012]  sync_current_irq_stage+0x177/0x250
[   43.460100]  __inband_irq_enable+0x4c/0x50
[   43.484593]  cpuidle_enter_state+0x100/0x3a0
[   43.510125]  cpuidle_enter+0x29/0x40
[   43.531498]  do_idle+0x1eb/0x230
[   43.550787]  cpu_startup_entry+0x19/0x20
[   43.574241]  start_kernel+0x56d/0x590
[   43.596134]  secondary_startup_64_no_verify+0xb0/0xbb
[   43.626343] ---[ end trace b219258fce8dc761 ]---
[   43.654095] igb 0000:07:00.0 enp7s0f0: Reset adapter
[   46.090948] igb 0000:07:00.0: exceed max 2 second

Best regards,
Florian

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

             reply	other threads:[~2021-09-06 12:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-06 12:58 Bezdeka, Florian [this message]
2021-09-06 13:31 ` irq pipeline: Netdev watchdog triggered Jan Kiszka
2021-09-06 14:56 ` Philippe Gerum
2021-09-06 15:01   ` Bezdeka, Florian
2021-09-06 15:38   ` Bezdeka, Florian
2021-09-06 16:16     ` Philippe Gerum
2021-10-07 14:50       ` Bezdeka, Florian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e28cb7b73db9bfe4faa08f3c6468a59ed578a366.camel@siemens.com \
    --to=florian.bezdeka@siemens.com \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.