netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* igb transmit queue timed out, rcu_sched_state detected stall
@ 2011-08-12 16:42 Peter Neal
  2011-08-15 14:33 ` Peter Neal
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Neal @ 2011-08-12 16:42 UTC (permalink / raw)
  To: netdev

Hi,

I have a machine with 25 interfaces, a mixture of igb and e1000e dual
and quad port NICs. It is used to PXE install and test 24 network
appliances at the same time. During a test run, the interfaces often
go up and down, and are frequently reconfigured (ipv4 and ipv6, all
done through iproute2). I recently changed some of the scripting that
controls the network setup, and now the box has started to hang about
4 hours into the test. I set up a serial console, and the text below
was spat out when the box hung - my shell on the serial terminal
responds when I press return, but any command hangs.

The issue is reproducible on two machines, a dell R900 (below) and an
R910 in a very similar setup, both running a 64bit kernel. I initially
found the issue on a debian packaged 2.6.32-5-amd64, and upgraded to
3.0.1 - this shows the same behaviour, and provided the information
below.

Please copy me on any replies as I'm not subscribed.

Thanks,


Pete Neal


[10179.824031] ------------[ cut here ]------------
[10179.879205] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xea/0x17e()
[10179.964506] Hardware name: PowerEdge R900
[10180.012375] NETDEV WATCHDOG: eth0 (igb): transmit queue 0 timed out
[10180.087284] Modules linked in: ipmi_si ipmi_devintf ipmi_msghandler
8021q garp stp loop tpm_tis snd_pcm snd_timer snd shpchp soundcore
rng_core dcdbas snd_page_alloc i7300_idle psmouse pcspkr processor tpm
tpm_bios ioatdma evdev thermal_sys serio_raw pci_hotplug button ext3
jbd mbcache sg sr_mod cdrom sd_mod ses crc_t10dif enclosure
ata_generic usbhid hid uhci_hcd ata_piix ehci_hcd libata e1000e
usbcore bnx2 megaraid_sas scsi_mod igb dca [last unloaded:
scsi_wait_scan]
[10180.585411] Pid: 0, comm: swapper Not tainted 3.0.1 #1
[10180.646793] Call Trace:
[10180.675937]  <IRQ>  [<ffffffff810450af>] ? warn_slowpath_common+0x78/0x8c
[10180.757115]  [<ffffffff81045162>] ? warn_slowpath_fmt+0x45/0x4a
[10180.827869]  [<ffffffff81281560>] ? netif_tx_lock+0x43/0x74
[10180.894465]  [<ffffffff810625a9>] ? hrtimer_interrupt+0x114/0x1a6
[10180.967301]  [<ffffffff8128167b>] ? dev_watchdog+0xea/0x17e
[10181.033890]  [<ffffffff810662aa>] ? ktime_get+0x50/0x88
[10181.096319]  [<ffffffff81051dbb>] ? run_timer_softirq+0x1c3/0x290
[10181.169149]  [<ffffffff81281591>] ? netif_tx_lock+0x74/0x74
[10181.235734]  [<ffffffff8104a737>] ? __do_softirq+0xc4/0x1a0
[10181.302324]  [<ffffffff81090afa>] ? handle_irq_event_percpu+0x166/0x184
[10181.381402]  [<ffffffff8132af5c>] ? call_softirq+0x1c/0x30
[10181.446950]  [<ffffffff8100aa3f>] ? do_softirq+0x3f/0x79
[10181.510420]  [<ffffffff8104a507>] ? irq_exit+0x44/0xb5
[10181.571809]  [<ffffffff8100a38a>] ? do_IRQ+0x94/0xaa
[10181.631121]  [<ffffffff81323cd3>] ? common_interrupt+0x13/0x13
[10181.700827]  <EOI>  [<ffffffffa031816e>] ?
acpi_idle_enter_simple+0xc7/0xfc [processor]
[10181.796569]  [<ffffffffa031816a>] ?
acpi_idle_enter_simple+0xc3/0xfc [processor]
[10181.885006]  [<ffffffff81252d9f>] ? cpuidle_idle_call+0x123/0x1d4
[10181.957836]  [<ffffffff81008dc7>] ? cpu_idle+0xab/0xe1
[10182.019225]  [<ffffffff81692c17>] ? start_kernel+0x3b4/0x3bf
[10182.086857]  [<ffffffff816923c8>] ? x86_64_start_kernel+0x102/0x10f
[10182.161766] ---[ end trace 74b13b200a4ea2df ]---
[10182.217111] igb 0000:10:00.0: eth0: Reset adapter
[10202.300005] INFO: rcu_sched_state detected stall on CPU 10 (t=15000 jiffies)
[10202.304005] INFO: rcu_sched_state detected stall on CPU 4 (t=15000 jiffies)

root@worker-1b.bigrig.dom:~# [10321.876088] INFO: task kworker/3:1:62
blocked for more than 120 seconds.
[10321.956203] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10322.049830] kworker/3:1     D ffff88042ee03d50     0    62      2 0x00000000
[10322.134133]  ffff88042ee03d50 0000000000000046 0000000000000000
ffff88042ece8000
[10322.222595]  0000000000012680 ffff88042e87bfd8 ffff88042e87bfd8
0000000000012680
[10322.311059]  ffff88042ee03d50 ffff88042e87a010 ffff88042ee03d50
000000013f272680
[10322.399536] Call Trace:
[10322.428684]  [<ffffffff813229aa>] ? schedule_timeout+0x2d/0xd7
[10322.498392]  [<ffffffff81037e3b>] ? update_rq_clock+0x15/0x2f
[10322.567055]  [<ffffffff81094019>] ? rcu_batches_completed+0x8/0x8
[10322.639874]  [<ffffffff8132281e>] ? wait_for_common+0xd1/0x14e
[10322.709583]  [<ffffffff810419a7>] ? try_to_wake_up+0x18d/0x18d
[10322.779288]  [<ffffffff81094019>] ? rcu_batches_completed+0x8/0x8
[10322.852115]  [<ffffffff81086eb6>] ? __stop_cpus+0xc4/0xe1
[10322.916616]  [<ffffffff81086f0c>] ? try_stop_cpus+0x39/0x51
[10322.983199]  [<ffffffff81094019>] ? rcu_batches_completed+0x8/0x8
[10323.056025]  [<ffffffff81095561>] ? synchronize_sched_expedited+0x99/0xc8
[10323.137181]  [<ffffffff812818a9>] ? dev_deactivate_many+0xf4/0x186
[10323.211046]  [<ffffffff81277b4d>] ? __linkwatch_run_queue+0x1a5/0x1a5
[10323.288030]  [<ffffffff81281968>] ? dev_deactivate+0x2d/0x42
[10323.355658]  [<ffffffff812777c8>] ? linkwatch_do_dev+0x9c/0xb2
[10323.425361]  [<ffffffff81277b06>] ? __linkwatch_run_queue+0x15e/0x1a5
[10323.502346]  [<ffffffff81277b6d>] ? linkwatch_event+0x20/0x26
[10323.571015]  [<ffffffff8105b7c8>] ? process_one_work+0x1cc/0x2ea
[10323.642795]  [<ffffffff8105ba13>] ? worker_thread+0x12d/0x247
[10323.711456]  [<ffffffff8105b8e6>] ? process_one_work+0x2ea/0x2ea
[10323.783243]  [<ffffffff8105b8e6>] ? process_one_work+0x2ea/0x2ea
[10323.855034]  [<ffffffff8105ec89>] ? kthread+0x7a/0x82
[10323.915382]  [<ffffffff8132ae64>] ? kernel_thread_helper+0x4/0x10
[10323.988199]  [<ffffffff8105ec0f>] ? kthread_worker_fn+0x147/0x147
[10324.061025]  [<ffffffff8132ae60>] ? gs_change+0x13/0x13
[10324.123455] INFO: task irqbalance:1793 blocked for more than 120 seconds.
[10324.204596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10324.298223] irqbalance      D ffff88042d99a8e0     0  1793      1 0x00000000
[10324.382521]  ffff88042d99a8e0 0000000000000082 ffff880400000000
ffff88042ec6df60
[10324.470993]  0000000000012680 ffff88042a8f5fd8 ffff88042a8f5fd8
0000000000012680
[10324.559457]  ffff88042d99a8e0 ffff88042a8f4010 ffff88042a8f5cc8
000000012d91d000
[10324.647926] Call Trace:
[10324.677066]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10324.751974]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10324.815444]  [<ffffffff81269ca6>] ? dev_load+0x9/0x70
[10324.875792]  [<ffffffff8126b017>] ? dev_ioctl+0x4ad/0x62e
[10324.940302]  [<ffffffff810eb735>] ? get_partial_node+0x15/0x7b
[10325.010005]  [<ffffffff8125844c>] ? sock_do_ioctl+0x2f/0x36
[10325.076582]  [<ffffffff81258853>] ? sock_ioctl+0x205/0x212
[10325.142126]  [<ffffffff810f3d2d>] ? get_empty_filp+0x9c/0x12b
[10325.210790]  [<ffffffff810ff9bb>] ? do_vfs_ioctl+0x467/0x4b4
[10325.278414]  [<ffffffff81259ed4>] ? sock_alloc_file+0xae/0x10c
[10325.348117]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[10325.411574]  [<ffffffff810ffa53>] ? sys_ioctl+0x4b/0x70
[10325.473999]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10325.547865] INFO: task snmpd:1820 blocked for more than 120 seconds.
[10325.623803] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10325.717436] snmpd           D ffff88042b412fb0     0  1820      1 0x00000000
[10325.801728]  ffff88042b412fb0 0000000000000086 ffffffff00000000
ffff88042edb0000
[10325.890194]  0000000000012680 ffff88042e961fd8 ffff88042e961fd8
0000000000012680
[10325.978657]  ffff88042b412fb0 ffff88042e960010 ffff8804127346c0
000000018113e49b
[10326.067122] Call Trace:
[10326.096260]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10326.171167]  [<ffffffff810fa4b3>] ? dget+0x12/0x1e
[10326.228388]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10326.291853]  [<ffffffff8126aba4>] ? dev_ioctl+0x3a/0x62e
[10326.355315]  [<ffffffff81103734>] ? dput+0x29/0xe9
[10326.412538]  [<ffffffff810eb735>] ? get_partial_node+0x15/0x7b
[10326.482242]  [<ffffffff810ec082>] ? kmem_cache_alloc+0x2a/0xe1
[10326.551952]  [<ffffffff8125844c>] ? sock_do_ioctl+0x2f/0x36
[10326.618530]  [<ffffffff81258853>] ? sock_ioctl+0x205/0x212
[10326.684084]  [<ffffffff810f3d2d>] ? get_empty_filp+0x9c/0x12b
[10326.752746]  [<ffffffff810ff9bb>] ? do_vfs_ioctl+0x467/0x4b4
[10326.820372]  [<ffffffff81259ed4>] ? sock_alloc_file+0xae/0x10c
[10326.890072]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[10326.953533]  [<ffffffff810ffa53>] ? sys_ioctl+0x4b/0x70
[10327.015958]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10327.089837] INFO: task tcpdump:12446 blocked for more than 120 seconds.
[10327.168898] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10327.262522] tcpdump         D ffff88042e9ae630     0 12446  10117 0x00000000
[10327.346820]  ffff88042e9ae630 0000000000000086 0000003100000000
ffff88042ed44420
[10327.435286]  0000000000012680 ffff8803e6533fd8 ffff8803e6533fd8
0000000000012680
[10327.523751]  ffff88042e9ae630 ffff8803e6532010 ffff8803e6533d38
000000013fffbe00
[10327.612231] Call Trace:
[10327.641369]  [<ffffffff813229aa>] ? schedule_timeout+0x2d/0xd7
[10327.711072]  [<ffffffff810e5028>] ? alloc_pages_vma+0x101/0x11d
[10327.781816]  [<ffffffff8132281e>] ? wait_for_common+0xd1/0x14e
[10327.851522]  [<ffffffff810419a7>] ? try_to_wake_up+0x18d/0x18d
[10327.921231]  [<ffffffff8119d912>] ? hweight_long+0x5/0x6
[10327.984692]  [<ffffffff8119d94b>] ? __bitmap_weight+0x38/0x78
[10328.053357]  [<ffffffff810954c2>] ? synchronize_sched+0x4c/0x52
[10328.124101]  [<ffffffff8105c988>] ? alloc_pid+0x368/0x368
[10328.188599]  [<ffffffff813078f9>] ? packet_set_ring+0x226/0x3ac
[10328.259341]  [<ffffffff81307d3c>] ? packet_setsockopt+0x2bd/0x518
[10328.332166]  [<ffffffff813076a8>] ? packet_getsockopt+0x1c5/0x1f0
[10328.404990]  [<ffffffff81259b6c>] ? sys_setsockopt+0x7d/0x9c
[10328.472613]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10328.546481] INFO: task sshd:12447 blocked for more than 120 seconds.
[10328.622420] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10328.716055] sshd            D ffff88042e9ac420     0 12447   1828 0x00000004
[10328.800362]  ffff88042e9ac420 0000000000000086 0000093a00000000
ffff88042edb0000
[10328.888838]  0000000000012680 ffff8803f37b3fd8 ffff8803f37b3fd8
0000000000012680
[10328.977309]  ffff88042e9ac420 ffff8803f37b2010 0000000000000001
00000001f37b3b88
[10329.065787] Call Trace:
[10329.094923]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10329.169827]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10329.233289]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[10329.298836]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[10329.368538]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[10329.439282]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[10329.513145]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[10329.578695]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[10329.641113]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[10329.708736]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[10329.781560]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[10329.846073]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[10329.909533]  [<ffffffff81259f56>] ? sock_map_fd+0x24/0x2d
[10329.974037]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10330.047905] INFO: task sshd:12450 blocked for more than 120 seconds.
[10330.123851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10330.217469] sshd            D ffff88042a845890     0 12450   1828 0x00000000
[10330.301759]  ffff88042a845890 0000000000000086 000280da00000000
ffff88042edb51c0
[10330.390234]  0000000000012680 ffff8803e417ffd8 ffff8803e417ffd8
0000000000012680
[10330.478698]  ffff88042a845890 ffff8803e417e010 ffff8803e417fb88
000000018110028a
[10330.567162] Call Trace:
[10330.596301]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10330.671207]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10330.734668]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[10330.800222]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[10330.869929]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[10330.940668]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[10331.014530]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[10331.080084]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[10331.142504]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[10331.210128]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[10331.282952]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[10331.347455]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[10331.410918]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10382.420003] INFO: rcu_sched_state detected stall on CPU 10 (t=60030 jiffies)
[10382.423999] INFO: rcu_sched_state detected stall on CPU 4 (t=60030 jiffies)
[10451.484093] INFO: task kworker/3:1:62 blocked for more than 120 seconds.
[10451.564192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10451.657811] kworker/3:1     D ffff88042ee03d50     0    62      2 0x00000000
[10451.742095]  ffff88042ee03d50 0000000000000046 0000000000000000
ffff88042ece8000
[10451.830553]  0000000000012680 ffff88042e87bfd8 ffff88042e87bfd8
0000000000012680
[10451.919018]  ffff88042ee03d50 ffff88042e87a010 ffff88042ee03d50
000000013f272680
[10452.007482] Call Trace:
[10452.036621]  [<ffffffff813229aa>] ? schedule_timeout+0x2d/0xd7
[10452.106321]  [<ffffffff81037e3b>] ? update_rq_clock+0x15/0x2f
[10452.174984]  [<ffffffff81094019>] ? rcu_batches_completed+0x8/0x8
[10452.247808]  [<ffffffff8132281e>] ? wait_for_common+0xd1/0x14e
[10452.317506]  [<ffffffff810419a7>] ? try_to_wake_up+0x18d/0x18d
[10452.387207]  [<ffffffff81094019>] ? rcu_batches_completed+0x8/0x8
[10452.460049]  [<ffffffff81086eb6>] ? __stop_cpus+0xc4/0xe1
[10452.524548]  [<ffffffff81086f0c>] ? try_stop_cpus+0x39/0x51
[10452.591126]  [<ffffffff81094019>] ? rcu_batches_completed+0x8/0x8
[10452.663944]  [<ffffffff81095561>] ? synchronize_sched_expedited+0x99/0xc8
[10452.745082]  [<ffffffff812818a9>] ? dev_deactivate_many+0xf4/0x186
[10452.818945]  [<ffffffff81277b4d>] ? __linkwatch_run_queue+0x1a5/0x1a5
[10452.895927]  [<ffffffff81281968>] ? dev_deactivate+0x2d/0x42
[10452.963546]  [<ffffffff812777c8>] ? linkwatch_do_dev+0x9c/0xb2
[10453.033244]  [<ffffffff81277b06>] ? __linkwatch_run_queue+0x15e/0x1a5
[10453.110223]  [<ffffffff81277b6d>] ? linkwatch_event+0x20/0x26
[10453.178887]  [<ffffffff8105b7c8>] ? process_one_work+0x1cc/0x2ea
[10453.250669]  [<ffffffff8105ba13>] ? worker_thread+0x12d/0x247
[10453.319326]  [<ffffffff8105b8e6>] ? process_one_work+0x2ea/0x2ea
[10453.391105]  [<ffffffff8105b8e6>] ? process_one_work+0x2ea/0x2ea
[10453.462889]  [<ffffffff8105ec89>] ? kthread+0x7a/0x82
[10453.523231]  [<ffffffff8132ae64>] ? kernel_thread_helper+0x4/0x10
[10453.596054]  [<ffffffff8105ec0f>] ? kthread_worker_fn+0x147/0x147
[10453.668880]  [<ffffffff8132ae60>] ? gs_change+0x13/0x13
[10453.731306] INFO: task irqbalance:1793 blocked for more than 120 seconds.
[10453.812448] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10453.906070] irqbalance      D ffff88042d99a8e0     0  1793      1 0x00000000
[10453.990357]  ffff88042d99a8e0 0000000000000082 ffff880400000000
ffff88042ec6df60
[10454.078814]  0000000000012680 ffff88042a8f5fd8 ffff88042a8f5fd8
0000000000012680
[10454.167280]  ffff88042d99a8e0 ffff88042a8f4010 ffff88042a8f5cc8
000000012d91d000
[10454.255744] Call Trace:
[10454.284883]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10454.359780]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10454.423237]  [<ffffffff81269ca6>] ? dev_load+0x9/0x70
[10454.483576]  [<ffffffff8126b017>] ? dev_ioctl+0x4ad/0x62e
[10454.548088]  [<ffffffff810eb735>] ? get_partial_node+0x15/0x7b
[10454.617789]  [<ffffffff8125844c>] ? sock_do_ioctl+0x2f/0x36
[10454.684365]  [<ffffffff81258853>] ? sock_ioctl+0x205/0x212
[10454.749906]  [<ffffffff810f3d2d>] ? get_empty_filp+0x9c/0x12b
[10454.818572]  [<ffffffff810ff9bb>] ? do_vfs_ioctl+0x467/0x4b4
[10454.886192]  [<ffffffff81259ed4>] ? sock_alloc_file+0xae/0x10c
[10454.955894]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[10455.019351]  [<ffffffff810ffa53>] ? sys_ioctl+0x4b/0x70
[10455.081769]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10455.155632] INFO: task ntpd:1808 blocked for more than 120 seconds.
[10455.230529] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10455.324148] ntpd            D ffff88042bf96630     0  1808      1 0x00000004
[10455.408430]  ffff88042bf96630 0000000000000082 ffff880400000000
ffff88042eceed00
[10455.496902]  0000000000012680 ffff88042a8affd8 ffff88042a8affd8
0000000000012680
[10455.585370]  ffff88042bf96630 ffff88042a8ae010 0000000700200200
000000010000ea00
[10455.673841] Call Trace:
[10455.702978]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10455.777876]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10455.841336]  [<ffffffff8126aba4>] ? dev_ioctl+0x3a/0x62e
[10455.904797]  [<ffffffff81053d92>] ? __set_task_blocked+0x5a/0x61
[10455.976575]  [<ffffffff810ec082>] ? kmem_cache_alloc+0x2a/0xe1
[10456.046279]  [<ffffffff8125844c>] ? sock_do_ioctl+0x2f/0x36
[10456.112861]  [<ffffffff81258853>] ? sock_ioctl+0x205/0x212
[10456.178400]  [<ffffffff810f3d2d>] ? get_empty_filp+0x9c/0x12b
[10456.247062]  [<ffffffff810ff9bb>] ? do_vfs_ioctl+0x467/0x4b4
[10456.314679]  [<ffffffff81259ed4>] ? sock_alloc_file+0xae/0x10c
[10456.384377]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[10456.447833]  [<ffffffff810ffa53>] ? sys_ioctl+0x4b/0x70
[10456.510251]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10456.584106] INFO: task zebra:1815 blocked for more than 120 seconds.
[10456.660054] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10456.753680] zebra           D ffff88042bfecaf0     0  1815      1 0x00000000
[10456.837969]  ffff88042bfecaf0 0000000000000082 ffff880400000000
ffff88042edb1b40
[10456.926427]  0000000000012680 ffff88042ba69fd8 ffff88042ba69fd8
0000000000012680
[10457.014891]  ffff88042bfecaf0 ffff88042ba68010 ffff88042ba69d48
0000000181289426
[10457.103356] Call Trace:
[10457.132494]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[10457.207392]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[10457.270850]  [<ffffffff81100218>] ? __pollwait+0xd0/0xd0
[10457.334306]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[10457.399842]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[10457.469540]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[10457.540280]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[10457.614142]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[10457.679679]  [<ffffffff812580fb>] ? __sock_recvmsg_nosec+0x29/0x69
[10457.753538]  [<ffffffff81262a9c>] ? copy_from_user+0x18/0x30
[10457.821166]  [<ffffffff81262de0>] ? verify_iovec+0x46/0x98
[10457.886703]  [<ffffffff81259811>] ? __sys_sendmsg+0x1c4/0x23b
[10457.955361]  [<ffffffff810f2687>] ? do_sync_read+0xb1/0xea
[10458.020900]  [<ffffffff8103cbdb>] ? thread_group_times+0x32/0x8e
[10458.092683]  [<ffffffff81053c57>] ? __lock_task_sighand+0x4a/0x72
[10458.165506]  [<ffffffff810431f2>] ? mmput+0x9/0xde
[10458.222724]  [<ffffffff81057e16>] ? getrusage+0x34b/0x365
[10458.287227]  [<ffffffff810ec082>] ? kmem_cache_alloc+0x2a/0xe1
[10458.356924]  [<ffffffff812599e6>] ? sys_sendmsg+0x39/0x58
[10458.421420]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[10562.540002] INFO: rcu_sched_state detected stall on CPU 10 (t=105060 jiffies)
[10562.543993] INFO: rcu_sched_state detected stall on CPU 4 (t=105060 jiffies)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: igb transmit queue timed out, rcu_sched_state detected stall
  2011-08-12 16:42 igb transmit queue timed out, rcu_sched_state detected stall Peter Neal
@ 2011-08-15 14:33 ` Peter Neal
  2011-08-15 18:07   ` Ben Hutchings
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Neal @ 2011-08-15 14:33 UTC (permalink / raw)
  To: netdev

I have updated the BIOS, iproute2, e1000e and igb drivers, but am
still seeing issues, any thoughts?

Thanks,


Pete

[ 7765.881893] bnx2 0000:0b:00.0: eth25: NIC Copper Link is Up, 1000
Mbps full duplex, receive & transmit flow control ON
[ 7767.395912] bnx2 0000:0b:00.0: eth25: NIC Copper Link is Down
[ 7769.832448] bnx2 0000:0b:00.0: eth25: NIC Copper Link is Up, 1000
Mbps full duplex, receive & transmit flow control ON
[ 7778.124580] igb: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7783.001120] igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7783.900216] igb: eth4 NIC Link is Down
[ 7786.204560] igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7788.168523] igb: eth18 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7789.414458] igb: eth18 NIC Link is Down
[ 7791.702958] igb: eth18 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7794.188210] igb: eth12 NIC Link is Down
[ 7796.432599] igb: eth12 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7826.864941] e1000e: eth21 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: Rx/Tx
[ 7836.544159] igb: eth17 NIC Link is Down
[ 7864.112307] igb: eth6 NIC Link is Down
[ 7917.072196] igb: eth16 NIC Link is Down
[ 7919.356618] igb: eth16 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7920.848574] igb: eth10 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7926.272173] igb: eth4 NIC Link is Down
[ 7965.212587] igb: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7966.200164] igb: eth6 NIC Link is Down
[ 7968.742002] igb: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7971.112151] e1000e: eth20 NIC Link is Down
[ 7973.084709] igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 7973.452998] e1000e: eth20 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: Rx/Tx
[ 7974.300193] igb: eth4 NIC Link is Down
[ 7976.616567] igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX/TX
[ 8041.200005] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
[ 8041.835999] INFO: rcu_bh_state detected stall on CPU 2 (t=15000 jiffies)
[ 8060.251724] bnx2 0000:0b:00.0: eth25: NIC Copper Link is Down
[ 8096.268889] bnx2 0000:0b:00.0: eth25: NIC Copper Link is Up, 1000
Mbps full duplex, receive & transmit flow control ON
[ 8161.920070] INFO: task irqbalance:1777 blocked for more than 120 seconds.
[ 8162.001213] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8162.094833] irqbalance      D ffff88042e9abd50     0  1777      1 0x00000000
[ 8162.179119]  ffff88042e9abd50 0000000000000086 ffff88042e9abd50
ffff88042ec6a8e0
[ 8162.267569]  0000000000012680 ffff88042b5fdfd8 ffff88042b5fdfd8
0000000000012680
[ 8162.356047]  ffff88042e9abd50 ffff88042b5fc010 0000000100000000
ffff88042b995f60
[ 8162.444498] Call Trace:
[ 8162.473641]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8162.548540]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8162.611996]  [<ffffffff81269ca6>] ? dev_load+0x9/0x70
[ 8162.672333]  [<ffffffff8126b017>] ? dev_ioctl+0x4ad/0x62e
[ 8162.736839]  [<ffffffff8125844c>] ? sock_do_ioctl+0x2f/0x36
[ 8162.803415]  [<ffffffff81258853>] ? sock_ioctl+0x205/0x212
[ 8162.868959]  [<ffffffff810f3d2d>] ? get_empty_filp+0x9c/0x12b
[ 8162.937616]  [<ffffffff810ff9bb>] ? do_vfs_ioctl+0x467/0x4b4
[ 8163.005235]  [<ffffffff81259ed4>] ? sock_alloc_file+0xae/0x10c
[ 8163.074938]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8163.138393]  [<ffffffff810ffa53>] ? sys_ioctl+0x4b/0x70
[ 8163.200810]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8163.274672] INFO: task snmpd:1797 blocked for more than 120 seconds.
[ 8163.350609] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8163.444228] snmpd           D ffff88042aa4d890     0  1797      1 0x00000000
[ 8163.528516]  ffff88042aa4d890 0000000000000086 ffff88042aa4d890
ffff88042ec6a8e0
[ 8163.616962]  0000000000012680 ffff88042d7b7fd8 ffff88042d7b7fd8
0000000000012680
[ 8163.705415]  ffff88042aa4d890 ffff88042d7b6010 0000000100000000
ffff88042b995f60
[ 8163.793870] Call Trace:
[ 8163.823001]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8163.897900]  [<ffffffff810fa4b3>] ? dget+0x12/0x1e
[ 8163.955115]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8164.018572]  [<ffffffff8126aba4>] ? dev_ioctl+0x3a/0x62e
[ 8164.082034]  [<ffffffff81103734>] ? dput+0x29/0xe9
[ 8164.139253]  [<ffffffff810eb735>] ? get_partial_node+0x15/0x7b
[ 8164.208949]  [<ffffffff810f3cfa>] ? get_empty_filp+0x69/0x12b
[ 8164.277606]  [<ffffffff8125844c>] ? sock_do_ioctl+0x2f/0x36
[ 8164.344186]  [<ffffffff81258853>] ? sock_ioctl+0x205/0x212
[ 8164.409721]  [<ffffffff810f3d2d>] ? get_empty_filp+0x9c/0x12b
[ 8164.478377]  [<ffffffff810ff9bb>] ? do_vfs_ioctl+0x467/0x4b4
[ 8164.545996]  [<ffffffff81259ed4>] ? sock_alloc_file+0xae/0x10c
[ 8164.615696]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8164.679156]  [<ffffffff810ffa53>] ? sys_ioctl+0x4b/0x70
[ 8164.741574]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8164.815438] INFO: task sshd:806 blocked for more than 120 seconds.
[ 8164.889300] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8164.982918] sshd            D ffff88042d62e630     0   806   1814 0x00000000
[ 8165.067195]  ffff88042d62e630 0000000000000086 ffff88042d62e630
ffff88042edb5890
[ 8165.155645]  0000000000012680 ffff8803fa8fffd8 ffff8803fa8fffd8
0000000000012680
[ 8165.244109]  ffff88042d62e630 ffff8803fa8fe010 0000000100000000
ffff88042b995f60
[ 8165.332561] Call Trace:
[ 8165.361699]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8165.436600]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8165.500064]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8165.565602]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8165.635298]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8165.706044]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8165.779901]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8165.845439]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8165.907854]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8165.975475]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8166.048289]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8166.112785]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8166.176242]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8166.250105] INFO: task sshd:807 blocked for more than 120 seconds.
[ 8166.323964] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8166.417581] sshd            D ffff88042a491470     0   807   1814 0x00000000
[ 8166.501865]  ffff88042a491470 0000000000000086 ffff88042a491470
ffff88042edb73d0
[ 8166.590321]  0000000000012680 ffff8803e7e37fd8 ffff8803e7e37fd8
0000000000012680
[ 8166.678785]  ffff88042a491470 ffff8803e7e36010 0000000100000000
ffff88042b995f60
[ 8166.767243] Call Trace:
[ 8166.796385]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8166.871289]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8166.934748]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8167.000284]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8167.069987]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8167.140727]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8167.214584]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8167.280121]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8167.342543]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8167.410162]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8167.482978]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8167.547477]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8167.610937]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8167.684797] INFO: task sshd:808 blocked for more than 120 seconds.
[ 8167.758654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8167.852272] sshd            D ffff88042a494420     0   808   1814 0x00000000
[ 8167.936556]  ffff88042a494420 0000000000000082 ffff88042a494420
ffff88042e9a8000
[ 8168.025020]  0000000000012680 ffff88042daa9fd8 ffff88042daa9fd8
0000000000012680
[ 8168.113482]  ffff88042a494420 ffff88042daa8010 0000000100000000
ffff88042b995f60
[ 8168.201947] Call Trace:
[ 8168.231085]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8168.305981]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8168.369438]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8168.434979]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8168.504672]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8168.575409]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8168.649267]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8168.714811]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8168.777228]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8168.844848]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8168.917663]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8168.982164]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8169.045621]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8169.119480] INFO: task sshd:809 blocked for more than 120 seconds.
[ 8169.193337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8169.286959] sshd            D ffff88042b8186d0     0   809   1814 0x00000000
[ 8169.371232]  ffff88042b8186d0 0000000000000086 ffff88042b8186d0
ffff88042e9a8da0
[ 8169.459683]  0000000000012680 ffff8803ef68bfd8 ffff8803ef68bfd8
0000000000012680
[ 8169.548141]  ffff88042b8186d0 ffff8803ef68a010 0000000100000000
ffff88042b995f60
[ 8169.636603] Call Trace:
[ 8169.665736]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8169.740634]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8169.804095]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8169.869632]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8169.939330]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8170.010066]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8170.083929]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8170.149469]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8170.211886]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8170.279502]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8170.352323]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8170.416818]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8170.480274]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8170.554131] INFO: task smtpserver.pl:815 blocked for more than 120 seconds.
[ 8170.637354] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8170.730974] smtpserver.pl   D ffff88042b81caf0     0   815  31144 0x00000004
[ 8170.815252]  ffff88042b81caf0 0000000000000082 ffff88042b81caf0
ffff88042ee03d50
[ 8170.903706]  0000000000012680 ffff88042e929fd8 ffff88042e929fd8
0000000000012680
[ 8170.992174]  ffff88042b81caf0 ffff88042e928010 0000000100000000
ffff88042b995f60
[ 8171.080630] Call Trace:
[ 8171.109767]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8171.184668]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8171.248126]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8171.313665]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8171.383361]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8171.454104]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8171.527962]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8171.593500]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8171.655915]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8171.723538]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8171.796356]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8171.860854]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8171.924311]  [<ffffffff81259f56>] ? sock_map_fd+0x24/0x2d
[ 8171.988812]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8172.062672] INFO: task sshd:823 blocked for more than 120 seconds.
[ 8172.136527] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8172.230147] sshd            D ffff88042a496d00     0   823   1814 0x00000000
[ 8172.314431]  ffff88042a496d00 0000000000000082 ffff88042a496d00
ffff88042edb73d0
[ 8172.402881]  0000000000012680 ffff88042a9cdfd8 ffff88042a9cdfd8
0000000000012680
[ 8172.491319]  ffff88042a496d00 ffff88042a9cc010 0000000100000000
ffff88042b995f60
[ 8172.579783] Call Trace:
[ 8172.608922]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8172.683819]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8172.747277]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8172.812815]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8172.882512]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8172.953247]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8173.027105]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8173.092645]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8173.155065]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8173.222680]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8173.295499]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8173.360015]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8173.423471]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8173.497329] INFO: task sshd:872 blocked for more than 120 seconds.
[ 8173.571185] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8173.664810] sshd            D ffff88042d6286d0     0   872   1814 0x00000000
[ 8173.749088]  ffff88042d6286d0 0000000000000082 ffff88042d6286d0
ffff88042edb5890
[ 8173.837545]  0000000000012680 ffff8803eabf3fd8 ffff8803eabf3fd8
0000000000012680
[ 8173.926005]  ffff88042d6286d0 ffff8803eabf2010 0000000100000000
ffff88042b995f60
[ 8174.014447] Call Trace:
[ 8174.043585]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8174.118483]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8174.181943]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8174.247482]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8174.317178]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8174.387918]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8174.461779]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8174.527317]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8174.589734]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8174.657352]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8174.730176]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8174.794674]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8174.858132]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8174.931986] INFO: task sshd:873 blocked for more than 120 seconds.
[ 8175.005847] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 8175.099461] sshd            D ffff88042a494af0     0   873   1814 0x00000000
[ 8175.183739]  ffff88042a494af0 0000000000000082 ffff88042a494af0
ffff88042edb73d0
[ 8175.272197]  0000000000012680 ffff8804084f7fd8 ffff8804084f7fd8
0000000000012680
[ 8175.360649]  ffff88042a494af0 ffff8804084f6010 0000000100000000
ffff88042b995f60
[ 8175.449106] Call Trace:
[ 8175.478243]  [<ffffffff81322df5>] ? __mutex_lock_common+0x10c/0x172
[ 8175.553145]  [<ffffffff81322f21>] ? mutex_lock+0x1a/0x2c
[ 8175.616602]  [<ffffffff81275f2e>] ? rtnetlink_rcv+0xe/0x28
[ 8175.682139]  [<ffffffff8128951f>] ? netlink_unicast+0xea/0x152
[ 8175.751838]  [<ffffffff81289c74>] ? netlink_sendmsg+0x246/0x266
[ 8175.822579]  [<ffffffff8125809a>] ? __sock_sendmsg_nosec+0x25/0x5d
[ 8175.896438]  [<ffffffff812590dc>] ? sock_sendmsg+0x83/0x9b
[ 8175.961976]  [<ffffffff810378ce>] ? __wake_up+0x35/0x46
[ 8176.024393]  [<ffffffff8125838d>] ? copy_from_user+0x18/0x30
[ 8176.092023]  [<ffffffff81258e23>] ? move_addr_to_kernel+0x2c/0x4c
[ 8176.164846]  [<ffffffff812595fc>] ? sys_sendto+0xf7/0x137
[ 8176.229343]  [<ffffffff810f0dc5>] ? fd_install+0x27/0x4e
[ 8176.292800]  [<ffffffff81329d52>] ? system_call_fastpath+0x16/0x1b
[ 8221.320008] INFO: rcu_sched_state detected stall on CPU 2 (t=60030 jiffies)
[ 8221.955999] INFO: rcu_bh_state detected stall on CPU 2 (t=60030 jiffies)
[ 8401.440001] INFO: rcu_sched_state detected stall on CPU 2 (t=105060 jiffies)
[ 8402.076002] INFO: rcu_bh_state detected stall on CPU 2 (t=105060 jiffies)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: igb transmit queue timed out, rcu_sched_state detected stall
  2011-08-15 14:33 ` Peter Neal
@ 2011-08-15 18:07   ` Ben Hutchings
  0 siblings, 0 replies; 3+ messages in thread
From: Ben Hutchings @ 2011-08-15 18:07 UTC (permalink / raw)
  To: Peter Neal; +Cc: netdev

On Mon, 2011-08-15 at 15:33 +0100, Peter Neal wrote:
> I have updated the BIOS, iproute2, e1000e and igb drivers, but am
> still seeing issues, any thoughts?
[...]

This issue doesn't seem network-related.  However it might be related to
this regression that I saw discussed just a few minutes ago:

Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=40092
Subject       : RCU stall in linux-3.0.0
Submitter     : Philip Armstrong <phil@kantaka.co.uk>
Date          : 2011-07-25 21:44 (21 days old)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-08-15 18:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-12 16:42 igb transmit queue timed out, rcu_sched_state detected stall Peter Neal
2011-08-15 14:33 ` Peter Neal
2011-08-15 18:07   ` Ben Hutchings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).