All of lore.kernel.org
 help / color / mirror / Atom feed
* [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
@ 2011-07-02 21:33 Ronny Meeus
  2011-07-04  8:06 ` Ronny Meeus
  0 siblings, 1 reply; 9+ messages in thread
From: Ronny Meeus @ 2011-07-02 21:33 UTC (permalink / raw)
  To: adeos-main

Hello

we use have a FreeScale P4040 (powerpc) based board running Linux+Xenomai.
I copy-paste here some information I found in the bootlog:

[    0.000000] Using P4080 DS machine description
[    0.000000] Memory CAM mapping: 256/256/256 Mb, residual: 1248Mb
[    0.000000] Linux version 2.6.35.7-hg98224f47aa52-dirty
(xxxxx@domain.hid) (gcc version 4.4.6 (Buildroot 2011.05-hg98224f47aa52)
) #1 SMP Fri Jul 1 08:42:30 CEST 2011

[    0.000000] clocksource: timebase mult[6aaaf09] shift[22] registered
[    0.000000] I-pipe 2.12-01: pipeline enabled.
[    0.000000] Console: colour dummy device 80x25
[    0.181150] pid_max: default: 32768 minimum: 301

[    2.093842] I-pipe: Domain Xenomai registered.
[    2.146016] Xenomai: hal/powerpc started.
[    2.193904] Xenomai: scheduling class idle registered.
[    2.255328] Xenomai: scheduling class rt registered.
[    2.319092] Xenomai: real-time nucleus v2.5.5 (Ghosts) loaded.
[    2.388207] Xenomai: starting native API services.
[    2.445249] Xenomai: starting pSOS+ services.
[    2.497478] highmem bounce pool size: 64 pages
[    2.550932] fuse init (API version 7.14)

Although the P4040 has 4 cores, we are currently using only 1 core.
This is specified in the device tree we are using.
The kernel runs SMP enabled.

I start 2 test applications on this board.
The first application is sending raw Ethernet packets on a link that
is put in loop. The result is that all packets we send are received
(unmodified) back on the same interface.
The second application is listening on the same Ethernet interface
also via a raw Ethernet socket.
Both application are plain Linux application so no Xenomai code is used.

One side effect of using raw Ethernet sockets is that all packets sent
on one socket will also be received by all other raw Ethernet sockets.
This means that the listening application will receive each packet 2
times: once while sending and a second time when it is received via
the loop. (A side question: can the behavior be disabled somehow? We
basically do not want to receive all packets we send ...)

After a very short time (sending something like 30000 packets), both
applications block completely and 60 seconds later an indication is
displayed on the console that the kernel is locked.

[  805.307213] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
[  805.389519] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
[  805.471880] NIP: c000cc4c LR: 00000000 CTR: 00000000
[  805.531274] REGS: c1f87040 TRAP: 0000   Not tainted
(2.6.35.7-hg98224f47aa52-dirty)
[  805.623992] MSR: 00029002 <EE,ME,CE>  CR: 00000000  XER: 00000000
[  805.696972] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
[  805.778248] GPR00: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  805.878359] GPR08: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  805.978452] GPR16: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  806.078571] GPR24: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  806.180773] NIP [c000cc4c] udelay+0x24/0x30
[  806.230782] LR [00000000] (null)
[  806.269334] Call Trace:
[  806.298521] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
[  806.374600] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
[  806.437125] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
[  806.505897] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
[  806.573626] [efff3c00] [c003cca4] update_process_times+0x44/0x80
[  806.645528] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
[  806.714307] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
[  806.779958] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
[  806.850812] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
[  806.919586] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
[  806.987315] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
[  807.059212] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
[  807.131112] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
[  807.204063] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
[  807.204068]     LR = tpacket_rcv+0x264/0x570
[  807.320754] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
[  807.397875] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
[  807.470811] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
[  807.539583] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
[  807.616693] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
[  807.684426] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
[  807.749031] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
[  807.814683] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
[  807.880333] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
[  807.947022] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
[  808.008503] --- Exception: ec6abbb0 at 0xec6abb70
[  808.008507]     LR = 0xec4e6c50
[  808.102274] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
[  808.175227] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
[  808.240872] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
[  808.312771] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
[  808.384669] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
[  808.454482] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
[  808.527425] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
[  808.527430]     LR = tpacket_rcv+0x264/0x570
[  808.644114] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
[  808.721232] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
[  808.794171] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
[  808.861901] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
[  808.925465] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
[  808.987988] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
[  809.055718] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
[  809.122407] --- Exception: c01 at 0x48051f00
[  809.122411]     LR = 0x4808e030
[  809.210966] Instruction dump:
[  809.246401] 7d204850 7f891840 419cfff0 7c421378 4e800020 3d20c04c
800967e0 7c0301d6
[  809.339215] 7d2c42a6 48000008 7c210b78 <7d6c42a6> <7d695850>
7f8b0040 419cfff0 7c421378
[  874.025894] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
[  874.108198] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
[  874.190551] NIP: c000cc48 LR: 00000000 CTR: 00000000
[  874.249937] REGS: c1f87040 TRAP: 0000   Not tainted
(2.6.35.7-hg98224f47aa52-dirty)
[  874.342658] MSR: 00029002 <EE,ME,CE>  CR: 00000000  XER: 00000000
[  874.415638] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
[  874.496907] GPR00: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  874.597018] GPR08: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  874.697124] GPR16: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  874.797235] GPR24: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  874.899421] NIP [c000cc40] udelay+0x18/0x30
[  874.949434] LR [00000000] (null)
[  874.987986] Call Trace:
[  875.017170] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
[  875.093240] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
[  875.155763] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
[  875.224534] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
[  875.292265] [efff3c00] [c003cca4] update_process_times+0x44/0x80
[  875.364164] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
[  875.432936] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
[  875.498584] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
[  875.569437] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
[  875.638211] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
[  875.705941] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
[  875.777839] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
[  875.849736] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
[  875.922680] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
[  875.922684]     LR = tpacket_rcv+0x264/0x570
[  876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
[  876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
[  876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
[  876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
[  876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
[  876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
[  876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
[  876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
[  876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
[  876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
[  876.727097] --- Exception: ec6abbb0 at 0xec6abb70
[  876.727101]     LR = 0xec4e6c50
[  876.820868] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
[  876.893814] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
[  876.959459] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
[  877.031358] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
[  877.103256] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
[  877.173069] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
[  877.246012] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
[  877.246017]     LR = tpacket_rcv+0x264/0x570
[  877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
[  877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
[  877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
[  877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
[  877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
[  877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
[  877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
[  877.840994] --- Exception: c01 at 0x48051f00
[  877.840998]     LR = 0x4808e030
[  877.929553] Instruction dump:
[  877.964988] 419cfff0 7c421378 4e800020 3d20c04c 800967e0 7c0301d6
7d2c42a6 48000008
[  878.057802] 7c210b78 7d6c42a6 7d695850 7f8b0040 419cfff0 7c421378
4e800020 3d20c04a

I do not completely understand this dump, but it looks like both the
receive direction (running in the context of a softirq) and my
transmitting application are blocked on the spinlock used in the
tpacket_rcv function:

[  876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
[  876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
[  876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
[  876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
[  876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
[  876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
[  876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
[  876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
[  876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
[  876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec

and

[  877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
[  877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
[  877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
[  877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
[  877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
[  877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
[  877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c

Is my analysis correct?
If yes, can this have anything to do with the IPIPE mechanism we are
using (maybe a know issue??).

Any help would be much appreciated.

Thanks,
Ronny


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-02 21:33 [Adeos-main] Kernel blocked during send/receive raw ethernet packets Ronny Meeus
@ 2011-07-04  8:06 ` Ronny Meeus
  2011-07-04  8:20   ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Ronny Meeus @ 2011-07-04  8:06 UTC (permalink / raw)
  To: adeos-main

On Sat, Jul 2, 2011 at 11:33 PM, Ronny Meeus <ronny.meeus@domain.hid> wrote:
> Hello
>
> we use have a FreeScale P4040 (powerpc) based board running Linux+Xenomai.
> I copy-paste here some information I found in the bootlog:
>
> [    0.000000] Using P4080 DS machine description
> [    0.000000] Memory CAM mapping: 256/256/256 Mb, residual: 1248Mb
> [    0.000000] Linux version 2.6.35.7-hg98224f47aa52-dirty
> (xxxxx@domain.hid) (gcc version 4.4.6 (Buildroot 2011.05-hg98224f47aa52)
> ) #1 SMP Fri Jul 1 08:42:30 CEST 2011
>
> [    0.000000] clocksource: timebase mult[6aaaf09] shift[22] registered
> [    0.000000] I-pipe 2.12-01: pipeline enabled.
> [    0.000000] Console: colour dummy device 80x25
> [    0.181150] pid_max: default: 32768 minimum: 301
>
> [    2.093842] I-pipe: Domain Xenomai registered.
> [    2.146016] Xenomai: hal/powerpc started.
> [    2.193904] Xenomai: scheduling class idle registered.
> [    2.255328] Xenomai: scheduling class rt registered.
> [    2.319092] Xenomai: real-time nucleus v2.5.5 (Ghosts) loaded.
> [    2.388207] Xenomai: starting native API services.
> [    2.445249] Xenomai: starting pSOS+ services.
> [    2.497478] highmem bounce pool size: 64 pages
> [    2.550932] fuse init (API version 7.14)
>
> Although the P4040 has 4 cores, we are currently using only 1 core.
> This is specified in the device tree we are using.
> The kernel runs SMP enabled.
>
> I start 2 test applications on this board.
> The first application is sending raw Ethernet packets on a link that
> is put in loop. The result is that all packets we send are received
> (unmodified) back on the same interface.
> The second application is listening on the same Ethernet interface
> also via a raw Ethernet socket.
> Both application are plain Linux application so no Xenomai code is used.
>
> One side effect of using raw Ethernet sockets is that all packets sent
> on one socket will also be received by all other raw Ethernet sockets.
> This means that the listening application will receive each packet 2
> times: once while sending and a second time when it is received via
> the loop. (A side question: can the behavior be disabled somehow? We
> basically do not want to receive all packets we send ...)
>
> After a very short time (sending something like 30000 packets), both
> applications block completely and 60 seconds later an indication is
> displayed on the console that the kernel is locked.
>
> [  805.307213] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
> [  805.389519] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
> [  805.471880] NIP: c000cc4c LR: 00000000 CTR: 00000000
> [  805.531274] REGS: c1f87040 TRAP: 0000   Not tainted
> (2.6.35.7-hg98224f47aa52-dirty)
> [  805.623992] MSR: 00029002 <EE,ME,CE>  CR: 00000000  XER: 00000000
> [  805.696972] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
> [  805.778248] GPR00: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  805.878359] GPR08: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  805.978452] GPR16: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  806.078571] GPR24: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  806.180773] NIP [c000cc4c] udelay+0x24/0x30
> [  806.230782] LR [00000000] (null)
> [  806.269334] Call Trace:
> [  806.298521] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
> [  806.374600] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
> [  806.437125] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
> [  806.505897] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
> [  806.573626] [efff3c00] [c003cca4] update_process_times+0x44/0x80
> [  806.645528] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
> [  806.714307] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
> [  806.779958] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
> [  806.850812] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
> [  806.919586] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
> [  806.987315] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
> [  807.059212] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
> [  807.131112] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> [  807.204063] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
> [  807.204068]     LR = tpacket_rcv+0x264/0x570
> [  807.320754] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> [  807.397875] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> [  807.470811] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> [  807.539583] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> [  807.616693] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> [  807.684426] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> [  807.749031] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> [  807.814683] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> [  807.880333] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> [  807.947022] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> [  808.008503] --- Exception: ec6abbb0 at 0xec6abb70
> [  808.008507]     LR = 0xec4e6c50
> [  808.102274] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
> [  808.175227] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
> [  808.240872] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
> [  808.312771] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
> [  808.384669] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
> [  808.454482] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> [  808.527425] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
> [  808.527430]     LR = tpacket_rcv+0x264/0x570
> [  808.644114] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> [  808.721232] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> [  808.794171] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> [  808.861901] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> [  808.925465] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> [  808.987988] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> [  809.055718] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> [  809.122407] --- Exception: c01 at 0x48051f00
> [  809.122411]     LR = 0x4808e030
> [  809.210966] Instruction dump:
> [  809.246401] 7d204850 7f891840 419cfff0 7c421378 4e800020 3d20c04c
> 800967e0 7c0301d6
> [  809.339215] 7d2c42a6 48000008 7c210b78 <7d6c42a6> <7d695850>
> 7f8b0040 419cfff0 7c421378
> [  874.025894] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
> [  874.108198] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
> [  874.190551] NIP: c000cc48 LR: 00000000 CTR: 00000000
> [  874.249937] REGS: c1f87040 TRAP: 0000   Not tainted
> (2.6.35.7-hg98224f47aa52-dirty)
> [  874.342658] MSR: 00029002 <EE,ME,CE>  CR: 00000000  XER: 00000000
> [  874.415638] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
> [  874.496907] GPR00: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  874.597018] GPR08: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  874.697124] GPR16: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  874.797235] GPR24: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [  874.899421] NIP [c000cc40] udelay+0x18/0x30
> [  874.949434] LR [00000000] (null)
> [  874.987986] Call Trace:
> [  875.017170] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
> [  875.093240] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
> [  875.155763] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
> [  875.224534] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
> [  875.292265] [efff3c00] [c003cca4] update_process_times+0x44/0x80
> [  875.364164] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
> [  875.432936] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
> [  875.498584] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
> [  875.569437] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
> [  875.638211] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
> [  875.705941] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
> [  875.777839] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
> [  875.849736] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> [  875.922680] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
> [  875.922684]     LR = tpacket_rcv+0x264/0x570
> [  876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> [  876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> [  876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> [  876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> [  876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> [  876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> [  876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> [  876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> [  876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> [  876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> [  876.727097] --- Exception: ec6abbb0 at 0xec6abb70
> [  876.727101]     LR = 0xec4e6c50
> [  876.820868] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
> [  876.893814] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
> [  876.959459] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
> [  877.031358] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
> [  877.103256] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
> [  877.173069] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> [  877.246012] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
> [  877.246017]     LR = tpacket_rcv+0x264/0x570
> [  877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> [  877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> [  877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> [  877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> [  877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> [  877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> [  877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> [  877.840994] --- Exception: c01 at 0x48051f00
> [  877.840998]     LR = 0x4808e030
> [  877.929553] Instruction dump:
> [  877.964988] 419cfff0 7c421378 4e800020 3d20c04c 800967e0 7c0301d6
> 7d2c42a6 48000008
> [  878.057802] 7c210b78 7d6c42a6 7d695850 7f8b0040 419cfff0 7c421378
> 4e800020 3d20c04a
>
> I do not completely understand this dump, but it looks like both the
> receive direction (running in the context of a softirq) and my
> transmitting application are blocked on the spinlock used in the
> tpacket_rcv function:
>
> [  876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> [  876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> [  876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> [  876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> [  876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> [  876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> [  876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> [  876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> [  876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> [  876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
>
> and
>
> [  877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> [  877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> [  877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> [  877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> [  877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> [  877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> [  877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
>
> Is my analysis correct?
> If yes, can this have anything to do with the IPIPE mechanism we are
> using (maybe a know issue??).
>
> Any help would be much appreciated.
>
> Thanks,
> Ronny
>

Hello

I did a new test (this time with an older kernel Linux version
2.6.34.6): same tests were executed but this time on a pure Linux
build (no IPIPE included). The issue cannot be reproduced anymore in
this environment. My test builds keep on running forever.

My next steps are:
- Running the same test on 2.6.35.7 without IPIPE. This enviroment is
currently building.
- Include only IPIPE and no Xenomai and redo the test.

Best regards
Ronny


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-04  8:06 ` Ronny Meeus
@ 2011-07-04  8:20   ` Philippe Gerum
  2011-07-04 11:42     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2011-07-04  8:20 UTC (permalink / raw)
  To: Ronny Meeus; +Cc: adeos-main

On Mon, 2011-07-04 at 10:06 +0200, Ronny Meeus wrote:
> On Sat, Jul 2, 2011 at 11:33 PM, Ronny Meeus <ronny.meeus@domain.hid> wrote:
> > Hello
> >
> > we use have a FreeScale P4040 (powerpc) based board running Linux+Xenomai.
> > I copy-paste here some information I found in the bootlog:
> >
> > [    0.000000] Using P4080 DS machine description
> > [    0.000000] Memory CAM mapping: 256/256/256 Mb, residual: 1248Mb
> > [    0.000000] Linux version 2.6.35.7-hg98224f47aa52-dirty
> > (xxxxx@domain.hid) (gcc version 4.4.6 (Buildroot 2011.05-hg98224f47aa52)
> > ) #1 SMP Fri Jul 1 08:42:30 CEST 2011
> >
> > [    0.000000] clocksource: timebase mult[6aaaf09] shift[22] registered
> > [    0.000000] I-pipe 2.12-01: pipeline enabled.
> > [    0.000000] Console: colour dummy device 80x25
> > [    0.181150] pid_max: default: 32768 minimum: 301
> >
> > [    2.093842] I-pipe: Domain Xenomai registered.
> > [    2.146016] Xenomai: hal/powerpc started.
> > [    2.193904] Xenomai: scheduling class idle registered.
> > [    2.255328] Xenomai: scheduling class rt registered.
> > [    2.319092] Xenomai: real-time nucleus v2.5.5 (Ghosts) loaded.
> > [    2.388207] Xenomai: starting native API services.
> > [    2.445249] Xenomai: starting pSOS+ services.
> > [    2.497478] highmem bounce pool size: 64 pages
> > [    2.550932] fuse init (API version 7.14)
> >
> > Although the P4040 has 4 cores, we are currently using only 1 core.
> > This is specified in the device tree we are using.
> > The kernel runs SMP enabled.
> >
> > I start 2 test applications on this board.
> > The first application is sending raw Ethernet packets on a link that
> > is put in loop. The result is that all packets we send are received
> > (unmodified) back on the same interface.
> > The second application is listening on the same Ethernet interface
> > also via a raw Ethernet socket.
> > Both application are plain Linux application so no Xenomai code is used.
> >
> > One side effect of using raw Ethernet sockets is that all packets sent
> > on one socket will also be received by all other raw Ethernet sockets.
> > This means that the listening application will receive each packet 2
> > times: once while sending and a second time when it is received via
> > the loop. (A side question: can the behavior be disabled somehow? We
> > basically do not want to receive all packets we send ...)
> >
> > After a very short time (sending something like 30000 packets), both
> > applications block completely and 60 seconds later an indication is
> > displayed on the console that the kernel is locked.
> >
> > [  805.307213] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
> > [  805.389519] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
> > [  805.471880] NIP: c000cc4c LR: 00000000 CTR: 00000000
> > [  805.531274] REGS: c1f87040 TRAP: 0000   Not tainted
> > (2.6.35.7-hg98224f47aa52-dirty)
> > [  805.623992] MSR: 00029002 <EE,ME,CE>  CR: 00000000  XER: 00000000
> > [  805.696972] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
> > [  805.778248] GPR00: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  805.878359] GPR08: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  805.978452] GPR16: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  806.078571] GPR24: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  806.180773] NIP [c000cc4c] udelay+0x24/0x30
> > [  806.230782] LR [00000000] (null)
> > [  806.269334] Call Trace:
> > [  806.298521] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
> > [  806.374600] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
> > [  806.437125] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
> > [  806.505897] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
> > [  806.573626] [efff3c00] [c003cca4] update_process_times+0x44/0x80
> > [  806.645528] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
> > [  806.714307] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
> > [  806.779958] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
> > [  806.850812] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
> > [  806.919586] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
> > [  806.987315] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
> > [  807.059212] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
> > [  807.131112] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [  807.204063] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
> > [  807.204068]     LR = tpacket_rcv+0x264/0x570
> > [  807.320754] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [  807.397875] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> > [  807.470811] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> > [  807.539583] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> > [  807.616693] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> > [  807.684426] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> > [  807.749031] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> > [  807.814683] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> > [  807.880333] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> > [  807.947022] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> > [  808.008503] --- Exception: ec6abbb0 at 0xec6abb70
> > [  808.008507]     LR = 0xec4e6c50
> > [  808.102274] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
> > [  808.175227] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
> > [  808.240872] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
> > [  808.312771] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
> > [  808.384669] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
> > [  808.454482] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [  808.527425] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
> > [  808.527430]     LR = tpacket_rcv+0x264/0x570
> > [  808.644114] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [  808.721232] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> > [  808.794171] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> > [  808.861901] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> > [  808.925465] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> > [  808.987988] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> > [  809.055718] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> > [  809.122407] --- Exception: c01 at 0x48051f00
> > [  809.122411]     LR = 0x4808e030
> > [  809.210966] Instruction dump:
> > [  809.246401] 7d204850 7f891840 419cfff0 7c421378 4e800020 3d20c04c
> > 800967e0 7c0301d6
> > [  809.339215] 7d2c42a6 48000008 7c210b78 <7d6c42a6> <7d695850>
> > 7f8b0040 419cfff0 7c421378
> > [  874.025894] BUG: soft lockup - CPU#0 stuck for 61s! [send_eth_socket:1907]
> > [  874.108198] Modules linked in: reboot_helper dpll_si53xx crave ndps_a_cpld
> > [  874.190551] NIP: c000cc48 LR: 00000000 CTR: 00000000
> > [  874.249937] REGS: c1f87040 TRAP: 0000   Not tainted
> > (2.6.35.7-hg98224f47aa52-dirty)
> > [  874.342658] MSR: 00029002 <EE,ME,CE>  CR: 00000000  XER: 00000000
> > [  874.415638] TASK = ec7116d0[1907] 'send_eth_socket' THREAD: ec6aa000 CPU: 0
> > [  874.496907] GPR00: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  874.597018] GPR08: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  874.697124] GPR16: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  874.797235] GPR24: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  874.899421] NIP [c000cc40] udelay+0x18/0x30
> > [  874.949434] LR [00000000] (null)
> > [  874.987986] Call Trace:
> > [  875.017170] [efff3b50] [c00071b4] show_stack+0x78/0x18c (unreliable)
> > [  875.093240] [efff3b90] [c00078c4] show_regs+0x200/0x2ec
> > [  875.155763] [efff3bc0] [c00658d4] softlockup_tick+0x1dc/0x23c
> > [  875.224534] [efff3bf0] [c003cc50] run_local_timers+0x1c/0x2c
> > [  875.292265] [efff3c00] [c003cca4] update_process_times+0x44/0x80
> > [  875.364164] [efff3c20] [c0059bc4] tick_sched_timer+0xd0/0x128
> > [  875.432936] [efff3c50] [c004d8f0] __run_hrtimer+0x68/0x14c
> > [  875.498584] [efff3c70] [c004efa4] hrtimer_interrupt+0x1d8/0x41c
> > [  875.569437] [efff3cf0] [c000d8d8] timer_interrupt+0x1b4/0x238
> > [  875.638211] [efff3d10] [c0009ac4] __ipipe_do_timer+0x44/0x54
> > [  875.705941] [efff3d20] [c006d448] __ipipe_sync_stage+0x1d0/0x27c
> > [  875.777839] [efff3d60] [c0009728] __ipipe_grab_timer+0x104/0x12c
> > [  875.849736] [efff3d70] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [  875.922680] --- Exception: 901 at _raw_spin_lock+0x30/0x3c
> > [  875.922684]     LR = tpacket_rcv+0x264/0x570
> > [  876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [  876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> > [  876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> > [  876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> > [  876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> > [  876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> > [  876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> > [  876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> > [  876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> > [  876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> > [  876.727097] --- Exception: ec6abbb0 at 0xec6abb70
> > [  876.727101]     LR = 0xec4e6c50
> > [  876.820868] [ec6abad0] [c00357cc] irq_exit+0x60/0xb8 (unreliable)
> > [  876.893814] [ec6abae0] [c0009b5c] __ipipe_do_IRQ+0x88/0xc0
> > [  876.959459] [ec6abb00] [c006d468] __ipipe_sync_stage+0x1f0/0x27c
> > [  877.031358] [ec6abb40] [c00095f4] __ipipe_handle_irq+0x1b8/0x1e8
> > [  877.103256] [ec6abb70] [c00098dc] __ipipe_grab_irq+0x18c/0x1bc
> > [  877.173069] [ec6abba0] [c00129e0] __ipipe_ret_from_except+0x0/0xc
> > [  877.246012] --- Exception: 501 at _raw_spin_lock+0x14/0x3c
> > [  877.246017]     LR = tpacket_rcv+0x264/0x570
> > [  877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [  877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> > [  877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> > [  877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> > [  877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> > [  877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> > [  877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> > [  877.840994] --- Exception: c01 at 0x48051f00
> > [  877.840998]     LR = 0x4808e030
> > [  877.929553] Instruction dump:
> > [  877.964988] 419cfff0 7c421378 4e800020 3d20c04c 800967e0 7c0301d6
> > 7d2c42a6 48000008
> > [  878.057802] 7c210b78 7d6c42a6 7d695850 7f8b0040 419cfff0 7c421378
> > 4e800020 3d20c04a
> >
> > I do not completely understand this dump, but it looks like both the
> > receive direction (running in the context of a softirq) and my
> > transmitting application are blocked on the spinlock used in the
> > tpacket_rcv function:
> >
> > [  876.039367] [efff3e30] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [  876.116479] [efff3e80] [c02c43b0] __netif_receive_skb+0x2b4/0x2f0
> > [  876.189418] [efff3eb0] [c02c4fa0] netif_receive_skb+0x98/0xac
> > [  876.258189] [efff3ee0] [c0292838] ingress_rx_default_dqrr+0x428/0x4b4
> > [  876.335297] [efff3f10] [c02ac2a8] qman_poll_dqrr+0x1e0/0x284
> > [  876.403025] [efff3f50] [c0294088] dpaa_eth_poll+0x34/0xd0
> > [  876.467632] [efff3f70] [c02c5280] net_rx_action+0xc0/0x1e8
> > [  876.533280] [efff3fa0] [c0035ab0] __do_softirq+0x138/0x210
> > [  876.598926] [efff3ff0] [c00115e8] call_do_softirq+0x14/0x24
> > [  876.665618] [ec6abab0] [c000480c] do_softirq+0xb4/0xec
> >
> > and
> >
> > [  877.362701] [ec6abc60] [c0325e48] tpacket_rcv+0xf4/0x570 (unreliable)
> > [  877.439819] [ec6abcb0] [c02c6238] dev_hard_start_xmit+0x164/0x414
> > [  877.512758] [ec6abcf0] [c0325b94] packet_sendmsg+0x8c0/0x984
> > [  877.580487] [ec6abd70] [c02b32f0] sock_sendmsg+0x90/0xb4
> > [  877.644052] [ec6abe40] [c02b3ea8] sys_sendto+0xd0/0x114
> > [  877.706575] [ec6abf10] [c02b522c] sys_socketcall+0x148/0x210
> > [  877.774306] [ec6abf40] [c0011d0c] ret_from_syscall+0x0/0x3c
> >
> > Is my analysis correct?
> > If yes, can this have anything to do with the IPIPE mechanism we are
> > using (maybe a know issue??).
> >
> > Any help would be much appreciated.
> >
> > Thanks,
> > Ronny
> >
> 
> Hello
> 
> I did a new test (this time with an older kernel Linux version
> 2.6.34.6): same tests were executed but this time on a pure Linux
> build (no IPIPE included). The issue cannot be reproduced anymore in
> this environment. My test builds keep on running forever.
> 
> My next steps are:
> - Running the same test on 2.6.35.7 without IPIPE. This enviroment is
> currently building.
> - Include only IPIPE and no Xenomai and redo the test.
> 

Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
does not exhibit the issue? A number of changes went in the IRQ replay
code during this time frame, and 2.6.35 was in a state of flux regarding
this.

> Best regards
> Ronny
> 
> _______________________________________________
> Adeos-main mailing list
> Adeos-main@domain.hid
> https://mail.gna.org/listinfo/adeos-main

-- 
Philippe.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-04  8:20   ` Philippe Gerum
@ 2011-07-04 11:42     ` Gilles Chanteperdrix
  2011-07-04 20:04       ` Ronny Meeus
  0 siblings, 1 reply; 9+ messages in thread
From: Gilles Chanteperdrix @ 2011-07-04 11:42 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

On 07/04/2011 10:20 AM, Philippe Gerum wrote:
> Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
> does not exhibit the issue? A number of changes went in the IRQ replay
> code during this time frame, and 2.6.35 was in a state of flux regarding
> this.

And please try Xenomai 2.5.6.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-04 11:42     ` Gilles Chanteperdrix
@ 2011-07-04 20:04       ` Ronny Meeus
  2011-07-04 20:09         ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Ronny Meeus @ 2011-07-04 20:04 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, Philippe Gerum

On Mon, Jul 4, 2011 at 1:42 PM, Gilles Chanteperdrix
<gilles.chanteperdrix@xenomai.org> wrote:
> On 07/04/2011 10:20 AM, Philippe Gerum wrote:
>> Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
>> does not exhibit the issue? A number of changes went in the IRQ replay
>> code during this time frame, and 2.6.35 was in a state of flux regarding
>> this.
>
> And please try Xenomai 2.5.6.
>
> --
>                                            Gilles.
>

Hello

today we tested tested: "Running the same test on 2.6.35.7 without IPIPE".
The result is also not OK. The problem can be reproduced on this
kernel, so it looks like the issue has nothing to do with the I-PIPE.
We are currently porting the FreeScale patches to the 2.6.36 kernel
but it looks like this is going to take some effort (Thomas already
spent half a day on it today).

Maybe a bit of history: we started with the 2.6.34 kernel with no SMP.
Here we observed issue in some scenarios. By playing with the
configuration we found that these issues were resolved by switching to
SMP (even on one core).
After the activation of SMP we started to see BADNESS issues once we
started to run our Xenomai based applications.
Thomas found a patch on the Xenomai mailing list to solve the badness
issue but this was based on 2.6.35.7. We ported the FreeScale patches
to that releases and we observed the blocking application issue
described above.

We will try to port the FreeScale patches to 2.6.36 and see how the
system behaves. More information on this later.
Can you guys give us any hint of what would be the most stable version
of Linux + Xenomai to base our application on?

Thanks
Ronny


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-04 20:04       ` Ronny Meeus
@ 2011-07-04 20:09         ` Philippe Gerum
  2011-07-04 20:13           ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2011-07-04 20:09 UTC (permalink / raw)
  To: Ronny Meeus; +Cc: adeos-main

On Mon, 2011-07-04 at 22:04 +0200, Ronny Meeus wrote:
> On Mon, Jul 4, 2011 at 1:42 PM, Gilles Chanteperdrix
> <gilles.chanteperdrix@xenomai.org> wrote:
> > On 07/04/2011 10:20 AM, Philippe Gerum wrote:
> >> Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
> >> does not exhibit the issue? A number of changes went in the IRQ replay
> >> code during this time frame, and 2.6.35 was in a state of flux regarding
> >> this.
> >
> > And please try Xenomai 2.5.6.
> >
> > --
> >                                            Gilles.
> >
> 
> Hello
> 
> today we tested tested: "Running the same test on 2.6.35.7 without IPIPE".
> The result is also not OK. The problem can be reproduced on this
> kernel, so it looks like the issue has nothing to do with the I-PIPE.
> We are currently porting the FreeScale patches to the 2.6.36 kernel
> but it looks like this is going to take some effort (Thomas already
> spent half a day on it today).
> 
> Maybe a bit of history: we started with the 2.6.34 kernel with no SMP.
> Here we observed issue in some scenarios. By playing with the
> configuration we found that these issues were resolved by switching to
> SMP (even on one core).
> After the activation of SMP we started to see BADNESS issues once we
> started to run our Xenomai based applications.
> Thomas found a patch on the Xenomai mailing list to solve the badness
> issue but this was based on 2.6.35.7. We ported the FreeScale patches
> to that releases and we observed the blocking application issue
> described above.
> 
> We will try to port the FreeScale patches to 2.6.36 and see how the
> system behaves. More information on this later.
> Can you guys give us any hint of what would be the most stable version
> of Linux + Xenomai to base our application on?

SMP-wise for powerpc, 2.6.36 + upcoming 2.5.6

> 
> Thanks
> Ronny
> 
> _______________________________________________
> Adeos-main mailing list
> Adeos-main@domain.hid
> https://mail.gna.org/listinfo/adeos-main

-- 
Philippe.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-04 20:09         ` Philippe Gerum
@ 2011-07-04 20:13           ` Philippe Gerum
  2011-07-05  8:45             ` Ronny Meeus
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2011-07-04 20:13 UTC (permalink / raw)
  To: Ronny Meeus; +Cc: adeos-main

On Mon, 2011-07-04 at 22:09 +0200, Philippe Gerum wrote:
> On Mon, 2011-07-04 at 22:04 +0200, Ronny Meeus wrote:
> > On Mon, Jul 4, 2011 at 1:42 PM, Gilles Chanteperdrix
> > <gilles.chanteperdrix@xenomai.org> wrote:
> > > On 07/04/2011 10:20 AM, Philippe Gerum wrote:
> > >> Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
> > >> does not exhibit the issue? A number of changes went in the IRQ replay
> > >> code during this time frame, and 2.6.35 was in a state of flux regarding
> > >> this.
> > >
> > > And please try Xenomai 2.5.6.
> > >
> > > --
> > >                                            Gilles.
> > >
> > 
> > Hello
> > 
> > today we tested tested: "Running the same test on 2.6.35.7 without IPIPE".
> > The result is also not OK. The problem can be reproduced on this
> > kernel, so it looks like the issue has nothing to do with the I-PIPE.
> > We are currently porting the FreeScale patches to the 2.6.36 kernel
> > but it looks like this is going to take some effort (Thomas already
> > spent half a day on it today).
> > 
> > Maybe a bit of history: we started with the 2.6.34 kernel with no SMP.
> > Here we observed issue in some scenarios. By playing with the
> > configuration we found that these issues were resolved by switching to
> > SMP (even on one core).
> > After the activation of SMP we started to see BADNESS issues once we
> > started to run our Xenomai based applications.
> > Thomas found a patch on the Xenomai mailing list to solve the badness
> > issue but this was based on 2.6.35.7. We ported the FreeScale patches
> > to that releases and we observed the blocking application issue
> > described above.
> > 
> > We will try to port the FreeScale patches to 2.6.36 and see how the
> > system behaves. More information on this later.
> > Can you guys give us any hint of what would be the most stable version
> > of Linux + Xenomai to base our application on?
> 
> SMP-wise for powerpc, 2.6.36 + upcoming 2.5.6

s,upcoming,,

I mean 2.5.6 stock.

> 
> > 
> > Thanks
> > Ronny
> > 
> > _______________________________________________
> > Adeos-main mailing list
> > Adeos-main@domain.hid
> > https://mail.gna.org/listinfo/adeos-main
> 

-- 
Philippe.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-04 20:13           ` Philippe Gerum
@ 2011-07-05  8:45             ` Ronny Meeus
  2011-07-05 20:11               ` Ronny Meeus
  0 siblings, 1 reply; 9+ messages in thread
From: Ronny Meeus @ 2011-07-05  8:45 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

On Mon, Jul 4, 2011 at 10:13 PM, Philippe Gerum <rpm@xenomai.org> wrote:
> On Mon, 2011-07-04 at 22:09 +0200, Philippe Gerum wrote:
>> On Mon, 2011-07-04 at 22:04 +0200, Ronny Meeus wrote:
>> > On Mon, Jul 4, 2011 at 1:42 PM, Gilles Chanteperdrix
>> > <gilles.chanteperdrix@xenomai.org> wrote:
>> > > On 07/04/2011 10:20 AM, Philippe Gerum wrote:
>> > >> Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
>> > >> does not exhibit the issue? A number of changes went in the IRQ replay
>> > >> code during this time frame, and 2.6.35 was in a state of flux regarding
>> > >> this.
>> > >
>> > > And please try Xenomai 2.5.6.
>> > >
>> > > --
>> > >                                            Gilles.
>> > >
>> >
>> > Hello
>> >
>> > today we tested tested: "Running the same test on 2.6.35.7 without IPIPE".
>> > The result is also not OK. The problem can be reproduced on this
>> > kernel, so it looks like the issue has nothing to do with the I-PIPE.
>> > We are currently porting the FreeScale patches to the 2.6.36 kernel
>> > but it looks like this is going to take some effort (Thomas already
>> > spent half a day on it today).
>> >
>> > Maybe a bit of history: we started with the 2.6.34 kernel with no SMP.
>> > Here we observed issue in some scenarios. By playing with the
>> > configuration we found that these issues were resolved by switching to
>> > SMP (even on one core).
>> > After the activation of SMP we started to see BADNESS issues once we
>> > started to run our Xenomai based applications.
>> > Thomas found a patch on the Xenomai mailing list to solve the badness
>> > issue but this was based on 2.6.35.7. We ported the FreeScale patches
>> > to that releases and we observed the blocking application issue
>> > described above.
>> >
>> > We will try to port the FreeScale patches to 2.6.36 and see how the
>> > system behaves. More information on this later.
>> > Can you guys give us any hint of what would be the most stable version
>> > of Linux + Xenomai to base our application on?
>>
>> SMP-wise for powerpc, 2.6.36 + upcoming 2.5.6
>
> s,upcoming,,
>
> I mean 2.5.6 stock.
>
>>
>> >
>> > Thanks
>> > Ronny
>> >
>> > _______________________________________________
>> > Adeos-main mailing list
>> > Adeos-main@domain.hid
>> > https://mail.gna.org/listinfo/adeos-main
>>
>
> --
> Philippe.
>
>
>

Hello

we have done the test on both 2.6.36 and 2.6.36.4 without I-PIPE.
In both cases the issue is seen.

Any hints to helps us debugging this issue would be appreciated.

Best regards
Ronny


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Adeos-main] Kernel blocked during send/receive raw ethernet packets.
  2011-07-05  8:45             ` Ronny Meeus
@ 2011-07-05 20:11               ` Ronny Meeus
  0 siblings, 0 replies; 9+ messages in thread
From: Ronny Meeus @ 2011-07-05 20:11 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

On Tue, Jul 5, 2011 at 10:45 AM, Ronny Meeus <ronny.meeus@domain.hid> wrote:
> On Mon, Jul 4, 2011 at 10:13 PM, Philippe Gerum <rpm@xenomai.org> wrote:
>> On Mon, 2011-07-04 at 22:09 +0200, Philippe Gerum wrote:
>>> On Mon, 2011-07-04 at 22:04 +0200, Ronny Meeus wrote:
>>> > On Mon, Jul 4, 2011 at 1:42 PM, Gilles Chanteperdrix
>>> > <gilles.chanteperdrix@xenomai.org> wrote:
>>> > > On 07/04/2011 10:20 AM, Philippe Gerum wrote:
>>> > >> Could you try 2.6.36-ipipe as well in case 2.6.35.7 without pipeline
>>> > >> does not exhibit the issue? A number of changes went in the IRQ replay
>>> > >> code during this time frame, and 2.6.35 was in a state of flux regarding
>>> > >> this.
>>> > >
>>> > > And please try Xenomai 2.5.6.
>>> > >
>>> > > --
>>> > >                                            Gilles.
>>> > >
>>> >
>>> > Hello
>>> >
>>> > today we tested tested: "Running the same test on 2.6.35.7 without IPIPE".
>>> > The result is also not OK. The problem can be reproduced on this
>>> > kernel, so it looks like the issue has nothing to do with the I-PIPE.
>>> > We are currently porting the FreeScale patches to the 2.6.36 kernel
>>> > but it looks like this is going to take some effort (Thomas already
>>> > spent half a day on it today).
>>> >
>>> > Maybe a bit of history: we started with the 2.6.34 kernel with no SMP.
>>> > Here we observed issue in some scenarios. By playing with the
>>> > configuration we found that these issues were resolved by switching to
>>> > SMP (even on one core).
>>> > After the activation of SMP we started to see BADNESS issues once we
>>> > started to run our Xenomai based applications.
>>> > Thomas found a patch on the Xenomai mailing list to solve the badness
>>> > issue but this was based on 2.6.35.7. We ported the FreeScale patches
>>> > to that releases and we observed the blocking application issue
>>> > described above.
>>> >
>>> > We will try to port the FreeScale patches to 2.6.36 and see how the
>>> > system behaves. More information on this later.
>>> > Can you guys give us any hint of what would be the most stable version
>>> > of Linux + Xenomai to base our application on?
>>>
>>> SMP-wise for powerpc, 2.6.36 + upcoming 2.5.6
>>
>> s,upcoming,,
>>
>> I mean 2.5.6 stock.
>>
>>>
>>> >
>>> > Thanks
>>> > Ronny
>>> >
>>> > _______________________________________________
>>> > Adeos-main mailing list
>>> > Adeos-main@domain.hid
>>> > https://mail.gna.org/listinfo/adeos-main
>>>
>>
>> --
>> Philippe.
>>
>>
>>
>
> Hello
>
> we have done the test on both 2.6.36 and 2.6.36.4 without I-PIPE.
> In both cases the issue is seen.
>
> Any hints to helps us debugging this issue would be appreciated.
>
> Best regards
> Ronny
>

Hello

Today we have identified the issue.
In the net/packet/af_packet.c file a spin_lock is used where a
spin_lock_bh should be used. We observed that the spin_lock was
entered recursively: by the application and also from the context of a
softIRQ. Result was a deadlock.

I will post a patch soon on the netdev mailing list.
We will upgrade most probably to 2.6.36.4 and Xenomai 2.5.6 like you
guys suggested.

Thanks for your support.

Best regards,
Ronny


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-07-05 20:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-02 21:33 [Adeos-main] Kernel blocked during send/receive raw ethernet packets Ronny Meeus
2011-07-04  8:06 ` Ronny Meeus
2011-07-04  8:20   ` Philippe Gerum
2011-07-04 11:42     ` Gilles Chanteperdrix
2011-07-04 20:04       ` Ronny Meeus
2011-07-04 20:09         ` Philippe Gerum
2011-07-04 20:13           ` Philippe Gerum
2011-07-05  8:45             ` Ronny Meeus
2011-07-05 20:11               ` Ronny Meeus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.