All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: linux sctp bug
@ 2009-09-28 16:45 Vlad Yasevich
  2009-09-30 12:49 ` Michael Krolikowski
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Vlad Yasevich @ 2009-09-28 16:45 UTC (permalink / raw)
  To: linux-sctp

Michael Krolikowski wrote:
> Hi,
> 
> I think I found a bug in the Linux SCTP implementation. I hope you are
> the right persons to ask for help with this.

The right place to ask is on linux-sctp mailing list.

> 
> If I send an SCTP INIT to a host which does not support SCTP (e.g. the
> module is not loaded), the
> other host sends an ICMP Protocol unreachable. This makes the SCTP
> module on the initiating host
> crash. It maybe that it crashes not at the first try but if I repeat the
> SCTP INIT 3-4 times it will crash.

Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
I've tried repeating INITs over the same path and over multiple paths, but
didn't see a crash.

Would you be able to do a bisect?

Thanks
-vlad

> 
> See this message:
> SCTP: Hash tables configured (established 512 bind 512)
> 
> Modules linked in: sctp
> Pid: 610, comm: sctp_test Not tainted 2.6.31
> RIP: 0033:[<00000000646228f9>]
> RSP: 0000000063873810  EFLAGS: 00010246
> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
> Call Trace: 
> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 
> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
> 0x646228f9
> Call Trace: 
> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 
> 
> Modules linked in: sctp
> Pid: 610, comm: sctp_test Not tainted 2.6.31
> RIP: 0033:[<00000000404ef5c0>]
> RSP: 0000007fbf8613f8  EFLAGS: 00000246
> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
> Call Trace: 
> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 
> This error seems only to occur if the remote host answers with ICMP
> protocol unreachable.
> If the remote host answers with SCTP ABORT, the error won't occur.
> 
> 
> Thanks in advance,
> 
> Michael Krolikowski
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
@ 2009-09-30 12:49 ` Michael Krolikowski
  2009-09-30 14:31 ` Vlad Yasevich
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Michael Krolikowski @ 2009-09-30 12:49 UTC (permalink / raw)
  To: linux-sctp

Hi,

I'm testing it using two UML machines. Both of them running Linux
2.6.31.
I tried it today again and it seems that the error occurs not as I first
said after only a few tries but many tries later it does.
I also tried with 2.6.31.1 (UML) with the same results.
I used Debian Lenny with a 2.6.26 Linux where I got the error for the
first time.

I hope this little information helps you a bit.


Regards,

Michael


-----Original Message-----
From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
Sent: Montag, 28. September 2009 18:46
To: Michael Krolikowski
Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
Subject: Re: linux sctp bug

Michael Krolikowski wrote:
> Hi,
> 
> I think I found a bug in the Linux SCTP implementation. I hope you are
> the right persons to ask for help with this.

The right place to ask is on linux-sctp mailing list.

> 
> If I send an SCTP INIT to a host which does not support SCTP (e.g. the
> module is not loaded), the
> other host sends an ICMP Protocol unreachable. This makes the SCTP
> module on the initiating host
> crash. It maybe that it crashes not at the first try but if I repeat
the
> SCTP INIT 3-4 times it will crash.

Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
I've tried repeating INITs over the same path and over multiple paths,
but
didn't see a crash.

Would you be able to do a bisect?

Thanks
-vlad

> 
> See this message:
> SCTP: Hash tables configured (established 512 bind 512)
> 
> Modules linked in: sctp
> Pid: 610, comm: sctp_test Not tainted 2.6.31
> RIP: 0033:[<00000000646228f9>]
> RSP: 0000000063873810  EFLAGS: 00010246
> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
> Call Trace: 
> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 
> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
> 0x646228f9
> Call Trace: 
> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 
> 
> Modules linked in: sctp
> Pid: 610, comm: sctp_test Not tainted 2.6.31
> RIP: 0033:[<00000000404ef5c0>]
> RSP: 0000007fbf8613f8  EFLAGS: 00000246
> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
> Call Trace: 
> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
> 
> This error seems only to occur if the remote host answers with ICMP
> protocol unreachable.
> If the remote host answers with SCTP ABORT, the error won't occur.
> 
> 
> Thanks in advance,
> 
> Michael Krolikowski
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
  2009-09-30 12:49 ` Michael Krolikowski
@ 2009-09-30 14:31 ` Vlad Yasevich
  2009-09-30 15:32 ` Michael Krolikowski
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vlad Yasevich @ 2009-09-30 14:31 UTC (permalink / raw)
  To: linux-sctp

Michael Krolikowski wrote:
> Hi,
> 
> I'm testing it using two UML machines. Both of them running Linux
> 2.6.31.
> I tried it today again and it seems that the error occurs not as I first
> said after only a few tries but many tries later it does.
> I also tried with 2.6.31.1 (UML) with the same results.
> I used Debian Lenny with a 2.6.26 Linux where I got the error for the
> first time.

So you were able to reproduce this with 2.6.26 kernel?

How do you test?  Do you just try to call connect() in a loop?

I run under KVM with a connect() call in a tight loop and see
not issues.  My ICMP sender is an Ubuntu Jaunty (2.6.28-15-generic)
kernel.

Looking at the stack trace you posted, the failure happens here:
        if (!asoc->temp) {
>>>              list_del(&asoc->asocs);

The addresses look very weird to.

Can reproduce this with live systems, or KVM?  I am suspecting UML...

-vlad


> 
> I hope this little information helps you a bit.
> 
> 
> Regards,
> 
> Michael
> 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Montag, 28. September 2009 18:46
> To: Michael Krolikowski
> Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
> Subject: Re: linux sctp bug
> 
> Michael Krolikowski wrote:
>> Hi,
>>
>> I think I found a bug in the Linux SCTP implementation. I hope you are
>> the right persons to ask for help with this.
> 
> The right place to ask is on linux-sctp mailing list.
> 
>> If I send an SCTP INIT to a host which does not support SCTP (e.g. the
>> module is not loaded), the
>> other host sends an ICMP Protocol unreachable. This makes the SCTP
>> module on the initiating host
>> crash. It maybe that it crashes not at the first try but if I repeat
> the
>> SCTP INIT 3-4 times it will crash.
> 
> Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
> I've tried repeating INITs over the same path and over multiple paths,
> but
> didn't see a crash.
> 
> Would you be able to do a bisect?
> 
> Thanks
> -vlad
> 
>> See this message:
>> SCTP: Hash tables configured (established 512 bind 512)
>>
>> Modules linked in: sctp
>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>> RIP: 0033:[<00000000646228f9>]
>> RSP: 0000000063873810  EFLAGS: 00010246
>> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
>> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
>> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
>> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
>> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
>> Call Trace: 
>> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>
>> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
>> 0x646228f9
>> Call Trace: 
>> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>
>>
>> Modules linked in: sctp
>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>> RIP: 0033:[<00000000404ef5c0>]
>> RSP: 0000007fbf8613f8  EFLAGS: 00000246
>> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
>> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
>> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
>> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
>> Call Trace: 
>> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
>> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
>> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
>> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
>> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>
>> This error seems only to occur if the remote host answers with ICMP
>> protocol unreachable.
>> If the remote host answers with SCTP ABORT, the error won't occur.
>>
>>
>> Thanks in advance,
>>
>> Michael Krolikowski
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
  2009-09-30 12:49 ` Michael Krolikowski
  2009-09-30 14:31 ` Vlad Yasevich
@ 2009-09-30 15:32 ` Michael Krolikowski
  2009-09-30 15:58 ` Vlad Yasevich
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Michael Krolikowski @ 2009-09-30 15:32 UTC (permalink / raw)
  To: linux-sctp

I've first seen the bug in Debian Lenny with Debian's patched Linux 2.6.
Now I've just installed Linux 2.6.26.8 (UML) and seen a different
behavior:

SCTP: Hash tables configured (established 512 bind 512)
BUG: soft lockup - CPU#0 stuck for 61s! [sctp_test:847]
Modules linked in: sctp

Modules linked in: sctp
Pid: 847, comm: sctp_test Not tainted 2.6.26.8
RIP: 0033:[<0000000062dad9c2>]
RSP: 0000000061f3b870  EFLAGS: 00000202
RAX: 7360adde2c000001 RBX: 0000000061e20000 RCX: 0000000061f3b910
RDX: 7360adde2c000001 RSI: 0000000000000000 RDI: 000000006150ea00
RBP: 0000000061f3b880 R08: 0000000061e20140 R09: 0000000000000000
R10: 0000000060228240 R11: 0000000000000049 R12: 0000000061e20000
R13: 0000000061e20000 R14: 0000000062dbfeb5 R15: 0000000062dc1a00
Call Trace:
601c7ae8:  [<6004e355>] softlockup_tick+0xf7/0x10a
601c7af8:  [<600318e7>] raise_softirq+0x64/0x6d
601c7b28:  [<60035bf0>] run_local_timers+0x18/0x1a
601c7b38:  [<60035c69>] update_process_times+0x2e/0x59
601c7b68:  [<600463c9>] tick_sched_timer+0x64/0x96
601c7b98:  [<600418da>] __run_hrtimer+0x26/0x6f
601c7bb8:  [<600421b2>] hrtimer_interrupt+0xe3/0x143
601c7bf8:  [<60012cd4>] um_timer+0xf/0x16
601c7c08:  [<6004e78a>] handle_IRQ_event+0x2b/0x5f
601c7c38:  [<6004e81f>] __do_IRQ+0x61/0xa6
601c7c68:  [<60010b8a>] do_IRQ+0x23/0x39
601c7c88:  [<60012d42>] timer_handler+0x21/0x2f
601c7ca8:  [<60020e87>] real_alarm_handler+0x3f/0x41
601c7cb8:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
601c7d30:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e [sctp]
601c7db8:  [<60020ee5>] alarm_handler+0x2e/0x39
601c7dd8:  [<60021179>] handle_signal+0x6b/0xa1
601c7e10:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
601c7e28:  [<60022a90>] hard_handler+0x10/0x14
601c7e98:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
601c7ee8:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e [sctp]

I did the test with the sctp_test tool from http://lksctp.sf.net/
I just repeated executing the tool manually, so no tight loop.
I always had both systems running with the same Linux Version. But this
shouldn't be the problem should it? It's always the same ICMP message I
get
from the remote host.
I did the test with Debian Lenny running inside VMware as well but
didn't
test inside KVM. I couldn't reproduce the bug in live systems but I did
only one quick test there. I'll give that a try and let you know - but
it
might take me a while.

Michael


-----Original Message-----
From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
Sent: Mittwoch, 30. September 2009 16:31
To: Michael Krolikowski
Cc: linux-sctp@vger.kernel.org
Subject: Re: linux sctp bug

Michael Krolikowski wrote:
> Hi,
> 
> I'm testing it using two UML machines. Both of them running Linux
> 2.6.31.
> I tried it today again and it seems that the error occurs not as I
first
> said after only a few tries but many tries later it does.
> I also tried with 2.6.31.1 (UML) with the same results.
> I used Debian Lenny with a 2.6.26 Linux where I got the error for the
> first time.

So you were able to reproduce this with 2.6.26 kernel?

How do you test?  Do you just try to call connect() in a loop?

I run under KVM with a connect() call in a tight loop and see
not issues.  My ICMP sender is an Ubuntu Jaunty (2.6.28-15-generic)
kernel.

Looking at the stack trace you posted, the failure happens here:
        if (!asoc->temp) {
>>>              list_del(&asoc->asocs);

The addresses look very weird to.

Can reproduce this with live systems, or KVM?  I am suspecting UML...

-vlad


> 
> I hope this little information helps you a bit.
> 
> 
> Regards,
> 
> Michael
> 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Montag, 28. September 2009 18:46
> To: Michael Krolikowski
> Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
> Subject: Re: linux sctp bug
> 
> Michael Krolikowski wrote:
>> Hi,
>>
>> I think I found a bug in the Linux SCTP implementation. I hope you
are
>> the right persons to ask for help with this.
> 
> The right place to ask is on linux-sctp mailing list.
> 
>> If I send an SCTP INIT to a host which does not support SCTP (e.g.
the
>> module is not loaded), the
>> other host sends an ICMP Protocol unreachable. This makes the SCTP
>> module on the initiating host
>> crash. It maybe that it crashes not at the first try but if I repeat
> the
>> SCTP INIT 3-4 times it will crash.
> 
> Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
> I've tried repeating INITs over the same path and over multiple paths,
> but
> didn't see a crash.
> 
> Would you be able to do a bisect?
> 
> Thanks
> -vlad
> 
>> See this message:
>> SCTP: Hash tables configured (established 512 bind 512)
>>
>> Modules linked in: sctp
>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>> RIP: 0033:[<00000000646228f9>]
>> RSP: 0000000063873810  EFLAGS: 00010246
>> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
>> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
>> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
>> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
>> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
>> Call Trace: 
>> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>
>> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
>> 0x646228f9
>> Call Trace: 
>> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>
>>
>> Modules linked in: sctp
>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>> RIP: 0033:[<00000000404ef5c0>]
>> RSP: 0000007fbf8613f8  EFLAGS: 00000246
>> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
>> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
>> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
>> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
>> Call Trace: 
>> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
>> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
>> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
>> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
>> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>
>> This error seems only to occur if the remote host answers with ICMP
>> protocol unreachable.
>> If the remote host answers with SCTP ABORT, the error won't occur.
>>
>>
>> Thanks in advance,
>>
>> Michael Krolikowski
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
                   ` (2 preceding siblings ...)
  2009-09-30 15:32 ` Michael Krolikowski
@ 2009-09-30 15:58 ` Vlad Yasevich
  2009-09-30 16:02 ` Michael Krolikowski
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vlad Yasevich @ 2009-09-30 15:58 UTC (permalink / raw)
  To: linux-sctp

Michael Krolikowski wrote:
> I've first seen the bug in Debian Lenny with Debian's patched Linux 2.6.
> Now I've just installed Linux 2.6.26.8 (UML) and seen a different
> behavior:
> 
> SCTP: Hash tables configured (established 512 bind 512)
> BUG: soft lockup - CPU#0 stuck for 61s! [sctp_test:847]
> Modules linked in: sctp
> 
> Modules linked in: sctp
> Pid: 847, comm: sctp_test Not tainted 2.6.26.8
> RIP: 0033:[<0000000062dad9c2>]
> RSP: 0000000061f3b870  EFLAGS: 00000202
> RAX: 7360adde2c000001 RBX: 0000000061e20000 RCX: 0000000061f3b910
> RDX: 7360adde2c000001 RSI: 0000000000000000 RDI: 000000006150ea00
> RBP: 0000000061f3b880 R08: 0000000061e20140 R09: 0000000000000000
> R10: 0000000060228240 R11: 0000000000000049 R12: 0000000061e20000
> R13: 0000000061e20000 R14: 0000000062dbfeb5 R15: 0000000062dc1a00
> Call Trace:
> 601c7ae8:  [<6004e355>] softlockup_tick+0xf7/0x10a
> 601c7af8:  [<600318e7>] raise_softirq+0x64/0x6d
> 601c7b28:  [<60035bf0>] run_local_timers+0x18/0x1a
> 601c7b38:  [<60035c69>] update_process_times+0x2e/0x59
> 601c7b68:  [<600463c9>] tick_sched_timer+0x64/0x96
> 601c7b98:  [<600418da>] __run_hrtimer+0x26/0x6f
> 601c7bb8:  [<600421b2>] hrtimer_interrupt+0xe3/0x143
> 601c7bf8:  [<60012cd4>] um_timer+0xf/0x16
> 601c7c08:  [<6004e78a>] handle_IRQ_event+0x2b/0x5f
> 601c7c38:  [<6004e81f>] __do_IRQ+0x61/0xa6
> 601c7c68:  [<60010b8a>] do_IRQ+0x23/0x39
> 601c7c88:  [<60012d42>] timer_handler+0x21/0x2f
> 601c7ca8:  [<60020e87>] real_alarm_handler+0x3f/0x41
> 601c7cb8:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7d30:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e [sctp]
> 601c7db8:  [<60020ee5>] alarm_handler+0x2e/0x39
> 601c7dd8:  [<60021179>] handle_signal+0x6b/0xa1
> 601c7e10:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7e28:  [<60022a90>] hard_handler+0x10/0x14
> 601c7e98:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7ee8:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e [sctp]
> 
> I did the test with the sctp_test tool from http://lksctp.sf.net/
> I just repeated executing the tool manually, so no tight loop.

Can you provide the command line args you use?  Want to try it in my KVM
sessions.

-vlad

> I always had both systems running with the same Linux Version. But this
> shouldn't be the problem should it? It's always the same ICMP message I
> get
> from the remote host.
> I did the test with Debian Lenny running inside VMware as well but
> didn't
> test inside KVM. I couldn't reproduce the bug in live systems but I did
> only one quick test there. I'll give that a try and let you know - but
> it
> might take me a while.
> 
> Michael
> 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Mittwoch, 30. September 2009 16:31
> To: Michael Krolikowski
> Cc: linux-sctp@vger.kernel.org
> Subject: Re: linux sctp bug
> 
> Michael Krolikowski wrote:
>> Hi,
>>
>> I'm testing it using two UML machines. Both of them running Linux
>> 2.6.31.
>> I tried it today again and it seems that the error occurs not as I
> first
>> said after only a few tries but many tries later it does.
>> I also tried with 2.6.31.1 (UML) with the same results.
>> I used Debian Lenny with a 2.6.26 Linux where I got the error for the
>> first time.
> 
> So you were able to reproduce this with 2.6.26 kernel?
> 
> How do you test?  Do you just try to call connect() in a loop?
> 
> I run under KVM with a connect() call in a tight loop and see
> not issues.  My ICMP sender is an Ubuntu Jaunty (2.6.28-15-generic)
> kernel.
> 
> Looking at the stack trace you posted, the failure happens here:
>         if (!asoc->temp) {
>>>>              list_del(&asoc->asocs);
> 
> The addresses look very weird to.
> 
> Can reproduce this with live systems, or KVM?  I am suspecting UML...
> 
> -vlad
> 
> 
>> I hope this little information helps you a bit.
>>
>>
>> Regards,
>>
>> Michael
>>
>>
>> -----Original Message-----
>> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
>> Sent: Montag, 28. September 2009 18:46
>> To: Michael Krolikowski
>> Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
>> Subject: Re: linux sctp bug
>>
>> Michael Krolikowski wrote:
>>> Hi,
>>>
>>> I think I found a bug in the Linux SCTP implementation. I hope you
> are
>>> the right persons to ask for help with this.
>> The right place to ask is on linux-sctp mailing list.
>>
>>> If I send an SCTP INIT to a host which does not support SCTP (e.g.
> the
>>> module is not loaded), the
>>> other host sends an ICMP Protocol unreachable. This makes the SCTP
>>> module on the initiating host
>>> crash. It maybe that it crashes not at the first try but if I repeat
>> the
>>> SCTP INIT 3-4 times it will crash.
>> Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
>> I've tried repeating INITs over the same path and over multiple paths,
>> but
>> didn't see a crash.
>>
>> Would you be able to do a bisect?
>>
>> Thanks
>> -vlad
>>
>>> See this message:
>>> SCTP: Hash tables configured (established 512 bind 512)
>>>
>>> Modules linked in: sctp
>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>> RIP: 0033:[<00000000646228f9>]
>>> RSP: 0000000063873810  EFLAGS: 00010246
>>> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
>>> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
>>> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
>>> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
>>> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
>>> Call Trace: 
>>> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
>>> 0x646228f9
>>> Call Trace: 
>>> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>>
>>> Modules linked in: sctp
>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>> RIP: 0033:[<00000000404ef5c0>]
>>> RSP: 0000007fbf8613f8  EFLAGS: 00000246
>>> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
>>> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
>>> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
>>> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
>>> Call Trace: 
>>> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
>>> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
>>> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
>>> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>> This error seems only to occur if the remote host answers with ICMP
>>> protocol unreachable.
>>> If the remote host answers with SCTP ABORT, the error won't occur.
>>>
>>>
>>> Thanks in advance,
>>>
>>> Michael Krolikowski
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
                   ` (3 preceding siblings ...)
  2009-09-30 15:58 ` Vlad Yasevich
@ 2009-09-30 16:02 ` Michael Krolikowski
  2010-01-08 16:27 ` Michael Krolikowski
  2010-01-08 19:48 ` Vlad Yasevich
  6 siblings, 0 replies; 8+ messages in thread
From: Michael Krolikowski @ 2009-09-30 16:02 UTC (permalink / raw)
  To: linux-sctp

sctp_test -H 192.168.123.2 -P 12345 -h 192.168.123.3 -p 2345 -s

where 192.168.123.2 is the host which crashes and 192.168.123.3
The host which sends ICMP messages.


Michael


-----Original Message-----
From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
Sent: Mittwoch, 30. September 2009 17:58
To: Michael Krolikowski
Cc: linux-sctp@vger.kernel.org
Subject: Re: linux sctp bug

Michael Krolikowski wrote:
> I've first seen the bug in Debian Lenny with Debian's patched Linux
2.6.
> Now I've just installed Linux 2.6.26.8 (UML) and seen a different
> behavior:
> 
> SCTP: Hash tables configured (established 512 bind 512)
> BUG: soft lockup - CPU#0 stuck for 61s! [sctp_test:847]
> Modules linked in: sctp
> 
> Modules linked in: sctp
> Pid: 847, comm: sctp_test Not tainted 2.6.26.8
> RIP: 0033:[<0000000062dad9c2>]
> RSP: 0000000061f3b870  EFLAGS: 00000202
> RAX: 7360adde2c000001 RBX: 0000000061e20000 RCX: 0000000061f3b910
> RDX: 7360adde2c000001 RSI: 0000000000000000 RDI: 000000006150ea00
> RBP: 0000000061f3b880 R08: 0000000061e20140 R09: 0000000000000000
> R10: 0000000060228240 R11: 0000000000000049 R12: 0000000061e20000
> R13: 0000000061e20000 R14: 0000000062dbfeb5 R15: 0000000062dc1a00
> Call Trace:
> 601c7ae8:  [<6004e355>] softlockup_tick+0xf7/0x10a
> 601c7af8:  [<600318e7>] raise_softirq+0x64/0x6d
> 601c7b28:  [<60035bf0>] run_local_timers+0x18/0x1a
> 601c7b38:  [<60035c69>] update_process_times+0x2e/0x59
> 601c7b68:  [<600463c9>] tick_sched_timer+0x64/0x96
> 601c7b98:  [<600418da>] __run_hrtimer+0x26/0x6f
> 601c7bb8:  [<600421b2>] hrtimer_interrupt+0xe3/0x143
> 601c7bf8:  [<60012cd4>] um_timer+0xf/0x16
> 601c7c08:  [<6004e78a>] handle_IRQ_event+0x2b/0x5f
> 601c7c38:  [<6004e81f>] __do_IRQ+0x61/0xa6
> 601c7c68:  [<60010b8a>] do_IRQ+0x23/0x39
> 601c7c88:  [<60012d42>] timer_handler+0x21/0x2f
> 601c7ca8:  [<60020e87>] real_alarm_handler+0x3f/0x41
> 601c7cb8:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7d30:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e
[sctp]
> 601c7db8:  [<60020ee5>] alarm_handler+0x2e/0x39
> 601c7dd8:  [<60021179>] handle_signal+0x6b/0xa1
> 601c7e10:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7e28:  [<60022a90>] hard_handler+0x10/0x14
> 601c7e98:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7ee8:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e
[sctp]
> 
> I did the test with the sctp_test tool from http://lksctp.sf.net/
> I just repeated executing the tool manually, so no tight loop.

Can you provide the command line args you use?  Want to try it in my KVM
sessions.

-vlad

> I always had both systems running with the same Linux Version. But
this
> shouldn't be the problem should it? It's always the same ICMP message
I
> get
> from the remote host.
> I did the test with Debian Lenny running inside VMware as well but
> didn't
> test inside KVM. I couldn't reproduce the bug in live systems but I
did
> only one quick test there. I'll give that a try and let you know - but
> it
> might take me a while.
> 
> Michael
> 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Mittwoch, 30. September 2009 16:31
> To: Michael Krolikowski
> Cc: linux-sctp@vger.kernel.org
> Subject: Re: linux sctp bug
> 
> Michael Krolikowski wrote:
>> Hi,
>>
>> I'm testing it using two UML machines. Both of them running Linux
>> 2.6.31.
>> I tried it today again and it seems that the error occurs not as I
> first
>> said after only a few tries but many tries later it does.
>> I also tried with 2.6.31.1 (UML) with the same results.
>> I used Debian Lenny with a 2.6.26 Linux where I got the error for the
>> first time.
> 
> So you were able to reproduce this with 2.6.26 kernel?
> 
> How do you test?  Do you just try to call connect() in a loop?
> 
> I run under KVM with a connect() call in a tight loop and see
> not issues.  My ICMP sender is an Ubuntu Jaunty (2.6.28-15-generic)
> kernel.
> 
> Looking at the stack trace you posted, the failure happens here:
>         if (!asoc->temp) {
>>>>              list_del(&asoc->asocs);
> 
> The addresses look very weird to.
> 
> Can reproduce this with live systems, or KVM?  I am suspecting UML...
> 
> -vlad
> 
> 
>> I hope this little information helps you a bit.
>>
>>
>> Regards,
>>
>> Michael
>>
>>
>> -----Original Message-----
>> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
>> Sent: Montag, 28. September 2009 18:46
>> To: Michael Krolikowski
>> Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
>> Subject: Re: linux sctp bug
>>
>> Michael Krolikowski wrote:
>>> Hi,
>>>
>>> I think I found a bug in the Linux SCTP implementation. I hope you
> are
>>> the right persons to ask for help with this.
>> The right place to ask is on linux-sctp mailing list.
>>
>>> If I send an SCTP INIT to a host which does not support SCTP (e.g.
> the
>>> module is not loaded), the
>>> other host sends an ICMP Protocol unreachable. This makes the SCTP
>>> module on the initiating host
>>> crash. It maybe that it crashes not at the first try but if I repeat
>> the
>>> SCTP INIT 3-4 times it will crash.
>> Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
>> I've tried repeating INITs over the same path and over multiple
paths,
>> but
>> didn't see a crash.
>>
>> Would you be able to do a bisect?
>>
>> Thanks
>> -vlad
>>
>>> See this message:
>>> SCTP: Hash tables configured (established 512 bind 512)
>>>
>>> Modules linked in: sctp
>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>> RIP: 0033:[<00000000646228f9>]
>>> RSP: 0000000063873810  EFLAGS: 00010246
>>> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
>>> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
>>> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
>>> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
>>> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
>>> Call Trace: 
>>> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
>>> 0x646228f9
>>> Call Trace: 
>>> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>>
>>> Modules linked in: sctp
>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>> RIP: 0033:[<00000000404ef5c0>]
>>> RSP: 0000007fbf8613f8  EFLAGS: 00000246
>>> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
>>> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
>>> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
>>> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
>>> Call Trace: 
>>> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
>>> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
>>> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
>>> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>> This error seems only to occur if the remote host answers with ICMP
>>> protocol unreachable.
>>> If the remote host answers with SCTP ABORT, the error won't occur.
>>>
>>>
>>> Thanks in advance,
>>>
>>> Michael Krolikowski
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
                   ` (4 preceding siblings ...)
  2009-09-30 16:02 ` Michael Krolikowski
@ 2010-01-08 16:27 ` Michael Krolikowski
  2010-01-08 19:48 ` Vlad Yasevich
  6 siblings, 0 replies; 8+ messages in thread
From: Michael Krolikowski @ 2010-01-08 16:27 UTC (permalink / raw)
  To: linux-sctp

Hi!

After a big break I continued testing the bug. I took one real machine
and a
netfilter rule to always respond with ICMP "protocol unreachable". This
is as
follows:
  iptables -A INPUT -p sctp --dport 12345 -j REJECT --reject-with \
    icmp-proto-unreachable
Then I start the following program to repetitive connect to
localhost:12345
and shutdown again.

/*BEGIN*/
#include <netinet/in.h>
#include <string.h>
#include <stdio.h>
#include <netinet/sctp.h>

#define _RUNS_ 100
#define _CONNECT_PORT_ 12345

#define _ERROR(a) { \
	perror(a);  \
	return 1;   \
}

int test()
{
	int sock;
	struct sockaddr_in sin_bind, sin_connect;
	struct sctp_initmsg init;

	/* create socket */
	if((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) < 0)
		_ERROR("socket");

	/* bind socket */
	memset(&sin_bind, 0, sizeof(struct sockaddr_in));
	sin_bind.sin_family = AF_INET;
	sin_bind.sin_addr.s_addr = INADDR_ANY;
	if(bind(sock, (struct sockaddr*)&sin_bind, sizeof(struct
sockaddr_in)))
		_ERROR("bind");

	/* set sctp options */
	init.sinit_num_ostreams   = 1;
	init.sinit_max_instreams  = 1;
	init.sinit_max_attempts   = 1;
	init.sinit_max_init_timeo = 1;
	if(setsockopt(sock, IPPROTO_SCTP, SCTP_INITMSG, &init,
sizeof(struct sctp_initmsg)))
		_ERROR("setsockopt");

	/* connect */
	memset(&sin_connect, 0, sizeof(struct sockaddr_in));
	sin_connect.sin_family = AF_INET;
	sin_connect.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
	sin_connect.sin_port = htons( _CONNECT_PORT_ );
	if(connect(sock, (struct sockaddr*)&sin_connect, sizeof(struct
sockaddr_in)))
		_ERROR("connect");

	/* shutdown socket */
	if(shutdown(sock, 2))
		_ERROR("shutdown");

	return 0;
}

int main(int argc, char** argv)
{
	int i, ret;
	for(i=0; i< _RUNS_ ; i++)
		ret = test();
	return 0;
}
/* END */

The initiation seems to be very important while its values don't. I
launched
the program once and waited a few seconds for the sctp module to crash.
I hope
this time your machine crashes too ;-)
This time I tested on a real machine with Debian Lenny (linux 2.6.26
debian
kernel) and a virtual machine with linux 2.6.32.3. Both crashed.


Regards,

Michael

-----Original Message-----
From: linux-sctp-owner@vger.kernel.org
[mailto:linux-sctp-owner@vger.kernel.org] On Behalf Of Michael
Krolikowski
Sent: Mittwoch, 30. September 2009 18:02
To: Vlad Yasevich
Cc: linux-sctp@vger.kernel.org
Subject: RE: linux sctp bug

sctp_test -H 192.168.123.2 -P 12345 -h 192.168.123.3 -p 2345 -s

where 192.168.123.2 is the host which crashes and 192.168.123.3
The host which sends ICMP messages.


Michael


-----Original Message-----
From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
Sent: Mittwoch, 30. September 2009 17:58
To: Michael Krolikowski
Cc: linux-sctp@vger.kernel.org
Subject: Re: linux sctp bug

Michael Krolikowski wrote:
> I've first seen the bug in Debian Lenny with Debian's patched Linux
2.6.
> Now I've just installed Linux 2.6.26.8 (UML) and seen a different
> behavior:
> 
> SCTP: Hash tables configured (established 512 bind 512)
> BUG: soft lockup - CPU#0 stuck for 61s! [sctp_test:847]
> Modules linked in: sctp
> 
> Modules linked in: sctp
> Pid: 847, comm: sctp_test Not tainted 2.6.26.8
> RIP: 0033:[<0000000062dad9c2>]
> RSP: 0000000061f3b870  EFLAGS: 00000202
> RAX: 7360adde2c000001 RBX: 0000000061e20000 RCX: 0000000061f3b910
> RDX: 7360adde2c000001 RSI: 0000000000000000 RDI: 000000006150ea00
> RBP: 0000000061f3b880 R08: 0000000061e20140 R09: 0000000000000000
> R10: 0000000060228240 R11: 0000000000000049 R12: 0000000061e20000
> R13: 0000000061e20000 R14: 0000000062dbfeb5 R15: 0000000062dc1a00
> Call Trace:
> 601c7ae8:  [<6004e355>] softlockup_tick+0xf7/0x10a
> 601c7af8:  [<600318e7>] raise_softirq+0x64/0x6d
> 601c7b28:  [<60035bf0>] run_local_timers+0x18/0x1a
> 601c7b38:  [<60035c69>] update_process_times+0x2e/0x59
> 601c7b68:  [<600463c9>] tick_sched_timer+0x64/0x96
> 601c7b98:  [<600418da>] __run_hrtimer+0x26/0x6f
> 601c7bb8:  [<600421b2>] hrtimer_interrupt+0xe3/0x143
> 601c7bf8:  [<60012cd4>] um_timer+0xf/0x16
> 601c7c08:  [<6004e78a>] handle_IRQ_event+0x2b/0x5f
> 601c7c38:  [<6004e81f>] __do_IRQ+0x61/0xa6
> 601c7c68:  [<60010b8a>] do_IRQ+0x23/0x39
> 601c7c88:  [<60012d42>] timer_handler+0x21/0x2f
> 601c7ca8:  [<60020e87>] real_alarm_handler+0x3f/0x41
> 601c7cb8:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7d30:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e
[sctp]
> 601c7db8:  [<60020ee5>] alarm_handler+0x2e/0x39
> 601c7dd8:  [<60021179>] handle_signal+0x6b/0xa1
> 601c7e10:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7e28:  [<60022a90>] hard_handler+0x10/0x14
> 601c7e98:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
> 601c7ee8:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e
[sctp]
> 
> I did the test with the sctp_test tool from http://lksctp.sf.net/
> I just repeated executing the tool manually, so no tight loop.

Can you provide the command line args you use?  Want to try it in my KVM
sessions.

-vlad

> I always had both systems running with the same Linux Version. But
this
> shouldn't be the problem should it? It's always the same ICMP message
I
> get
> from the remote host.
> I did the test with Debian Lenny running inside VMware as well but
> didn't
> test inside KVM. I couldn't reproduce the bug in live systems but I
did
> only one quick test there. I'll give that a try and let you know - but
> it
> might take me a while.
> 
> Michael
> 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Mittwoch, 30. September 2009 16:31
> To: Michael Krolikowski
> Cc: linux-sctp@vger.kernel.org
> Subject: Re: linux sctp bug
> 
> Michael Krolikowski wrote:
>> Hi,
>>
>> I'm testing it using two UML machines. Both of them running Linux
>> 2.6.31.
>> I tried it today again and it seems that the error occurs not as I
> first
>> said after only a few tries but many tries later it does.
>> I also tried with 2.6.31.1 (UML) with the same results.
>> I used Debian Lenny with a 2.6.26 Linux where I got the error for the
>> first time.
> 
> So you were able to reproduce this with 2.6.26 kernel?
> 
> How do you test?  Do you just try to call connect() in a loop?
> 
> I run under KVM with a connect() call in a tight loop and see
> not issues.  My ICMP sender is an Ubuntu Jaunty (2.6.28-15-generic)
> kernel.
> 
> Looking at the stack trace you posted, the failure happens here:
>         if (!asoc->temp) {
>>>>              list_del(&asoc->asocs);
> 
> The addresses look very weird to.
> 
> Can reproduce this with live systems, or KVM?  I am suspecting UML...
> 
> -vlad
> 
> 
>> I hope this little information helps you a bit.
>>
>>
>> Regards,
>>
>> Michael
>>
>>
>> -----Original Message-----
>> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
>> Sent: Montag, 28. September 2009 18:46
>> To: Michael Krolikowski
>> Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
>> Subject: Re: linux sctp bug
>>
>> Michael Krolikowski wrote:
>>> Hi,
>>>
>>> I think I found a bug in the Linux SCTP implementation. I hope you
> are
>>> the right persons to ask for help with this.
>> The right place to ask is on linux-sctp mailing list.
>>
>>> If I send an SCTP INIT to a host which does not support SCTP (e.g.
> the
>>> module is not loaded), the
>>> other host sends an ICMP Protocol unreachable. This makes the SCTP
>>> module on the initiating host
>>> crash. It maybe that it crashes not at the first try but if I repeat
>> the
>>> SCTP INIT 3-4 times it will crash.
>> Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
>> I've tried repeating INITs over the same path and over multiple
paths,
>> but
>> didn't see a crash.
>>
>> Would you be able to do a bisect?
>>
>> Thanks
>> -vlad
>>
>>> See this message:
>>> SCTP: Hash tables configured (established 512 bind 512)
>>>
>>> Modules linked in: sctp
>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>> RIP: 0033:[<00000000646228f9>]
>>> RSP: 0000000063873810  EFLAGS: 00010246
>>> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
>>> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
>>> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
>>> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
>>> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
>>> Call Trace: 
>>> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
>>> 0x646228f9
>>> Call Trace: 
>>> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>>
>>> Modules linked in: sctp
>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>> RIP: 0033:[<00000000404ef5c0>]
>>> RSP: 0000007fbf8613f8  EFLAGS: 00000246
>>> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
>>> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
>>> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
>>> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
>>> Call Trace: 
>>> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
>>> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
>>> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
>>> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>
>>> This error seems only to occur if the remote host answers with ICMP
>>> protocol unreachable.
>>> If the remote host answers with SCTP ABORT, the error won't occur.
>>>
>>>
>>> Thanks in advance,
>>>
>>> Michael Krolikowski
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux sctp bug
  2009-09-28 16:45 linux sctp bug Vlad Yasevich
                   ` (5 preceding siblings ...)
  2010-01-08 16:27 ` Michael Krolikowski
@ 2010-01-08 19:48 ` Vlad Yasevich
  6 siblings, 0 replies; 8+ messages in thread
From: Vlad Yasevich @ 2010-01-08 19:48 UTC (permalink / raw)
  To: linux-sctp

Ok.  Seems to be crashing over loopback only...

When a remote is generating the ICMPs, I can't trigger the crash.

Thanks for the report.
-vlad

Michael Krolikowski wrote:
> Hi!
> 
> After a big break I continued testing the bug. I took one real machine
> and a
> netfilter rule to always respond with ICMP "protocol unreachable". This
> is as
> follows:
>   iptables -A INPUT -p sctp --dport 12345 -j REJECT --reject-with \
>     icmp-proto-unreachable
> Then I start the following program to repetitive connect to
> localhost:12345
> and shutdown again.
> 
> /*BEGIN*/
> #include <netinet/in.h>
> #include <string.h>
> #include <stdio.h>
> #include <netinet/sctp.h>
> 
> #define _RUNS_ 100
> #define _CONNECT_PORT_ 12345
> 
> #define _ERROR(a) { \
> 	perror(a);  \
> 	return 1;   \
> }
> 
> int test()
> {
> 	int sock;
> 	struct sockaddr_in sin_bind, sin_connect;
> 	struct sctp_initmsg init;
> 
> 	/* create socket */
> 	if((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) < 0)
> 		_ERROR("socket");
> 
> 	/* bind socket */
> 	memset(&sin_bind, 0, sizeof(struct sockaddr_in));
> 	sin_bind.sin_family = AF_INET;
> 	sin_bind.sin_addr.s_addr = INADDR_ANY;
> 	if(bind(sock, (struct sockaddr*)&sin_bind, sizeof(struct
> sockaddr_in)))
> 		_ERROR("bind");
> 
> 	/* set sctp options */
> 	init.sinit_num_ostreams   = 1;
> 	init.sinit_max_instreams  = 1;
> 	init.sinit_max_attempts   = 1;
> 	init.sinit_max_init_timeo = 1;
> 	if(setsockopt(sock, IPPROTO_SCTP, SCTP_INITMSG, &init,
> sizeof(struct sctp_initmsg)))
> 		_ERROR("setsockopt");
> 
> 	/* connect */
> 	memset(&sin_connect, 0, sizeof(struct sockaddr_in));
> 	sin_connect.sin_family = AF_INET;
> 	sin_connect.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
> 	sin_connect.sin_port = htons( _CONNECT_PORT_ );
> 	if(connect(sock, (struct sockaddr*)&sin_connect, sizeof(struct
> sockaddr_in)))
> 		_ERROR("connect");
> 
> 	/* shutdown socket */
> 	if(shutdown(sock, 2))
> 		_ERROR("shutdown");
> 
> 	return 0;
> }
> 
> int main(int argc, char** argv)
> {
> 	int i, ret;
> 	for(i=0; i< _RUNS_ ; i++)
> 		ret = test();
> 	return 0;
> }
> /* END */
> 
> The initiation seems to be very important while its values don't. I
> launched
> the program once and waited a few seconds for the sctp module to crash.
> I hope
> this time your machine crashes too ;-)
> This time I tested on a real machine with Debian Lenny (linux 2.6.26
> debian
> kernel) and a virtual machine with linux 2.6.32.3. Both crashed.
> 
> 
> Regards,
> 
> Michael
> 
> -----Original Message-----
> From: linux-sctp-owner@vger.kernel.org
> [mailto:linux-sctp-owner@vger.kernel.org] On Behalf Of Michael
> Krolikowski
> Sent: Mittwoch, 30. September 2009 18:02
> To: Vlad Yasevich
> Cc: linux-sctp@vger.kernel.org
> Subject: RE: linux sctp bug
> 
> sctp_test -H 192.168.123.2 -P 12345 -h 192.168.123.3 -p 2345 -s
> 
> where 192.168.123.2 is the host which crashes and 192.168.123.3
> The host which sends ICMP messages.
> 
> 
> Michael
> 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Mittwoch, 30. September 2009 17:58
> To: Michael Krolikowski
> Cc: linux-sctp@vger.kernel.org
> Subject: Re: linux sctp bug
> 
> Michael Krolikowski wrote:
>> I've first seen the bug in Debian Lenny with Debian's patched Linux
> 2.6.
>> Now I've just installed Linux 2.6.26.8 (UML) and seen a different
>> behavior:
>>
>> SCTP: Hash tables configured (established 512 bind 512)
>> BUG: soft lockup - CPU#0 stuck for 61s! [sctp_test:847]
>> Modules linked in: sctp
>>
>> Modules linked in: sctp
>> Pid: 847, comm: sctp_test Not tainted 2.6.26.8
>> RIP: 0033:[<0000000062dad9c2>]
>> RSP: 0000000061f3b870  EFLAGS: 00000202
>> RAX: 7360adde2c000001 RBX: 0000000061e20000 RCX: 0000000061f3b910
>> RDX: 7360adde2c000001 RSI: 0000000000000000 RDI: 000000006150ea00
>> RBP: 0000000061f3b880 R08: 0000000061e20140 R09: 0000000000000000
>> R10: 0000000060228240 R11: 0000000000000049 R12: 0000000061e20000
>> R13: 0000000061e20000 R14: 0000000062dbfeb5 R15: 0000000062dc1a00
>> Call Trace:
>> 601c7ae8:  [<6004e355>] softlockup_tick+0xf7/0x10a
>> 601c7af8:  [<600318e7>] raise_softirq+0x64/0x6d
>> 601c7b28:  [<60035bf0>] run_local_timers+0x18/0x1a
>> 601c7b38:  [<60035c69>] update_process_times+0x2e/0x59
>> 601c7b68:  [<600463c9>] tick_sched_timer+0x64/0x96
>> 601c7b98:  [<600418da>] __run_hrtimer+0x26/0x6f
>> 601c7bb8:  [<600421b2>] hrtimer_interrupt+0xe3/0x143
>> 601c7bf8:  [<60012cd4>] um_timer+0xf/0x16
>> 601c7c08:  [<6004e78a>] handle_IRQ_event+0x2b/0x5f
>> 601c7c38:  [<6004e81f>] __do_IRQ+0x61/0xa6
>> 601c7c68:  [<60010b8a>] do_IRQ+0x23/0x39
>> 601c7c88:  [<60012d42>] timer_handler+0x21/0x2f
>> 601c7ca8:  [<60020e87>] real_alarm_handler+0x3f/0x41
>> 601c7cb8:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
>> 601c7d30:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e
> [sctp]
>> 601c7db8:  [<60020ee5>] alarm_handler+0x2e/0x39
>> 601c7dd8:  [<60021179>] handle_signal+0x6b/0xa1
>> 601c7e10:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
>> 601c7e28:  [<60022a90>] hard_handler+0x10/0x14
>> 601c7e98:  [<62dbfeb5>] sctp_pname+0x0/0x1a [sctp]
>> 601c7ee8:  [<62dad9c2>] sctp_assoc_update_retran_path+0x44/0x13e
> [sctp]
>> I did the test with the sctp_test tool from http://lksctp.sf.net/
>> I just repeated executing the tool manually, so no tight loop.
> 
> Can you provide the command line args you use?  Want to try it in my KVM
> sessions.
> 
> -vlad
> 
>> I always had both systems running with the same Linux Version. But
> this
>> shouldn't be the problem should it? It's always the same ICMP message
> I
>> get
>> from the remote host.
>> I did the test with Debian Lenny running inside VMware as well but
>> didn't
>> test inside KVM. I couldn't reproduce the bug in live systems but I
> did
>> only one quick test there. I'll give that a try and let you know - but
>> it
>> might take me a while.
>>
>> Michael
>>
>>
>> -----Original Message-----
>> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
>> Sent: Mittwoch, 30. September 2009 16:31
>> To: Michael Krolikowski
>> Cc: linux-sctp@vger.kernel.org
>> Subject: Re: linux sctp bug
>>
>> Michael Krolikowski wrote:
>>> Hi,
>>>
>>> I'm testing it using two UML machines. Both of them running Linux
>>> 2.6.31.
>>> I tried it today again and it seems that the error occurs not as I
>> first
>>> said after only a few tries but many tries later it does.
>>> I also tried with 2.6.31.1 (UML) with the same results.
>>> I used Debian Lenny with a 2.6.26 Linux where I got the error for the
>>> first time.
>> So you were able to reproduce this with 2.6.26 kernel?
>>
>> How do you test?  Do you just try to call connect() in a loop?
>>
>> I run under KVM with a connect() call in a tight loop and see
>> not issues.  My ICMP sender is an Ubuntu Jaunty (2.6.28-15-generic)
>> kernel.
>>
>> Looking at the stack trace you posted, the failure happens here:
>>         if (!asoc->temp) {
>>>>>              list_del(&asoc->asocs);
>> The addresses look very weird to.
>>
>> Can reproduce this with live systems, or KVM?  I am suspecting UML...
>>
>> -vlad
>>
>>
>>> I hope this little information helps you a bit.
>>>
>>>
>>> Regards,
>>>
>>> Michael
>>>
>>>
>>> -----Original Message-----
>>> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
>>> Sent: Montag, 28. September 2009 18:46
>>> To: Michael Krolikowski
>>> Cc: Sridhar Samudrala; Linux SCTP Dev Mailing list
>>> Subject: Re: linux sctp bug
>>>
>>> Michael Krolikowski wrote:
>>>> Hi,
>>>>
>>>> I think I found a bug in the Linux SCTP implementation. I hope you
>> are
>>>> the right persons to ask for help with this.
>>> The right place to ask is on linux-sctp mailing list.
>>>
>>>> If I send an SCTP INIT to a host which does not support SCTP (e.g.
>> the
>>>> module is not loaded), the
>>>> other host sends an ICMP Protocol unreachable. This makes the SCTP
>>>> module on the initiating host
>>>> crash. It maybe that it crashes not at the first try but if I repeat
>>> the
>>>> SCTP INIT 3-4 times it will crash.
>>> Hm..  I've tried to reproduce and couldn't with top of tree 2.6.31.
>>> I've tried repeating INITs over the same path and over multiple
> paths,
>>> but
>>> didn't see a crash.
>>>
>>> Would you be able to do a bisect?
>>>
>>> Thanks
>>> -vlad
>>>
>>>> See this message:
>>>> SCTP: Hash tables configured (established 512 bind 512)
>>>>
>>>> Modules linked in: sctp
>>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>>> RIP: 0033:[<00000000646228f9>]
>>>> RSP: 0000000063873810  EFLAGS: 00010246
>>>> RAX: 0000000000200200 RBX: 0000000063a20000 RCX: 00000000638e6800
>>>> RDX: 0000000000100100 RSI: 000000006384b8c0 RDI: 0000000063a20000
>>>> RBP: 0000000063873830 R08: 0000003000000008 R09: 0000000000000000
>>>> R10: 000000000000000f R11: 0000000000000000 R12: 00000000ffffffea
>>>> R13: 00000000638e6800 R14: 0000000063a20000 R15: 0000000063a20000
>>>> Call Trace: 
>>>> 601f1ad8:  [<60014bcd>] segv+0x1fd/0x20f
>>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>>
>>>> Kernel panic - not syncing: Kernel mode fault at addr 0x100108, ip
>>>> 0x646228f9
>>>> Call Trace: 
>>>> 601f19d8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f19e8:  [<60158b8d>] panic+0xd3/0x174
>>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>>
>>>>
>>>> Modules linked in: sctp
>>>> Pid: 610, comm: sctp_test Not tainted 2.6.31
>>>> RIP: 0033:[<00000000404ef5c0>]
>>>> RSP: 0000007fbf8613f8  EFLAGS: 00000246
>>>> RAX: ffffffffffffffda RBX: 0000007fbf861460 RCX: ffffffffffffffff
>>>> RDX: 0000000000000100 RSI: 0000007fbf861410 RDI: 0000000000000003
>>>> RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000607560
>>>> R13: 0000000000000002 R14: 0000000000000000 R15: 0000007fbf861450
>>>> Call Trace: 
>>>> 601f1960:  [<6004c462>] __module_text_address+0xd/0x5b
>>>> 601f1978:  [<60014e05>] panic_exit+0x2f/0x45
>>>> 601f1998:  [<60043417>] notifier_call_chain+0x33/0x5b
>>>> 601f19c8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f19d8:  [<60043459>] atomic_notifier_call_chain+0xf/0x11
>>>> 601f19e8:  [<60158b9e>] panic+0xe4/0x174
>>>> 601f1a20:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1a40:  [<6004c462>] __module_text_address+0xd/0x5b
>>>> 601f1a58:  [<6004c4b9>] is_module_text_address+0x9/0x11
>>>> 601f1a68:  [<6003e264>] __kernel_text_address+0x65/0x6b
>>>> 601f1a70:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1a88:  [<60013a96>] show_trace+0x8e/0x92
>>>> 601f1aa8:  [<600271ff>] show_regs+0x2b/0x30
>>>> 601f1ad8:  [<60014bdf>] segv_handler+0x0/0xb9
>>>> 601f1b18:  [<601102f0>] process_backlog+0x8b/0xa9
>>>> 601f1b58:  [<60110904>] net_rx_action+0xe5/0x123
>>>> 601f1bb8:  [<60014c92>] segv_handler+0xb3/0xb9
>>>> 601f1bf8:  [<600329c4>] do_softirq+0x43/0x4a
>>>> 601f1c28:  [<60016439>] free_irqs+0x72/0xd4
>>>> 601f1c68:  [<60012108>] sigio_handler+0x5a/0x5f
>>>> 601f1c88:  [<60021a47>] sig_handler_common+0x87/0x9b
>>>> 601f1d10:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>> 601f1d30:  [<60017b51>] line_write_room+0x57/0x58
>>>> 601f1db8:  [<60021b90>] sig_handler+0x30/0x3b
>>>> 601f1dd8:  [<60021de9>] handle_signal+0x6b/0xa1
>>>> 601f1e28:  [<600236fc>] hard_handler+0x10/0x14
>>>> 601f1ee8:  [<646228f9>] sctp_association_free+0x2b/0x1e0 [sctp]
>>>>
>>>> This error seems only to occur if the remote host answers with ICMP
>>>> protocol unreachable.
>>>> If the remote host answers with SCTP ABORT, the error won't occur.
>>>>
>>>>
>>>> Thanks in advance,
>>>>
>>>> Michael Krolikowski
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
>> in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-08 19:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-28 16:45 linux sctp bug Vlad Yasevich
2009-09-30 12:49 ` Michael Krolikowski
2009-09-30 14:31 ` Vlad Yasevich
2009-09-30 15:32 ` Michael Krolikowski
2009-09-30 15:58 ` Vlad Yasevich
2009-09-30 16:02 ` Michael Krolikowski
2010-01-08 16:27 ` Michael Krolikowski
2010-01-08 19:48 ` Vlad Yasevich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.