All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG in sctp crashes sles10sp2 kernel
@ 2008-12-11 14:52 Michal Hocko
  2008-12-11 15:28 ` Vlad Yasevich
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Michal Hocko @ 2008-12-11 14:52 UTC (permalink / raw)
  To: linux-sctp

Hi Vlad,

I am starting this new thread because I am starting to believe that
sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
issue than we can see in the upstream kernel (see bellow).

Karsten (CCing him) has found out following:
"
OK I think the
KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
net/ipv4/af_inet.c (149)

is related to the main problem here, it says that on the time a socket
get destroyed here is still some wmem allocated. This mean here is still
a transmit skb on the fly. Since sctp use skb destructors to do the
memory accounting, this also means that after destroying the socket, the
destructor of this skb will access the already freed socket struct,
which will let in some cases (if the memory is in use again and the
pointers are already overwritten) cause the crash with on
{sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
it can crash in every other place, since the accounting may overwrite
pointers in any other struct, which reuse this memory.

I instrument some routines with extra debug (eg. inet_sock_destruct) too
see the amount of memory in sk->sk_wmem_alloc, it allmost show 

Dec 11 12:31:16 gw kernel: inet_sock_destruct:
sk(ffff810116960e00)->sk_wmem_alloc 496
Dec 11 12:31:17 gw kernel: inet_sock_destruct:
sk(ffff8101144f1b00)->sk_wmem_alloc 496
Dec 11 12:31:18 gw kernel: inet_sock_destruct:
sk(ffff8101144f1b00)->sk_wmem_alloc -496
Dec 11 12:31:20 gw kernel: inet_sock_destruct:
sk(ffff81011d461a00)->sk_wmem_alloc 496
Dec 11 12:31:21 gw kernel: inet_sock_destruct:
sk(ffff81011d460080)->sk_wmem_alloc 496

Note the -496, I think this is a case in which the same memory was again
allocated by a socket struct, so the memory still has valid pointers and
so on the destructor call for the old socket it did decrement the memory
on the new socket.

Do you agree with this analysis ?
"

I am trying to go through git logs but maybe you remember some fix in
this area.

If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
removes destructors from sctp completely, so the previous should not
happen in upstream, shouldn't it?

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
@ 2008-12-11 15:28 ` Vlad Yasevich
  2008-12-12 13:04 ` Karsten Keil
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-11 15:28 UTC (permalink / raw)
  To: linux-sctp

Michal Hocko wrote:
> Hi Vlad,
> 
> I am starting this new thread because I am starting to believe that
> sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
> issue than we can see in the upstream kernel (see bellow).
> 
> Karsten (CCing him) has found out following:
> "
> OK I think the
> KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
> net/ipv4/af_inet.c (149)
> 
> is related to the main problem here, it says that on the time a socket
> get destroyed here is still some wmem allocated. This mean here is still
> a transmit skb on the fly. Since sctp use skb destructors to do the
> memory accounting, this also means that after destroying the socket, the
> destructor of this skb will access the already freed socket struct,
> which will let in some cases (if the memory is in use again and the
> pointers are already overwritten) cause the crash with on
> {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
> it can crash in every other place, since the accounting may overwrite
> pointers in any other struct, which reuse this memory.
> 
> I instrument some routines with extra debug (eg. inet_sock_destruct) too
> see the amount of memory in sk->sk_wmem_alloc, it allmost show 
> 
> Dec 11 12:31:16 gw kernel: inet_sock_destruct:
> sk(ffff810116960e00)->sk_wmem_alloc 496
> Dec 11 12:31:17 gw kernel: inet_sock_destruct:
> sk(ffff8101144f1b00)->sk_wmem_alloc 496
> Dec 11 12:31:18 gw kernel: inet_sock_destruct:
> sk(ffff8101144f1b00)->sk_wmem_alloc -496
> Dec 11 12:31:20 gw kernel: inet_sock_destruct:
> sk(ffff81011d461a00)->sk_wmem_alloc 496
> Dec 11 12:31:21 gw kernel: inet_sock_destruct:
> sk(ffff81011d460080)->sk_wmem_alloc 496
> 
> Note the -496, I think this is a case in which the same memory was again
> allocated by a socket struct, so the memory still has valid pointers and
> so on the destructor call for the old socket it did decrement the memory
> on the new socket.
> 
> Do you agree with this analysis ?
> "
> 
> I am trying to go through git logs but maybe you remember some fix in
> this area.
> 
> If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
> removes destructors from sctp completely, so the previous should not
> happen in upstream, shouldn't it?
> 


Here are a few commits that you need to check on:

61c9fed41638249f8b6ca5345064eb1beb50179f
[SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().

cfdeef3282705a4b872d3559c4e7d2561251363c
[SCTP]: Unhash the endpoint in sctp_endpoint_free().

f26f7c480555812ca7c4037e0a50fa54afe2cb4a
[SCTP]: Add bind hash locking to the migrate code


All of the above commits address races in the SCTP code and are not in the base
2.6.16 kernel.

-vlad


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
  2008-12-11 15:28 ` Vlad Yasevich
@ 2008-12-12 13:04 ` Karsten Keil
  2008-12-15 15:38 ` Vlad Yasevich
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Karsten Keil @ 2008-12-12 13:04 UTC (permalink / raw)
  To: linux-sctp

Hi Vlad,

On Thu, Dec 11, 2008 at 10:28:35AM -0500, Vlad Yasevich wrote:
> Michal Hocko wrote:
> > Hi Vlad,
> > 
> > I am starting this new thread because I am starting to believe that
> > sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
> > issue than we can see in the upstream kernel (see bellow).
> > 
> > Karsten (CCing him) has found out following:
> > "
> > OK I think the
> > KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
> > net/ipv4/af_inet.c (149)
> > 
> > is related to the main problem here, it says that on the time a socket
> > get destroyed here is still some wmem allocated. This mean here is still
> > a transmit skb on the fly. Since sctp use skb destructors to do the
> > memory accounting, this also means that after destroying the socket, the
> > destructor of this skb will access the already freed socket struct,
> > which will let in some cases (if the memory is in use again and the
> > pointers are already overwritten) cause the crash with on
> > {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
> > it can crash in every other place, since the accounting may overwrite
> > pointers in any other struct, which reuse this memory.
> > 
> > I instrument some routines with extra debug (eg. inet_sock_destruct) too
> > see the amount of memory in sk->sk_wmem_alloc, it allmost show 
> > 
> > Dec 11 12:31:16 gw kernel: inet_sock_destruct:
> > sk(ffff810116960e00)->sk_wmem_alloc 496
> > Dec 11 12:31:17 gw kernel: inet_sock_destruct:
> > sk(ffff8101144f1b00)->sk_wmem_alloc 496
> > Dec 11 12:31:18 gw kernel: inet_sock_destruct:
> > sk(ffff8101144f1b00)->sk_wmem_alloc -496
> > Dec 11 12:31:20 gw kernel: inet_sock_destruct:
> > sk(ffff81011d461a00)->sk_wmem_alloc 496
> > Dec 11 12:31:21 gw kernel: inet_sock_destruct:
> > sk(ffff81011d460080)->sk_wmem_alloc 496
> > 
> > Note the -496, I think this is a case in which the same memory was again
> > allocated by a socket struct, so the memory still has valid pointers and
> > so on the destructor call for the old socket it did decrement the memory
> > on the new socket.
> > 
> > Do you agree with this analysis ?
> > "
> > 
> > I am trying to go through git logs but maybe you remember some fix in
> > this area.
> > 
> > If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
> > removes destructors from sctp completely, so the previous should not
> > happen in upstream, shouldn't it?
> > 
> 
> 
> Here are a few commits that you need to check on:
> 
> 61c9fed41638249f8b6ca5345064eb1beb50179f
> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
> 
> cfdeef3282705a4b872d3559c4e7d2561251363c
> [SCTP]: Unhash the endpoint in sctp_endpoint_free().
> 
> f26f7c480555812ca7c4037e0a50fa54afe2cb4a
> [SCTP]: Add bind hash locking to the migrate code
> 
> 
> All of the above commits address races in the SCTP code and are not in the base
> 2.6.16 kernel.
> 

Thanks for your input.

61c9fed41638249f8b6ca5345064eb1beb50179f
[SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().

seems to fix this issue, I applied also the other patches.

Now I do not get any longer the "KERNEL: assertion
(!atomic_read(&sk->sk_wmem_alloc)) failed ..." messages.

But now I run into the skb_overflow BUG.
With some extra debug (based on your debug patch) I see:

Possible SKB overflow: packet size = 76, packet overhead = 32, packet chunk = 1/4, chunk len \x1040 packet padding 0 nskb len 12 mtu = 1500

packet chunk = 1/4 read as first chunk of total 4 chunks cause the overflow.

First I was thinking that maybe the padding cause this, so I also print this
value, but it is 0 in all traces.

I also applied
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h¹0a137d30a6322d76023d879d40fc31f3edf0a6

which sound likely to fix such kind of problem, but it seems that we do not
hit this, the bug is still here.


-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 16746 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
  2008-12-11 15:28 ` Vlad Yasevich
  2008-12-12 13:04 ` Karsten Keil
@ 2008-12-15 15:38 ` Vlad Yasevich
  2008-12-15 17:02 ` Karsten Keil
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-15 15:38 UTC (permalink / raw)
  To: linux-sctp

Karsten Keil wrote:
> Hi Vlad,
> 
> On Thu, Dec 11, 2008 at 10:28:35AM -0500, Vlad Yasevich wrote:
>> Michal Hocko wrote:
>>> Hi Vlad,
>>>
>>> I am starting this new thread because I am starting to believe that
>>> sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
>>> issue than we can see in the upstream kernel (see bellow).
>>>
>>> Karsten (CCing him) has found out following:
>>> "
>>> OK I think the
>>> KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
>>> net/ipv4/af_inet.c (149)
>>>
>>> is related to the main problem here, it says that on the time a socket
>>> get destroyed here is still some wmem allocated. This mean here is still
>>> a transmit skb on the fly. Since sctp use skb destructors to do the
>>> memory accounting, this also means that after destroying the socket, the
>>> destructor of this skb will access the already freed socket struct,
>>> which will let in some cases (if the memory is in use again and the
>>> pointers are already overwritten) cause the crash with on
>>> {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
>>> it can crash in every other place, since the accounting may overwrite
>>> pointers in any other struct, which reuse this memory.
>>>
>>> I instrument some routines with extra debug (eg. inet_sock_destruct) too
>>> see the amount of memory in sk->sk_wmem_alloc, it allmost show 
>>>
>>> Dec 11 12:31:16 gw kernel: inet_sock_destruct:
>>> sk(ffff810116960e00)->sk_wmem_alloc 496
>>> Dec 11 12:31:17 gw kernel: inet_sock_destruct:
>>> sk(ffff8101144f1b00)->sk_wmem_alloc 496
>>> Dec 11 12:31:18 gw kernel: inet_sock_destruct:
>>> sk(ffff8101144f1b00)->sk_wmem_alloc -496
>>> Dec 11 12:31:20 gw kernel: inet_sock_destruct:
>>> sk(ffff81011d461a00)->sk_wmem_alloc 496
>>> Dec 11 12:31:21 gw kernel: inet_sock_destruct:
>>> sk(ffff81011d460080)->sk_wmem_alloc 496
>>>
>>> Note the -496, I think this is a case in which the same memory was again
>>> allocated by a socket struct, so the memory still has valid pointers and
>>> so on the destructor call for the old socket it did decrement the memory
>>> on the new socket.
>>>
>>> Do you agree with this analysis ?
>>> "
>>>
>>> I am trying to go through git logs but maybe you remember some fix in
>>> this area.
>>>
>>> If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
>>> removes destructors from sctp completely, so the previous should not
>>> happen in upstream, shouldn't it?
>>>
>>
>> Here are a few commits that you need to check on:
>>
>> 61c9fed41638249f8b6ca5345064eb1beb50179f
>> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
>>
>> cfdeef3282705a4b872d3559c4e7d2561251363c
>> [SCTP]: Unhash the endpoint in sctp_endpoint_free().
>>
>> f26f7c480555812ca7c4037e0a50fa54afe2cb4a
>> [SCTP]: Add bind hash locking to the migrate code
>>
>>
>> All of the above commits address races in the SCTP code and are not in the base
>> 2.6.16 kernel.
>>
> 
> Thanks for your input.
> 
> 61c9fed41638249f8b6ca5345064eb1beb50179f
> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
> 
> seems to fix this issue, I applied also the other patches.
> 
> Now I do not get any longer the "KERNEL: assertion
> (!atomic_read(&sk->sk_wmem_alloc)) failed ..." messages.
> 
> But now I run into the skb_overflow BUG.
> With some extra debug (based on your debug patch) I see:
> 
> Possible SKB overflow: packet size = 76, packet overhead = 32, packet chunk = 1/4, chunk len \x1040 packet padding 0 nskb len 12 mtu = 1500
> 
> packet chunk = 1/4 read as first chunk of total 4 chunks cause the overflow.

OK.  It appears that the list of chunk on the packets is not cleaned up
correctly.  It could also be that in my prior debug patches I didn't reset
the number of chunks properly.

Since this is the same bug that I am trying to diagnose in 2.6.27+, it now
boils down to the same problem.  Either the packet is not cleaned up
correctly, or there is a race wrt packet chunk list.

I am currently somewhat off-line due to power issues at my house, but I'll
see if I can get a somewhat cleaned-up debug patch to you.

-vlad
> 
> First I was thinking that maybe the padding cause this, so I also print this
> value, but it is 0 in all traces.
> 
> I also applied
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h¹0a137d30a6322d76023d879d40fc31f3edf0a6
> 
> which sound likely to fix such kind of problem, but it seems that we do not
> hit this, the bug is still here.
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (2 preceding siblings ...)
  2008-12-15 15:38 ` Vlad Yasevich
@ 2008-12-15 17:02 ` Karsten Keil
  2008-12-15 17:41 ` Vlad Yasevich
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Karsten Keil @ 2008-12-15 17:02 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 8913 bytes --]

On Mon, Dec 15, 2008 at 10:38:56AM -0500, Vlad Yasevich wrote:
> Karsten Keil wrote:
> > Hi Vlad,
> > 
> > On Thu, Dec 11, 2008 at 10:28:35AM -0500, Vlad Yasevich wrote:
> >> Michal Hocko wrote:
> >>> Hi Vlad,
> >>>
> >>> I am starting this new thread because I am starting to believe that
> >>> sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
> >>> issue than we can see in the upstream kernel (see bellow).
> >>>
> >>> Karsten (CCing him) has found out following:
> >>> "
> >>> OK I think the
> >>> KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
> >>> net/ipv4/af_inet.c (149)
> >>>
> >>> is related to the main problem here, it says that on the time a socket
> >>> get destroyed here is still some wmem allocated. This mean here is still
> >>> a transmit skb on the fly. Since sctp use skb destructors to do the
> >>> memory accounting, this also means that after destroying the socket, the
> >>> destructor of this skb will access the already freed socket struct,
> >>> which will let in some cases (if the memory is in use again and the
> >>> pointers are already overwritten) cause the crash with on
> >>> {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
> >>> it can crash in every other place, since the accounting may overwrite
> >>> pointers in any other struct, which reuse this memory.
> >>>
> >>> I instrument some routines with extra debug (eg. inet_sock_destruct) too
> >>> see the amount of memory in sk->sk_wmem_alloc, it allmost show 
> >>>
> >>> Dec 11 12:31:16 gw kernel: inet_sock_destruct:
> >>> sk(ffff810116960e00)->sk_wmem_alloc 496
> >>> Dec 11 12:31:17 gw kernel: inet_sock_destruct:
> >>> sk(ffff8101144f1b00)->sk_wmem_alloc 496
> >>> Dec 11 12:31:18 gw kernel: inet_sock_destruct:
> >>> sk(ffff8101144f1b00)->sk_wmem_alloc -496
> >>> Dec 11 12:31:20 gw kernel: inet_sock_destruct:
> >>> sk(ffff81011d461a00)->sk_wmem_alloc 496
> >>> Dec 11 12:31:21 gw kernel: inet_sock_destruct:
> >>> sk(ffff81011d460080)->sk_wmem_alloc 496
> >>>
> >>> Note the -496, I think this is a case in which the same memory was again
> >>> allocated by a socket struct, so the memory still has valid pointers and
> >>> so on the destructor call for the old socket it did decrement the memory
> >>> on the new socket.
> >>>
> >>> Do you agree with this analysis ?
> >>> "
> >>>
> >>> I am trying to go through git logs but maybe you remember some fix in
> >>> this area.
> >>>
> >>> If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
> >>> removes destructors from sctp completely, so the previous should not
> >>> happen in upstream, shouldn't it?
> >>>
> >>
> >> Here are a few commits that you need to check on:
> >>
> >> 61c9fed41638249f8b6ca5345064eb1beb50179f
> >> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
> >>
> >> cfdeef3282705a4b872d3559c4e7d2561251363c
> >> [SCTP]: Unhash the endpoint in sctp_endpoint_free().
> >>
> >> f26f7c480555812ca7c4037e0a50fa54afe2cb4a
> >> [SCTP]: Add bind hash locking to the migrate code
> >>
> >>
> >> All of the above commits address races in the SCTP code and are not in the base
> >> 2.6.16 kernel.
> >>
> > 
> > Thanks for your input.
> > 
> > 61c9fed41638249f8b6ca5345064eb1beb50179f
> > [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
> > 
> > seems to fix this issue, I applied also the other patches.
> > 
> > Now I do not get any longer the "KERNEL: assertion
> > (!atomic_read(&sk->sk_wmem_alloc)) failed ..." messages.
> > 
> > But now I run into the skb_overflow BUG.
> > With some extra debug (based on your debug patch) I see:
> > 
> > Possible SKB overflow: packet size = 76, packet overhead = 32, packet chunk = 1/4, chunk len =1040 packet padding 0 nskb len 12 mtu = 1500
> > 
> > packet chunk = 1/4 read as first chunk of total 4 chunks cause the overflow.
> 
> OK.  It appears that the list of chunk on the packets is not cleaned up
> correctly.  It could also be that in my prior debug patches I didn't reset
> the number of chunks properly.

No I think that the debug patch is correct.
I already tried some things, so log "the history" of adding chunks in the
packet struct, but it does not really help me, but maybe you see something:

Possible SKB overflow: packet size = 1072, packet overhead = 32, packet chunk = 2/4, chunk len =1040 packet padding 0 nskb len 1052 mtu = 1500


chunk[1]: ps(36) ss(4) cs(4) ct(11)
chunk[2]: ps(48) ss(16) cs(16) ct(3)
chunk[3]: ps(1088) ss(1040) cs(1040) ct(0)
chunk[4]: ps(1072) ss(1040) cs(1040) ct(0)
skb_over_panic: text:ffffffff8824af48 len:2092 put:1040
head:ffff81006b8b8000 data:ffff81006b8b8074 tail:ffff81006b8b88a0
end:ffff81006b8b8500 dev:<NULL>

ps packet size after adding this chunk
ss chunk->skb->len
cs chunk_len
ct chunk type

I attach the patch at the end, so you see what I did.

All traces (with our 2.6.16) are similar to this one (which happend after
the above printout).

CPU 0
Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave
powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod
floppy r8169 parport_pc
Pid: 0, comm: swapper Tainted: G     U 2.6.16.60-0.33-smp #13
RIP: 0010:[<ffffffff8028667e>] <ffffffff8028667e>{skb_over_panic+77}
RSP: 0018:ffffffff8042fb68  EFLAGS: 00010296
RAX: 000000000000009d RBX: 0000000000000410 RCX: 00000000000046b2
RDX: ffff81006b8b8490 RSI: 0000000000000296 RDI: ffffffff8035a99c
RBP: ffff81010845e480 R08: 0000000000000000 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000000 R12: ffff81006bd87bc0
R13: ffff81006b8b8490 R14: ffff810069b3ad50 R15: ffff810069b3ac00
FS:  00002aba700166e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002ae33210c000 CR3: 0000000117424000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff803ee000, task ffffffff80354320)
Stack: ffff81006b8b88a0 ffff81006b8b8500 ffffffff803197b7 000000000000041c
       00000000000005dc ffffffff8824af52 0000000000000286 ffff810069b3ad58
       ffff810117a4f000 ffff810108042000
Call Trace: <IRQ> <ffffffff8824af52>{:sctp:sctp_packet_transmit+962}
       <ffffffff882437c7>{:sctp:sctp_outq_flush+1507}
<ffffffff8013d450>{do_timer+61}
       <ffffffff8010f5f6>{main_timer_handler+641}
<ffffffff801198b4>{smp_send_timer_broadcast_ipi+107}
       <ffffffff88243ab3>{:sctp:sctp_retransmit+116}
<ffffffff8010b35e>{ret_from_intr+0}
       <ffffffff8823baf1>{:sctp:sctp_do_sm+1503}
<ffffffff8014a88a>{ktime_get+14}
       <ffffffff8823c4c7>{:sctp:sctp_generate_t3_rtx_event+0}
       <ffffffff8823c53c>{:sctp:sctp_generate_t3_rtx_event+117}
       <ffffffff8013db23>{run_timer_softirq+352}
<ffffffff80139774>{__do_softirq+85}
       <ffffffff8010c222>{call_softirq+30} <ffffffff8010d1a4>{do_softirq+44}
       <ffffffff80109d41>{default_idle+0}
<ffffffff8010bb7c>{apic_timer_interrupt+132} <EOI>
       <ffffffff80109d41>{default_idle+0}
<ffffffff80109d7d>{default_idle+60}
       <ffffffff80109e57>{cpu_idle+160} <ffffffff803f0832>{start_kernel+626}
       <ffffffff803f02a2>{_sinittext+674}



> 
> Since this is the same bug that I am trying to diagnose in 2.6.27+, it now
> boils down to the same problem.  Either the packet is not cleaned up
> correctly, or there is a race wrt packet chunk list.
>

Unfortunately I could not reproduce it with 2.6.27 here (I did run the test
overnight, > 12 hour), with our 2.6.16 it triggers on my machine (Dual core
AMD 64) in 5 -10 minutes. I also tried 2.6.25, here I could see a similar
Oops after 3 hour.

I had the idea, that maybe the corruption comes from concurent access
to a chunk list, so I did instrument all list operation on the
chunklist with special debug spinlocks (use try_lock and print a message
if failed), the message was never printed, but it crashed with the same
trace, so I think we can rule out this (I used seperate locks for
each list_head).

 
> I am currently somewhat off-line due to power issues at my house, but I'll
> see if I can get a somewhat cleaned-up debug patch to you.
> 

What a pity, I had some power problems (with leak current triggering a
protection device) some weeks ago in my home as well (every few hour it shut
down my computers the hard way), very annoying.

So if you have any idea, what I can test or should log, I can do this.


> -vlad
> > 
> > First I was thinking that maybe the padding cause this, so I also print this
> > value, but it is 0 in all traces.
> > 
> > I also applied
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b90a137d30a6322d76023d879d40fc31f3edf0a6
> > 
> > which sound likely to fix such kind of problem, but it seems that we do not
> > hit this, the bug is still here.
> > 
> > 

-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 16746 (AG Nuernberg)

[-- Attachment #2: debug.sctp --]
[-- Type: text/plain, Size: 3762 bytes --]

Index: linux/include/net/sctp/structs.h
===================================================================
--- linux.orig/include/net/sctp/structs.h
+++ linux/include/net/sctp/structs.h
@@ -735,6 +735,7 @@ typedef struct sctp_chunk *(sctp_packet_
 /* This structure holds lists of chunks as we are assembling for
  * transmission.
  */
+
 struct sctp_packet {
 	/* These are the SCTP header values (host order) for the packet. */
 	__u16 source_port;
@@ -743,7 +744,15 @@ struct sctp_packet {
 
 	/* This contains the payload chunks.  */
 	struct list_head chunk_list;
-
+	__u32 num_chunks;
+	__u32 num_pad;
+#ifdef SCTP_CHUNK_TRACEXDEBUG
+#define SCTP_CHUNK_MAXDEBUG	8
+	int dbg_psize[SCTP_CHUNK_MAXDEBUG];
+	int dbg_ssize[SCTP_CHUNK_MAXDEBUG];
+	int dbg_clen[SCTP_CHUNK_MAXDEBUG];
+	int dbg_ctype[SCTP_CHUNK_MAXDEBUG];
+#endif
 	/* This is the overhead of the sctp and ip headers. */
 	size_t overhead;
 	/* This is the total size of all chunks INCLUDING padding.  */
Index: linux/net/sctp/output.c
===================================================================
--- linux.orig/net/sctp/output.c
+++ linux/net/sctp/output.c
@@ -109,6 +109,8 @@ struct sctp_packet *sctp_packet_init(str
 	packet->source_port = sport;
 	packet->destination_port = dport;
 	INIT_LIST_HEAD(&packet->chunk_list);
+	packet->num_chunks = 0;
+	packet->num_pad = 0;
 	if (asoc) {
 		struct sctp_sock *sp = sctp_sk(asoc->base.sk);	
 		overhead = sp->pf->af->net_header_len; 
@@ -280,6 +282,15 @@ append:
 	/* It is OK to send this chunk.  */
 	list_add_tail(&chunk->list, &packet->chunk_list);
 	packet->size += chunk_len;
+#ifdef SCTP_CHUNK_TRACEXDEBUG
+	if (packet->num_chunks < SCTP_CHUNK_MAXDEBUG) {
+		packet->dbg_psize[packet->num_chunks] = packet->size;
+		packet->dbg_ssize[packet->num_chunks] = chunk->skb->len;
+		packet->dbg_clen[packet->num_chunks] = chunk_len;
+		packet->dbg_ctype[packet->num_chunks] = chunk->chunk_hdr->type;
+	}
+#endif
+	packet->num_chunks += 1;
 	chunk->transport = packet->transport;
 finish:
 	return retval;
@@ -302,6 +313,7 @@ int sctp_packet_transmit(struct sctp_pac
 	int err = 0;
 	int padding;		/* How much padding do we need?  */
 	__u8 has_data = 0;
+	int nr = 0;
 	struct dst_entry *dst;
 
 	SCTP_DEBUG_PRINTK("%s: packet:%p\n", __FUNCTION__, packet);
@@ -374,6 +386,7 @@ int sctp_packet_transmit(struct sctp_pac
 	SCTP_DEBUG_PRINTK("***sctp_transmit_packet***\n");
 	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
 		list_del_init(&chunk->list);
+		nr++;
 		if (sctp_chunk_is_data(chunk)) {
 
 			if (!chunk->has_tsn) {
@@ -401,7 +414,23 @@ int sctp_packet_transmit(struct sctp_pac
 		padding = WORD_ROUND(chunk->skb->len) - chunk->skb->len;
 		if (padding)
 			memset(skb_put(chunk->skb, padding), 0, padding);
-
+		packet->num_pad += padding;
+		if ((nskb->tail + chunk->skb->len) > nskb->end) {
+			int i;
+			printk(KERN_EMERG "Possible SKB overflow: packet size = %u, packet overhead = %u, packet chunk = %d/%d, chunk len =%u packet padding %d nskb len %d mtu = %u\n",
+				packet->size, packet->overhead, nr, packet->num_chunks, chunk->skb->len, packet->num_pad, nskb->len, asoc?asoc->pathmtu:tp->pathmtu);
+#ifdef SKB_SOURCE_DEBUG
+			printk(KERN_EMERG "chunk type %d skb was from %s:%d\n", chunk->chunk_hdr->type, chunk->skb->_file, chunk->skb->_line);
+#endif
+#ifdef SCTP_CHUNK_TRACEXDEBUG
+			for (i = 0; i < packet->num_chunks; i++) {
+				if (i >= SCTP_CHUNK_MAXDEBUG)
+					break;
+				printk(KERN_EMERG "chunk[%d]: ps(%d) ss(%d) cs(%d) ct(%d)\n", i+ 1,
+					packet->dbg_psize[i], packet->dbg_ssize[i], packet->dbg_clen[i], packet->dbg_ctype[i]);
+			}
+#endif
+		}
 		crc32 = sctp_update_copy_cksum(skb_put(nskb, chunk->skb->len),
 					       chunk->skb->data,
 					       chunk->skb->len, crc32);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (3 preceding siblings ...)
  2008-12-15 17:02 ` Karsten Keil
@ 2008-12-15 17:41 ` Vlad Yasevich
  2008-12-15 17:42 ` Vlad Yasevich
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-15 17:41 UTC (permalink / raw)
  To: linux-sctp

Karsten Keil wrote:
> On Mon, Dec 15, 2008 at 10:38:56AM -0500, Vlad Yasevich wrote:
>> Karsten Keil wrote:
>>> Hi Vlad,
>>>
>>> On Thu, Dec 11, 2008 at 10:28:35AM -0500, Vlad Yasevich wrote:
>>>> Michal Hocko wrote:
>>>>> Hi Vlad,
>>>>>
>>>>> I am starting this new thread because I am starting to believe that
>>>>> sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
>>>>> issue than we can see in the upstream kernel (see bellow).
>>>>>
>>>>> Karsten (CCing him) has found out following:
>>>>> "
>>>>> OK I think the
>>>>> KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
>>>>> net/ipv4/af_inet.c (149)
>>>>>
>>>>> is related to the main problem here, it says that on the time a socket
>>>>> get destroyed here is still some wmem allocated. This mean here is still
>>>>> a transmit skb on the fly. Since sctp use skb destructors to do the
>>>>> memory accounting, this also means that after destroying the socket, the
>>>>> destructor of this skb will access the already freed socket struct,
>>>>> which will let in some cases (if the memory is in use again and the
>>>>> pointers are already overwritten) cause the crash with on
>>>>> {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
>>>>> it can crash in every other place, since the accounting may overwrite
>>>>> pointers in any other struct, which reuse this memory.
>>>>>
>>>>> I instrument some routines with extra debug (eg. inet_sock_destruct) too
>>>>> see the amount of memory in sk->sk_wmem_alloc, it allmost show 
>>>>>
>>>>> Dec 11 12:31:16 gw kernel: inet_sock_destruct:
>>>>> sk(ffff810116960e00)->sk_wmem_alloc 496
>>>>> Dec 11 12:31:17 gw kernel: inet_sock_destruct:
>>>>> sk(ffff8101144f1b00)->sk_wmem_alloc 496
>>>>> Dec 11 12:31:18 gw kernel: inet_sock_destruct:
>>>>> sk(ffff8101144f1b00)->sk_wmem_alloc -496
>>>>> Dec 11 12:31:20 gw kernel: inet_sock_destruct:
>>>>> sk(ffff81011d461a00)->sk_wmem_alloc 496
>>>>> Dec 11 12:31:21 gw kernel: inet_sock_destruct:
>>>>> sk(ffff81011d460080)->sk_wmem_alloc 496
>>>>>
>>>>> Note the -496, I think this is a case in which the same memory was again
>>>>> allocated by a socket struct, so the memory still has valid pointers and
>>>>> so on the destructor call for the old socket it did decrement the memory
>>>>> on the new socket.
>>>>>
>>>>> Do you agree with this analysis ?
>>>>> "
>>>>>
>>>>> I am trying to go through git logs but maybe you remember some fix in
>>>>> this area.
>>>>>
>>>>> If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
>>>>> removes destructors from sctp completely, so the previous should not
>>>>> happen in upstream, shouldn't it?
>>>>>
>>>> Here are a few commits that you need to check on:
>>>>
>>>> 61c9fed41638249f8b6ca5345064eb1beb50179f
>>>> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
>>>>
>>>> cfdeef3282705a4b872d3559c4e7d2561251363c
>>>> [SCTP]: Unhash the endpoint in sctp_endpoint_free().
>>>>
>>>> f26f7c480555812ca7c4037e0a50fa54afe2cb4a
>>>> [SCTP]: Add bind hash locking to the migrate code
>>>>
>>>>
>>>> All of the above commits address races in the SCTP code and are not in the base
>>>> 2.6.16 kernel.
>>>>
>>> Thanks for your input.
>>>
>>> 61c9fed41638249f8b6ca5345064eb1beb50179f
>>> [SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().
>>>
>>> seems to fix this issue, I applied also the other patches.
>>>
>>> Now I do not get any longer the "KERNEL: assertion
>>> (!atomic_read(&sk->sk_wmem_alloc)) failed ..." messages.
>>>
>>> But now I run into the skb_overflow BUG.
>>> With some extra debug (based on your debug patch) I see:
>>>
>>> Possible SKB overflow: packet size = 76, packet overhead = 32, packet chunk = 1/4, chunk len \x1040 packet padding 0 nskb len 12 mtu = 1500
>>>
>>> packet chunk = 1/4 read as first chunk of total 4 chunks cause the overflow.
>> OK.  It appears that the list of chunk on the packets is not cleaned up
>> correctly.  It could also be that in my prior debug patches I didn't reset
>> the number of chunks properly.
> 
> No I think that the debug patch is correct.
> I already tried some things, so log "the history" of adding chunks in the
> packet struct, but it does not really help me, but maybe you see something:
> 
> Possible SKB overflow: packet size = 1072, packet overhead = 32, packet chunk = 2/4, chunk len \x1040 packet padding 0 nskb len 1052 mtu = 1500
> 
> 
> chunk[1]: ps(36) ss(4) cs(4) ct(11)
> chunk[2]: ps(48) ss(16) cs(16) ct(3)
> chunk[3]: ps(1088) ss(1040) cs(1040) ct(0)
> chunk[4]: ps(1072) ss(1040) cs(1040) ct(0)
> skb_over_panic: text:ffffffff8824af48 len:2092 put:1040
> head:ffff81006b8b8000 data:ffff81006b8b8074 tail:ffff81006b8b88a0
> end:ffff81006b8b8500 dev:<NULL>
> 
> ps packet size after adding this chunk
> ss chunk->skb->len
> cs chunk_len
> ct chunk type
> 
> I attach the patch at the end, so you see what I did.
> 
> All traces (with our 2.6.16) are similar to this one (which happend after
> the above printout).
> 
> CPU 0
> Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave
> powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod
> floppy r8169 parport_pc
> Pid: 0, comm: swapper Tainted: G     U 2.6.16.60-0.33-smp #13
> RIP: 0010:[<ffffffff8028667e>] <ffffffff8028667e>{skb_over_panic+77}
> RSP: 0018:ffffffff8042fb68  EFLAGS: 00010296
> RAX: 000000000000009d RBX: 0000000000000410 RCX: 00000000000046b2
> RDX: ffff81006b8b8490 RSI: 0000000000000296 RDI: ffffffff8035a99c
> RBP: ffff81010845e480 R08: 0000000000000000 R09: 0000000000000010
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff81006bd87bc0
> R13: ffff81006b8b8490 R14: ffff810069b3ad50 R15: ffff810069b3ac00
> FS:  00002aba700166e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00002ae33210c000 CR3: 0000000117424000 CR4: 00000000000006e0
> Process swapper (pid: 0, threadinfo ffffffff803ee000, task ffffffff80354320)
> Stack: ffff81006b8b88a0 ffff81006b8b8500 ffffffff803197b7 000000000000041c
>        00000000000005dc ffffffff8824af52 0000000000000286 ffff810069b3ad58
>        ffff810117a4f000 ffff810108042000
> Call Trace: <IRQ> <ffffffff8824af52>{:sctp:sctp_packet_transmit+962}
>        <ffffffff882437c7>{:sctp:sctp_outq_flush+1507}
> <ffffffff8013d450>{do_timer+61}
>        <ffffffff8010f5f6>{main_timer_handler+641}
> <ffffffff801198b4>{smp_send_timer_broadcast_ipi+107}
>        <ffffffff88243ab3>{:sctp:sctp_retransmit+116}
> <ffffffff8010b35e>{ret_from_intr+0}
>        <ffffffff8823baf1>{:sctp:sctp_do_sm+1503}
> <ffffffff8014a88a>{ktime_get+14}
>        <ffffffff8823c4c7>{:sctp:sctp_generate_t3_rtx_event+0}
>        <ffffffff8823c53c>{:sctp:sctp_generate_t3_rtx_event+117}
>        <ffffffff8013db23>{run_timer_softirq+352}
> <ffffffff80139774>{__do_softirq+85}
>        <ffffffff8010c222>{call_softirq+30} <ffffffff8010d1a4>{do_softirq+44}
>        <ffffffff80109d41>{default_idle+0}
> <ffffffff8010bb7c>{apic_timer_interrupt+132} <EOI>
>        <ffffffff80109d41>{default_idle+0}
> <ffffffff80109d7d>{default_idle+60}
>        <ffffffff80109e57>{cpu_idle+160} <ffffffff803f0832>{start_kernel+626}
>        <ffffffff803f02a2>{_sinittext+674}
> 
> 
> 
>> Since this is the same bug that I am trying to diagnose in 2.6.27+, it now
>> boils down to the same problem.  Either the packet is not cleaned up
>> correctly, or there is a race wrt packet chunk list.
>>
> 
> Unfortunately I could not reproduce it with 2.6.27 here (I did run the test
> overnight, > 12 hour), with our 2.6.16 it triggers on my machine (Dual core
> AMD 64) in 5 -10 minutes. I also tried 2.6.25, here I could see a similar
> Oops after 3 hour.
> 
> I had the idea, that maybe the corruption comes from concurent access
> to a chunk list, so I did instrument all list operation on the
> chunklist with special debug spinlocks (use try_lock and print a message
> if failed), the message was never printed, but it crashed with the same
> trace, so I think we can rule out this (I used seperate locks for
> each list_head).
> 

I've tried that as well with the same results.  However, what really got to me
was that when added

BUG_ON(!list_empty(&packet->chunk_list));

after the loop that took all the chunks from the list, I tripped over this
BUG.

The way the code is structured, at the end of sctp_packet_transmit() the
packet must be empty, i.e. the chunk_list must be empty and the number
of chunks must be 0.

See if you can play around with that.

FYI, I am currently running with the following patch on top of 2.6.28-rc6.
It is a cumulative of a few bugs I found and debugging for the chunk_list
corruption.  The BUG is currently commented out since I wanted to see how
it runs without it.

-vlad

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (4 preceding siblings ...)
  2008-12-15 17:41 ` Vlad Yasevich
@ 2008-12-15 17:42 ` Vlad Yasevich
  2008-12-18 12:35 ` Karsten Keil
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-15 17:42 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 40 bytes --]

Oh, forgot the actuall patch. ;)

-vlad

[-- Attachment #2: testing.patch --]
[-- Type: text/x-patch, Size: 10356 bytes --]

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 9661d7b..4ba2210 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -790,6 +790,7 @@ struct sctp_packet {
 	__u32 vtag;
 
 	/* This contains the payload chunks.  */
+	spinlock_t	lock;
 	struct list_head chunk_list;
 
 	/* This is the overhead of the sctp and ip headers. */
@@ -797,6 +798,7 @@ struct sctp_packet {
 	/* This is the total size of all chunks INCLUDING padding.  */
 	size_t size;
 
+	size_t num_chunks;
 	/* The packet is destined for this transport address.
 	 * The function we finally use to pass down to the next lower
 	 * layer lives in the transport structure.
diff --git a/net/sctp/output.c b/net/sctp/output.c
index c3f417f..529281f 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -79,12 +79,6 @@ struct sctp_packet *sctp_packet_config(struct sctp_packet *packet,
 			  packet, vtag);
 
 	packet->vtag = vtag;
-	packet->has_cookie_echo = 0;
-	packet->has_sack = 0;
-	packet->has_auth = 0;
-	packet->has_data = 0;
-	packet->ipfragok = 0;
-	packet->auth = NULL;
 
 	if (ecn_capable && sctp_packet_empty(packet)) {
 		chunk = sctp_get_ecne_prepend(packet->transport->asoc);
@@ -113,7 +107,9 @@ struct sctp_packet *sctp_packet_init(struct sctp_packet *packet,
 	packet->transport = transport;
 	packet->source_port = sport;
 	packet->destination_port = dport;
+	spin_lock_init(&packet->lock);
 	INIT_LIST_HEAD(&packet->chunk_list);
+	packet->num_chunks = 0;
 	if (asoc) {
 		struct sctp_sock *sp = sctp_sk(asoc->base.sk);
 		overhead = sp->pf->af->net_header_len;
@@ -134,6 +130,43 @@ struct sctp_packet *sctp_packet_init(struct sctp_packet *packet,
 	return packet;
 }
 
+static void sctp_packet_reset(struct sctp_packet *packet)
+{
+	struct sctp_chunk *chunk, *tmp;
+
+	SCTP_DEBUG_PRINTK("%s: packet:%p vtag:0x%x\n", __func__,
+			  packet, vtag);
+
+	packet->has_cookie_echo = 0;
+	packet->has_sack = 0;
+	packet->has_auth = 0;
+	packet->has_data = 0;
+	packet->ipfragok = 0;
+	packet->auth = NULL;
+	packet->size = packet->overhead;
+/*
+	spin_lock_bh(&packet->lock);
+*/
+	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
+		list_del_init(&chunk->list);
+		packet->num_chunks--;
+		if (!sctp_chunk_is_data(chunk))
+			sctp_chunk_free(chunk);
+	}
+/*
+	if (packet->num_chunks ||
+	    !list_empty(&packet->chunk_list) {
+		printk(KERN_CRIT "packet %p, (%p,%p), num %d\n",
+			packet, packet->chunk_list.next,
+			packet->chunk_list.prev,
+			packet->num_chunks);
+		BUG();
+	}
+	spin_unlock_bh(&packet->lock);
+*/
+}
+
+
 /* Free a packet.  */
 void sctp_packet_free(struct sctp_packet *packet)
 {
@@ -141,6 +174,8 @@ void sctp_packet_free(struct sctp_packet *packet)
 
 	SCTP_DEBUG_PRINTK("%s: packet:%p\n", __func__, packet);
 
+	BUG_ON(spin_is_locked(&packet->lock));
+
 	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
 		list_del_init(&chunk->list);
 		sctp_chunk_free(chunk);
@@ -324,14 +359,15 @@ append:
 	switch (chunk->chunk_hdr->type) {
 	    case SCTP_CID_DATA:
 		retval = sctp_packet_append_data(packet, chunk);
+		if (SCTP_XMIT_OK != retval)
+			goto finish;
 		/* Disallow SACK bundling after DATA. */
 		packet->has_sack = 1;
 		/* Disallow AUTH bundling after DATA */
 		packet->has_auth = 1;
 		/* Let it be knows that packet has DATA in it */
 		packet->has_data = 1;
-		if (SCTP_XMIT_OK != retval)
-			goto finish;
+		chunk->sent_at = jiffies;
 		break;
 	    case SCTP_CID_COOKIE_ECHO:
 		packet->has_cookie_echo = 1;
@@ -348,7 +384,10 @@ append:
 	}
 
 	/* It is OK to send this chunk.  */
+	//spin_lock_bh(&packet->lock);
+	packet->num_chunks++;
 	list_add_tail(&chunk->list, &packet->chunk_list);
+	//spin_unlock_bh(&packet->lock);
 	packet->size += chunk_len;
 	chunk->transport = packet->transport;
 finish:
@@ -367,7 +406,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	struct sctphdr *sh;
 	__be32 crc32 = __constant_cpu_to_be32(0);
 	struct sk_buff *nskb;
-	struct sctp_chunk *chunk, *tmp;
+	struct sctp_chunk *chunk;
 	struct sock *sk;
 	int err = 0;
 	int padding;		/* How much padding do we need?  */
@@ -382,6 +421,8 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	if (list_empty(&packet->chunk_list))
 		return err;
 
+	BUG_ON(packet->num_chunks == 0);
+
 	/* Set up convenience variables... */
 	chunk = list_entry(packet->chunk_list.next, struct sctp_chunk, list);
 	sk = chunk->skb->sk;
@@ -448,8 +489,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	 * [This whole comment explains WORD_ROUND() below.]
 	 */
 	SCTP_DEBUG_PRINTK("***sctp_transmit_packet***\n");
-	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
-		list_del_init(&chunk->list);
+	list_for_each_entry(chunk, &packet->chunk_list, list) {
 		if (sctp_chunk_is_data(chunk)) {
 
 			if (!chunk->has_tsn) {
@@ -470,7 +510,6 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 			} else
 				chunk->resent = 1;
 
-			chunk->sent_at = jiffies;
 			has_data = 1;
 		}
 
@@ -485,6 +524,35 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 		if (chunk == packet->auth)
 			auth = skb_tail_pointer(nskb);
 
+		/* DEBUG: Check to see if this chunk will overflow the
+		 * skb.  Output needed info
+		 */
+		if ((nskb->tail + chunk->skb->len) > nskb->end) {
+			int i = 1;
+			struct sctp_chunk *tmp;
+			int num = packet->num_chunks;
+
+			printk("SKB %p overflow: p_size = %lu, n_chunks = %u, list(%p, %p)\n",
+			nskb,
+			packet->size, num,
+			packet->chunk_list.next, packet->chunk_list.prev);
+
+			list_for_each_entry(tmp, &packet->chunk_list, list) {
+			    if (i > num) {
+			    	printk(KERN_CRIT
+				"*** Overflow Chunk [%d] %p[%d], nxt %p, prv %p, skb_len %d\n",
+				i, tmp, tmp->chunk_hdr->type,
+				tmp->list.next, tmp->list.prev,
+				tmp->skb->len);
+			    } else {
+					printk("Chunk %p[%d], skb_length %d\n",
+						tmp,
+						tmp->chunk_hdr->type,
+						tmp->skb->len);
+			    }
+			    i++;
+			}
+		}
 		cksum_buf_len += chunk->skb->len;
 		memcpy(skb_put(nskb, chunk->skb->len),
 			       chunk->skb->data, chunk->skb->len);
@@ -500,13 +568,6 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 				  "chunk->skb->len", chunk->skb->len,
 				  "rtt_in_progress", chunk->rtt_in_progress);
 
-		/*
-		 * If this is a control chunk, this is our last
-		 * reference. Free data chunks after they've been
-		 * acknowledged or have failed.
-		 */
-		if (!sctp_chunk_is_data(chunk))
-			sctp_chunk_free(chunk);
 	}
 
 	/* SCTP-AUTH, Section 6.2
@@ -591,8 +652,9 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	(*tp->af_specific->sctp_xmit)(nskb, tp);
 
 out:
-	packet->size = packet->overhead;
+	sctp_packet_reset(packet);
 	return err;
+
 no_route:
 	kfree_skb(nskb);
 	IP_INC_STATS_BH(&init_net, IPSTATS_MIB_OUTNOROUTES);
@@ -604,21 +666,10 @@ no_route:
 	 * to the user via notifications. So setting this error may not be
 	 * required.
 	 */
-	 /* err = -EHOSTUNREACH; */
-err:
-	/* Control chunks are unreliable so just drop them.  DATA chunks
-	 * will get resent or dropped later.
-	 */
-
-	list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
-		list_del_init(&chunk->list);
-		if (!sctp_chunk_is_data(chunk))
-			sctp_chunk_free(chunk);
-	}
 	goto out;
 nomem:
 	err = -ENOMEM;
-	goto err;
+	goto out;
 }
 
 /********************************************************************
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 247ebc9..5d81bb6 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -560,6 +560,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 	asoc = q->asoc;
 	lqueue = &q->retransmit;
 	fast_rtx = q->fast_rtx;
+	q->fast_rtx = 0;
 
 	/* This loop handles time-out retransmissions, fast retransmissions,
 	 * and retransmissions due to opening of whindow.
@@ -592,8 +593,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 		 * next chunk.
 		 */
 		if (chunk->tsn_gap_acked) {
-			list_del(&chunk->transmitted_list);
-			list_add_tail(&chunk->transmitted_list,
+			list_move_tail(&chunk->transmitted_list,
 					&transport->transmitted);
 			continue;
 		}
@@ -604,6 +604,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 		if (fast_rtx && !chunk->fast_retransmit)
 			continue;
 
+again:
 		/* Attempt to append this chunk to the packet. */
 		status = sctp_packet_append_chunk(pkt, chunk);
 
@@ -617,20 +618,13 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			 */
 			if (rtx_timeout || fast_rtx)
 				done = 1;
+			else
+				goto again;
 
 			/* Bundle next chunk in the next round.  */
 			break;
 
 		case SCTP_XMIT_RWND_FULL:
-			/* Send this packet. */
-			error = sctp_packet_transmit(pkt);
-
-			/* Stop sending DATA as there is no more room
-			 * at the receiver.
-			 */
-			done = 1;
-			break;
-
 		case SCTP_XMIT_NAGLE_DELAY:
 			/* Send this packet. */
 			error = sctp_packet_transmit(pkt);
@@ -643,8 +637,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			/* The append was successful, so add this chunk to
 			 * the transmitted list.
 			 */
-			list_del(&chunk->transmitted_list);
-			list_add_tail(&chunk->transmitted_list,
+			list_move_tail(&chunk->transmitted_list,
 					&transport->transmitted);
 
 			/* Mark the chunk as ineligible for fast retransmit
@@ -687,10 +680,6 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 
 	*start_timer = timer;
 
-	/* Clear fast retransmit hint */
-	if (fast_rtx)
-		q->fast_rtx = 0;
-
 	return error;
 }
 
@@ -1028,7 +1017,7 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
 			list_add_tail(&chunk->transmitted_list,
 				      &transport->transmitted);
 
-			sctp_transport_reset_timers(transport, start_timer-1);
+			sctp_transport_reset_timers(transport, 0);
 
 			q->empty = 0;
 
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index a6a0ea7..442752f 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -2650,7 +2650,7 @@ sctp_disposition_t sctp_sf_do_9_2_shut_ctsn(const struct sctp_endpoint *ep,
 	sctp_add_cmd_sf(commands, SCTP_CMD_PROCESS_CTSN,
 			SCTP_BE32(sdh->cum_tsn_ack));
 
-	return SCTP_DISPOSITION_CONSUME;
+	return sctp_sf_do_9_2_shutdown_ack(ep, asoc, type, arg, commands);
 }
 
 /* RFC 2960 9.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (5 preceding siblings ...)
  2008-12-15 17:42 ` Vlad Yasevich
@ 2008-12-18 12:35 ` Karsten Keil
  2008-12-18 17:30 ` Karsten Keil
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Karsten Keil @ 2008-12-18 12:35 UTC (permalink / raw)
  To: linux-sctp

On Mon, Dec 15, 2008 at 12:42:01PM -0500, Vlad Yasevich wrote:
> Oh, forgot the actuall patch. ;)
> 
> -vlad


OK I did port back this patch to our 2.6.16 kernel and it seems it
fix some of the pathes to trigger the bug, but after 2.5 hour it
triggers a overflow again, this time with different call path in the
trace:

Dec 16 18:50:17 gw kernel: SKB ffff81012681f340 overflow: p_size = 32, n_chunks = 0, list(ffff81010be63160, ffff81010be63160)
Dec 16 18:50:17 gw kernel: skb_over_panic: text:ffffffff8824af13 len:2092 put:1040 head:ffff81010e1a7000 data:ffff81010e1a7074 tail:ffff81010e1a78a0 end:ffff81010e1a7500 dev:<NULL>
Dec 16 18:50:17 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
Dec 16 18:50:17 gw kernel: Kernel BUG at net/core/skbuff.c:103
Dec 16 18:50:17 gw kernel: invalid opcode: 0000 [1] SMP 
Dec 16 18:50:17 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:14.4/0000:04:05.0/irq
Dec 16 18:50:17 gw kernel: CPU 0 
Dec 16 18:50:17 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
Dec 16 18:50:17 gw kernel: Pid: 11882, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
Dec 16 18:50:17 gw kernel: RIP: 0010:[<ffffffff8028667e>] <ffffffff8028667e>{skb_over_panic+77}
Dec 16 18:50:17 gw kernel: RSP: 0018:ffff81010cbe17e8  EFLAGS: 00010292
Dec 16 18:50:17 gw kernel: RAX: 000000000000009d RBX: 0000000000000410 RCX: 0000000000000292
Dec 16 18:50:17 gw kernel: RDX: ffffffff8035a968 RSI: 0000000000000292 RDI: ffffffff8035a960
Dec 16 18:50:17 gw kernel: RBP: ffff81012681f340 R08: ffffffff8035a968 R09: ffff810128190ac0
Dec 16 18:50:17 gw kernel: R10: ffff810001023780 R11: 0000000000000070 R12: ffff81010e1a7490
Dec 16 18:50:17 gw kernel: R13: ffff81010be63150 R14: ffff81010c926580 R15: ffff81010e1a7074
Dec 16 18:50:17 gw kernel: FS:  00002aeca5d8c6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
Dec 16 18:50:17 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 16 18:50:17 gw kernel: CR2: 00002b363a857000 CR3: 0000000110154000 CR4: 00000000000006e0
Dec 16 18:50:17 gw kernel: Process sctpcrash (pid: 11882, threadinfo ffff81010cbe0000, task ffff810129ece100)
Dec 16 18:50:17 gw kernel: Stack: ffff81010e1a78a0 ffff81010e1a7500 ffffffff803197b7 ffff81010c926580 
Dec 16 18:50:17 gw kernel:        ffff81012681f340 ffffffff8824af1d ffff81010be63160 ffff81010b6f3000 
Dec 16 18:50:17 gw kernel:        ffff81010be63000 ffff81010c948000 
Dec 16 18:50:17 gw kernel: Call Trace: <ffffffff8824af1d>{:sctp:sctp_packet_transmit+851}
Dec 16 18:50:17 gw kernel:        <ffffffff88243810>{:sctp:sctp_outq_flush+1524} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
Dec 16 18:50:17 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
Dec 16 18:50:17 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
Dec 16 18:50:18 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
Dec 16 18:50:18 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
Dec 16 18:50:18 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
Dec 16 18:50:18 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
Dec 16 18:50:18 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
Dec 16 18:50:18 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
Dec 16 18:50:18 gw kernel:        <ffffffff8010ae42>{system_call+126}
Dec 16 18:50:18 gw kernel: 
Dec 16 18:50:18 gw kernel: Code: 0f 0b 68 86 90 32 80 c2 67 00 48 83 c4 28 c3 41 57 41 56 49 
Dec 16 18:50:18 gw kernel: RIP <ffffffff8028667e>{skb_over_panic+77} RSP <ffff81010cbe17e8>

-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 16746 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (6 preceding siblings ...)
  2008-12-18 12:35 ` Karsten Keil
@ 2008-12-18 17:30 ` Karsten Keil
  2008-12-18 18:03 ` Vlad Yasevich
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Karsten Keil @ 2008-12-18 17:30 UTC (permalink / raw)
  To: linux-sctp

On Thu, Dec 18, 2008 at 01:35:08PM +0100, Karsten Keil wrote:
> On Mon, Dec 15, 2008 at 12:42:01PM -0500, Vlad Yasevich wrote:
> > Oh, forgot the actuall patch. ;)
> > 
> > -vlad
> 
> 
> OK I did port back this patch to our 2.6.16 kernel and it seems it
> fix some of the pathes to trigger the bug, but after 2.5 hour it
> triggers a overflow again, this time with different call path in the
> trace:
> 
> Dec 16 18:50:17 gw kernel: SKB ffff81012681f340 overflow: p_size = 32, n_chunks = 0, list(ffff81010be63160, ffff81010be63160)
> Dec 16 18:50:17 gw kernel: skb_over_panic: text:ffffffff8824af13 len:2092 put:1040 head:ffff81010e1a7000 data:ffff81010e1a7074 tail:ffff81010e1a78a0 end:ffff81010e1a7500 dev:<NULL>
> Dec 16 18:50:17 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
> Dec 16 18:50:17 gw kernel: Kernel BUG at net/core/skbuff.c:103
> Dec 16 18:50:17 gw kernel: invalid opcode: 0000 [1] SMP 
> Dec 16 18:50:17 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:14.4/0000:04:05.0/irq
> Dec 16 18:50:17 gw kernel: CPU 0 
> Dec 16 18:50:17 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
> Dec 16 18:50:17 gw kernel: Pid: 11882, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
> Dec 16 18:50:17 gw kernel: RIP: 0010:[<ffffffff8028667e>] <ffffffff8028667e>{skb_over_panic+77}
> Dec 16 18:50:17 gw kernel: RSP: 0018:ffff81010cbe17e8  EFLAGS: 00010292
> Dec 16 18:50:17 gw kernel: RAX: 000000000000009d RBX: 0000000000000410 RCX: 0000000000000292
> Dec 16 18:50:17 gw kernel: RDX: ffffffff8035a968 RSI: 0000000000000292 RDI: ffffffff8035a960
> Dec 16 18:50:17 gw kernel: RBP: ffff81012681f340 R08: ffffffff8035a968 R09: ffff810128190ac0
> Dec 16 18:50:17 gw kernel: R10: ffff810001023780 R11: 0000000000000070 R12: ffff81010e1a7490
> Dec 16 18:50:17 gw kernel: R13: ffff81010be63150 R14: ffff81010c926580 R15: ffff81010e1a7074
> Dec 16 18:50:17 gw kernel: FS:  00002aeca5d8c6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
> Dec 16 18:50:17 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Dec 16 18:50:17 gw kernel: CR2: 00002b363a857000 CR3: 0000000110154000 CR4: 00000000000006e0
> Dec 16 18:50:17 gw kernel: Process sctpcrash (pid: 11882, threadinfo ffff81010cbe0000, task ffff810129ece100)
> Dec 16 18:50:17 gw kernel: Stack: ffff81010e1a78a0 ffff81010e1a7500 ffffffff803197b7 ffff81010c926580 
> Dec 16 18:50:17 gw kernel:        ffff81012681f340 ffffffff8824af1d ffff81010be63160 ffff81010b6f3000 
> Dec 16 18:50:17 gw kernel:        ffff81010be63000 ffff81010c948000 
> Dec 16 18:50:17 gw kernel: Call Trace: <ffffffff8824af1d>{:sctp:sctp_packet_transmit+851}
> Dec 16 18:50:17 gw kernel:        <ffffffff88243810>{:sctp:sctp_outq_flush+1524} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
> Dec 16 18:50:17 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
> Dec 16 18:50:17 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
> Dec 16 18:50:18 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
> Dec 16 18:50:18 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
> Dec 16 18:50:18 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
> Dec 16 18:50:18 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
> Dec 16 18:50:18 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
> Dec 16 18:50:18 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
> Dec 16 18:50:18 gw kernel:        <ffffffff8010ae42>{system_call+126}
> Dec 16 18:50:18 gw kernel: 
> Dec 16 18:50:18 gw kernel: Code: 0f 0b 68 86 90 32 80 c2 67 00 48 83 c4 28 c3 41 57 41 56 49 
> Dec 16 18:50:18 gw kernel: RIP <ffffffff8028667e>{skb_over_panic+77} RSP <ffff81010cbe17e8>
> 


Next crash (after 2 hour):
Dec 18 16:32:47 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
Dec 18 16:32:47 gw kernel: Kernel BUG at net/sctp/output.c:352
Dec 18 16:32:47 gw kernel: invalid opcode: 0000 [1] SMP 
Dec 18 16:32:47 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
Dec 18 16:32:47 gw kernel: CPU 0 
Dec 18 16:32:47 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
Dec 18 16:32:47 gw kernel: Pid: 3904, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
Dec 18 16:32:47 gw kernel: RIP: 0010:[<ffffffff8824ac0d>] <ffffffff8824ac0d>{:sctp:sctp_packet_transmit+67}
Dec 18 16:32:47 gw kernel: RSP: 0018:ffff8101181af818  EFLAGS: 00010246
Dec 18 16:32:47 gw kernel: RAX: ffff810119ccfd60 RBX: ffff8101180e6cc0 RCX: 0000000000000020
Dec 18 16:32:47 gw kernel: RDX: ffff810119ccfd60 RSI: 0000000000000286 RDI: ffff810119ccfd50
Dec 18 16:32:47 gw kernel: RBP: ffff810119ccfc00 R08: 0000000000001f3a R09: 000000000000d400
Dec 18 16:32:47 gw kernel: R10: 0000000000000000 R11: ffff81011790d5d0 R12: 0000000000000000
Dec 18 16:32:47 gw kernel: R13: ffff810119ccfd50 R14: ffff8101190e15b0 R15: ffff810119ccfd50
Dec 18 16:32:47 gw kernel: FS:  00002b301e55b6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
Dec 18 16:32:47 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 18 16:32:47 gw kernel: CR2: 00002b539596d000 CR3: 000000011ccc0000 CR4: 00000000000006e0
Dec 18 16:32:47 gw kernel: Process sctpcrash (pid: 3904, threadinfo ffff8101181ae000, task ffff81012a32a080)
Dec 18 16:32:47 gw kernel: Stack: ffff810119ccfd60 ffff810119ccfc00 ffff810119ccfc00 ffff8101190e0000 
Dec 18 16:32:47 gw kernel:        ffff810119ccfd50 ffff8101180e6cc0 0000000000000286 ffff8101180e6cc0 
Dec 18 16:32:47 gw kernel:        ffff810119ccfc00 ffff8101181af998 
Dec 18 16:32:47 gw kernel: Call Trace: <ffffffff88243810>{:sctp:sctp_outq_flush+1524}
Dec 18 16:32:47 gw kernel:        <ffffffff8011e49a>{flush_gart+206} <ffffffff8011e31e>{dma_map_area+48}
Dec 18 16:32:47 gw kernel:        <ffffffff8011e4e4>{gart_map_simple+63} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
Dec 18 16:32:47 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
Dec 18 16:32:47 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
Dec 18 16:32:47 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
Dec 18 16:32:47 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
Dec 18 16:32:47 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
Dec 18 16:32:47 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
Dec 18 16:32:47 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
Dec 18 16:32:47 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
Dec 18 16:32:47 gw kernel:        <ffffffff8010ae42>{system_call+126}
Dec 18 16:32:47 gw kernel: 
Dec 18 16:32:47 gw kernel: Code: 0f 0b 68 bf 0a 25 88 c2 60 01 49 8b 45 10 41 8b 7d 28 48 8b 


This is the BUG_ON(packet->num_chunks = 0);  in sctp_packet_transmit().



-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 16746 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (7 preceding siblings ...)
  2008-12-18 17:30 ` Karsten Keil
@ 2008-12-18 18:03 ` Vlad Yasevich
  2008-12-18 23:01 ` Vlad Yasevich
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-18 18:03 UTC (permalink / raw)
  To: linux-sctp

Karsten Keil wrote:
> On Thu, Dec 18, 2008 at 01:35:08PM +0100, Karsten Keil wrote:
>> On Mon, Dec 15, 2008 at 12:42:01PM -0500, Vlad Yasevich wrote:
>>> Oh, forgot the actuall patch. ;)
>>>
>>> -vlad
>>
>> OK I did port back this patch to our 2.6.16 kernel and it seems it
>> fix some of the pathes to trigger the bug, but after 2.5 hour it
>> triggers a overflow again, this time with different call path in the
>> trace:
>>
>> Dec 16 18:50:17 gw kernel: SKB ffff81012681f340 overflow: p_size = 32, n_chunks = 0, list(ffff81010be63160, ffff81010be63160)
>> Dec 16 18:50:17 gw kernel: skb_over_panic: text:ffffffff8824af13 len:2092 put:1040 head:ffff81010e1a7000 data:ffff81010e1a7074 tail:ffff81010e1a78a0 end:ffff81010e1a7500 dev:<NULL>
>> Dec 16 18:50:17 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
>> Dec 16 18:50:17 gw kernel: Kernel BUG at net/core/skbuff.c:103
>> Dec 16 18:50:17 gw kernel: invalid opcode: 0000 [1] SMP 
>> Dec 16 18:50:17 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:14.4/0000:04:05.0/irq
>> Dec 16 18:50:17 gw kernel: CPU 0 
>> Dec 16 18:50:17 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
>> Dec 16 18:50:17 gw kernel: Pid: 11882, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
>> Dec 16 18:50:17 gw kernel: RIP: 0010:[<ffffffff8028667e>] <ffffffff8028667e>{skb_over_panic+77}
>> Dec 16 18:50:17 gw kernel: RSP: 0018:ffff81010cbe17e8  EFLAGS: 00010292
>> Dec 16 18:50:17 gw kernel: RAX: 000000000000009d RBX: 0000000000000410 RCX: 0000000000000292
>> Dec 16 18:50:17 gw kernel: RDX: ffffffff8035a968 RSI: 0000000000000292 RDI: ffffffff8035a960
>> Dec 16 18:50:17 gw kernel: RBP: ffff81012681f340 R08: ffffffff8035a968 R09: ffff810128190ac0
>> Dec 16 18:50:17 gw kernel: R10: ffff810001023780 R11: 0000000000000070 R12: ffff81010e1a7490
>> Dec 16 18:50:17 gw kernel: R13: ffff81010be63150 R14: ffff81010c926580 R15: ffff81010e1a7074
>> Dec 16 18:50:17 gw kernel: FS:  00002aeca5d8c6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
>> Dec 16 18:50:17 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> Dec 16 18:50:17 gw kernel: CR2: 00002b363a857000 CR3: 0000000110154000 CR4: 00000000000006e0
>> Dec 16 18:50:17 gw kernel: Process sctpcrash (pid: 11882, threadinfo ffff81010cbe0000, task ffff810129ece100)
>> Dec 16 18:50:17 gw kernel: Stack: ffff81010e1a78a0 ffff81010e1a7500 ffffffff803197b7 ffff81010c926580 
>> Dec 16 18:50:17 gw kernel:        ffff81012681f340 ffffffff8824af1d ffff81010be63160 ffff81010b6f3000 
>> Dec 16 18:50:17 gw kernel:        ffff81010be63000 ffff81010c948000 
>> Dec 16 18:50:17 gw kernel: Call Trace: <ffffffff8824af1d>{:sctp:sctp_packet_transmit+851}
>> Dec 16 18:50:17 gw kernel:        <ffffffff88243810>{:sctp:sctp_outq_flush+1524} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
>> Dec 16 18:50:17 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
>> Dec 16 18:50:17 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
>> Dec 16 18:50:18 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
>> Dec 16 18:50:18 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
>> Dec 16 18:50:18 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
>> Dec 16 18:50:18 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
>> Dec 16 18:50:18 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
>> Dec 16 18:50:18 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
>> Dec 16 18:50:18 gw kernel:        <ffffffff8010ae42>{system_call+126}
>> Dec 16 18:50:18 gw kernel: 
>> Dec 16 18:50:18 gw kernel: Code: 0f 0b 68 86 90 32 80 c2 67 00 48 83 c4 28 c3 41 57 41 56 49 
>> Dec 16 18:50:18 gw kernel: RIP <ffffffff8028667e>{skb_over_panic+77} RSP <ffff81010cbe17e8>
>>
> 
> 
> Next crash (after 2 hour):
> Dec 18 16:32:47 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
> Dec 18 16:32:47 gw kernel: Kernel BUG at net/sctp/output.c:352
> Dec 18 16:32:47 gw kernel: invalid opcode: 0000 [1] SMP 
> Dec 18 16:32:47 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
> Dec 18 16:32:47 gw kernel: CPU 0 
> Dec 18 16:32:47 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
> Dec 18 16:32:47 gw kernel: Pid: 3904, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
> Dec 18 16:32:47 gw kernel: RIP: 0010:[<ffffffff8824ac0d>] <ffffffff8824ac0d>{:sctp:sctp_packet_transmit+67}
> Dec 18 16:32:47 gw kernel: RSP: 0018:ffff8101181af818  EFLAGS: 00010246
> Dec 18 16:32:47 gw kernel: RAX: ffff810119ccfd60 RBX: ffff8101180e6cc0 RCX: 0000000000000020
> Dec 18 16:32:47 gw kernel: RDX: ffff810119ccfd60 RSI: 0000000000000286 RDI: ffff810119ccfd50
> Dec 18 16:32:47 gw kernel: RBP: ffff810119ccfc00 R08: 0000000000001f3a R09: 000000000000d400
> Dec 18 16:32:47 gw kernel: R10: 0000000000000000 R11: ffff81011790d5d0 R12: 0000000000000000
> Dec 18 16:32:47 gw kernel: R13: ffff810119ccfd50 R14: ffff8101190e15b0 R15: ffff810119ccfd50
> Dec 18 16:32:47 gw kernel: FS:  00002b301e55b6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
> Dec 18 16:32:47 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Dec 18 16:32:47 gw kernel: CR2: 00002b539596d000 CR3: 000000011ccc0000 CR4: 00000000000006e0
> Dec 18 16:32:47 gw kernel: Process sctpcrash (pid: 3904, threadinfo ffff8101181ae000, task ffff81012a32a080)
> Dec 18 16:32:47 gw kernel: Stack: ffff810119ccfd60 ffff810119ccfc00 ffff810119ccfc00 ffff8101190e0000 
> Dec 18 16:32:47 gw kernel:        ffff810119ccfd50 ffff8101180e6cc0 0000000000000286 ffff8101180e6cc0 
> Dec 18 16:32:47 gw kernel:        ffff810119ccfc00 ffff8101181af998 
> Dec 18 16:32:47 gw kernel: Call Trace: <ffffffff88243810>{:sctp:sctp_outq_flush+1524}
> Dec 18 16:32:47 gw kernel:        <ffffffff8011e49a>{flush_gart+206} <ffffffff8011e31e>{dma_map_area+48}
> Dec 18 16:32:47 gw kernel:        <ffffffff8011e4e4>{gart_map_simple+63} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
> Dec 18 16:32:47 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
> Dec 18 16:32:47 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
> Dec 18 16:32:47 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
> Dec 18 16:32:47 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
> Dec 18 16:32:47 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
> Dec 18 16:32:47 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
> Dec 18 16:32:47 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
> Dec 18 16:32:47 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
> Dec 18 16:32:47 gw kernel:        <ffffffff8010ae42>{system_call+126}
> Dec 18 16:32:47 gw kernel: 
> Dec 18 16:32:47 gw kernel: Code: 0f 0b 68 bf 0a 25 88 c2 60 01 49 8b 45 10 41 8b 7d 28 48 8b 
> 
> 
> This is the BUG_ON(packet->num_chunks = 0);  in sctp_packet_transmit().
> 
> 
> 

The two crashes are somewhat consistent.  In both cases, the packet's chunk_list
is not empty, but num_chunks shows up at 0.

Additionally, in the first crash, we see the packet size as 32, but the
allocated skb is 1280 bytes and we have an overflow.

So, either we have a memory corruption or a race.  I can't figure out how there
could be a race since the packet is embedded into the sctp_transport structure
and we have only 1 of those per association.

I am running a test on 2.6.28-rc6 kernel with my cleaned up patches to see if I
can trigger this fault.  It used to happen very quickly before my patches.  I am
not at 2+ hours, but will let run for the rest of the day to see what happens.

-vlad





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (8 preceding siblings ...)
  2008-12-18 18:03 ` Vlad Yasevich
@ 2008-12-18 23:01 ` Vlad Yasevich
  2008-12-23 19:23 ` Vlad Yasevich
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-18 23:01 UTC (permalink / raw)
  To: linux-sctp

Vlad Yasevich wrote:
> Karsten Keil wrote:
>> On Thu, Dec 18, 2008 at 01:35:08PM +0100, Karsten Keil wrote:
>>> On Mon, Dec 15, 2008 at 12:42:01PM -0500, Vlad Yasevich wrote:
>>>> Oh, forgot the actuall patch. ;)
>>>>
>>>> -vlad
>>> OK I did port back this patch to our 2.6.16 kernel and it seems it
>>> fix some of the pathes to trigger the bug, but after 2.5 hour it
>>> triggers a overflow again, this time with different call path in the
>>> trace:
>>>
>>> Dec 16 18:50:17 gw kernel: SKB ffff81012681f340 overflow: p_size = 32, n_chunks = 0, list(ffff81010be63160, ffff81010be63160)
>>> Dec 16 18:50:17 gw kernel: skb_over_panic: text:ffffffff8824af13 len:2092 put:1040 head:ffff81010e1a7000 data:ffff81010e1a7074 tail:ffff81010e1a78a0 end:ffff81010e1a7500 dev:<NULL>
>>> Dec 16 18:50:17 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
>>> Dec 16 18:50:17 gw kernel: Kernel BUG at net/core/skbuff.c:103
>>> Dec 16 18:50:17 gw kernel: invalid opcode: 0000 [1] SMP 
>>> Dec 16 18:50:17 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:14.4/0000:04:05.0/irq
>>> Dec 16 18:50:17 gw kernel: CPU 0 
>>> Dec 16 18:50:17 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
>>> Dec 16 18:50:17 gw kernel: Pid: 11882, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
>>> Dec 16 18:50:17 gw kernel: RIP: 0010:[<ffffffff8028667e>] <ffffffff8028667e>{skb_over_panic+77}
>>> Dec 16 18:50:17 gw kernel: RSP: 0018:ffff81010cbe17e8  EFLAGS: 00010292
>>> Dec 16 18:50:17 gw kernel: RAX: 000000000000009d RBX: 0000000000000410 RCX: 0000000000000292
>>> Dec 16 18:50:17 gw kernel: RDX: ffffffff8035a968 RSI: 0000000000000292 RDI: ffffffff8035a960
>>> Dec 16 18:50:17 gw kernel: RBP: ffff81012681f340 R08: ffffffff8035a968 R09: ffff810128190ac0
>>> Dec 16 18:50:17 gw kernel: R10: ffff810001023780 R11: 0000000000000070 R12: ffff81010e1a7490
>>> Dec 16 18:50:17 gw kernel: R13: ffff81010be63150 R14: ffff81010c926580 R15: ffff81010e1a7074
>>> Dec 16 18:50:17 gw kernel: FS:  00002aeca5d8c6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
>>> Dec 16 18:50:17 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> Dec 16 18:50:17 gw kernel: CR2: 00002b363a857000 CR3: 0000000110154000 CR4: 00000000000006e0
>>> Dec 16 18:50:17 gw kernel: Process sctpcrash (pid: 11882, threadinfo ffff81010cbe0000, task ffff810129ece100)
>>> Dec 16 18:50:17 gw kernel: Stack: ffff81010e1a78a0 ffff81010e1a7500 ffffffff803197b7 ffff81010c926580 
>>> Dec 16 18:50:17 gw kernel:        ffff81012681f340 ffffffff8824af1d ffff81010be63160 ffff81010b6f3000 
>>> Dec 16 18:50:17 gw kernel:        ffff81010be63000 ffff81010c948000 
>>> Dec 16 18:50:17 gw kernel: Call Trace: <ffffffff8824af1d>{:sctp:sctp_packet_transmit+851}
>>> Dec 16 18:50:17 gw kernel:        <ffffffff88243810>{:sctp:sctp_outq_flush+1524} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
>>> Dec 16 18:50:17 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
>>> Dec 16 18:50:17 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
>>> Dec 16 18:50:18 gw kernel:        <ffffffff8010ae42>{system_call+126}
>>> Dec 16 18:50:18 gw kernel: 
>>> Dec 16 18:50:18 gw kernel: Code: 0f 0b 68 86 90 32 80 c2 67 00 48 83 c4 28 c3 41 57 41 56 49 
>>> Dec 16 18:50:18 gw kernel: RIP <ffffffff8028667e>{skb_over_panic+77} RSP <ffff81010cbe17e8>
>>>
>>
>> Next crash (after 2 hour):
>> Dec 18 16:32:47 gw kernel: ----------- [cut here ] --------- [please bite here ] ---------
>> Dec 18 16:32:47 gw kernel: Kernel BUG at net/sctp/output.c:352
>> Dec 18 16:32:47 gw kernel: invalid opcode: 0000 [1] SMP 
>> Dec 18 16:32:47 gw kernel: last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
>> Dec 18 16:32:47 gw kernel: CPU 0 
>> Dec 18 16:32:47 gw kernel: Modules linked in: sctp cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table ipv6 af_packet button battery ac apparmor loop dm_mod floppy r8169 parport_pc lp parport ext3 jbd edd fan thermal processor atiixp sg ahci libata sd_mod scsi_mod ide_disk ide_core
>> Dec 18 16:32:47 gw kernel: Pid: 3904, comm: sctpcrash Tainted: G     U 2.6.16.60-0.33-smp #13
>> Dec 18 16:32:47 gw kernel: RIP: 0010:[<ffffffff8824ac0d>] <ffffffff8824ac0d>{:sctp:sctp_packet_transmit+67}
>> Dec 18 16:32:47 gw kernel: RSP: 0018:ffff8101181af818  EFLAGS: 00010246
>> Dec 18 16:32:47 gw kernel: RAX: ffff810119ccfd60 RBX: ffff8101180e6cc0 RCX: 0000000000000020
>> Dec 18 16:32:47 gw kernel: RDX: ffff810119ccfd60 RSI: 0000000000000286 RDI: ffff810119ccfd50
>> Dec 18 16:32:47 gw kernel: RBP: ffff810119ccfc00 R08: 0000000000001f3a R09: 000000000000d400
>> Dec 18 16:32:47 gw kernel: R10: 0000000000000000 R11: ffff81011790d5d0 R12: 0000000000000000
>> Dec 18 16:32:47 gw kernel: R13: ffff810119ccfd50 R14: ffff8101190e15b0 R15: ffff810119ccfd50
>> Dec 18 16:32:47 gw kernel: FS:  00002b301e55b6e0(0000) GS:ffffffff803d4000(0000) knlGS:0000000000000000
>> Dec 18 16:32:47 gw kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> Dec 18 16:32:47 gw kernel: CR2: 00002b539596d000 CR3: 000000011ccc0000 CR4: 00000000000006e0
>> Dec 18 16:32:47 gw kernel: Process sctpcrash (pid: 3904, threadinfo ffff8101181ae000, task ffff81012a32a080)
>> Dec 18 16:32:47 gw kernel: Stack: ffff810119ccfd60 ffff810119ccfc00 ffff810119ccfc00 ffff8101190e0000 
>> Dec 18 16:32:47 gw kernel:        ffff810119ccfd50 ffff8101180e6cc0 0000000000000286 ffff8101180e6cc0 
>> Dec 18 16:32:47 gw kernel:        ffff810119ccfc00 ffff8101181af998 
>> Dec 18 16:32:47 gw kernel: Call Trace: <ffffffff88243810>{:sctp:sctp_outq_flush+1524}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8011e49a>{flush_gart+206} <ffffffff8011e31e>{dma_map_area+48}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8011e4e4>{gart_map_simple+63} <ffffffff8823c2a7>{:sctp:sctp_do_sm+3421}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8823f22d>{:sctp:sctp_datamsg_from_user+689}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8824ab05>{:sctp:sctp_primitive_SEND+57} <ffffffff882498ac>{:sctp:sctp_sendmsg+0}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8824a158>{:sctp:sctp_sendmsg+2220} <ffffffff80283a6a>{release_sock+19}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8824a6eb>{:sctp:sctp_recvmsg+1024} <ffffffff8028159b>{sock_sendmsg+251}
>> Dec 18 16:32:47 gw kernel:        <ffffffff80148092>{autoremove_wake_function+0} <ffffffff801fa28c>{sprintf+81}
>> Dec 18 16:32:47 gw kernel:        <ffffffff80281b6f>{sys_sendmsg+523} <ffffffff801ab20e>{inotify_d_instantiate+26}
>> Dec 18 16:32:47 gw kernel:        <ffffffff80184655>{fd_install+46} <ffffffff80283bae>{lock_sock+169}
>> Dec 18 16:32:47 gw kernel:        <ffffffff80283a6a>{release_sock+19} <ffffffff80284e5e>{sock_setsockopt+1230}
>> Dec 18 16:32:47 gw kernel:        <ffffffff8010ae42>{system_call+126}
>> Dec 18 16:32:47 gw kernel: 
>> Dec 18 16:32:47 gw kernel: Code: 0f 0b 68 bf 0a 25 88 c2 60 01 49 8b 45 10 41 8b 7d 28 48 8b 
>>
>>
>> This is the BUG_ON(packet->num_chunks = 0);  in sctp_packet_transmit().
>>
>>
>>
> 
> The two crashes are somewhat consistent.  In both cases, the packet's chunk_list
> is not empty, but num_chunks shows up at 0.
> 
> Additionally, in the first crash, we see the packet size as 32, but the
> allocated skb is 1280 bytes and we have an overflow.
> 
> So, either we have a memory corruption or a race.  I can't figure out how there
> could be a race since the packet is embedded into the sctp_transport structure
> and we have only 1 of those per association.
> 
> I am running a test on 2.6.28-rc6 kernel with my cleaned up patches to see if I
> can trigger this fault.  It used to happen very quickly before my patches.  I am
> not at 2+ hours, but will let run for the rest of the day to see what happens.
> 
> -vlad

At this point, I am starting to think that this is a race, but I am not sure
what/who is racing what/who.

When I add an SCTP level setsockopt call in the accept() loop of the server, I
get 4+ hours of normal operation (I killed the test at this point).  It doesn't
matter what the socket option does.  As a test, I used SCTP_DISABLE_FRAGMENTS
with a value of 0 which is essentially a no-op with locks around it, and it worked.

The few crashes I've received on 2.6.28-rc6 seem to always eminate from
the retransmission timeout.  After poking around the crash dump, I see
the following:

crash> sctp_transport.packet 0xffff88013dd830d8
  packet = {
    source_port = 10003,
    destination_port = 36107,
    vtag = 4043516048,
    chunk_list = {
      next = 0xffff88013c395e80,
      prev = 0xffff88013c395e80
    },
    ...
crash> struct sctp_chunk 0xffff88013c395e80
struct sctp_chunk {
  list = {
    next = 0xffff88013c395e80,
    prev = 0xffff88013c395e80
  },
  refcnt = {
    counter = 2
  },
  transmitted_list = {
    next = 0xffff88013dd83228,
    prev = 0xffff88013dd83228
  },


Note that the transmitted_list is good (it points back to the association).
However, the list{} in the sctp chunk points to itself, while chunk_list in
the packet also points to it.  This results an infinite iteration over the same
chunk while trying to copy it into the transmission skb and triggers the skb
overflow that we BUG() with.

I am going to see if I can poison the chunk->list from the start and see who
dies.

-vlad

p.s.  the crash I am seeing is with locks added around packet->chunk_list
manipulations.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (9 preceding siblings ...)
  2008-12-18 23:01 ` Vlad Yasevich
@ 2008-12-23 19:23 ` Vlad Yasevich
  2009-01-05 23:05 ` Vlad Yasevich
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2008-12-23 19:23 UTC (permalink / raw)
  To: linux-sctp

Vlad Yasevich wrote:
> 
> At this point, I am starting to think that this is a race, but I am not sure
> what/who is racing what/who.
> 
> When I add an SCTP level setsockopt call in the accept() loop of the server, I
> get 4+ hours of normal operation (I killed the test at this point).  It doesn't
> matter what the socket option does.  As a test, I used SCTP_DISABLE_FRAGMENTS
> with a value of 0 which is essentially a no-op with locks around it, and it worked.
> 
> The few crashes I've received on 2.6.28-rc6 seem to always eminate from
> the retransmission timeout.  After poking around the crash dump, I see
> the following:
> 
> crash> sctp_transport.packet 0xffff88013dd830d8
>   packet = {
>     source_port = 10003,
>     destination_port = 36107,
>     vtag = 4043516048,
>     chunk_list = {
>       next = 0xffff88013c395e80,
>       prev = 0xffff88013c395e80
>     },
>     ...
> crash> struct sctp_chunk 0xffff88013c395e80
> struct sctp_chunk {
>   list = {
>     next = 0xffff88013c395e80,
>     prev = 0xffff88013c395e80
>   },
>   refcnt = {
>     counter = 2
>   },
>   transmitted_list = {
>     next = 0xffff88013dd83228,
>     prev = 0xffff88013dd83228
>   },
> 
> 
> Note that the transmitted_list is good (it points back to the association).
> However, the list{} in the sctp chunk points to itself, while chunk_list in
> the packet also points to it.  This results an infinite iteration over the same
> chunk while trying to copy it into the transmission skb and triggers the skb
> overflow that we BUG() with.
> 
> I am going to see if I can poison the chunk->list from the start and see who
> dies.
> 
> -vlad
> 
> p.s.  the crash I am seeing is with locks added around packet->chunk_list
> manipulations.
> 

Ok, I was able to prove that there is a race condition accessing the packet and
the chunk_list.  The way to do this is by adding a "void *last_thread" to the
sctp_chunk structure and then using the following code:

in scpt_packet_init:
	packet->last_thread = NULL;

in sctp_packet_free:
	packet->last_thread = 0xdeadbeef; /* to catch errors */

in sctp_packet_append:

	spin_lock_bh(&packet->lock);
	if (packet->last_thread && packet->last_thread != current) {
		/* print warning with interesting info. I printed the packet */
		BUG();
	}
	packet->last_thread = current;
	...
	spin_unlock_bh(&packet->lock);

in sctp_packet_reset:
	/* after the loop to free chunks */
	packet->last_thread = NULL;


In my builds with 2.6.28-rc6 + my patches, it tripped the BUG in sctp_packet_append() with two
different threads accessing the same packet structure.  Looking in crash, sure enough, one CPU was
spinning on a lock in packet_reset, while the other CPU was holding the lock while adding a chunk.

As you can tell, you need locks enabled around the chunk list handling.

I can't do much more since my company has shut down for the holidays and I'll restart in the new
year.  Meanwhile, if you still have access to equipment, you can do some looking to see if you can
figure out the race.

-vlad

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (10 preceding siblings ...)
  2008-12-23 19:23 ` Vlad Yasevich
@ 2009-01-05 23:05 ` Vlad Yasevich
  2009-01-06 13:30 ` Michal Hocko
  2009-01-06 13:50 ` Vlad Yasevich
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2009-01-05 23:05 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 700 bytes --]

Karsten, Michal

I think I found this cursed race.

The problem is that nothing is guarding the asoc->base.sk reference.  This
reference changes during the accept()/peeloff() code paths and the only
guard around it is the socket lock.  But that lock is not taken when
the reference is used for reading as it is done in sctp_rcv().

So, when we do sctp_bh_lock_sock(sk) in sctp_rcv(), we may be using
a stale cached copy of the sk which might have changed during sctp_accept().

What this allows us to do is to have a user sending a packet on a socket
at the same time we may be processing and incoming packet on the same socket.
We just end up taking different locks which totally hoses us.

-vlad



[-- Attachment #2: race_test.patch --]
[-- Type: text/x-diff, Size: 769 bytes --]

diff --git a/net/sctp/input.c b/net/sctp/input.c
index bf612d9..2e4a864 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -249,6 +249,19 @@ int sctp_rcv(struct sk_buff *skb)
 	 */
 	sctp_bh_lock_sock(sk);
 
+	if (sk != rcvr->sk) {
+		/* Our cached sk is different from the rcvr->sk.  This is
+		 * because migrate()/accept() may have moved the association
+		 * to a new socket and released all the sockets.  So now we
+		 * are holding a lock on the old socket while the user may
+		 * be doing something with the new socket.  Switch our veiw
+		 * of the current sk.
+		 */
+		sctp_bh_unlock_sock(sk);
+		sk = rcvr->sk;
+		sctp_bh_lock_sock(sk);
+	}
+
 	if (sock_owned_by_user(sk)) {
 		SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG);
 		sctp_add_backlog(sk, skb);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (11 preceding siblings ...)
  2009-01-05 23:05 ` Vlad Yasevich
@ 2009-01-06 13:30 ` Michal Hocko
  2009-01-06 13:50 ` Vlad Yasevich
  13 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2009-01-06 13:30 UTC (permalink / raw)
  To: linux-sctp

On Mon 05-01-09 18:05:21, Vlad Yasevich wrote:
> Karsten, Michal

Hi Vlad,

> 
> I think I found this cursed race.
> 
> The problem is that nothing is guarding the asoc->base.sk reference.  This
> reference changes during the accept()/peeloff() code paths and the only
> guard around it is the socket lock.  But that lock is not taken when
> the reference is used for reading as it is done in sctp_rcv().
> 
> So, when we do sctp_bh_lock_sock(sk) in sctp_rcv(), we may be using
> a stale cached copy of the sk which might have changed during sctp_accept().
> 
> What this allows us to do is to have a user sending a packet on a socket
> at the same time we may be processing and incoming packet on the same socket.
> We just end up taking different locks which totally hoses us.
> 
> -vlad
> 
> 

> diff --git a/net/sctp/input.c b/net/sctp/input.c
> index bf612d9..2e4a864 100644
> --- a/net/sctp/input.c
> +++ b/net/sctp/input.c
> @@ -249,6 +249,19 @@ int sctp_rcv(struct sk_buff *skb)
>  	 */
>  	sctp_bh_lock_sock(sk);
>  
> +	if (sk != rcvr->sk) {
> +		/* Our cached sk is different from the rcvr->sk.  This is
> +		 * because migrate()/accept() may have moved the association
> +		 * to a new socket and released all the sockets.  So now we
> +		 * are holding a lock on the old socket while the user may
> +		 * be doing something with the new socket.  Switch our veiw
> +		 * of the current sk.
> +		 */
> +		sctp_bh_unlock_sock(sk);
> +		sk = rcvr->sk;
> +		sctp_bh_lock_sock(sk);
> +	}
> +
>  	if (sock_owned_by_user(sk)) {
>  		SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG);
>  		sctp_add_backlog(sk, skb);

Can you reproduce with this patch? Unfortunately, I don't have any HW
configuration available at the moment.

Our SLES10SP2 kernel contains similar code:
[...]
        /* Acquire access to the sock lock. Note: We are safe from other
         * bottom halves on this lock, but a user may be in the lock too,
         * so check if it is busy.
         */
        sctp_bh_lock_sock(sk);

        /* It is possible that the association could have moved to a different
         * socket if it is peeled off. If so, update the sk.
         */
        if (sk != rcvr->sk) {
                sctp_bh_lock_sock(rcvr->sk);
                sctp_bh_unlock_sock(sk);
                sk = rcvr->sk;
        }

        if (sock_owned_by_user(sk))
                sk_add_backlog(sk, skb);
        else
                sctp_backlog_rcv(sk, skb);

        /* Release the sock and the sock ref we took in the lookup calls.
         * The asoc/ep ref will be released in sctp_backlog_rcv.
         */
        sctp_bh_unlock_sock(sk);
        sock_put(sk);

        return ret;
[...]

This test, however, has been removed by 61c9fed41638249f8b6ca5345064eb1beb50179f.
AFAIU this patch deals with similar race in the receive path when socket
is moved.

As you can see the test&relock part differs only in ordering of the old
unlock and the new lock.
Is this ordering important? 
Do we have to unlock old socket sooner to enable rcvr->sk change? 
Personally, I don't think so, because we would see deadlock otherwise,
right? 
Are there any other patches which depend on the above one?

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in sctp crashes sles10sp2 kernel
  2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
                   ` (12 preceding siblings ...)
  2009-01-06 13:30 ` Michal Hocko
@ 2009-01-06 13:50 ` Vlad Yasevich
  13 siblings, 0 replies; 15+ messages in thread
From: Vlad Yasevich @ 2009-01-06 13:50 UTC (permalink / raw)
  To: linux-sctp

Michal Hocko wrote:
> On Mon 05-01-09 18:05:21, Vlad Yasevich wrote:
>> Karsten, Michal
> 
> Hi Vlad,
> 
>> I think I found this cursed race.
>>
>> The problem is that nothing is guarding the asoc->base.sk reference.  This
>> reference changes during the accept()/peeloff() code paths and the only
>> guard around it is the socket lock.  But that lock is not taken when
>> the reference is used for reading as it is done in sctp_rcv().
>>
>> So, when we do sctp_bh_lock_sock(sk) in sctp_rcv(), we may be using
>> a stale cached copy of the sk which might have changed during sctp_accept().
>>
>> What this allows us to do is to have a user sending a packet on a socket
>> at the same time we may be processing and incoming packet on the same socket.
>> We just end up taking different locks which totally hoses us.
>>
>> -vlad
>>
>>
> 
>> diff --git a/net/sctp/input.c b/net/sctp/input.c
>> index bf612d9..2e4a864 100644
>> --- a/net/sctp/input.c
>> +++ b/net/sctp/input.c
>> @@ -249,6 +249,19 @@ int sctp_rcv(struct sk_buff *skb)
>>  	 */
>>  	sctp_bh_lock_sock(sk);
>>  
>> +	if (sk != rcvr->sk) {
>> +		/* Our cached sk is different from the rcvr->sk.  This is
>> +		 * because migrate()/accept() may have moved the association
>> +		 * to a new socket and released all the sockets.  So now we
>> +		 * are holding a lock on the old socket while the user may
>> +		 * be doing something with the new socket.  Switch our veiw
>> +		 * of the current sk.
>> +		 */
>> +		sctp_bh_unlock_sock(sk);
>> +		sk = rcvr->sk;
>> +		sctp_bh_lock_sock(sk);
>> +	}
>> +
>>  	if (sock_owned_by_user(sk)) {
>>  		SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG);
>>  		sctp_add_backlog(sk, skb);
> 
> Can you reproduce with this patch? Unfortunately, I don't have any HW
> configuration available at the moment.

I've had the test run for a few hours without any issues.  Usually the
system crashes with the first 15 minutes.

> 
> Our SLES10SP2 kernel contains similar code:
> [...]
>         /* Acquire access to the sock lock. Note: We are safe from other
>          * bottom halves on this lock, but a user may be in the lock too,
>          * so check if it is busy.
>          */
>         sctp_bh_lock_sock(sk);
> 
>         /* It is possible that the association could have moved to a different
>          * socket if it is peeled off. If so, update the sk.
>          */
>         if (sk != rcvr->sk) {
>                 sctp_bh_lock_sock(rcvr->sk);
>                 sctp_bh_unlock_sock(sk);
>                 sk = rcvr->sk;
>         }
> 
>         if (sock_owned_by_user(sk))
>                 sk_add_backlog(sk, skb);
>         else
>                 sctp_backlog_rcv(sk, skb);
> 
>         /* Release the sock and the sock ref we took in the lookup calls.
>          * The asoc/ep ref will be released in sctp_backlog_rcv.
>          */
>         sctp_bh_unlock_sock(sk);
>         sock_put(sk);
> 
>         return ret;
> [...]
> 
> This test, however, has been removed by 61c9fed41638249f8b6ca5345064eb1beb50179f.
> AFAIU this patch deals with similar race in the receive path when socket
> is moved.

Yes.  The old code was insufficient, but this part should not have been removed.
The old code was still racy in other parts of the peel-off and resulted in
crashes you've seen.   However, this particular part or something along these
lines needs to be there.

I really hate the this is done and currently looking at a way to protect this
pointer somehow, or rework it so it's not needed.  The problem is that we
perform other checks using the sk without locking it so there could be other
races as well that we just haven't seen because noone has an app yet that needs
the additional functionality.  So, even this patch I sent is not a complete
solution, but it is enough to fix the crash your customer is seeing.

> 
> As you can see the test&relock part differs only in ordering of the old
> unlock and the new lock.
> Is this ordering important? 

I don't think so.  In fact, I think having both sockets locked will make lockdep
complain.  Releasing the lock on the socket can let the user lock it, but then
we enter the backlog code and everything is ok.  If the socket is not owned,
then everything proceeds normally.  As long as we own the correct lock, it
should be safe.  Right now, we don't own the correct lock and that causes trouble.

> Do we have to unlock old socket sooner to enable rcvr->sk change? 
> Personally, I don't think so, because we would see deadlock otherwise,
> right? 
> Are there any other patches which depend on the above one?
> 

I don't think so.

-vlad

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-01-06 13:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-11 14:52 BUG in sctp crashes sles10sp2 kernel Michal Hocko
2008-12-11 15:28 ` Vlad Yasevich
2008-12-12 13:04 ` Karsten Keil
2008-12-15 15:38 ` Vlad Yasevich
2008-12-15 17:02 ` Karsten Keil
2008-12-15 17:41 ` Vlad Yasevich
2008-12-15 17:42 ` Vlad Yasevich
2008-12-18 12:35 ` Karsten Keil
2008-12-18 17:30 ` Karsten Keil
2008-12-18 18:03 ` Vlad Yasevich
2008-12-18 23:01 ` Vlad Yasevich
2008-12-23 19:23 ` Vlad Yasevich
2009-01-05 23:05 ` Vlad Yasevich
2009-01-06 13:30 ` Michal Hocko
2009-01-06 13:50 ` Vlad Yasevich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.