linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.4.0-rc2: skb_put() -> skb_over_panic
@ 2012-04-15 12:24 Alexander Beregalov
  2012-04-15 15:16 ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Beregalov @ 2012-04-15 12:24 UTC (permalink / raw)
  To: netdev, Linux Kernel Mailing List

Hi

kernel 3.4.0-rc2-00333-g668ce0a

calltrace is lost on netconsole, sorry.

ethernet is realtek/r8169

It happened already two times with rtorrent, perhaps I can reproduce
it, what else can I provide to you?

skb_over_panic: text:ffffffff8136d919 len:1016 put:815
head:ffff8800afe4d800 data:ffff8800afe4dd6b tail:0x963 end:0x6c0
dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#1] SMP
CPU 2
Modules linked in:

Pid: 0, comm: swapper/2 Not tainted 3.4.0-rc2-00333-g668ce0a #1
          /D525MW
RIP: 0010:[<ffffffff8132414a>]  [<ffffffff8132414a>] skb_put+0x7c/0x86
RSP: 0018:ffff8800bed03950  EFLAGS: 00010246
RAX: 0000000000000098 RBX: ffff88002883a580 RCX: 0000000000000015
RDX: 0000000000000001 RSI: 0000000000000046 RDI: ffffffff8162a0b0
RBP: ffff8800bed03970 R08: ffffffff813cd540 R09: ffff8800bed03650
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88002895c380
R13: ffff88002895ca80 R14: ffff88002895c3a8 R15: ffff88002883a660
FS:  0000000000000000(0000) GS:ffff8800bed00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fcbba6e1000 CR3: 0000000001520000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/2 (pid: 0, threadinfo ffff8800bb0a8000, task ffff8800bb06d780)
Stack:
 0000000000000963 00000000000006c0 ffffffff814d06e7 ffff88002895c380
 ffff8800bed039d0 ffffffff8136d919 ffff88000000032f 000000c98102ebbd
 ffff88002883a660 0000019c0000250f ffff88002883a580 ffff88002883a580
Call Trace:
 <IRQ>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-15 12:24 3.4.0-rc2: skb_put() -> skb_over_panic Alexander Beregalov
@ 2012-04-15 15:16 ` Eric Dumazet
  2012-04-17 20:15   ` Alexander Beregalov
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-04-15 15:16 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: netdev, Linux Kernel Mailing List

On Sun, 2012-04-15 at 16:24 +0400, Alexander Beregalov wrote:
> Hi
> 
> kernel 3.4.0-rc2-00333-g668ce0a
> 
> calltrace is lost on netconsole, sorry.
> 
> ethernet is realtek/r8169
> 
> It happened already two times with rtorrent, perhaps I can reproduce
> it, what else can I provide to you?
> 
> skb_over_panic: text:ffffffff8136d919 len:1016 put:815
> head:ffff8800afe4d800 data:ffff8800afe4dd6b tail:0x963 end:0x6c0
> dev:<NULL>
> ------------[ cut here ]------------
> kernel BUG at net/core/skbuff.c:127!
> invalid opcode: 0000 [#1] SMP
> CPU 2
> Modules linked in:
> 
> Pid: 0, comm: swapper/2 Not tainted 3.4.0-rc2-00333-g668ce0a #1
>           /D525MW
> RIP: 0010:[<ffffffff8132414a>]  [<ffffffff8132414a>] skb_put+0x7c/0x86
> RSP: 0018:ffff8800bed03950  EFLAGS: 00010246
> RAX: 0000000000000098 RBX: ffff88002883a580 RCX: 0000000000000015
> RDX: 0000000000000001 RSI: 0000000000000046 RDI: ffffffff8162a0b0
> RBP: ffff8800bed03970 R08: ffffffff813cd540 R09: ffff8800bed03650
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88002895c380
> R13: ffff88002895ca80 R14: ffff88002895c3a8 R15: ffff88002883a660
> FS:  0000000000000000(0000) GS:ffff8800bed00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fcbba6e1000 CR3: 0000000001520000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper/2 (pid: 0, threadinfo ffff8800bb0a8000, task ffff8800bb06d780)
> Stack:
>  0000000000000963 00000000000006c0 ffffffff814d06e7 ffff88002895c380
>  ffff8800bed039d0 ffffffff8136d919 ffff88000000032f 000000c98102ebbd
>  ffff88002883a660 0000019c0000250f ffff88002883a580 ffff88002883a580
> Call Trace:
>  <IRQ>
> --

full stack trace needed please.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-15 15:16 ` Eric Dumazet
@ 2012-04-17 20:15   ` Alexander Beregalov
  2012-04-17 20:45     ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Beregalov @ 2012-04-17 20:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Linux Kernel Mailing List

On 15 April 2012 19:16, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sun, 2012-04-15 at 16:24 +0400, Alexander Beregalov wrote:
>> Hi
>>
>> kernel 3.4.0-rc2-00333-g668ce0a
>>
>> calltrace is lost on netconsole, sorry.
>>
>> ethernet is realtek/r8169
>>
>> It happened already two times with rtorrent, perhaps I can reproduce
>> it, what else can I provide to you?
>>
> full stack trace needed please.
>

This time calltrace is different (I saw few last lines of calltrace on
a display, but not enough) and netconsole transmitted complete
message, but perhaps it is the same problem. At least 'end' is the
same.


skb_over_panic: text:ffffffff8136d919 len:1248 put:932
head:ffff8800babcf800 data:ffff8800babcfd10 tail:0x9f0 end:0x6c0
dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#1] SMP
CPU 3
Modules linked in:

Pid: 1926, comm: rtorrent Not tainted 3.4.0-rc2-00333-g668ce0a #1
            /D525MW
RIP: 0010:[<ffffffff8132414a>]  [<ffffffff8132414a>] skb_put+0x7c/0x86
RSP: 0018:ffff8800bed83d60  EFLAGS: 00010246
RAX: 0000000000000098 RBX: ffff8800b0b244c0 RCX: 000000000000003d
RDX: 000000000000000d RSI: 0000000000000046 RDI: ffffffff8162a0b0
RBP: ffff8800bed83d80 R08: 0000000000000001 R09: 0000000000000000
R10: ffff88002ed840c0 R11: 00000000007e28f6 R12: ffff8800b9cb9180
R13: ffff8800b9cb96c0 R14: ffff8800b9cb91a8 R15: ffff8800b0b245a0
FS:  00007fe639e75720(0000) GS:ffff8800bed80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe63571a000 CR3: 00000000b53fa000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rtorrent (pid: 1926, threadinfo ffff8800bb0a0000, task ffff8800bb1dddc0)
Stack:
 00000000000009f0 00000000000006c0 ffffffff814d06e7 ffff8800b9cb9180
 ffff8800bed83de0 ffffffff8136d919 ffff8800000003a4 0000013c81536640
 ffff8800b0b245a0 000000cc00000006 ffff8800b0b244c0 ffff8800b0b244c0
Call Trace:
 <IRQ>
 [<ffffffff8136d919>] tcp_retransmit_skb+0x29a/0x529
 [<ffffffff8136eebd>] tcp_retransmit_timer+0x358/0x4e7
 [<ffffffff8136f0e8>] tcp_write_timer+0x9c/0x17d
 [<ffffffff81035bd0>] run_timer_softirq+0x1eb/0x2bf
 [<ffffffff8136f04c>] ? tcp_retransmit_timer+0x4e7/0x4e7
 [<ffffffff810184e9>] ? native_smp_send_reschedule+0x4f/0x51
 [<ffffffff8102fb52>] __do_softirq+0xbf/0x17d
 [<ffffffff810188ec>] ? lapic_next_event+0x18/0x1c
 [<ffffffff813c05cc>] call_softirq+0x1c/0x30
 [<ffffffff810033d6>] do_softirq+0x33/0x69
 [<ffffffff8102fdc6>] irq_exit+0x44/0x9c
 [<ffffffff81018c6f>] smp_apic_timer_interrupt+0x86/0x94
 [<ffffffff813bfe47>] apic_timer_interrupt+0x67/0x70
 <EOI>
 [<ffffffff8131d87e>] ? sock_sendmsg+0xe6/0x106
 [<ffffffff8135fb7b>] ? tcp_poll+0xaf/0x168
 [<ffffffff810ff814>] ? ep_send_events_proc+0x67/0x116
 [<ffffffff8131b8ef>] sock_poll+0x15/0x17
 [<ffffffff810ff823>] ep_send_events_proc+0x76/0x116
 [<ffffffff810ff7ad>] ? ep_read_events_proc+0x99/0x99
 [<ffffffff810fff24>] ep_scan_ready_list.clone.6+0x8f/0x16f
 [<ffffffff81100277>] ep_poll+0x25f/0x2e2
 [<ffffffff8131e886>] ? sys_accept4+0x133/0x15f
 [<ffffffff81100d8e>] sys_epoll_wait+0x90/0xae
 [<ffffffff813bf2e6>] system_call_fastpath+0x1a/0x1f
Code: 8b 57 60 48 89 44 24 10 8b 87 ac 00 00 00 48 89 44 24 08 31 c0
8b bf a8 00 00 00 48 89 3c 24 48 c7 c7 43 07 4d 81 e8 2c 27 09 00 <0f>
0b 89 c0 49 8d 04 00 c9 c3 55 48 89 e5 41 57 41 56 41 55 41
RIP  [<ffffffff8132414a>] skb_put+0x7c/0x86
 RSP <ffff8800bed83d60>
---[ end trace a721715cd86be064 ]---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-17 20:15   ` Alexander Beregalov
@ 2012-04-17 20:45     ` Eric Dumazet
  2012-04-17 20:47       ` David Miller
  2012-04-18  6:37       ` Alexander Beregalov
  0 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-04-17 20:45 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: netdev, Linux Kernel Mailing List

On Wed, 2012-04-18 at 00:15 +0400, Alexander Beregalov wrote:
> On 15 April 2012 19:16, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Sun, 2012-04-15 at 16:24 +0400, Alexander Beregalov wrote:
> >> Hi
> >>
> >> kernel 3.4.0-rc2-00333-g668ce0a
> >>
> >> calltrace is lost on netconsole, sorry.
> >>
> >> ethernet is realtek/r8169
> >>
> >> It happened already two times with rtorrent, perhaps I can reproduce
> >> it, what else can I provide to you?
> >>
> > full stack trace needed please.
> >
> 
> This time calltrace is different (I saw few last lines of calltrace on
> a display, but not enough) and netconsole transmitted complete
> message, but perhaps it is the same problem. At least 'end' is the
> same.
> 
> 
> skb_over_panic: text:ffffffff8136d919 len:1248 put:932
> head:ffff8800babcf800 data:ffff8800babcfd10 tail:0x9f0 end:0x6c0
> dev:<NULL>
> ------------[ cut here ]------------
> kernel BUG at net/core/skbuff.c:127!
> invalid opcode: 0000 [#1] SMP
> CPU 3
> Modules linked in:
> 
> Pid: 1926, comm: rtorrent Not tainted 3.4.0-rc2-00333-g668ce0a #1
>             /D525MW
> RIP: 0010:[<ffffffff8132414a>]  [<ffffffff8132414a>] skb_put+0x7c/0x86
> RSP: 0018:ffff8800bed83d60  EFLAGS: 00010246
> RAX: 0000000000000098 RBX: ffff8800b0b244c0 RCX: 000000000000003d
> RDX: 000000000000000d RSI: 0000000000000046 RDI: ffffffff8162a0b0
> RBP: ffff8800bed83d80 R08: 0000000000000001 R09: 0000000000000000
> R10: ffff88002ed840c0 R11: 00000000007e28f6 R12: ffff8800b9cb9180
> R13: ffff8800b9cb96c0 R14: ffff8800b9cb91a8 R15: ffff8800b0b245a0
> FS:  00007fe639e75720(0000) GS:ffff8800bed80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe63571a000 CR3: 00000000b53fa000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process rtorrent (pid: 1926, threadinfo ffff8800bb0a0000, task ffff8800bb1dddc0)
> Stack:
>  00000000000009f0 00000000000006c0 ffffffff814d06e7 ffff8800b9cb9180
>  ffff8800bed83de0 ffffffff8136d919 ffff8800000003a4 0000013c81536640
>  ffff8800b0b245a0 000000cc00000006 ffff8800b0b244c0 ffff8800b0b244c0
> Call Trace:
>  <IRQ>
>  [<ffffffff8136d919>] tcp_retransmit_skb+0x29a/0x529
>  [<ffffffff8136eebd>] tcp_retransmit_timer+0x358/0x4e7
>  [<ffffffff8136f0e8>] tcp_write_timer+0x9c/0x17d
>  [<ffffffff81035bd0>] run_timer_softirq+0x1eb/0x2bf
>  [<ffffffff8136f04c>] ? tcp_retransmit_timer+0x4e7/0x4e7
>  [<ffffffff810184e9>] ? native_smp_send_reschedule+0x4f/0x51
>  [<ffffffff8102fb52>] __do_softirq+0xbf/0x17d
>  [<ffffffff810188ec>] ? lapic_next_event+0x18/0x1c
>  [<ffffffff813c05cc>] call_softirq+0x1c/0x30
>  [<ffffffff810033d6>] do_softirq+0x33/0x69
>  [<ffffffff8102fdc6>] irq_exit+0x44/0x9c
>  [<ffffffff81018c6f>] smp_apic_timer_interrupt+0x86/0x94
>  [<ffffffff813bfe47>] apic_timer_interrupt+0x67/0x70
>  <EOI>
>  [<ffffffff8131d87e>] ? sock_sendmsg+0xe6/0x106
>  [<ffffffff8135fb7b>] ? tcp_poll+0xaf/0x168
>  [<ffffffff810ff814>] ? ep_send_events_proc+0x67/0x116
>  [<ffffffff8131b8ef>] sock_poll+0x15/0x17
>  [<ffffffff810ff823>] ep_send_events_proc+0x76/0x116
>  [<ffffffff810ff7ad>] ? ep_read_events_proc+0x99/0x99
>  [<ffffffff810fff24>] ep_scan_ready_list.clone.6+0x8f/0x16f
>  [<ffffffff81100277>] ep_poll+0x25f/0x2e2
>  [<ffffffff8131e886>] ? sys_accept4+0x133/0x15f
>  [<ffffffff81100d8e>] sys_epoll_wait+0x90/0xae
>  [<ffffffff813bf2e6>] system_call_fastpath+0x1a/0x1f
> Code: 8b 57 60 48 89 44 24 10 8b 87 ac 00 00 00 48 89 44 24 08 31 c0
> 8b bf a8 00 00 00 48 89 3c 24 48 c7 c7 43 07 4d 81 e8 2c 27 09 00 <0f>
> 0b 89 c0 49 8d 04 00 c9 c3 55 48 89 e5 41 57 41 56 41 55 41
> RIP  [<ffffffff8132414a>] skb_put+0x7c/0x86
>  RSP <ffff8800bed83d60>
> ---[ end trace a721715cd86be064 ]---

Thanks a lot, I belive I know where the problem is.

Could you check if commit a21d45726acacc963d8baddf74607d9b74e2b723
(tcp: avoid order-1 allocations on wifi and tx path)
was in your tree ?




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-17 20:45     ` Eric Dumazet
@ 2012-04-17 20:47       ` David Miller
  2012-04-17 20:53         ` Eric Dumazet
  2012-04-18  6:37       ` Alexander Beregalov
  1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2012-04-17 20:47 UTC (permalink / raw)
  To: eric.dumazet; +Cc: a.beregalov, netdev, linux-kernel

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 17 Apr 2012 22:45:06 +0200

> Could you check if commit a21d45726acacc963d8baddf74607d9b74e2b723
> (tcp: avoid order-1 allocations on wifi and tx path)
> was in your tree ?

I was about to say that I think this is the guilty commit too.

Good thing I held off the -stable submission of that change
for a bit :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-17 20:47       ` David Miller
@ 2012-04-17 20:53         ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-04-17 20:53 UTC (permalink / raw)
  To: David Miller; +Cc: a.beregalov, netdev, linux-kernel

On Tue, 2012-04-17 at 16:47 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 17 Apr 2012 22:45:06 +0200
> 
> > Could you check if commit a21d45726acacc963d8baddf74607d9b74e2b723
> > (tcp: avoid order-1 allocations on wifi and tx path)
> > was in your tree ?
> 
> I was about to say that I think this is the guilty commit too.
> 
> Good thing I held off the -stable submission of that change
> for a bit :-)

Fix should be easy I think, but yes you can hold stable submission of
course.

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 376b2cf..7ac6423 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1096,6 +1096,7 @@ static void __pskb_trim_head(struct sk_buff *skb, int len)
 	eat = min_t(int, len, skb_headlen(skb));
 	if (eat) {
 		__skb_pull(skb, eat);
+		skb->avail_size -= eat;
 		len -= eat;
 		if (!len)
 			return;



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-17 20:45     ` Eric Dumazet
  2012-04-17 20:47       ` David Miller
@ 2012-04-18  6:37       ` Alexander Beregalov
  2012-04-18  7:29         ` Eric Dumazet
  1 sibling, 1 reply; 11+ messages in thread
From: Alexander Beregalov @ 2012-04-18  6:37 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Linux Kernel Mailing List

On 18 April 2012 00:45, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2012-04-18 at 00:15 +0400, Alexander Beregalov wrote:
>> On 15 April 2012 19:16, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Sun, 2012-04-15 at 16:24 +0400, Alexander Beregalov wrote:
>> >> Hi
>> >>
>> >> kernel 3.4.0-rc2-00333-g668ce0a
>> >>
>> >> calltrace is lost on netconsole, sorry.
>> >>
>> >> ethernet is realtek/r8169
>> >>
>> >> It happened already two times with rtorrent, perhaps I can reproduce
>> >> it, what else can I provide to you?
>> >>
>> > full stack trace needed please.
>> >
>>
>> This time calltrace is different (I saw few last lines of calltrace on
>> a display, but not enough) and netconsole transmitted complete
>> message, but perhaps it is the same problem. At least 'end' is the
>> same.
>>
>>
>> skb_over_panic: text:ffffffff8136d919 len:1248 put:932
>> head:ffff8800babcf800 data:ffff8800babcfd10 tail:0x9f0 end:0x6c0
>> dev:<NULL>
>> ------------[ cut here ]------------
>> kernel BUG at net/core/skbuff.c:127!
>> invalid opcode: 0000 [#1] SMP
>> CPU 3
>> Modules linked in:
>>
>> Pid: 1926, comm: rtorrent Not tainted 3.4.0-rc2-00333-g668ce0a #1
>>             /D525MW
>> RIP: 0010:[<ffffffff8132414a>]  [<ffffffff8132414a>] skb_put+0x7c/0x86
>> RSP: 0018:ffff8800bed83d60  EFLAGS: 00010246
>> RAX: 0000000000000098 RBX: ffff8800b0b244c0 RCX: 000000000000003d
>> RDX: 000000000000000d RSI: 0000000000000046 RDI: ffffffff8162a0b0
>> RBP: ffff8800bed83d80 R08: 0000000000000001 R09: 0000000000000000
>> R10: ffff88002ed840c0 R11: 00000000007e28f6 R12: ffff8800b9cb9180
>> R13: ffff8800b9cb96c0 R14: ffff8800b9cb91a8 R15: ffff8800b0b245a0
>> FS:  00007fe639e75720(0000) GS:ffff8800bed80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007fe63571a000 CR3: 00000000b53fa000 CR4: 00000000000007e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process rtorrent (pid: 1926, threadinfo ffff8800bb0a0000, task ffff8800bb1dddc0)
>> Stack:
>>  00000000000009f0 00000000000006c0 ffffffff814d06e7 ffff8800b9cb9180
>>  ffff8800bed83de0 ffffffff8136d919 ffff8800000003a4 0000013c81536640
>>  ffff8800b0b245a0 000000cc00000006 ffff8800b0b244c0 ffff8800b0b244c0
>> Call Trace:
>>  <IRQ>
>>  [<ffffffff8136d919>] tcp_retransmit_skb+0x29a/0x529
>>  [<ffffffff8136eebd>] tcp_retransmit_timer+0x358/0x4e7
>>  [<ffffffff8136f0e8>] tcp_write_timer+0x9c/0x17d
>>  [<ffffffff81035bd0>] run_timer_softirq+0x1eb/0x2bf
>>  [<ffffffff8136f04c>] ? tcp_retransmit_timer+0x4e7/0x4e7
>>  [<ffffffff810184e9>] ? native_smp_send_reschedule+0x4f/0x51
>>  [<ffffffff8102fb52>] __do_softirq+0xbf/0x17d
>>  [<ffffffff810188ec>] ? lapic_next_event+0x18/0x1c
>>  [<ffffffff813c05cc>] call_softirq+0x1c/0x30
>>  [<ffffffff810033d6>] do_softirq+0x33/0x69
>>  [<ffffffff8102fdc6>] irq_exit+0x44/0x9c
>>  [<ffffffff81018c6f>] smp_apic_timer_interrupt+0x86/0x94
>>  [<ffffffff813bfe47>] apic_timer_interrupt+0x67/0x70
>>  <EOI>
>>  [<ffffffff8131d87e>] ? sock_sendmsg+0xe6/0x106
>>  [<ffffffff8135fb7b>] ? tcp_poll+0xaf/0x168
>>  [<ffffffff810ff814>] ? ep_send_events_proc+0x67/0x116
>>  [<ffffffff8131b8ef>] sock_poll+0x15/0x17
>>  [<ffffffff810ff823>] ep_send_events_proc+0x76/0x116
>>  [<ffffffff810ff7ad>] ? ep_read_events_proc+0x99/0x99
>>  [<ffffffff810fff24>] ep_scan_ready_list.clone.6+0x8f/0x16f
>>  [<ffffffff81100277>] ep_poll+0x25f/0x2e2
>>  [<ffffffff8131e886>] ? sys_accept4+0x133/0x15f
>>  [<ffffffff81100d8e>] sys_epoll_wait+0x90/0xae
>>  [<ffffffff813bf2e6>] system_call_fastpath+0x1a/0x1f
>> Code: 8b 57 60 48 89 44 24 10 8b 87 ac 00 00 00 48 89 44 24 08 31 c0
>> 8b bf a8 00 00 00 48 89 3c 24 48 c7 c7 43 07 4d 81 e8 2c 27 09 00 <0f>
>> 0b 89 c0 49 8d 04 00 c9 c3 55 48 89 e5 41 57 41 56 41 55 41
>> RIP  [<ffffffff8132414a>] skb_put+0x7c/0x86
>>  RSP <ffff8800bed83d60>
>> ---[ end trace a721715cd86be064 ]---
>
> Thanks a lot, I belive I know where the problem is.
>
> Could you check if commit a21d45726acacc963d8baddf74607d9b74e2b723
> (tcp: avoid order-1 allocations on wifi and tx path)
> was in your tree ?
>

It was,
git show a21d4572 v3.4-rc2..668ce0a
shows it

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-18  6:37       ` Alexander Beregalov
@ 2012-04-18  7:29         ` Eric Dumazet
  2012-04-18  7:54           ` Alexander Beregalov
  2012-04-18 20:14           ` [PATCH] tcp: fix retransmit of partially acked frames Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-04-18  7:29 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: netdev, Linux Kernel Mailing List

On Wed, 2012-04-18 at 10:37 +0400, Alexander Beregalov wrote:
> On 18 April 2012 00:45, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Wed, 2012-04-18 at 00:15 +0400, Alexander Beregalov wrote:
> >> On 15 April 2012 19:16, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> > On Sun, 2012-04-15 at 16:24 +0400, Alexander Beregalov wrote:
> >> >> Hi
> >> >>
> >> >> kernel 3.4.0-rc2-00333-g668ce0a
> >> >>
> >> >> calltrace is lost on netconsole, sorry.
> >> >>
> >> >> ethernet is realtek/r8169
> >> >>
> >> >> It happened already two times with rtorrent, perhaps I can reproduce
> >> >> it, what else can I provide to you?
> >> >>
> >> > full stack trace needed please.
> >> >
> >>
> >> This time calltrace is different (I saw few last lines of calltrace on
> >> a display, but not enough) and netconsole transmitted complete
> >> message, but perhaps it is the same problem. At least 'end' is the
> >> same.
> >>
> >>
> >> skb_over_panic: text:ffffffff8136d919 len:1248 put:932
> >> head:ffff8800babcf800 data:ffff8800babcfd10 tail:0x9f0 end:0x6c0
> >> dev:<NULL>
> >> ------------[ cut here ]------------
> >> kernel BUG at net/core/skbuff.c:127!
> >> invalid opcode: 0000 [#1] SMP
> >> CPU 3
> >> Modules linked in:
> >>
> >> Pid: 1926, comm: rtorrent Not tainted 3.4.0-rc2-00333-g668ce0a #1
> >>             /D525MW
> >> RIP: 0010:[<ffffffff8132414a>]  [<ffffffff8132414a>] skb_put+0x7c/0x86
> >> RSP: 0018:ffff8800bed83d60  EFLAGS: 00010246
> >> RAX: 0000000000000098 RBX: ffff8800b0b244c0 RCX: 000000000000003d
> >> RDX: 000000000000000d RSI: 0000000000000046 RDI: ffffffff8162a0b0
> >> RBP: ffff8800bed83d80 R08: 0000000000000001 R09: 0000000000000000
> >> R10: ffff88002ed840c0 R11: 00000000007e28f6 R12: ffff8800b9cb9180
> >> R13: ffff8800b9cb96c0 R14: ffff8800b9cb91a8 R15: ffff8800b0b245a0
> >> FS:  00007fe639e75720(0000) GS:ffff8800bed80000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 00007fe63571a000 CR3: 00000000b53fa000 CR4: 00000000000007e0
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> Process rtorrent (pid: 1926, threadinfo ffff8800bb0a0000, task ffff8800bb1dddc0)
> >> Stack:
> >>  00000000000009f0 00000000000006c0 ffffffff814d06e7 ffff8800b9cb9180
> >>  ffff8800bed83de0 ffffffff8136d919 ffff8800000003a4 0000013c81536640
> >>  ffff8800b0b245a0 000000cc00000006 ffff8800b0b244c0 ffff8800b0b244c0
> >> Call Trace:
> >>  <IRQ>
> >>  [<ffffffff8136d919>] tcp_retransmit_skb+0x29a/0x529
> >>  [<ffffffff8136eebd>] tcp_retransmit_timer+0x358/0x4e7
> >>  [<ffffffff8136f0e8>] tcp_write_timer+0x9c/0x17d
> >>  [<ffffffff81035bd0>] run_timer_softirq+0x1eb/0x2bf
> >>  [<ffffffff8136f04c>] ? tcp_retransmit_timer+0x4e7/0x4e7
> >>  [<ffffffff810184e9>] ? native_smp_send_reschedule+0x4f/0x51
> >>  [<ffffffff8102fb52>] __do_softirq+0xbf/0x17d
> >>  [<ffffffff810188ec>] ? lapic_next_event+0x18/0x1c
> >>  [<ffffffff813c05cc>] call_softirq+0x1c/0x30
> >>  [<ffffffff810033d6>] do_softirq+0x33/0x69
> >>  [<ffffffff8102fdc6>] irq_exit+0x44/0x9c
> >>  [<ffffffff81018c6f>] smp_apic_timer_interrupt+0x86/0x94
> >>  [<ffffffff813bfe47>] apic_timer_interrupt+0x67/0x70
> >>  <EOI>
> >>  [<ffffffff8131d87e>] ? sock_sendmsg+0xe6/0x106
> >>  [<ffffffff8135fb7b>] ? tcp_poll+0xaf/0x168
> >>  [<ffffffff810ff814>] ? ep_send_events_proc+0x67/0x116
> >>  [<ffffffff8131b8ef>] sock_poll+0x15/0x17
> >>  [<ffffffff810ff823>] ep_send_events_proc+0x76/0x116
> >>  [<ffffffff810ff7ad>] ? ep_read_events_proc+0x99/0x99
> >>  [<ffffffff810fff24>] ep_scan_ready_list.clone.6+0x8f/0x16f
> >>  [<ffffffff81100277>] ep_poll+0x25f/0x2e2
> >>  [<ffffffff8131e886>] ? sys_accept4+0x133/0x15f
> >>  [<ffffffff81100d8e>] sys_epoll_wait+0x90/0xae
> >>  [<ffffffff813bf2e6>] system_call_fastpath+0x1a/0x1f
> >> Code: 8b 57 60 48 89 44 24 10 8b 87 ac 00 00 00 48 89 44 24 08 31 c0
> >> 8b bf a8 00 00 00 48 89 3c 24 48 c7 c7 43 07 4d 81 e8 2c 27 09 00 <0f>
> >> 0b 89 c0 49 8d 04 00 c9 c3 55 48 89 e5 41 57 41 56 41 55 41
> >> RIP  [<ffffffff8132414a>] skb_put+0x7c/0x86
> >>  RSP <ffff8800bed83d60>
> >> ---[ end trace a721715cd86be064 ]---
> >
> > Thanks a lot, I belive I know where the problem is.
> >
> > Could you check if commit a21d45726acacc963d8baddf74607d9b74e2b723
> > (tcp: avoid order-1 allocations on wifi and tx path)
> > was in your tree ?
> >
> 
> It was,
> git show a21d4572 v3.4-rc2..668ce0a
> shows it

Thansk for the confirmation.

Had you see the patch I sent some hours ago, and can you test it ?

If not, I probably can reproduce the problem in my lab.

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 376b2cf..7ac6423 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1096,6 +1096,7 @@ static void __pskb_trim_head(struct sk_buff *skb, int len)
 	eat = min_t(int, len, skb_headlen(skb));
 	if (eat) {
 		__skb_pull(skb, eat);
+		skb->avail_size -= eat;
 		len -= eat;
 		if (!len)
 			return;



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: 3.4.0-rc2: skb_put() -> skb_over_panic
  2012-04-18  7:29         ` Eric Dumazet
@ 2012-04-18  7:54           ` Alexander Beregalov
  2012-04-18 20:14           ` [PATCH] tcp: fix retransmit of partially acked frames Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Alexander Beregalov @ 2012-04-18  7:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Linux Kernel Mailing List

>> > Thanks a lot, I belive I know where the problem is.
>> >
>> > Could you check if commit a21d45726acacc963d8baddf74607d9b74e2b723
>> > (tcp: avoid order-1 allocations on wifi and tx path)
>> > was in your tree ?
>> >
>>
>> It was,
>> git show a21d4572 v3.4-rc2..668ce0a
>> shows it
>
> Thansk for the confirmation.
>
> Had you see the patch I sent some hours ago, and can you test it ?
>
> If not, I probably can reproduce the problem in my lab.
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 376b2cf..7ac6423 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1096,6 +1096,7 @@ static void __pskb_trim_head(struct sk_buff *skb, int len)
>        eat = min_t(int, len, skb_headlen(skb));
>        if (eat) {
>                __skb_pull(skb, eat);
> +               skb->avail_size -= eat;
>                len -= eat;
>                if (!len)
>                        return;
>
>

Yes, I am testing it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] tcp: fix retransmit of partially acked frames
  2012-04-18  7:29         ` Eric Dumazet
  2012-04-18  7:54           ` Alexander Beregalov
@ 2012-04-18 20:14           ` Eric Dumazet
  2012-04-18 20:54             ` David Miller
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-04-18 20:14 UTC (permalink / raw)
  To: Alexander Beregalov, David Miller
  Cc: netdev, Linux Kernel Mailing List, Marc MERLIN

From: Eric Dumazet <edumazet@google.com>

Alexander Beregalov reported skb_over_panic errors and provided stack
trace.

I occurs commit a21d45726aca (tcp: avoid order-1 allocations on wifi and
tx path) added a regression, when a retransmit is done after a partial
ACK.

tcp_retransmit_skb() tries to aggregate several frames if the first one
has enough available room to hold the following ones payload. This is
controlled by /proc/sys/net/ipv4/tcp_retrans_collapse tunable (default :
enabled)

Problem is we must make sure _pskb_trim_head() doesnt fool
skb_availroom() when pulling some bytes from skb (this pull is done when
receiver ACK part of the frame).

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Marc MERLIN <marc@merlins.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_output.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 376b2cf..7ac6423 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1096,6 +1096,7 @@ static void __pskb_trim_head(struct sk_buff *skb, int len)
 	eat = min_t(int, len, skb_headlen(skb));
 	if (eat) {
 		__skb_pull(skb, eat);
+		skb->avail_size -= eat;
 		len -= eat;
 		if (!len)
 			return;



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] tcp: fix retransmit of partially acked frames
  2012-04-18 20:14           ` [PATCH] tcp: fix retransmit of partially acked frames Eric Dumazet
@ 2012-04-18 20:54             ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2012-04-18 20:54 UTC (permalink / raw)
  To: eric.dumazet; +Cc: a.beregalov, netdev, linux-kernel, marc

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 18 Apr 2012 22:14:23 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Alexander Beregalov reported skb_over_panic errors and provided stack
> trace.
> 
> I occurs commit a21d45726aca (tcp: avoid order-1 allocations on wifi and
> tx path) added a regression, when a retransmit is done after a partial
> ACK.
> 
> tcp_retransmit_skb() tries to aggregate several frames if the first one
> has enough available room to hold the following ones payload. This is
> controlled by /proc/sys/net/ipv4/tcp_retrans_collapse tunable (default :
> enabled)
> 
> Problem is we must make sure _pskb_trim_head() doesnt fool
> skb_availroom() when pulling some bytes from skb (this pull is done when
> receiver ACK part of the frame).
> 
> Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
> Cc: Marc MERLIN <marc@merlins.org>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-04-18 20:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-15 12:24 3.4.0-rc2: skb_put() -> skb_over_panic Alexander Beregalov
2012-04-15 15:16 ` Eric Dumazet
2012-04-17 20:15   ` Alexander Beregalov
2012-04-17 20:45     ` Eric Dumazet
2012-04-17 20:47       ` David Miller
2012-04-17 20:53         ` Eric Dumazet
2012-04-18  6:37       ` Alexander Beregalov
2012-04-18  7:29         ` Eric Dumazet
2012-04-18  7:54           ` Alexander Beregalov
2012-04-18 20:14           ` [PATCH] tcp: fix retransmit of partially acked frames Eric Dumazet
2012-04-18 20:54             ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).