netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
       [not found] <529376a4-cf63-f225-ce7c-4747e9966938@tessares.net>
@ 2019-08-24  6:03 ` Tim Froidcoeur
  2019-08-24 22:05   ` Jonathan Lemon
  2019-08-31 12:20   ` Sasha Levin
  0 siblings, 2 replies; 11+ messages in thread
From: Tim Froidcoeur @ 2019-08-24  6:03 UTC (permalink / raw)
  To: matthieu.baerts
  Cc: aprout, cpaasch, davem, edumazet, gregkh, jonathan.lemon, jtl,
	linux-kernel, mkubecek, ncardwell, sashal, stable,
	tim.froidcoeur, ycheng, netdev

Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
triggers following stack trace:

[25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
[25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
[25244.888167] Call Trace:
[25244.889182]  <IRQ>
[25244.890001]  tcp_fragment+0x9c/0x2cf
[25244.891295]  tcp_write_xmit+0x68f/0x988
[25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
[25244.894347]  tcp_data_snd_check+0x2a/0xc8
[25244.895775]  tcp_rcv_established+0x2a8/0x30d
[25244.897282]  tcp_v4_do_rcv+0xb2/0x158
[25244.898666]  tcp_v4_rcv+0x692/0x956
[25244.899959]  ip_local_deliver_finish+0xeb/0x169
[25244.901547]  __netif_receive_skb_core+0x51c/0x582
[25244.903193]  ? inet_gro_receive+0x239/0x247
[25244.904756]  netif_receive_skb_internal+0xab/0xc6
[25244.906395]  napi_gro_receive+0x8a/0xc0
[25244.907760]  receive_buf+0x9a1/0x9cd
[25244.909160]  ? load_balance+0x17a/0x7b7
[25244.910536]  ? vring_unmap_one+0x18/0x61
[25244.911932]  ? detach_buf+0x60/0xfa
[25244.913234]  virtnet_poll+0x128/0x1e1
[25244.914607]  net_rx_action+0x12a/0x2b1
[25244.915953]  __do_softirq+0x11c/0x26b
[25244.917269]  ? handle_irq_event+0x44/0x56
[25244.918695]  irq_exit+0x61/0xa0
[25244.919947]  do_IRQ+0x9d/0xbb
[25244.921065]  common_interrupt+0x85/0x85
[25244.922479]  </IRQ>

tcp_rtx_queue_tail() (called by tcp_fragment()) can call
tcp_write_queue_prev() on the first packet in the queue, which will trigger
the BUG in tcp_write_queue_prev(), because there is no previous packet.

This happens when the retransmit queue is empty, for example in case of a
zero window.

Patch is needed for 4.4, 4.9 and 4.14 stable branches.

Fixes: 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
Change-Id: I839bde7167ae59e2f7d916c913507372445765c5
Signed-off-by: Tim Froidcoeur <tim.froidcoeur@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
---
 include/net/tcp.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 9de2c8cdcc51..1e70ca75c8bf 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1705,6 +1705,10 @@ static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
 {
 	struct sk_buff *skb = tcp_send_head(sk);

+	/* empty retransmit queue, for example due to zero window */
+	if (skb == tcp_write_queue_head(sk))
+		return NULL;
+
 	return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
 }

--
2.23.0


-- 


Disclaimer: https://www.tessares.net/mail-disclaimer/ 
<https://www.tessares.net/mail-disclaimer/>



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-24  6:03 ` [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue Tim Froidcoeur
@ 2019-08-24 22:05   ` Jonathan Lemon
  2019-08-30 23:26     ` Christoph Paasch
  2019-08-31 12:20   ` Sasha Levin
  1 sibling, 1 reply; 11+ messages in thread
From: Jonathan Lemon @ 2019-08-24 22:05 UTC (permalink / raw)
  To: Tim Froidcoeur
  Cc: matthieu.baerts, aprout, cpaasch, davem, edumazet, gregkh, jtl,
	linux-kernel, mkubecek, ncardwell, sashal, stable, ycheng,
	netdev



On 23 Aug 2019, at 23:03, Tim Froidcoeur wrote:

> Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
> triggers following stack trace:
>
> [25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
> [25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
> [25244.888167] Call Trace:
> [25244.889182]  <IRQ>
> [25244.890001]  tcp_fragment+0x9c/0x2cf
> [25244.891295]  tcp_write_xmit+0x68f/0x988
> [25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
> [25244.894347]  tcp_data_snd_check+0x2a/0xc8
> [25244.895775]  tcp_rcv_established+0x2a8/0x30d
> [25244.897282]  tcp_v4_do_rcv+0xb2/0x158
> [25244.898666]  tcp_v4_rcv+0x692/0x956
> [25244.899959]  ip_local_deliver_finish+0xeb/0x169
> [25244.901547]  __netif_receive_skb_core+0x51c/0x582
> [25244.903193]  ? inet_gro_receive+0x239/0x247
> [25244.904756]  netif_receive_skb_internal+0xab/0xc6
> [25244.906395]  napi_gro_receive+0x8a/0xc0
> [25244.907760]  receive_buf+0x9a1/0x9cd
> [25244.909160]  ? load_balance+0x17a/0x7b7
> [25244.910536]  ? vring_unmap_one+0x18/0x61
> [25244.911932]  ? detach_buf+0x60/0xfa
> [25244.913234]  virtnet_poll+0x128/0x1e1
> [25244.914607]  net_rx_action+0x12a/0x2b1
> [25244.915953]  __do_softirq+0x11c/0x26b
> [25244.917269]  ? handle_irq_event+0x44/0x56
> [25244.918695]  irq_exit+0x61/0xa0
> [25244.919947]  do_IRQ+0x9d/0xbb
> [25244.921065]  common_interrupt+0x85/0x85
> [25244.922479]  </IRQ>
>
> tcp_rtx_queue_tail() (called by tcp_fragment()) can call
> tcp_write_queue_prev() on the first packet in the queue, which will trigger
> the BUG in tcp_write_queue_prev(), because there is no previous packet.
>
> This happens when the retransmit queue is empty, for example in case of a
> zero window.
>
> Patch is needed for 4.4, 4.9 and 4.14 stable branches.
>
> Fixes: 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
> Change-Id: I839bde7167ae59e2f7d916c913507372445765c5
> Signed-off-by: Tim Froidcoeur <tim.froidcoeur@tessares.net>
> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Reviewed-by: Christoph Paasch <cpaasch@apple.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-24 22:05   ` Jonathan Lemon
@ 2019-08-30 23:26     ` Christoph Paasch
  2019-08-31  2:20       ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Paasch @ 2019-08-30 23:26 UTC (permalink / raw)
  To: Jonathan Lemon, stable, gregkh
  Cc: Tim Froidcoeur, matthieu.baerts, aprout, davem, edumazet, jtl,
	linux-kernel, mkubecek, ncardwell, sashal, ycheng, netdev

Hello,

On 24/08/19 - 15:05:20, Jonathan Lemon wrote:
> 
> 
> On 23 Aug 2019, at 23:03, Tim Froidcoeur wrote:
> 
> > Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
> > triggers following stack trace:
> >
> > [25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
> > [25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
> > [25244.888167] Call Trace:
> > [25244.889182]  <IRQ>
> > [25244.890001]  tcp_fragment+0x9c/0x2cf
> > [25244.891295]  tcp_write_xmit+0x68f/0x988
> > [25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
> > [25244.894347]  tcp_data_snd_check+0x2a/0xc8
> > [25244.895775]  tcp_rcv_established+0x2a8/0x30d
> > [25244.897282]  tcp_v4_do_rcv+0xb2/0x158
> > [25244.898666]  tcp_v4_rcv+0x692/0x956
> > [25244.899959]  ip_local_deliver_finish+0xeb/0x169
> > [25244.901547]  __netif_receive_skb_core+0x51c/0x582
> > [25244.903193]  ? inet_gro_receive+0x239/0x247
> > [25244.904756]  netif_receive_skb_internal+0xab/0xc6
> > [25244.906395]  napi_gro_receive+0x8a/0xc0
> > [25244.907760]  receive_buf+0x9a1/0x9cd
> > [25244.909160]  ? load_balance+0x17a/0x7b7
> > [25244.910536]  ? vring_unmap_one+0x18/0x61
> > [25244.911932]  ? detach_buf+0x60/0xfa
> > [25244.913234]  virtnet_poll+0x128/0x1e1
> > [25244.914607]  net_rx_action+0x12a/0x2b1
> > [25244.915953]  __do_softirq+0x11c/0x26b
> > [25244.917269]  ? handle_irq_event+0x44/0x56
> > [25244.918695]  irq_exit+0x61/0xa0
> > [25244.919947]  do_IRQ+0x9d/0xbb
> > [25244.921065]  common_interrupt+0x85/0x85
> > [25244.922479]  </IRQ>
> >
> > tcp_rtx_queue_tail() (called by tcp_fragment()) can call
> > tcp_write_queue_prev() on the first packet in the queue, which will trigger
> > the BUG in tcp_write_queue_prev(), because there is no previous packet.
> >
> > This happens when the retransmit queue is empty, for example in case of a
> > zero window.
> >
> > Patch is needed for 4.4, 4.9 and 4.14 stable branches.
> >
> > Fixes: 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
> > Change-Id: I839bde7167ae59e2f7d916c913507372445765c5
> > Signed-off-by: Tim Froidcoeur <tim.froidcoeur@tessares.net>
> > Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> > Reviewed-by: Christoph Paasch <cpaasch@apple.com>
> 
> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>

just checking in, if the patch is getting picked up for -stable ?

(I don't see it in the stable-queue)


Thanks,
Christoph



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-30 23:26     ` Christoph Paasch
@ 2019-08-31  2:20       ` David Miller
  2019-08-31 10:53         ` maowenan
  2019-08-31 11:44         ` maowenan
  0 siblings, 2 replies; 11+ messages in thread
From: David Miller @ 2019-08-31  2:20 UTC (permalink / raw)
  To: cpaasch
  Cc: jonathan.lemon, stable, gregkh, tim.froidcoeur, matthieu.baerts,
	aprout, edumazet, jtl, linux-kernel, mkubecek, ncardwell, sashal,
	ycheng, netdev

From: Christoph Paasch <cpaasch@apple.com>
Date: Fri, 30 Aug 2019 16:26:57 -0700

> (I don't see it in the stable-queue)

I don't handle any stable branch before the most recent two, so this isn't
my territory.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-31  2:20       ` David Miller
@ 2019-08-31 10:53         ` maowenan
  2019-08-31 11:44         ` maowenan
  1 sibling, 0 replies; 11+ messages in thread
From: maowenan @ 2019-08-31 10:53 UTC (permalink / raw)
  To: David Miller, cpaasch
  Cc: jonathan.lemon, stable, gregkh, tim.froidcoeur, matthieu.baerts,
	aprout, edumazet, jtl, linux-kernel, mkubecek, ncardwell, sashal,
	ycheng, netdev

[<ffffff90094930dc>] skb_peek_tail include/linux/skbuff.h:1516 [inline]
[<ffffff90094930dc>] tcp_write_queue_tail include/net/tcp.h:1478 [inline]
[<ffffff90094930dc>] tcp_rtx_queue_tail include/net/tcp.h:1543 [inline]
[<ffffff90094930dc>] tcp_fragment+0xc64/0xce8 net/ipv4/tcp_output.c:1175
[<ffffff90094a37f0>] tcp_write_wakeup+0x3f8/0x4a0 net/ipv4/tcp_output.c:3496
[<ffffff90094a38d0>] tcp_send_probe0+0x38/0x2d8 net/ipv4/tcp_output.c:3523
[<ffffff90094a75a0>] tcp_probe_timer net/ipv4/tcp_timer.c:343 [inline]
[<ffffff90094a75a0>] tcp_write_timer_handler+0x640/0x720 net/ipv4/tcp_timer.c:548
[<ffffff90094a76f8>] tcp_write_timer+0x78/0x1d0 net/ipv4/tcp_timer.c:562
[<ffffff90082610b0>] call_timer_fn.isra.1+0x58/0x180 kernel/time/timer.c:1144
[<ffffff90082616ec>] __run_timers kernel/time/timer.c:1218 [inline]
[<ffffff90082616ec>] run_timer_softirq+0x514/0x848 kernel/time/timer.c:1401
[<ffffff9008141a28>] __do_softirq+0x340/0x878 kernel/softirq.c:273
[<ffffff9008142890>] do_softirq_own_stack include/linux/interrupt.h:498 [inline]
[<ffffff9008142890>] invoke_softirq kernel/softirq.c:357 [inline]
[<ffffff9008142890>] irq_exit+0x170/0x370 kernel/softirq.c:391
[<ffffff900821d550>] __handle_domain_irq+0x100/0x1c0 kernel/irq/irqdesc.c:459
[<ffffff90080914a0>] handle_domain_irq include/linux/irqdesc.h:168 [inline]
[<ffffff90080914a0>] gic_handle_irq+0xd0/0x1f0 drivers/irqchip/irq-gic.c:377

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-31  2:20       ` David Miller
  2019-08-31 10:53         ` maowenan
@ 2019-08-31 11:44         ` maowenan
       [not found]           ` <CAOj+RUsqTUF9fuetskRRw26Z=sBM-mELSMcV21Ch06007aP5yQ@mail.gmail.com>
  1 sibling, 1 reply; 11+ messages in thread
From: maowenan @ 2019-08-31 11:44 UTC (permalink / raw)
  To: David Miller, cpaasch
  Cc: jonathan.lemon, stable, gregkh, tim.froidcoeur, matthieu.baerts,
	aprout, edumazet, jtl, linux-kernel, mkubecek, ncardwell, sashal,
	ycheng, netdev

Tim 

 Can you share the reproduce steps for this issue? C or syzkaller is ok.
 Thanks a lot.
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-24  6:03 ` [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue Tim Froidcoeur
  2019-08-24 22:05   ` Jonathan Lemon
@ 2019-08-31 12:20   ` Sasha Levin
  2019-08-31 13:14     ` Matthieu Baerts
  1 sibling, 1 reply; 11+ messages in thread
From: Sasha Levin @ 2019-08-31 12:20 UTC (permalink / raw)
  To: Tim Froidcoeur
  Cc: matthieu.baerts, aprout, cpaasch, davem, edumazet, gregkh,
	jonathan.lemon, jtl, linux-kernel, mkubecek, ncardwell, stable,
	ycheng, netdev

On Sat, Aug 24, 2019 at 08:03:51AM +0200, Tim Froidcoeur wrote:
>Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
>triggers following stack trace:
>
>[25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
>[25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
>[25244.888167] Call Trace:
>[25244.889182]  <IRQ>
>[25244.890001]  tcp_fragment+0x9c/0x2cf
>[25244.891295]  tcp_write_xmit+0x68f/0x988
>[25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
>[25244.894347]  tcp_data_snd_check+0x2a/0xc8
>[25244.895775]  tcp_rcv_established+0x2a8/0x30d
>[25244.897282]  tcp_v4_do_rcv+0xb2/0x158
>[25244.898666]  tcp_v4_rcv+0x692/0x956
>[25244.899959]  ip_local_deliver_finish+0xeb/0x169
>[25244.901547]  __netif_receive_skb_core+0x51c/0x582
>[25244.903193]  ? inet_gro_receive+0x239/0x247
>[25244.904756]  netif_receive_skb_internal+0xab/0xc6
>[25244.906395]  napi_gro_receive+0x8a/0xc0
>[25244.907760]  receive_buf+0x9a1/0x9cd
>[25244.909160]  ? load_balance+0x17a/0x7b7
>[25244.910536]  ? vring_unmap_one+0x18/0x61
>[25244.911932]  ? detach_buf+0x60/0xfa
>[25244.913234]  virtnet_poll+0x128/0x1e1
>[25244.914607]  net_rx_action+0x12a/0x2b1
>[25244.915953]  __do_softirq+0x11c/0x26b
>[25244.917269]  ? handle_irq_event+0x44/0x56
>[25244.918695]  irq_exit+0x61/0xa0
>[25244.919947]  do_IRQ+0x9d/0xbb
>[25244.921065]  common_interrupt+0x85/0x85
>[25244.922479]  </IRQ>
>
>tcp_rtx_queue_tail() (called by tcp_fragment()) can call
>tcp_write_queue_prev() on the first packet in the queue, which will trigger
>the BUG in tcp_write_queue_prev(), because there is no previous packet.
>
>This happens when the retransmit queue is empty, for example in case of a
>zero window.
>
>Patch is needed for 4.4, 4.9 and 4.14 stable branches.

There needs to be a better explanation of why it's not needed
upstream...

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-31 12:20   ` Sasha Levin
@ 2019-08-31 13:14     ` Matthieu Baerts
  2019-09-01  0:07       ` Sasha Levin
  0 siblings, 1 reply; 11+ messages in thread
From: Matthieu Baerts @ 2019-08-31 13:14 UTC (permalink / raw)
  To: Sasha Levin, Tim Froidcoeur
  Cc: aprout, cpaasch, davem, edumazet, gregkh, jonathan.lemon, jtl,
	linux-kernel, mkubecek, ncardwell, stable, ycheng, netdev

Hi Sasha,

Thank you for your reply!

On 31/08/2019 14:20, Sasha Levin wrote:
> On Sat, Aug 24, 2019 at 08:03:51AM +0200, Tim Froidcoeur wrote:
>> Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
>> triggers following stack trace:
>>
>> [25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
>> [25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
>> [25244.888167] Call Trace:
>> [25244.889182]  <IRQ>
>> [25244.890001]  tcp_fragment+0x9c/0x2cf
>> [25244.891295]  tcp_write_xmit+0x68f/0x988
>> [25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
>> [25244.894347]  tcp_data_snd_check+0x2a/0xc8
>> [25244.895775]  tcp_rcv_established+0x2a8/0x30d
>> [25244.897282]  tcp_v4_do_rcv+0xb2/0x158
>> [25244.898666]  tcp_v4_rcv+0x692/0x956
>> [25244.899959]  ip_local_deliver_finish+0xeb/0x169
>> [25244.901547]  __netif_receive_skb_core+0x51c/0x582
>> [25244.903193]  ? inet_gro_receive+0x239/0x247
>> [25244.904756]  netif_receive_skb_internal+0xab/0xc6
>> [25244.906395]  napi_gro_receive+0x8a/0xc0
>> [25244.907760]  receive_buf+0x9a1/0x9cd
>> [25244.909160]  ? load_balance+0x17a/0x7b7
>> [25244.910536]  ? vring_unmap_one+0x18/0x61
>> [25244.911932]  ? detach_buf+0x60/0xfa
>> [25244.913234]  virtnet_poll+0x128/0x1e1
>> [25244.914607]  net_rx_action+0x12a/0x2b1
>> [25244.915953]  __do_softirq+0x11c/0x26b
>> [25244.917269]  ? handle_irq_event+0x44/0x56
>> [25244.918695]  irq_exit+0x61/0xa0
>> [25244.919947]  do_IRQ+0x9d/0xbb
>> [25244.921065]  common_interrupt+0x85/0x85
>> [25244.922479]  </IRQ>
>>
>> tcp_rtx_queue_tail() (called by tcp_fragment()) can call
>> tcp_write_queue_prev() on the first packet in the queue, which will
>> trigger
>> the BUG in tcp_write_queue_prev(), because there is no previous packet.
>>
>> This happens when the retransmit queue is empty, for example in case of a
>> zero window.
>>
>> Patch is needed for 4.4, 4.9 and 4.14 stable branches.
> 
> There needs to be a better explanation of why it's not needed
> upstream...

Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()") was not a
simple cherry-pick of the original one from master (b617158dc096)
because there is a specific TCP rtx queue only since v4.15. For more
details, please see the commit message of b617158dc096 ("tcp: be more
careful in tcp_fragment()").

The BUG() is hit due to the specific code added to versions older than
v4.15. The comment in skb_queue_prev() (include/linux/skbuff.h:1406),
just before the BUG_ON() somehow suggests to add a check before using
it, what Tim did.

In master, this code path causing the issue will not be taken because
the implementation of tcp_rtx_queue_tail() is different:

    tcp_fragment() → tcp_rtx_queue_tail() → tcp_write_queue_prev() →
skb_queue_prev() → BUG_ON()

Because this patch is specific to versions older than the two last
stable ones but still linked to the network architecture, who can review
and approve it? :)

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts@tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-08-31 13:14     ` Matthieu Baerts
@ 2019-09-01  0:07       ` Sasha Levin
  0 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2019-09-01  0:07 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: Tim Froidcoeur, aprout, cpaasch, davem, edumazet, gregkh,
	jonathan.lemon, jtl, linux-kernel, mkubecek, ncardwell, stable,
	ycheng, netdev

On Sat, Aug 31, 2019 at 03:14:35PM +0200, Matthieu Baerts wrote:
>Hi Sasha,
>
>Thank you for your reply!
>
>On 31/08/2019 14:20, Sasha Levin wrote:
>> On Sat, Aug 24, 2019 at 08:03:51AM +0200, Tim Froidcoeur wrote:
>>> Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
>>> triggers following stack trace:
>>>
>>> [25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
>>> [25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
>>> [25244.888167] Call Trace:
>>> [25244.889182]  <IRQ>
>>> [25244.890001]  tcp_fragment+0x9c/0x2cf
>>> [25244.891295]  tcp_write_xmit+0x68f/0x988
>>> [25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
>>> [25244.894347]  tcp_data_snd_check+0x2a/0xc8
>>> [25244.895775]  tcp_rcv_established+0x2a8/0x30d
>>> [25244.897282]  tcp_v4_do_rcv+0xb2/0x158
>>> [25244.898666]  tcp_v4_rcv+0x692/0x956
>>> [25244.899959]  ip_local_deliver_finish+0xeb/0x169
>>> [25244.901547]  __netif_receive_skb_core+0x51c/0x582
>>> [25244.903193]  ? inet_gro_receive+0x239/0x247
>>> [25244.904756]  netif_receive_skb_internal+0xab/0xc6
>>> [25244.906395]  napi_gro_receive+0x8a/0xc0
>>> [25244.907760]  receive_buf+0x9a1/0x9cd
>>> [25244.909160]  ? load_balance+0x17a/0x7b7
>>> [25244.910536]  ? vring_unmap_one+0x18/0x61
>>> [25244.911932]  ? detach_buf+0x60/0xfa
>>> [25244.913234]  virtnet_poll+0x128/0x1e1
>>> [25244.914607]  net_rx_action+0x12a/0x2b1
>>> [25244.915953]  __do_softirq+0x11c/0x26b
>>> [25244.917269]  ? handle_irq_event+0x44/0x56
>>> [25244.918695]  irq_exit+0x61/0xa0
>>> [25244.919947]  do_IRQ+0x9d/0xbb
>>> [25244.921065]  common_interrupt+0x85/0x85
>>> [25244.922479]  </IRQ>
>>>
>>> tcp_rtx_queue_tail() (called by tcp_fragment()) can call
>>> tcp_write_queue_prev() on the first packet in the queue, which will
>>> trigger
>>> the BUG in tcp_write_queue_prev(), because there is no previous packet.
>>>
>>> This happens when the retransmit queue is empty, for example in case of a
>>> zero window.
>>>
>>> Patch is needed for 4.4, 4.9 and 4.14 stable branches.
>>
>> There needs to be a better explanation of why it's not needed
>> upstream...
>
>Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()") was not a
>simple cherry-pick of the original one from master (b617158dc096)
>because there is a specific TCP rtx queue only since v4.15. For more
>details, please see the commit message of b617158dc096 ("tcp: be more
>careful in tcp_fragment()").
>
>The BUG() is hit due to the specific code added to versions older than
>v4.15. The comment in skb_queue_prev() (include/linux/skbuff.h:1406),
>just before the BUG_ON() somehow suggests to add a check before using
>it, what Tim did.
>
>In master, this code path causing the issue will not be taken because
>the implementation of tcp_rtx_queue_tail() is different:
>
>    tcp_fragment() → tcp_rtx_queue_tail() → tcp_write_queue_prev() →
>skb_queue_prev() → BUG_ON()
>
>Because this patch is specific to versions older than the two last
>stable ones but still linked to the network architecture, who can review
>and approve it? :)

Thanks for the explanation. I've changed the commit message to include
this explanation and queued it for 4.4, 4.9 and 4.14.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
       [not found]             ` <F95AC9340317A84688A5F0DF0246F3F21AAB8F82@dggeml512-mbx.china.huawei.com>
@ 2019-09-03  6:58               ` Tim Froidcoeur
  2019-09-03  8:55                 ` maowenan
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Froidcoeur @ 2019-09-03  6:58 UTC (permalink / raw)
  To: maowenan
  Cc: David Miller, cpaasch, jonathan.lemon, stable, gregkh,
	matthieu.baerts, aprout, edumazet, jtl, linux-kernel, mkubecek,
	ncardwell, sashal, ycheng, netdev

Hi,

I also tried to reproduce this in a targeted way, and run into the
same difficulty as you: satisfying the first condition “
(sk->sk_wmem_queued >> 1) > limit “.
I will not have bandwidth the coming days to try and reproduce it in
this way. Maybe simply forcing a very small send buffer using sysctl
net.ipv4.tcp_wmem might even do the trick?

I suspect that the bug is easier to trigger with the MPTCP patch like
I did originally, due to the way this patch manages the tcp subflow
buffers (it can temporarily overfill the buffers, satisfying that
first condition more often).

another thing, the stacktrace you shared before seems caused by
another issue (corrupted socket?), it will not be solved by the patch
we submitted.

kind regards,

Tim


On Tue, Sep 3, 2019 at 5:22 AM maowenan <maowenan@huawei.com> wrote:
>
> Hi Tim,
>
>
>
> I try to reproduce it with packetdrill or user application, but I can’t.
>
> The first condition “ (sk->sk_wmem_queued >> 1) > limit “    can’t be satisfied,
>
> This condition is to avoid tiny SO_SNDBUF values set by user.
>
> It also adds the some room due to the fact that tcp_sendmsg()
>
> and tcp_sendpage() might overshoot sk_wmem_queued by about one full
>
> TSO skb (64KB size).
>
>
>
>         limit = sk->sk_sndbuf + 2 * SKB_TRUESIZE(GSO_MAX_SIZE);
>
>         if (unlikely((sk->sk_wmem_queued >> 1) > limit &&
>
>                      skb != tcp_rtx_queue_head(sk) &&
>
>                      skb != tcp_rtx_queue_tail(sk))) {
>
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>
>                 return -ENOMEM;
>
>         }
>
>
>
> Can you try to reproduce it with packetdrill or C socket application?
>
>



-- 
Tim Froidcoeur | R&D engineer HAG
tim.froidcoeur@tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

-- 


Disclaimer: https://www.tessares.net/mail-disclaimer/ 
<https://www.tessares.net/mail-disclaimer/>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
  2019-09-03  6:58               ` Tim Froidcoeur
@ 2019-09-03  8:55                 ` maowenan
  0 siblings, 0 replies; 11+ messages in thread
From: maowenan @ 2019-09-03  8:55 UTC (permalink / raw)
  To: Tim Froidcoeur, eric.dumazet
  Cc: David Miller, cpaasch, jonathan.lemon, stable, gregkh,
	matthieu.baerts, aprout, edumazet, jtl, linux-kernel, mkubecek,
	ncardwell, sashal, ycheng, netdev


On 2019/9/3 14:58, Tim Froidcoeur wrote:
> Hi,
> 
> I also tried to reproduce this in a targeted way, and run into the
> same difficulty as you: satisfying the first condition “
> (sk->sk_wmem_queued >> 1) > limit “.
> I will not have bandwidth the coming days to try and reproduce it in
> this way. Maybe simply forcing a very small send buffer using sysctl
> net.ipv4.tcp_wmem might even do the trick?
> 
> I suspect that the bug is easier to trigger with the MPTCP patch like
> I did originally, due to the way this patch manages the tcp subflow
> buffers (it can temporarily overfill the buffers, satisfying that
> first condition more often).
> 
> another thing, the stacktrace you shared before seems caused by
> another issue (corrupted socket?), it will not be solved by the patch
> we submitted.

The trace shows zero window probe message can be BUG_ON in skb_queue_prev,
this is reproduced on our platform with syzkaller. It can be resolved by
your fix patch.
The thing I need to think is why the first condition can be satisfied?
Eric, Do you have any comments to reproduce it as the first condition
is hard to be true?
(sk->sk_wmem_queued >> 1) > limit

> 
> kind regards,
> 
> Tim
> 
> 
> On Tue, Sep 3, 2019 at 5:22 AM maowenan <maowenan@huawei.com> wrote:
>>
>> Hi Tim,
>>
>>
>>
>> I try to reproduce it with packetdrill or user application, but I can’t.
>>
>> The first condition “ (sk->sk_wmem_queued >> 1) > limit “    can’t be satisfied,
>>
>> This condition is to avoid tiny SO_SNDBUF values set by user.
>>
>> It also adds the some room due to the fact that tcp_sendmsg()
>>
>> and tcp_sendpage() might overshoot sk_wmem_queued by about one full
>>
>> TSO skb (64KB size).
>>
>>
>>
>>         limit = sk->sk_sndbuf + 2 * SKB_TRUESIZE(GSO_MAX_SIZE);
>>
>>         if (unlikely((sk->sk_wmem_queued >> 1) > limit &&
>>
>>                      skb != tcp_rtx_queue_head(sk) &&
>>
>>                      skb != tcp_rtx_queue_tail(sk))) {
>>
>>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>>
>>                 return -ENOMEM;
>>
>>         }
>>
>>
>>
>> Can you try to reproduce it with packetdrill or C socket application?
>>
>>
> 
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-09-03  8:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <529376a4-cf63-f225-ce7c-4747e9966938@tessares.net>
2019-08-24  6:03 ` [PATCH 4.14] tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue Tim Froidcoeur
2019-08-24 22:05   ` Jonathan Lemon
2019-08-30 23:26     ` Christoph Paasch
2019-08-31  2:20       ` David Miller
2019-08-31 10:53         ` maowenan
2019-08-31 11:44         ` maowenan
     [not found]           ` <CAOj+RUsqTUF9fuetskRRw26Z=sBM-mELSMcV21Ch06007aP5yQ@mail.gmail.com>
     [not found]             ` <F95AC9340317A84688A5F0DF0246F3F21AAB8F82@dggeml512-mbx.china.huawei.com>
2019-09-03  6:58               ` Tim Froidcoeur
2019-09-03  8:55                 ` maowenan
2019-08-31 12:20   ` Sasha Levin
2019-08-31 13:14     ` Matthieu Baerts
2019-09-01  0:07       ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).