netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo
@ 2017-12-07 17:43 Neal Cardwell
  2017-12-07 17:43 ` [PATCH net 1/3] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit Neal Cardwell
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Neal Cardwell @ 2017-12-07 17:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell

This patch series has a few minor bug fixes for cases where spurious
loss recoveries can trick BBR estimators into estimating that the
available bandwidth is much lower than the true available bandwidth.
In both cases the fix here is to just reset the estimator upon loss
recovery undo.

Neal Cardwell (3):
  tcp_bbr: record "full bw reached" decision in new full_bw_reached bit
  tcp_bbr: reset full pipe detection on loss recovery undo
  tcp_bbr: reset long-term bandwidth sampling on loss recovery undo

 net/ipv4/tcp_bbr.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-- 
2.15.1.424.g9478a66081-goog

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net 1/3] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit
  2017-12-07 17:43 [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo Neal Cardwell
@ 2017-12-07 17:43 ` Neal Cardwell
  2017-12-07 17:43 ` [PATCH net 2/3] tcp_bbr: reset full pipe detection on loss recovery undo Neal Cardwell
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Neal Cardwell @ 2017-12-07 17:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell

This commit records the "full bw reached" decision in a new
full_bw_reached bit. This is a pure refactor that does not change the
current behavior, but enables subsequent fixes and improvements.

In particular, this enables simple and clean fixes because the full_bw
and full_bw_cnt can be unconditionally zeroed without worrying about
forgetting that we estimated we filled the pipe in Startup. And it
enables future improvements because multiple code paths can be used
for estimating that we filled the pipe in Startup; any new code paths
only need to set this bit when they think the pipe is full.

Note that this fix intentionally reduces the width of the full_bw_cnt
counter, since we have never used the most significant bit.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_bbr.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 69ee877574d0..3089c956b9f9 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -110,7 +110,8 @@ struct bbr {
 	u32	lt_last_lost;	     /* LT intvl start: tp->lost */
 	u32	pacing_gain:10,	/* current gain for setting pacing rate */
 		cwnd_gain:10,	/* current gain for setting cwnd */
-		full_bw_cnt:3,	/* number of rounds without large bw gains */
+		full_bw_reached:1,   /* reached full bw in Startup? */
+		full_bw_cnt:2,	/* number of rounds without large bw gains */
 		cycle_idx:3,	/* current index in pacing_gain cycle array */
 		has_seen_rtt:1, /* have we seen an RTT sample yet? */
 		unused_b:5;
@@ -180,7 +181,7 @@ static bool bbr_full_bw_reached(const struct sock *sk)
 {
 	const struct bbr *bbr = inet_csk_ca(sk);
 
-	return bbr->full_bw_cnt >= bbr_full_bw_cnt;
+	return bbr->full_bw_reached;
 }
 
 /* Return the windowed max recent bandwidth sample, in pkts/uS << BW_SCALE. */
@@ -717,6 +718,7 @@ static void bbr_check_full_bw_reached(struct sock *sk,
 		return;
 	}
 	++bbr->full_bw_cnt;
+	bbr->full_bw_reached = bbr->full_bw_cnt >= bbr_full_bw_cnt;
 }
 
 /* If pipe is probably full, drain the queue and then enter steady-state. */
@@ -850,6 +852,7 @@ static void bbr_init(struct sock *sk)
 	bbr->restore_cwnd = 0;
 	bbr->round_start = 0;
 	bbr->idle_restart = 0;
+	bbr->full_bw_reached = 0;
 	bbr->full_bw = 0;
 	bbr->full_bw_cnt = 0;
 	bbr->cycle_mstamp = 0;
-- 
2.15.1.424.g9478a66081-goog

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 2/3] tcp_bbr: reset full pipe detection on loss recovery undo
  2017-12-07 17:43 [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo Neal Cardwell
  2017-12-07 17:43 ` [PATCH net 1/3] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit Neal Cardwell
@ 2017-12-07 17:43 ` Neal Cardwell
  2017-12-07 17:43 ` [PATCH net 3/3] tcp_bbr: reset long-term bandwidth sampling " Neal Cardwell
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Neal Cardwell @ 2017-12-07 17:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell

Fix BBR so that upon notification of a loss recovery undo BBR resets
the full pipe detection (STARTUP exit) state machine.

Under high reordering, reordering events can be interpreted as loss.
If the reordering and spurious loss estimates are high enough, this
could previously cause BBR to spuriously estimate that the pipe is
full.

Since spurious loss recovery means that our overall sending will have
slowed down spuriously, this commit gives a flow more time to probe
robustly for bandwidth and decide the pipe is really full.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_bbr.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 3089c956b9f9..ab3ff14ea7f7 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -874,6 +874,10 @@ static u32 bbr_sndbuf_expand(struct sock *sk)
  */
 static u32 bbr_undo_cwnd(struct sock *sk)
 {
+	struct bbr *bbr = inet_csk_ca(sk);
+
+	bbr->full_bw = 0;   /* spurious slow-down; reset full pipe detection */
+	bbr->full_bw_cnt = 0;
 	return tcp_sk(sk)->snd_cwnd;
 }
 
-- 
2.15.1.424.g9478a66081-goog

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 3/3] tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
  2017-12-07 17:43 [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo Neal Cardwell
  2017-12-07 17:43 ` [PATCH net 1/3] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit Neal Cardwell
  2017-12-07 17:43 ` [PATCH net 2/3] tcp_bbr: reset full pipe detection on loss recovery undo Neal Cardwell
@ 2017-12-07 17:43 ` Neal Cardwell
  2017-12-07 17:48 ` [PATCH net 0/3] TCP BBR sampling fixes for " Neal Cardwell
  2017-12-08 18:28 ` David Miller
  4 siblings, 0 replies; 7+ messages in thread
From: Neal Cardwell @ 2017-12-07 17:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell

Fix BBR so that upon notification of a loss recovery undo BBR resets
long-term bandwidth sampling.

Under high reordering, reordering events can be interpreted as loss.
If the reordering and spurious loss estimates are high enough, this
can cause BBR to spuriously estimate that we are seeing loss rates
high enough to trigger long-term bandwidth estimation. To avoid that
problem, this commit resets long-term bandwidth sampling on loss
recovery undo events.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_bbr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index ab3ff14ea7f7..8322f26e770e 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -878,6 +878,7 @@ static u32 bbr_undo_cwnd(struct sock *sk)
 
 	bbr->full_bw = 0;   /* spurious slow-down; reset full pipe detection */
 	bbr->full_bw_cnt = 0;
+	bbr_reset_lt_bw_sampling(sk);
 	return tcp_sk(sk)->snd_cwnd;
 }
 
-- 
2.15.1.424.g9478a66081-goog

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo
  2017-12-07 17:43 [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo Neal Cardwell
                   ` (2 preceding siblings ...)
  2017-12-07 17:43 ` [PATCH net 3/3] tcp_bbr: reset long-term bandwidth sampling " Neal Cardwell
@ 2017-12-07 17:48 ` Neal Cardwell
  2017-12-07 18:07   ` David Miller
  2017-12-08 18:28 ` David Miller
  4 siblings, 1 reply; 7+ messages in thread
From: Neal Cardwell @ 2017-12-07 17:48 UTC (permalink / raw)
  To: David Miller; +Cc: Netdev, Neal Cardwell

On Thu, Dec 7, 2017 at 12:43 PM, Neal Cardwell <ncardwell@google.com> wrote:
> This patch series has a few minor bug fixes for cases where spurious
> loss recoveries can trick BBR estimators into estimating that the
> available bandwidth is much lower than the true available bandwidth.
> In both cases the fix here is to just reset the estimator upon loss
> recovery undo.
>
> Neal Cardwell (3):
>   tcp_bbr: record "full bw reached" decision in new full_bw_reached bit
>   tcp_bbr: reset full pipe detection on loss recovery undo
>   tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
>
>  net/ipv4/tcp_bbr.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

Sorry, I should have mentioned these patches are all stable candidates
and should have all carried a footer of:

  Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")

David, I can resend a v2 with that footer on each patch if you want.

neal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo
  2017-12-07 17:48 ` [PATCH net 0/3] TCP BBR sampling fixes for " Neal Cardwell
@ 2017-12-07 18:07   ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2017-12-07 18:07 UTC (permalink / raw)
  To: ncardwell; +Cc: netdev

From: Neal Cardwell <ncardwell@google.com>
Date: Thu, 7 Dec 2017 12:48:40 -0500

> On Thu, Dec 7, 2017 at 12:43 PM, Neal Cardwell <ncardwell@google.com> wrote:
>> This patch series has a few minor bug fixes for cases where spurious
>> loss recoveries can trick BBR estimators into estimating that the
>> available bandwidth is much lower than the true available bandwidth.
>> In both cases the fix here is to just reset the estimator upon loss
>> recovery undo.
>>
>> Neal Cardwell (3):
>>   tcp_bbr: record "full bw reached" decision in new full_bw_reached bit
>>   tcp_bbr: reset full pipe detection on loss recovery undo
>>   tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
>>
>>  net/ipv4/tcp_bbr.c | 12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> Sorry, I should have mentioned these patches are all stable candidates
> and should have all carried a footer of:
> 
>   Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
> 
> David, I can resend a v2 with that footer on each patch if you want.

It is not necessary to resend.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo
  2017-12-07 17:43 [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo Neal Cardwell
                   ` (3 preceding siblings ...)
  2017-12-07 17:48 ` [PATCH net 0/3] TCP BBR sampling fixes for " Neal Cardwell
@ 2017-12-08 18:28 ` David Miller
  4 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2017-12-08 18:28 UTC (permalink / raw)
  To: ncardwell; +Cc: netdev

From: Neal Cardwell <ncardwell@google.com>
Date: Thu,  7 Dec 2017 12:43:29 -0500

> This patch series has a few minor bug fixes for cases where spurious
> loss recoveries can trick BBR estimators into estimating that the
> available bandwidth is much lower than the true available bandwidth.
> In both cases the fix here is to just reset the estimator upon loss
> recovery undo.

Series applied and queued up for -stable, thanks Neal.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-12-08 18:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-07 17:43 [PATCH net 0/3] TCP BBR sampling fixes for loss recovery undo Neal Cardwell
2017-12-07 17:43 ` [PATCH net 1/3] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit Neal Cardwell
2017-12-07 17:43 ` [PATCH net 2/3] tcp_bbr: reset full pipe detection on loss recovery undo Neal Cardwell
2017-12-07 17:43 ` [PATCH net 3/3] tcp_bbr: reset long-term bandwidth sampling " Neal Cardwell
2017-12-07 17:48 ` [PATCH net 0/3] TCP BBR sampling fixes for " Neal Cardwell
2017-12-07 18:07   ` David Miller
2017-12-08 18:28 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).