stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Patch "netem: apply correct delay when rate throttling" has been added to the 4.9-stable tree
@ 2018-03-18 16:02 gregkh
  0 siblings, 0 replies; only message in thread
From: gregkh @ 2018-03-18 16:02 UTC (permalink / raw)
  To: njunger, alexander.levin, davem, gregkh, stephen; +Cc: stable, stable-commits


This is a note to let you know that I've just added the patch titled

    netem: apply correct delay when rate throttling

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     netem-apply-correct-delay-when-rate-throttling.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From foo@baz Sun Mar 18 16:55:33 CET 2018
From: Nik Unger <njunger@uwaterloo.ca>
Date: Mon, 13 Mar 2017 10:16:58 -0700
Subject: netem: apply correct delay when rate throttling

From: Nik Unger <njunger@uwaterloo.ca>


[ Upstream commit 5080f39e8c72e01cf37e8359023e7018e2a4901e ]

I recently reported on the netem list that iperf network benchmarks
show unexpected results when a bandwidth throttling rate has been
configured for netem. Specifically:

1) The measured link bandwidth *increases* when a higher delay is added
2) The measured link bandwidth appears higher than the specified limit
3) The measured link bandwidth for the same very slow settings varies significantly across
  machines

The issue can be reproduced by using tc to configure netem with a
512kbit rate and various (none, 1us, 50ms, 100ms, 200ms) delays on a
veth pair between network namespaces, and then using iperf (or any
other network benchmarking tool) to test throughput. Complete detailed
instructions are in the original email chain here:
https://lists.linuxfoundation.org/pipermail/netem/2017-February/001672.html

There appear to be two underlying bugs causing these effects:

- The first issue causes long delays when the rate is slow and no
  delay is configured (e.g., "rate 512kbit"). This is because SKBs are
  not orphaned when no delay is configured, so orphaning does not
  occur until *after* the rate-induced delay has been applied. For
  this reason, adding a tiny delay (e.g., "rate 512kbit delay 1us")
  dramatically increases the measured bandwidth.

- The second issue is that rate-induced delays are not correctly
  applied, allowing SKB delays to occur in parallel. The indended
  approach is to compute the delay for an SKB and to add this delay to
  the end of the current queue. However, the code does not detect
  existing SKBs in the queue due to improperly testing sch->q.qlen,
  which is nonzero even when packets exist only in the
  rbtree. Consequently, new SKBs do not wait for the current queue to
  empty. When packet delays vary significantly (e.g., if packet sizes
  are different), then this also causes unintended reordering.

I modified the code to expect a delay (and orphan the SKB) when a rate
is configured. I also added some defensive tests that correctly find
the latest scheduled delivery time, even if it is (unexpectedly) for a
packet in sch->q. I have tested these changes on the latest kernel
(4.11.0-rc1+) and the iperf / ping test results are as expected.

Signed-off-by: Nik Unger <njunger@uwaterloo.ca>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_netem.c |   26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -462,7 +462,7 @@ static int netem_enqueue(struct sk_buff
 	/* If a delay is expected, orphan the skb. (orphaning usually takes
 	 * place at TX completion time, so _before_ the link transit delay)
 	 */
-	if (q->latency || q->jitter)
+	if (q->latency || q->jitter || q->rate)
 		skb_orphan_partial(skb);
 
 	/*
@@ -530,21 +530,31 @@ static int netem_enqueue(struct sk_buff
 		now = psched_get_time();
 
 		if (q->rate) {
-			struct sk_buff *last;
+			struct netem_skb_cb *last = NULL;
+
+			if (sch->q.tail)
+				last = netem_skb_cb(sch->q.tail);
+			if (q->t_root.rb_node) {
+				struct sk_buff *t_skb;
+				struct netem_skb_cb *t_last;
+
+				t_skb = netem_rb_to_skb(rb_last(&q->t_root));
+				t_last = netem_skb_cb(t_skb);
+				if (!last ||
+				    t_last->time_to_send > last->time_to_send) {
+					last = t_last;
+				}
+			}
 
-			if (sch->q.qlen)
-				last = sch->q.tail;
-			else
-				last = netem_rb_to_skb(rb_last(&q->t_root));
 			if (last) {
 				/*
 				 * Last packet in queue is reference point (now),
 				 * calculate this time bonus and subtract
 				 * from delay.
 				 */
-				delay -= netem_skb_cb(last)->time_to_send - now;
+				delay -= last->time_to_send - now;
 				delay = max_t(psched_tdiff_t, 0, delay);
-				now = netem_skb_cb(last)->time_to_send;
+				now = last->time_to_send;
 			}
 
 			delay += packet_len_2_sched_time(qdisc_pkt_len(skb), q);


Patches currently in stable-queue which might be from njunger@uwaterloo.ca are

queue-4.9/netem-apply-correct-delay-when-rate-throttling.patch

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2018-03-18 16:04 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-18 16:02 Patch "netem: apply correct delay when rate throttling" has been added to the 4.9-stable tree gregkh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).