All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] netem: fix rate extension and drop accounting
@ 2012-07-03  9:25 Eric Dumazet
  2012-07-03  9:54 ` Eric Dumazet
       [not found] ` <CAPVr9VP7DniPZj4vZi_myJWfL5JLYKYTXXtrXcKHo9LjEQzjYw@mail.gmail.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2012-07-03  9:25 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Hagen Paul Pfeifer, Yuchung Cheng, Andreas Terzis, Mark Gordon

From: Eric Dumazet <edumazet@google.com>

commit 7bc0f28c7a0c (netem: rate extension) did wrong maths when packet
is enqueued while queue is not empty.

Result is unexpected cumulative delays

# tc qd add dev eth0 root est 1sec 4sec netem delay 200ms rate 100kbit
# ping -i 0.1 172.30.42.18
PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=208 ms
64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=424 ms
64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=838 ms
64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=1142 ms
64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=1335 ms
64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=1949 ms
64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=2450 ms
64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=2840 ms
64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=3121 ms
64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=3291 ms
64 bytes from 172.30.42.18: icmp_req=11 ttl=64 time=3784 ms

This patch also fixes a double drop accounting in case packet is dropped
in tfifo_enqueue()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Andreas Terzis <aterzis@google.com>
Cc: Mark Gordon <msg@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
---
 net/sched/sch_netem.c |   14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index a2a95aa..e8b5ac3 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -368,7 +368,6 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	/* We don't fill cb now as skb_unshare() may invalidate it */
 	struct netem_skb_cb *cb;
 	struct sk_buff *skb2;
-	int ret;
 	int count = 1;
 
 	/* Random duplication */
@@ -443,14 +442,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 				 * calculate this time bonus and substract
 				 * from delay.
 				 */
-				delay -= now - netem_skb_cb(skb_peek(list))->time_to_send;
+				delay -= netem_skb_cb(skb_peek(list))->time_to_send - now;
 				now = netem_skb_cb(skb_peek_tail(list))->time_to_send;
 			}
 		}
 
 		cb->time_to_send = now + delay;
 		++q->counter;
-		ret = tfifo_enqueue(skb, sch);
+		return tfifo_enqueue(skb, sch);
 	} else {
 		/*
 		 * Do re-ordering by putting one out of N packets at the front
@@ -462,16 +461,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		__skb_queue_head(&sch->q, skb);
 		sch->qstats.backlog += qdisc_pkt_len(skb);
 		sch->qstats.requeues++;
-		ret = NET_XMIT_SUCCESS;
-	}
-
-	if (ret != NET_XMIT_SUCCESS) {
-		if (net_xmit_drop_count(ret)) {
-			sch->qstats.drops++;
-			return ret;
-		}
 	}
-
 	return NET_XMIT_SUCCESS;
 }
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-03  9:25 [PATCH] netem: fix rate extension and drop accounting Eric Dumazet
@ 2012-07-03  9:54 ` Eric Dumazet
  2012-07-03 22:04   ` Hagen Paul Pfeifer
       [not found] ` <CAPVr9VP7DniPZj4vZi_myJWfL5JLYKYTXXtrXcKHo9LjEQzjYw@mail.gmail.com>
  1 sibling, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2012-07-03  9:54 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Hagen Paul Pfeifer, Yuchung Cheng, Andreas Terzis, Mark Gordon

On Tue, 2012-07-03 at 11:25 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> commit 7bc0f28c7a0c (netem: rate extension) did wrong maths when packet
> is enqueued while queue is not empty.
> 
> Result is unexpected cumulative delays
> 
> # tc qd add dev eth0 root est 1sec 4sec netem delay 200ms rate 100kbit
> # ping -i 0.1 172.30.42.18
> PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
> 64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=208 ms
> 64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=424 ms
> 64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=838 ms
> 64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=1142 ms
> 64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=1335 ms
> 64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=1949 ms
> 64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=2450 ms
> 64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=2840 ms
> 64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=3121 ms
> 64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=3291 ms
> 64 bytes from 172.30.42.18: icmp_req=11 ttl=64 time=3784 ms
> 
> This patch also fixes a double drop accounting in case packet is dropped
> in tfifo_enqueue()
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Andreas Terzis <aterzis@google.com>
> Cc: Mark Gordon <msg@google.com>
> Cc: Hagen Paul Pfeifer <hagen@jauu.net>
> ---

Hmm, I'll send a v2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-03  9:54 ` Eric Dumazet
@ 2012-07-03 22:04   ` Hagen Paul Pfeifer
  2012-07-04  5:58     ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Hagen Paul Pfeifer @ 2012-07-03 22:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Yuchung Cheng, Andreas Terzis, Mark Gordon

* Eric Dumazet | 2012-07-03 11:54:17 [+0200]:

>> commit 7bc0f28c7a0c (netem: rate extension) did wrong maths when packet
>> is enqueued while queue is not empty.
>> 
>> Result is unexpected cumulative delays
>> 
>> # tc qd add dev eth0 root est 1sec 4sec netem delay 200ms rate 100kbit
>> # ping -i 0.1 172.30.42.18
>> PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
>> 64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=208 ms
>> 64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=424 ms
>> 64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=838 ms
>> 64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=1142 ms
>> 64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=1335 ms
>> 64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=1949 ms
>> 64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=2450 ms
>> 64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=2840 ms
>> 64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=3121 ms
>> 64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=3291 ms
>> 64 bytes from 172.30.42.18: icmp_req=11 ttl=64 time=3784 ms

Strange, we test the patch in detail. I will take a look ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-03 22:04   ` Hagen Paul Pfeifer
@ 2012-07-04  5:58     ` Eric Dumazet
  2012-07-04 16:51       ` Hagen Paul Pfeifer
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2012-07-04  5:58 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: netdev, Yuchung Cheng, Andreas Terzis, Mark Gordon

On Wed, 2012-07-04 at 00:04 +0200, Hagen Paul Pfeifer wrote:

> Strange, we test the patch in detail. I will take a look ...

I tried to fix the thing but lacked time yesterday.


I had to use the good old way for my tests.


DEV=eth0
tc qdisc del dev $DEV root
tc qdisc add dev $DEV root handle 30: est 1sec 4sec netem \
        delay 100ms 10ms reorder 3
tc qdisc add dev $DEV handle 40:0 parent 30:0 tbf \
        burst 5000 limit 10000 mtu 1514 rate 100kbit
tc qdisc add dev $DEV handle 50:00 parent 40:0 pfifo limit 200

fundamentally, mixing the TBF is going to be hard with "delay ..."
especially with jitter.

Same problem for reorder : since packets are put at head of queue,
they have no effect on the 'time_to_send' of packets already in queue
and netem use more bandwidth than allowed.

I'll send the patch on the double drop accounting problem because the
fix is easy enough, but fir the rate limiting, I prefer letting you work
on it if you dont mind ?

Thanks

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-04  5:58     ` Eric Dumazet
@ 2012-07-04 16:51       ` Hagen Paul Pfeifer
  2012-07-04 17:23         ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Hagen Paul Pfeifer @ 2012-07-04 16:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Yuchung Cheng, Andreas Terzis, Mark Gordon

* Eric Dumazet | 2012-07-04 07:58:08 [+0200]:


>DEV=eth0
>tc qdisc del dev $DEV root
>tc qdisc add dev $DEV root handle 30: est 1sec 4sec netem \
>        delay 100ms 10ms reorder 3
>tc qdisc add dev $DEV handle 40:0 parent 30:0 tbf \
>        burst 5000 limit 10000 mtu 1514 rate 100kbit
>tc qdisc add dev $DEV handle 50:00 parent 40:0 pfifo limit 200
>
>fundamentally, mixing the TBF is going to be hard with "delay ..."
>especially with jitter.
>
>Same problem for reorder : since packets are put at head of queue,
>they have no effect on the 'time_to_send' of packets already in queue
>and netem use more bandwidth than allowed.
>
>I'll send the patch on the double drop accounting problem because the
>fix is easy enough, but fir the rate limiting, I prefer letting you work
>on it if you dont mind ?

OK, I will work on it tomorrow! But Eric, keep in mind that this accumulative
behavior is intended: think about a hypothetical satellite link with a
bandwidth (rate) of 1000 byte/s. If you send three 1000 byte consecutively
packets. The first packet is delayed for 1 second, the second then is
transmitted after 2 seconds, the third after three seconds and so on. So
_this_ accumulative behavior is correct. Anyway, I will look at this tomorrow!

Thanks Eric!

PS: one last question: what do you want to test? TBF and netem rate at the
same time looks, mmhh, special ... ;-) I ask myself what link exhibit this
characteristic?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-04 16:51       ` Hagen Paul Pfeifer
@ 2012-07-04 17:23         ` Eric Dumazet
  2012-07-04 17:30           ` Hagen Paul Pfeifer
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2012-07-04 17:23 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: netdev, Yuchung Cheng, Andreas Terzis, Mark Gordon

On Wed, 2012-07-04 at 18:51 +0200, Hagen Paul Pfeifer wrote:
> OK, I will work on it tomorrow! But Eric, keep in mind that this accumulative
> behavior is intended: think about a hypothetical satellite link with a
> bandwidth (rate) of 1000 byte/s. If you send three 1000 byte consecutively
> packets. The first packet is delayed for 1 second, the second then is
> transmitted after 2 seconds, the third after three seconds and so on. So
> _this_ accumulative behavior is correct. Anyway, I will look at this tomorrow!
> 

I fear you did your tests with no delay on netem.

Try to setup a rate of 100kbit and a delay of 100ms and to really get
full bandwith (100kbit), I am afraid it doesnt work.

Your algo is OK only if no packets are in queue (obviously)

But if you have 2 or 3 packets, the delay are cumulative,
but the delay should be a fixed bias for each packet.


> Thanks Eric!
> 
> PS: one last question: what do you want to test? TBF and netem rate at the
> same time looks, mmhh, special ... ;-) I ask myself what link exhibit this
> characteristic?

TBF as a netem child was the usual way to have delay + rate before your
patch ?

Not sure why you find it special ?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-04 17:23         ` Eric Dumazet
@ 2012-07-04 17:30           ` Hagen Paul Pfeifer
  0 siblings, 0 replies; 10+ messages in thread
From: Hagen Paul Pfeifer @ 2012-07-04 17:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Yuchung Cheng, Andreas Terzis, Mark Gordon

* Eric Dumazet | 2012-07-04 19:23:21 [+0200]:

>I fear you did your tests with no delay on netem.
>
>Try to setup a rate of 100kbit and a delay of 100ms and to really get
>full bandwith (100kbit), I am afraid it doesnt work.
>
>Your algo is OK only if no packets are in queue (obviously)
>
>But if you have 2 or 3 packets, the delay are cumulative,
>but the delay should be a fixed bias for each packet.

Right, we did not test it with delay because we emulated satelite links.
Static delay was negligible. All delay was caused by a limited rate. ok, I
will look at this!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
       [not found] ` <CAPVr9VP7DniPZj4vZi_myJWfL5JLYKYTXXtrXcKHo9LjEQzjYw@mail.gmail.com>
@ 2012-07-16 23:26   ` Hagen Paul Pfeifer
  2012-07-17  5:12     ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Hagen Paul Pfeifer @ 2012-07-16 23:26 UTC (permalink / raw)
  To: Mark Gordon
  Cc: Eric Dumazet, David Miller, netdev, Yuchung Cheng, Andreas Terzis

Sorry Eric for the delay! I am a little bit under project pressure. But I will
test patches where I can.

* Mark Gordon | 2012-07-16 15:15:07 [-0700]:

>Sorry for the late reply on this.  I'm still pretty unsure about the math
>we're doing on
>
>delay -= netem_skb_cb(skb_peek(list))->time_to_send - now;
>
>First, shouldn't it be skb_peek_tail(list)?  Even if you do that the math

The delta is between now and when the _next_ packet is to send. New packets
are enqueued on the tail.

>doesn't really seem to work out.  In my mind there ought to be at least one
>more non-linearity.  The code before/after this patch does not work as I
>would expect and seems to have fairly random effects on the bandwidth and
>that the below patch does work for me.  Here is what I'm suggesting
>(relative to the new patches)

As Eric said: there are problems in combination with a static delay. During
rate extension development we tested the raw/vanilla "rate" functionality.
The rate part works faultless[TM] - at least independet of any other
"delay-latency generator".

Hagen

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
  2012-07-16 23:26   ` Hagen Paul Pfeifer
@ 2012-07-17  5:12     ` Eric Dumazet
       [not found]       ` <CAPVr9VMCYFO-7uEzO6ft2vpPhVvRgHB3EWJJG62OqGqux1LsZQ@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2012-07-17  5:12 UTC (permalink / raw)
  To: Hagen Paul Pfeifer
  Cc: Mark Gordon, David Miller, netdev, Yuchung Cheng, Andreas Terzis

On Tue, 2012-07-17 at 01:26 +0200, Hagen Paul Pfeifer wrote:

> As Eric said: there are problems in combination with a static delay. During
> rate extension development we tested the raw/vanilla "rate" functionality.
> The rate part works faultless[TM] - at least independet of any other
> "delay-latency generator".

Well, to get correct rate, I use the following unofficial patch :

(Or else, the rate is really wrong)



diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index c412ad0..2740a75 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -433,16 +433,8 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 			delay += packet_len_2_sched_time(skb->len, q);
 
-			if (!skb_queue_empty(list)) {
-				/*
-				 * Last packet in queue is reference point (now).
-				 * First packet in queue is already in flight,
-				 * calculate this time bonus and substract
-				 * from delay.
-				 */
-				delay -= now - netem_skb_cb(skb_peek(list))->time_to_send;
+			if (!skb_queue_empty(list))
 				now = netem_skb_cb(skb_peek_tail(list))->time_to_send;
-			}
 		}
 
 		cb->time_to_send = now + delay;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] netem: fix rate extension and drop accounting
       [not found]       ` <CAPVr9VMCYFO-7uEzO6ft2vpPhVvRgHB3EWJJG62OqGqux1LsZQ@mail.gmail.com>
@ 2012-07-17 17:39         ` Eric Dumazet
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Dumazet @ 2012-07-17 17:39 UTC (permalink / raw)
  To: Mark Gordon
  Cc: Hagen Paul Pfeifer, David Miller, netdev, Yuchung Cheng, Andreas Terzis

Please Mark :

1) Dont top post on netdev

2) Dont write HTML mails on netdev (your mail never went to netdev,
   only to CCed people). Only text mails are allowed.

On Tue, 2012-07-17 at 10:20 -0700, Mark Gordon wrote:
> Even the static delay case seems wrong with the new patch.  Assume all
> packets have the same sched_time.  Then if you spam packets that get
> processed at the same time by netem they will all get scheduled with
> the same time_to_send because the first packet will get time_to_send
> of [1] = clock_time + sched_time.  Then packet n compute 'now' as
> [n-1] and delay as sched_time - (clock_time - [1]) = 0 so that [n] =
> [n-1].  Therefore every packet gets scheduled at the same time.
> 
> 
> The above modification seems to fix the issue when latency/jitter is 0
> but suffers from a missing non-linearity when delay is present.  Is
> there a technical reason I'm missing that prevents us from doing rate
> and latency here?  Why wouldn't the 'official' patch have correct
> rate?

Because delay is variable (jitter)

netem as is is not working correctly if you have both a rate limit and
delay.

Hagen is working on a solution, but there is no easy fix.

The right solution is to have :

1) A rate stage, using a child qdisc (that you can graft to install your
own qdisc hierarchy if needed, say if you want codel or fq_codel ;))

  Thats basically a TBF...

2) skb orphan

3) drops/reorders/corrupt/additional delay (variable delay)
   using an internal tfifo, to mimic real networks behavior.

Thats the reverse of how its currently done.

Alternatively, this could be implemented as a special network device,
like bonding, instead of a qdisc.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-17 17:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-03  9:25 [PATCH] netem: fix rate extension and drop accounting Eric Dumazet
2012-07-03  9:54 ` Eric Dumazet
2012-07-03 22:04   ` Hagen Paul Pfeifer
2012-07-04  5:58     ` Eric Dumazet
2012-07-04 16:51       ` Hagen Paul Pfeifer
2012-07-04 17:23         ` Eric Dumazet
2012-07-04 17:30           ` Hagen Paul Pfeifer
     [not found] ` <CAPVr9VP7DniPZj4vZi_myJWfL5JLYKYTXXtrXcKHo9LjEQzjYw@mail.gmail.com>
2012-07-16 23:26   ` Hagen Paul Pfeifer
2012-07-17  5:12     ` Eric Dumazet
     [not found]       ` <CAPVr9VMCYFO-7uEzO6ft2vpPhVvRgHB3EWJJG62OqGqux1LsZQ@mail.gmail.com>
2012-07-17 17:39         ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.