All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] netem: refine early skb orphaning
@ 2012-07-14 13:16 Eric Dumazet
  2012-07-14 21:53 ` Stephen Hemminger
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2012-07-14 13:16 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Hagen Paul Pfeifer, Mark Gordon, Andreas Terzis, Yuchung Cheng

From: Eric Dumazet <edumazet@google.com>

netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
or any mechanism relying on socket sk_wmem_alloc feedback.

Ideally, we should perform this orphaning after the rate module and
before the delay module, to mimic what happens on a real link :

skb orphaning is indeed normally done at TX completion, before the
transit on the link.

+-------+   +--------+  +---------------+  +-----------------+
+ Qdisc +---> Device +--> TX completion +--> links / hops    +->
+       +   +  xmit  +  + skb orphaning +  + propagation     +
+-------+   +--------+  +---------------+  +-----------------+
      < rate limiting >                  < delay, drops, reorders >

If netem is used without delay feature (drops, reorders, rate
limiting), then we should avoid early skb orphaning, to keep pressure
on sockets as long as packets are still in qdisc queue.

Ideally, netem should be refactored to implement delay module
as the last stage. Current algorithm merges the two phases
(rate limiting + delay) so its not correct.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Mark Gordon <msg@google.com>
Cc: Andreas Terzis <aterzis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/sched/sch_netem.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index c412ad0..298c0dd 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -380,7 +380,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
 	}
 
-	skb_orphan(skb);
+	/* If a delay is expected, orphan the skb. (orphaning usually takes
+	 * place at TX completion time, so _before_ the link transit delay)
+	 * Ideally, this orphaning should be done after the rate limiting
+	 * module, because this breaks TCP Small Queue, and other mechanisms
+	 * based on socket sk_wmem_alloc.
+	 */
+	if (q->latency || q->jitter)
+		skb_orphan(skb);
 
 	/*
 	 * If we need to duplicate packet, then re-insert at top of the

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] netem: refine early skb orphaning
  2012-07-14 13:16 [PATCH net-next] netem: refine early skb orphaning Eric Dumazet
@ 2012-07-14 21:53 ` Stephen Hemminger
  2012-07-17  6:08   ` David Miller
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Hemminger @ 2012-07-14 21:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Hagen Paul Pfeifer, Mark Gordon,
	Andreas Terzis, Yuchung Cheng

On Sat, 14 Jul 2012 15:16:27 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> From: Eric Dumazet <edumazet@google.com>
> 
> netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
> or any mechanism relying on socket sk_wmem_alloc feedback.
> 
> Ideally, we should perform this orphaning after the rate module and
> before the delay module, to mimic what happens on a real link :
> 
> skb orphaning is indeed normally done at TX completion, before the
> transit on the link.
> 
> +-------+   +--------+  +---------------+  +-----------------+
> + Qdisc +---> Device +--> TX completion +--> links / hops    +->
> +       +   +  xmit  +  + skb orphaning +  + propagation     +
> +-------+   +--------+  +---------------+  +-----------------+
>       < rate limiting >                  < delay, drops, reorders >
> 
> If netem is used without delay feature (drops, reorders, rate
> limiting), then we should avoid early skb orphaning, to keep pressure
> on sockets as long as packets are still in qdisc queue.
> 
> Ideally, netem should be refactored to implement delay module
> as the last stage. Current algorithm merges the two phases
> (rate limiting + delay) so its not correct.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Hagen Paul Pfeifer <hagen@jauu.net>
> Cc: Mark Gordon <msg@google.com>
> Cc: Andreas Terzis <aterzis@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> ---
>  net/sched/sch_netem.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index c412ad0..298c0dd 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -380,7 +380,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>  		return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
>  	}
>  
> -	skb_orphan(skb);
> +	/* If a delay is expected, orphan the skb. (orphaning usually takes
> +	 * place at TX completion time, so _before_ the link transit delay)
> +	 * Ideally, this orphaning should be done after the rate limiting
> +	 * module, because this breaks TCP Small Queue, and other mechanisms
> +	 * based on socket sk_wmem_alloc.
> +	 */
> +	if (q->latency || q->jitter)
> +		skb_orphan(skb);
>  
>  	/*
>  	 * If we need to duplicate packet, then re-insert at top of the
> 

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] netem: refine early skb orphaning
  2012-07-14 21:53 ` Stephen Hemminger
@ 2012-07-17  6:08   ` David Miller
  0 siblings, 0 replies; 3+ messages in thread
From: David Miller @ 2012-07-17  6:08 UTC (permalink / raw)
  To: shemminger; +Cc: eric.dumazet, netdev, hagen, msg, aterzis, ycheng

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Sat, 14 Jul 2012 14:53:33 -0700

> On Sat, 14 Jul 2012 15:16:27 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> From: Eric Dumazet <edumazet@google.com>
>> 
>> netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
>> or any mechanism relying on socket sk_wmem_alloc feedback.
>> 
>> Ideally, we should perform this orphaning after the rate module and
>> before the delay module, to mimic what happens on a real link :
>> 
>> skb orphaning is indeed normally done at TX completion, before the
>> transit on the link.
>> 
>> +-------+   +--------+  +---------------+  +-----------------+
>> + Qdisc +---> Device +--> TX completion +--> links / hops    +->
>> +       +   +  xmit  +  + skb orphaning +  + propagation     +
>> +-------+   +--------+  +---------------+  +-----------------+
>>       < rate limiting >                  < delay, drops, reorders >
>> 
>> If netem is used without delay feature (drops, reorders, rate
>> limiting), then we should avoid early skb orphaning, to keep pressure
>> on sockets as long as packets are still in qdisc queue.
>> 
>> Ideally, netem should be refactored to implement delay module
>> as the last stage. Current algorithm merges the two phases
>> (rate limiting + delay) so its not correct.
>> 
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
 ...
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-07-17  6:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-14 13:16 [PATCH net-next] netem: refine early skb orphaning Eric Dumazet
2012-07-14 21:53 ` Stephen Hemminger
2012-07-17  6:08   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.