All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] fq_codel : interval servo on hosts
       [not found] <1346396137.2586.301.camel@edumazet-glaptop>
@ 2012-08-31 13:50 ` Eric Dumazet
  2012-08-31 13:57   ` [RFC v2] " Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-08-31 13:50 UTC (permalink / raw)
  To: codel; +Cc: Tomas Hruby, Nandita Dukkipati, netdev

On Thu, 2012-08-30 at 23:55 -0700, Eric Dumazet wrote:
> On locally generated TCP traffic (host), we can override the 100 ms
> interval value using the more accurate RTT estimation maintained by TCP
> stack (tp->srtt)
> 
> Datacenter workload benefits using shorter feedback (say if RTT is below
> 1 ms, we can react 100 times faster to a congestion)
> 
> Idea from Yuchung Cheng.
> 

Linux patch would be the following :

I'll do tests next week, but I am sending a raw patch right now if
anybody wants to try it.

Presumably we also want to adjust target as well.

To get more precise srtt values in the datacenter, we might avoid the
'one jiffie slack' on small values in tcp_rtt_estimator(), as we force
m to be 1 before the scaling by 8 :

if (m == 0)
	m = 1;

We only need to force the least significant bit of srtt to be set.


 net/sched/sch_fq_codel.c |   23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 9fc1c62..7d2fe35 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -25,6 +25,7 @@
 #include <net/pkt_sched.h>
 #include <net/flow_keys.h>
 #include <net/codel.h>
+#include <linux/tcp.h>
 
 /*	Fair Queue CoDel.
  *
@@ -59,6 +60,7 @@ struct fq_codel_sched_data {
 	u32		perturbation;	/* hash perturbation */
 	u32		quantum;	/* psched_mtu(qdisc_dev(sch)); */
 	struct codel_params cparams;
+	codel_time_t	default_interval;
 	struct codel_stats cstats;
 	u32		drop_overlimit;
 	u32		new_flow_count;
@@ -211,6 +213,14 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	return NET_XMIT_SUCCESS;
 }
 
+/* Given TCP srtt evaluation, return codel interval.
+ * srtt is given in jiffies, scaled by 8.
+ */
+static codel_time_t tcp_srtt_to_codel(unsigned int srtt)
+{
+	return srtt * ((NSEC_PER_SEC >> (CODEL_SHIFT + 3)) / HZ);
+}
+
 /* This is the specific function called from codel_dequeue()
  * to dequeue a packet from queue. Note: backlog is handled in
  * codel, we dont need to reduce it here.
@@ -220,12 +230,21 @@ static struct sk_buff *dequeue(struct codel_vars *vars, struct Qdisc *sch)
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 	struct fq_codel_flow *flow;
 	struct sk_buff *skb = NULL;
+	struct sock *sk;
 
 	flow = container_of(vars, struct fq_codel_flow, cvars);
 	if (flow->head) {
 		skb = dequeue_head(flow);
 		q->backlogs[flow - q->flows] -= qdisc_pkt_len(skb);
 		sch->q.qlen--;
+		sk = skb->sk;
+		q->cparams.interval = q->default_interval;
+		if (sk && sk->sk_protocol == IPPROTO_TCP) {
+			u32 srtt = tcp_sk(sk)->srtt;
+
+			if (srtt)
+				q->cparams.interval = tcp_srtt_to_codel(srtt);
+		}
 	}
 	return skb;
 }
@@ -330,7 +349,7 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt)
 	if (tb[TCA_FQ_CODEL_INTERVAL]) {
 		u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]);
 
-		q->cparams.interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
+		q->default_interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
 	}
 
 	if (tb[TCA_FQ_CODEL_LIMIT])
@@ -441,7 +460,7 @@ static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
 	    nla_put_u32(skb, TCA_FQ_CODEL_LIMIT,
 			sch->limit) ||
 	    nla_put_u32(skb, TCA_FQ_CODEL_INTERVAL,
-			codel_time_to_us(q->cparams.interval)) ||
+			codel_time_to_us(q->default_interval)) ||
 	    nla_put_u32(skb, TCA_FQ_CODEL_ECN,
 			q->cparams.ecn) ||
 	    nla_put_u32(skb, TCA_FQ_CODEL_QUANTUM,

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v2] fq_codel : interval servo on hosts
  2012-08-31 13:50 ` [RFC] fq_codel : interval servo on hosts Eric Dumazet
@ 2012-08-31 13:57   ` Eric Dumazet
  2012-09-01  1:37     ` Yuchung Cheng
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-08-31 13:57 UTC (permalink / raw)
  To: codel; +Cc: Tomas Hruby, Nandita Dukkipati, netdev

On Fri, 2012-08-31 at 06:50 -0700, Eric Dumazet wrote:
> On Thu, 2012-08-30 at 23:55 -0700, Eric Dumazet wrote:
> > On locally generated TCP traffic (host), we can override the 100 ms
> > interval value using the more accurate RTT estimation maintained by TCP
> > stack (tp->srtt)
> > 
> > Datacenter workload benefits using shorter feedback (say if RTT is below
> > 1 ms, we can react 100 times faster to a congestion)
> > 
> > Idea from Yuchung Cheng.
> > 
> 
> Linux patch would be the following :
> 
> I'll do tests next week, but I am sending a raw patch right now if
> anybody wants to try it.
> 
> Presumably we also want to adjust target as well.
> 
> To get more precise srtt values in the datacenter, we might avoid the
> 'one jiffie slack' on small values in tcp_rtt_estimator(), as we force
> m to be 1 before the scaling by 8 :
> 
> if (m == 0)
> 	m = 1;
> 
> We only need to force the least significant bit of srtt to be set.
> 

Hmm, I also need to properly init default_interval after
codel_params_init(&q->cparams) :

 net/sched/sch_fq_codel.c |   24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 9fc1c62..f04ff6a 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -25,6 +25,7 @@
 #include <net/pkt_sched.h>
 #include <net/flow_keys.h>
 #include <net/codel.h>
+#include <linux/tcp.h>
 
 /*	Fair Queue CoDel.
  *
@@ -59,6 +60,7 @@ struct fq_codel_sched_data {
 	u32		perturbation;	/* hash perturbation */
 	u32		quantum;	/* psched_mtu(qdisc_dev(sch)); */
 	struct codel_params cparams;
+	codel_time_t	default_interval;
 	struct codel_stats cstats;
 	u32		drop_overlimit;
 	u32		new_flow_count;
@@ -211,6 +213,14 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	return NET_XMIT_SUCCESS;
 }
 
+/* Given TCP srtt evaluation, return codel interval.
+ * srtt is given in jiffies, scaled by 8.
+ */
+static codel_time_t tcp_srtt_to_codel(unsigned int srtt)
+{
+	return srtt * ((NSEC_PER_SEC >> (CODEL_SHIFT + 3)) / HZ);
+}
+
 /* This is the specific function called from codel_dequeue()
  * to dequeue a packet from queue. Note: backlog is handled in
  * codel, we dont need to reduce it here.
@@ -220,12 +230,21 @@ static struct sk_buff *dequeue(struct codel_vars *vars, struct Qdisc *sch)
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 	struct fq_codel_flow *flow;
 	struct sk_buff *skb = NULL;
+	struct sock *sk;
 
 	flow = container_of(vars, struct fq_codel_flow, cvars);
 	if (flow->head) {
 		skb = dequeue_head(flow);
 		q->backlogs[flow - q->flows] -= qdisc_pkt_len(skb);
 		sch->q.qlen--;
+		sk = skb->sk;
+		q->cparams.interval = q->default_interval;
+		if (sk && sk->sk_protocol == IPPROTO_TCP) {
+			u32 srtt = tcp_sk(sk)->srtt;
+
+			if (srtt)
+				q->cparams.interval = tcp_srtt_to_codel(srtt);
+		}
 	}
 	return skb;
 }
@@ -330,7 +349,7 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt)
 	if (tb[TCA_FQ_CODEL_INTERVAL]) {
 		u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]);
 
-		q->cparams.interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
+		q->default_interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
 	}
 
 	if (tb[TCA_FQ_CODEL_LIMIT])
@@ -395,6 +414,7 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
 	INIT_LIST_HEAD(&q->new_flows);
 	INIT_LIST_HEAD(&q->old_flows);
 	codel_params_init(&q->cparams);
+	q->default_interval = q->cparams.interval;
 	codel_stats_init(&q->cstats);
 	q->cparams.ecn = true;
 
@@ -441,7 +461,7 @@ static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
 	    nla_put_u32(skb, TCA_FQ_CODEL_LIMIT,
 			sch->limit) ||
 	    nla_put_u32(skb, TCA_FQ_CODEL_INTERVAL,
-			codel_time_to_us(q->cparams.interval)) ||
+			codel_time_to_us(q->default_interval)) ||
 	    nla_put_u32(skb, TCA_FQ_CODEL_ECN,
 			q->cparams.ecn) ||
 	    nla_put_u32(skb, TCA_FQ_CODEL_QUANTUM,

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-08-31 13:57   ` [RFC v2] " Eric Dumazet
@ 2012-09-01  1:37     ` Yuchung Cheng
  2012-09-01 12:51       ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Yuchung Cheng @ 2012-09-01  1:37 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tomas Hruby, Nandita Dukkipati, netdev, codel

On Fri, Aug 31, 2012 at 6:57 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-08-31 at 06:50 -0700, Eric Dumazet wrote:
>> On Thu, 2012-08-30 at 23:55 -0700, Eric Dumazet wrote:
>> > On locally generated TCP traffic (host), we can override the 100 ms
>> > interval value using the more accurate RTT estimation maintained by TCP
>> > stack (tp->srtt)
>> >
>> > Datacenter workload benefits using shorter feedback (say if RTT is below
>> > 1 ms, we can react 100 times faster to a congestion)
>> >
>> > Idea from Yuchung Cheng.
>> >
>>
>> Linux patch would be the following :
>>
>> I'll do tests next week, but I am sending a raw patch right now if
>> anybody wants to try it.
>>
>> Presumably we also want to adjust target as well.
>>
>> To get more precise srtt values in the datacenter, we might avoid the
>> 'one jiffie slack' on small values in tcp_rtt_estimator(), as we force
>> m to be 1 before the scaling by 8 :
>>
>> if (m == 0)
>>       m = 1;
>>
>> We only need to force the least significant bit of srtt to be set.
>>
Just curious: tp->srtt is a very rough estimator, e.g., Delayed-ACks
can easily add 40 - 200ms fuzziness. Will this affect short flows?


>
> Hmm, I also need to properly init default_interval after
> codel_params_init(&q->cparams) :
>
>  net/sched/sch_fq_codel.c |   24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 9fc1c62..f04ff6a 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -25,6 +25,7 @@
>  #include <net/pkt_sched.h>
>  #include <net/flow_keys.h>
>  #include <net/codel.h>
> +#include <linux/tcp.h>
>
>  /*     Fair Queue CoDel.
>   *
> @@ -59,6 +60,7 @@ struct fq_codel_sched_data {
>         u32             perturbation;   /* hash perturbation */
>         u32             quantum;        /* psched_mtu(qdisc_dev(sch)); */
>         struct codel_params cparams;
> +       codel_time_t    default_interval;
>         struct codel_stats cstats;
>         u32             drop_overlimit;
>         u32             new_flow_count;
> @@ -211,6 +213,14 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>         return NET_XMIT_SUCCESS;
>  }
>
> +/* Given TCP srtt evaluation, return codel interval.
> + * srtt is given in jiffies, scaled by 8.
> + */
> +static codel_time_t tcp_srtt_to_codel(unsigned int srtt)
> +{
> +       return srtt * ((NSEC_PER_SEC >> (CODEL_SHIFT + 3)) / HZ);
> +}
> +
>  /* This is the specific function called from codel_dequeue()
>   * to dequeue a packet from queue. Note: backlog is handled in
>   * codel, we dont need to reduce it here.
> @@ -220,12 +230,21 @@ static struct sk_buff *dequeue(struct codel_vars *vars, struct Qdisc *sch)
>         struct fq_codel_sched_data *q = qdisc_priv(sch);
>         struct fq_codel_flow *flow;
>         struct sk_buff *skb = NULL;
> +       struct sock *sk;
>
>         flow = container_of(vars, struct fq_codel_flow, cvars);
>         if (flow->head) {
>                 skb = dequeue_head(flow);
>                 q->backlogs[flow - q->flows] -= qdisc_pkt_len(skb);
>                 sch->q.qlen--;
> +               sk = skb->sk;
> +               q->cparams.interval = q->default_interval;
> +               if (sk && sk->sk_protocol == IPPROTO_TCP) {
> +                       u32 srtt = tcp_sk(sk)->srtt;
> +
> +                       if (srtt)
> +                               q->cparams.interval = tcp_srtt_to_codel(srtt);
> +               }
>         }
>         return skb;
>  }
> @@ -330,7 +349,7 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt)
>         if (tb[TCA_FQ_CODEL_INTERVAL]) {
>                 u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]);
>
> -               q->cparams.interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
> +               q->default_interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
>         }
>
>         if (tb[TCA_FQ_CODEL_LIMIT])
> @@ -395,6 +414,7 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
>         INIT_LIST_HEAD(&q->new_flows);
>         INIT_LIST_HEAD(&q->old_flows);
>         codel_params_init(&q->cparams);
> +       q->default_interval = q->cparams.interval;
>         codel_stats_init(&q->cstats);
>         q->cparams.ecn = true;
>
> @@ -441,7 +461,7 @@ static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
>             nla_put_u32(skb, TCA_FQ_CODEL_LIMIT,
>                         sch->limit) ||
>             nla_put_u32(skb, TCA_FQ_CODEL_INTERVAL,
> -                       codel_time_to_us(q->cparams.interval)) ||
> +                       codel_time_to_us(q->default_interval)) ||
>             nla_put_u32(skb, TCA_FQ_CODEL_ECN,
>                         q->cparams.ecn) ||
>             nla_put_u32(skb, TCA_FQ_CODEL_QUANTUM,
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-01  1:37     ` Yuchung Cheng
@ 2012-09-01 12:51       ` Eric Dumazet
  2012-09-04 15:10         ` Nandita Dukkipati
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-09-01 12:51 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Tomas Hruby, Nandita Dukkipati, netdev, codel

On Fri, 2012-08-31 at 18:37 -0700, Yuchung Cheng wrote:

> Just curious: tp->srtt is a very rough estimator, e.g., Delayed-ACks
> can easily add 40 - 200ms fuzziness. Will this affect short flows?

Good point

Delayed acks shouldnt matter, because they happen when flow had been
idle for a while.

I guess we should clamp the srtt to the default interval

if (srtt)
	q->cparams.interval = min(tcp_srtt_to_codel(srtt),
				  q->default_interval);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-01 12:51       ` Eric Dumazet
@ 2012-09-04 15:10         ` Nandita Dukkipati
  2012-09-04 15:25           ` Jonathan Morton
  2012-09-04 15:34           ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Nandita Dukkipati @ 2012-09-04 15:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tomas Hruby, netdev, codel

The idea of using srtt as interval makes sense to me if alongside we
also hash flows with similar RTTs into same bucket. But with just the
change in interval, I am not sure how codel is expected to behave.

My understanding is: the interval (usually set to worst case expected
RTT) is used to measure the standing queue or the "bad" queue. Suppose
1ms and 100ms RTT flows get hashed to same bucket, then the interval
with this patch will flip flop between 1ms and 100ms. How is this
expected to measure a standing queue? In fact I think the 1ms flow may
land up measuring the burstiness or the "good" queue created by the
long RTT flows, and this isn't desirable.


On Sat, Sep 1, 2012 at 5:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-08-31 at 18:37 -0700, Yuchung Cheng wrote:
>
>> Just curious: tp->srtt is a very rough estimator, e.g., Delayed-ACks
>> can easily add 40 - 200ms fuzziness. Will this affect short flows?
>
> Good point
>
> Delayed acks shouldnt matter, because they happen when flow had been
> idle for a while.
>
> I guess we should clamp the srtt to the default interval
>
> if (srtt)
>         q->cparams.interval = min(tcp_srtt_to_codel(srtt),
>                                   q->default_interval);
>
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-04 15:10         ` Nandita Dukkipati
@ 2012-09-04 15:25           ` Jonathan Morton
  2012-09-04 15:39             ` Eric Dumazet
  2012-09-04 15:34           ` Eric Dumazet
  1 sibling, 1 reply; 11+ messages in thread
From: Jonathan Morton @ 2012-09-04 15:25 UTC (permalink / raw)
  To: Nandita Dukkipati; +Cc: netdev, codel, Tomas Hruby

I think that in most cases, a long RTT flow and a short RTT flow on the same interface means that the long RTT flow isn't bottlenecked here, and therefore won't ever build up a significant queue - and that means you would want to track over the shorter interval. Is that a reasonable assumption?

The key to knowledge is not to rely on others to teach you it. 

On 4 Sep 2012, at 18:10, Nandita Dukkipati <nanditad@google.com> wrote:

> The idea of using srtt as interval makes sense to me if alongside we
> also hash flows with similar RTTs into same bucket. But with just the
> change in interval, I am not sure how codel is expected to behave.
> 
> My understanding is: the interval (usually set to worst case expected
> RTT) is used to measure the standing queue or the "bad" queue. Suppose
> 1ms and 100ms RTT flows get hashed to same bucket, then the interval
> with this patch will flip flop between 1ms and 100ms. How is this
> expected to measure a standing queue? In fact I think the 1ms flow may
> land up measuring the burstiness or the "good" queue created by the
> long RTT flows, and this isn't desirable.
> 
> 
> On Sat, Sep 1, 2012 at 5:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Fri, 2012-08-31 at 18:37 -0700, Yuchung Cheng wrote:
>> 
>>> Just curious: tp->srtt is a very rough estimator, e.g., Delayed-ACks
>>> can easily add 40 - 200ms fuzziness. Will this affect short flows?
>> 
>> Good point
>> 
>> Delayed acks shouldnt matter, because they happen when flow had been
>> idle for a while.
>> 
>> I guess we should clamp the srtt to the default interval
>> 
>> if (srtt)
>>        q->cparams.interval = min(tcp_srtt_to_codel(srtt),
>>                                  q->default_interval);
>> 
>> 
>> 
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-04 15:10         ` Nandita Dukkipati
  2012-09-04 15:25           ` Jonathan Morton
@ 2012-09-04 15:34           ` Eric Dumazet
  2012-09-04 16:40             ` Dave Taht
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-09-04 15:34 UTC (permalink / raw)
  To: Nandita Dukkipati; +Cc: Tomas Hruby, netdev, codel

On Tue, 2012-09-04 at 08:10 -0700, Nandita Dukkipati wrote:
> The idea of using srtt as interval makes sense to me if alongside we
> also hash flows with similar RTTs into same bucket. But with just the
> change in interval, I am not sure how codel is expected to behave.
> 
> My understanding is: the interval (usually set to worst case expected
> RTT) is used to measure the standing queue or the "bad" queue. Suppose
> 1ms and 100ms RTT flows get hashed to same bucket, then the interval
> with this patch will flip flop between 1ms and 100ms. How is this
> expected to measure a standing queue? In fact I think the 1ms flow may
> land up measuring the burstiness or the "good" queue created by the
> long RTT flows, and this isn't desirable.
> 

Well, how things settle with a pure codel, mixing flows of very
different RTT then ?

It seems there is a high resistance on SFQ/fq_codel model because of the
probabilities of flows sharing a bucket.

So what about removing the stochastic thing and switch to a hash with
collision resolution ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-04 15:25           ` Jonathan Morton
@ 2012-09-04 15:39             ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-09-04 15:39 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Nandita Dukkipati, netdev, codel, Tomas Hruby

On Tue, 2012-09-04 at 18:25 +0300, Jonathan Morton wrote:
> I think that in most cases, a long RTT flow and a short RTT flow on
> the same interface means that the long RTT flow isn't bottlenecked
> here, and therefore won't ever build up a significant queue - and that
> means you would want to track over the shorter interval. Is that a
> reasonable assumption?
> 

This would be reasonable, but if we have a shorter interval, this means
we could drop packets of the long RTT flow sooner than expected.

Thats because the drop_next value is setup on the previous packet, and
not based on the 'next packet'

Re-evaluating drop_next at the right time would need more cpu cycles.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-04 15:34           ` Eric Dumazet
@ 2012-09-04 16:40             ` Dave Taht
  2012-09-04 16:54               ` Eric Dumazet
  2012-09-04 16:57               ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Dave Taht @ 2012-09-04 16:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Nandita Dukkipati, Yuchung Cheng, codel, Kathleen Nichols,
	netdev, Tomas Hruby

On Tue, Sep 4, 2012 at 8:34 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2012-09-04 at 08:10 -0700, Nandita Dukkipati wrote:
>> The idea of using srtt as interval makes sense to me if alongside we
>> also hash flows with similar RTTs into same bucket. But with just the
>> change in interval, I am not sure how codel is expected to behave.
>>
>> My understanding is: the interval (usually set to worst case expected
>> RTT) is used to measure the standing queue or the "bad" queue. Suppose
>> 1ms and 100ms RTT flows get hashed to same bucket, then the interval
>> with this patch will flip flop between 1ms and 100ms. How is this
>> expected to measure a standing queue? In fact I think the 1ms flow may
>> land up measuring the burstiness or the "good" queue created by the
>> long RTT flows, and this isn't desirable.

Experiments would be good.

>
> Well, how things settle with a pure codel, mixing flows of very
> different RTT then ?

Elephants are shot statistically more often than mice.

> It seems there is a high resistance on SFQ/fq_codel model because of the
> probabilities of flows sharing a bucket.

I was going to do this in a separate email, because it is a little off-topic.

fq_codel has a standing queue problem, based on the fact that when a
queue empties, codel.h resets. This made sense for the single FIFO
codel but not multi-queued fq_codel. So after we hit X high rate
flows, target can never be achieved, even straining mightily, and we
end up with a standing queue again.

Easily seen with like 150 bidirectional flows at 10 or 100Mbit.

(as queues go, it's still pretty good queue. And: I've fiddled with
various means of draining multi-queue behavior thus far, and they
ended up unstable/unfair)

> So what about removing the stochastic thing and switch to a hash with
> collision resolution ?

Was considered and discarded in the original SFQ paper as being too
computationally intensive
(in 1993). Worth revisiting.

http://www2.rdrop.com/~paulmck/scalability/paper/sfq.2002.06.04.pdf

>
>



-- 
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
with fq_codel!"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-04 16:40             ` Dave Taht
@ 2012-09-04 16:54               ` Eric Dumazet
  2012-09-04 16:57               ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-09-04 16:54 UTC (permalink / raw)
  To: Dave Taht; +Cc: Tomas Hruby, Nandita Dukkipati, netdev, codel

On Tue, 2012-09-04 at 09:40 -0700, Dave Taht wrote:

> >
> > Well, how things settle with a pure codel, mixing flows of very
> > different RTT then ?
> 
> Elephants are shot statistically more often than mice.

This doesnt answer the question.

long/short RTT have nothing to do with elephant and mice.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v2] fq_codel : interval servo on hosts
  2012-09-04 16:40             ` Dave Taht
  2012-09-04 16:54               ` Eric Dumazet
@ 2012-09-04 16:57               ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-09-04 16:57 UTC (permalink / raw)
  To: Dave Taht; +Cc: Tomas Hruby, Nandita Dukkipati, netdev, codel

On Tue, 2012-09-04 at 09:40 -0700, Dave Taht wrote:

> fq_codel has a standing queue problem, based on the fact that when a
> queue empties, codel.h resets. This made sense for the single FIFO
> codel but not multi-queued fq_codel. So after we hit X high rate
> flows, target can never be achieved, even straining mightily, and we
> end up with a standing queue again.
> 
> Easily seen with like 150 bidirectional flows at 10 or 100Mbit.
> 
> (as queues go, it's still pretty good queue. And: I've fiddled with
> various means of draining multi-queue behavior thus far, and they
> ended up unstable/unfair)

No idea of what you mean by "codel.h resets".

Please use small mails, one idea by mail.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-09-04 16:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1346396137.2586.301.camel@edumazet-glaptop>
2012-08-31 13:50 ` [RFC] fq_codel : interval servo on hosts Eric Dumazet
2012-08-31 13:57   ` [RFC v2] " Eric Dumazet
2012-09-01  1:37     ` Yuchung Cheng
2012-09-01 12:51       ` Eric Dumazet
2012-09-04 15:10         ` Nandita Dukkipati
2012-09-04 15:25           ` Jonathan Morton
2012-09-04 15:39             ` Eric Dumazet
2012-09-04 15:34           ` Eric Dumazet
2012-09-04 16:40             ` Dave Taht
2012-09-04 16:54               ` Eric Dumazet
2012-09-04 16:57               ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.