All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] fq_codel: Fair Queue Codel AQM
@ 2012-05-11 13:59 Eric Dumazet
  2012-05-11 15:03 ` Changli Gao
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2012-05-11 13:59 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Dave Taht, Kathleen Nichols, Van Jacobson, Tom Herbert,
	Matt Mathis, Yuchung Cheng, Stephen Hemminger

From: Eric Dumazet <edumazet@google.com>

Fair Queue Codel implementation.

Principles :

- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
                              be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
  so that new flows have priority on old ones.

- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)

tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                      [ target TIME ] [ interval TIME ] [ noecn ]

defaults : 1024 flows, 10240 packets limit

Impressive results on load :

# tc -s -d class show dev eth9

class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0 
 Sent 1267974946 bytes 837585 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 202298Kbit 16702pps backlog 0b 103p requeues 0 
 lended: 837482 borrowed: 0 giants: 0
 tokens: -912 ctokens: -912

class fq_codel 10:a7 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 18168b 12p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.0ms
class fq_codel 10:10b parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 16654b 11p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 6.4ms
class fq_codel 10:146 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 13626b 9p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 5.2ms
class fq_codel 10:1c0 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 12112b 8p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 2.8ms
class fq_codel 10:2ba parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 13626b 9p requeues 0 
  deficit 926 count 1 lastcount 1 ldelay 5.2ms
class fq_codel 10:31d parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 16654b 11p requeues 0 
  deficit 0 count 1 lastcount 1 ldelay 6.4ms
class fq_codel 10:32c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 15140b 10p requeues 0 
  deficit 80 count 1 lastcount 1 ldelay 6.4ms
class fq_codel 10:342 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 16654b 11p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 6.4ms
class fq_codel 10:3ab parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 18168b 12p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.0ms
class fq_codel 10:3c2 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 15140b 10p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 6.4ms

# tc -s -d qdisc show dev eth9

qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 1267878050 bytes 837521 pkt (dropped 0, overlimits 1666567 requeues 1) 
 rate 202305Kbit 16703pps backlog 0b 104p requeues 1 
qdisc fq_codel 10: parent 1:1 limit 10240p target 5.0ms interval 100.0ms ecn 
 Sent 1267878050 bytes 837521 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 202305Kbit 16703pps backlog 157456b 104p requeues 0 
  maxpacket 1514 drop_overlimit 0 new_flow_count 87 ecn_mark 4071
  new_flows_len 0 old_flows_len 10

# ping -c 10 172.30.42.18
PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms

--- 172.30.42.18 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms

Much better than SFQ because of priority given to new flows

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
---
 include/linux/pkt_sched.h |   45 ++
 net/sched/Kconfig         |   11 
 net/sched/Makefile        |    1 
 net/sched/sch_fq_codel.c  |  595 ++++++++++++++++++++++++++++++++++++
 4 files changed, 652 insertions(+)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index cde56c2..3ffdaec 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -681,4 +681,49 @@ struct tc_codel_xstats {
 	__u32	dropping;  /* are we in dropping state ? */
 };
 
+/* FQ_CODEL */
+
+enum {
+	TCA_FQ_CODEL_UNSPEC,
+	TCA_FQ_CODEL_TARGET,
+	TCA_FQ_CODEL_LIMIT,
+	TCA_FQ_CODEL_INTERVAL,
+	TCA_FQ_CODEL_ECN,
+	TCA_FQ_CODEL_FLOWS,
+	__TCA_FQ_CODEL_MAX
+};
+
+#define TCA_FQ_CODEL_MAX	(__TCA_FQ_CODEL_MAX - 1)
+
+enum {
+	TCA_FQ_CODEL_XSTATS_QDISC,
+	TCA_FQ_CODEL_XSTATS_CLASS,
+};
+
+struct tc_fq_codel_xstats {
+	__u32	type;
+	union {
+		struct {
+			__u32	maxpacket; /* largest packet we've seen so far */
+			__u32	drop_overlimit; /* number of time max qdisc packet limit was hit */
+			__u32	ecn_mark;  /* number of packets we ECN marked
+					    * instead of dropped
+					    */
+			__u32	new_flow_count; /* number of time packets created a 'new flow' */
+			__u32	new_flows_len;	/* count of flows in new list */
+			__u32	old_flows_len;	/* count of flows in old list */ 
+		} qdisc_stats;
+		struct {
+			__s32	deficit;
+			__u32	ldelay; /* in-queue delay seen by most recently
+					 * dequeued packet
+					 */
+			__u32	count;
+			__u32	lastcount;
+			__u32	dropping;
+			__s32	drop_next;
+		} class_stats;
+	};
+};
+
 #endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index fadd252..e7a8976 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -261,6 +261,17 @@ config NET_SCH_CODEL
 
 	  If unsure, say N.
 
+config NET_SCH_FQ_CODEL
+	tristate "Fair Queue Controlled Delay AQM (FQ_CODEL)"
+	help
+	  Say Y here if you want to use the FQ Controlled Delay (FQ_CODEL)
+	  packet scheduling algorithm.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called sch_fq_codel.
+
+	  If unsure, say N.
+
 config NET_SCH_INGRESS
 	tristate "Ingress Qdisc"
 	depends on NET_CLS_ACT
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 30fab03..5940a19 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_NET_SCH_MQPRIO)	+= sch_mqprio.o
 obj-$(CONFIG_NET_SCH_CHOKE)	+= sch_choke.o
 obj-$(CONFIG_NET_SCH_QFQ)	+= sch_qfq.o
 obj-$(CONFIG_NET_SCH_CODEL)	+= sch_codel.o
+obj-$(CONFIG_NET_SCH_FQ_CODEL)	+= sch_fq_codel.o
 
 obj-$(CONFIG_NET_CLS_U32)	+= cls_u32.o
 obj-$(CONFIG_NET_CLS_ROUTE4)	+= cls_route.o
diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
new file mode 100644
index 0000000..8675ff8
--- /dev/null
+++ b/net/sched/sch_fq_codel.c
@@ -0,0 +1,595 @@
+/*
+ * Fair Queue CoDel discipline
+ *
+ *	This program is free software; you can redistribute it and/or
+ *	modify it under the terms of the GNU General Public License
+ *	as published by the Free Software Foundation; either version
+ *	2 of the License, or (at your option) any later version.
+ *
+ *  Copyright (C) 2012 Eric Dumazet <edumazet@google.com>
+ */
+
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/jiffies.h>
+#include <linux/string.h>
+#include <linux/in.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/skbuff.h>
+#include <linux/jhash.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <net/flow_keys.h>
+#include <net/codel.h>
+
+/*	Fair Queue CoDel.
+ *
+ * Principles :
+ * Packets are classified (internal classifier or external) on flows.
+ * This is a Stochastic model (as we use a hash, several flows
+ *			       might be hashed on same slot)
+ * Each flow has a CoDel managed queue.
+ * Flows are linked onto two (Round Robin) lists,
+ * so that new flows have priority on old ones.
+ *
+ * For a given flow, packets are not reordered (CoDel uses a FIFO)
+ * head drops only.
+ * ECN capability is on by default.
+ * Low memory footprint (64 bytes per flow)
+ */
+
+struct fq_codel_flow {
+	struct sk_buff	  *head;
+	struct sk_buff	  *tail;
+	struct list_head  flowchain;
+	int		  deficit;
+	struct codel_vars cvars;
+};
+
+struct fq_codel_sched_data {
+	struct tcf_proto *filter_list;	/* external classifier */
+	struct fq_codel_flow *flows;	/* Flows table [flows_cnt] */
+	u32		*backlogs;	/* backlog table [flows_cnt] */
+	u32		flows_cnt;	/* number of flows */
+	u32		perturbation;	/* hash perturbation */
+	u32		quantum;	/* psched_mtu(qdisc_dev(sch)); */
+	struct codel_params cparams;
+	struct codel_stats cstats;
+	u32		drop_overlimit;
+	u32		new_flow_count;
+
+	struct list_head new_flows;	/* list of new flows */
+	struct list_head old_flows;	/* list of old flows */
+};
+
+static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
+				  const struct sk_buff *skb)
+{
+	struct flow_keys keys;
+	unsigned int hash;
+
+	skb_flow_dissect(skb, &keys);
+	hash = jhash_3words((__force u32)keys.dst,
+			    (__force u32)keys.src ^ keys.ip_proto,
+			    (__force u32)keys.ports, q->perturbation);
+	return ((u64)hash * q->flows_cnt) >> 32;
+}
+
+static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch,
+				      int *qerr)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct tcf_result res;
+	int result;
+
+	if (TC_H_MAJ(skb->priority) == sch->handle &&
+	    TC_H_MIN(skb->priority) > 0 &&
+	    TC_H_MIN(skb->priority) <= q->flows_cnt)
+		return TC_H_MIN(skb->priority);
+
+	if (!q->filter_list)
+		return fq_codel_hash(q, skb) + 1;
+
+	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
+	result = tc_classify(skb, q->filter_list, &res);
+	if (result >= 0) {
+#ifdef CONFIG_NET_CLS_ACT
+		switch (result) {
+		case TC_ACT_STOLEN:
+		case TC_ACT_QUEUED:
+			*qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN;
+		case TC_ACT_SHOT:
+			return 0;
+		}
+#endif
+		if (TC_H_MIN(res.classid) <= q->flows_cnt)
+			return TC_H_MIN(res.classid);
+	}
+	return 0;
+}
+
+/* helper functions : might be changed when/if skb use a standard list_head */
+
+/* remove one skb from head of slot queue */
+static inline struct sk_buff *dequeue_head(struct fq_codel_flow *flow)
+{
+	struct sk_buff *skb = flow->head;
+
+	flow->head = skb->next;
+	skb->next = NULL;
+	return skb;
+}
+
+/* add skb to flow queue (tail add) */
+static inline void flow_queue_add(struct fq_codel_flow *flow,
+				  struct sk_buff *skb)
+{
+	if (flow->head == NULL)
+		flow->head = skb;
+	else
+		flow->tail->next = skb;
+	flow->tail = skb;
+	skb->next = NULL;
+}
+
+static unsigned int fq_codel_drop(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *skb;
+	unsigned int maxbacklog = 0, idx = 0, i, len;
+	struct fq_codel_flow *flow;
+
+	/* Queue is full! Find the fat flow and drop packet from it.
+	 * This might sound expensive, but with 1024 flows, we scan
+	 * 4KB of memory, and we dont need to handle a complex tree
+	 * in fast path (packet queue/enqueue) with many cache misses.
+	 */
+	for (i = 0; i < q->flows_cnt; i++) {
+		if (q->backlogs[i] > maxbacklog) {
+			maxbacklog = q->backlogs[i];
+			idx = i;
+		}
+	}
+	flow = &q->flows[idx];
+	skb = dequeue_head(flow);
+	len = qdisc_pkt_len(skb);
+	q->backlogs[idx] -= len;
+	kfree_skb(skb);
+	sch->q.qlen--;
+	sch->qstats.drops++;
+	sch->qstats.backlog -= len;
+	return idx;
+}
+
+static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	unsigned int idx;
+	struct fq_codel_flow *flow;
+	int uninitialized_var(ret);
+
+	idx = fq_codel_classify(skb, sch, &ret);
+	if (idx == 0) {
+		if (ret & __NET_XMIT_BYPASS)
+			sch->qstats.drops++;
+		kfree_skb(skb);
+		return ret;
+	}
+	idx--;
+
+	codel_set_enqueue_time(skb);
+	flow = &q->flows[idx];
+	flow_queue_add(flow, skb);
+	q->backlogs[idx] += qdisc_pkt_len(skb);
+	sch->qstats.backlog += qdisc_pkt_len(skb);
+
+	if (list_empty(&flow->flowchain)) {
+		list_add_tail(&flow->flowchain, &q->new_flows);
+		codel_vars_init(&flow->cvars);
+		q->new_flow_count++;
+		flow->deficit = q->quantum;
+	}
+	if (++sch->q.qlen < sch->limit)
+		return NET_XMIT_SUCCESS;
+
+	q->drop_overlimit++;
+	/* Return Congestion Notification only if we dropped a packet
+	 * from this flow.
+	 */
+	if (fq_codel_drop(sch) == idx)
+		return NET_XMIT_CN;
+
+	/* As we dropped a packet, better let upper stack know this */
+	qdisc_tree_decrease_qlen(sch, 1);
+	return NET_XMIT_SUCCESS;
+}
+
+/* This is the specific function called from codel_dequeue()
+ * to dequeue a packet from queue. Note: backlog is handled in
+ * codel, we dont need to reduce it here.
+ */
+static struct sk_buff *dequeue(struct codel_vars *vars, struct Qdisc *sch)
+{
+	struct fq_codel_flow *flow;
+	struct sk_buff *skb = NULL;
+
+	flow = container_of(vars, struct fq_codel_flow, cvars);
+	if (flow->head) {
+		skb = dequeue_head(flow);
+		sch->qstats.backlog -= qdisc_pkt_len(skb);
+		sch->q.qlen--;
+	}
+	return skb;
+}
+
+static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *skb;
+	struct fq_codel_flow *flow;
+
+begin:
+	if (!list_empty(&q->new_flows))
+		flow = list_first_entry(&q->new_flows,
+					struct fq_codel_flow,
+					flowchain);
+	else if (!list_empty(&q->old_flows))
+		flow = list_first_entry(&q->old_flows,
+					struct fq_codel_flow,
+					flowchain);
+	else
+		return NULL;
+
+	if (flow->deficit <= 0) {
+		flow->deficit += q->quantum;
+		list_move_tail(&flow->flowchain, &q->old_flows);
+		goto begin;
+	}
+	skb = codel_dequeue(sch, &q->cparams, &flow->cvars, &q->cstats,
+			    dequeue, &q->backlogs[flow - q->flows]);
+	if (!skb) {
+		list_del_init(&flow->flowchain);
+		goto begin;
+	}
+	qdisc_bstats_update(sch, skb);
+	flow->deficit -= qdisc_pkt_len(skb);
+	return skb;
+}
+
+static void fq_codel_reset(struct Qdisc *sch)
+{
+	struct sk_buff *skb;
+
+	while ((skb = fq_codel_dequeue(sch)) != NULL)
+		kfree_skb(skb);
+}
+
+static const struct nla_policy fq_codel_policy[TCA_FQ_CODEL_MAX + 1] = {
+	[TCA_FQ_CODEL_TARGET]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_LIMIT]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_INTERVAL]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_ECN]	= { .type = NLA_U32 },
+};
+
+static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct nlattr *tb[TCA_FQ_CODEL_MAX + 1];
+	int err;
+
+	if (!opt)
+		return -EINVAL;
+
+	err = nla_parse_nested(tb, TCA_FQ_CODEL_MAX, opt, fq_codel_policy);
+	if (err < 0)
+		return err;
+	if (tb[TCA_FQ_CODEL_FLOWS]) {
+		if (q->flows)
+			return -EINVAL;
+		q->flows_cnt = nla_get_u32(tb[TCA_FQ_CODEL_FLOWS]);
+		if (!q->flows_cnt ||
+		    q->flows_cnt > 65536)
+			return -EINVAL;
+	}
+	sch_tree_lock(sch);
+
+	if (tb[TCA_FQ_CODEL_TARGET]) {
+		u64 target = nla_get_u32(tb[TCA_FQ_CODEL_TARGET]);
+
+		q->cparams.target = (target * NSEC_PER_USEC) >> CODEL_SHIFT;
+	}
+
+	if (tb[TCA_FQ_CODEL_INTERVAL]) {
+		u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]);
+
+		q->cparams.interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
+	}
+
+	if (tb[TCA_FQ_CODEL_LIMIT])
+		sch->limit = nla_get_u32(tb[TCA_FQ_CODEL_LIMIT]);
+
+	if (tb[TCA_FQ_CODEL_ECN])
+		q->cparams.ecn = !!nla_get_u32(tb[TCA_FQ_CODEL_ECN]);
+
+	while (sch->q.qlen > sch->limit) {
+		struct sk_buff *skb = fq_codel_dequeue(sch);
+
+		kfree_skb(skb);
+		q->cstats.drop_count++;
+	}
+	qdisc_tree_decrease_qlen(sch, q->cstats.drop_count);
+	q->cstats.drop_count = 0;
+
+	sch_tree_unlock(sch);
+	return 0;
+}
+
+static void *fq_codel_zalloc(size_t sz)
+{
+	void *ptr = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN);
+
+	if (!ptr)
+		ptr = vzalloc(sz);
+	return ptr;
+}
+
+static void fq_codel_free(void *addr)
+{
+	if (addr) {
+		if (is_vmalloc_addr(addr))
+			vfree(addr);
+		else
+			kfree(addr);
+	}
+}
+
+static void fq_codel_destroy(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+
+	tcf_destroy_chain(&q->filter_list);
+	fq_codel_free(q->backlogs);
+	fq_codel_free(q->flows);
+}
+
+static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	int i;
+
+	sch->limit = 10*1024;
+	q->flows_cnt = 1024;
+	q->quantum = psched_mtu(qdisc_dev(sch));
+	q->perturbation = net_random();
+	INIT_LIST_HEAD(&q->new_flows);
+	INIT_LIST_HEAD(&q->old_flows);
+	codel_params_init(&q->cparams);
+	codel_stats_init(&q->cstats);
+	q->cparams.ecn = true;
+
+	if (opt) {
+		int err = fq_codel_change(sch, opt);
+		if (err)
+			return err;
+	}
+
+	if (!q->flows) {
+		q->flows = fq_codel_zalloc(q->flows_cnt *
+					   sizeof(struct fq_codel_flow));
+		if (!q->flows)
+			return -ENOMEM;
+		q->backlogs = fq_codel_zalloc(q->flows_cnt * sizeof(u32));
+		if (!q->backlogs) {
+			fq_codel_free(q->flows);
+			return -ENOMEM;
+		}
+		for (i = 0; i < q->flows_cnt; i++) {
+			struct fq_codel_flow *flow = q->flows + i;
+
+			INIT_LIST_HEAD(&flow->flowchain);
+		}
+	}
+	if (sch->limit >= 1)
+		sch->flags |= TCQ_F_CAN_BYPASS;
+	else
+		sch->flags &= ~TCQ_F_CAN_BYPASS;
+	return 0;
+}
+
+static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct nlattr *opts;
+
+	opts = nla_nest_start(skb, TCA_OPTIONS);
+	if (opts == NULL)
+		goto nla_put_failure;
+
+	if (nla_put_u32(skb, TCA_FQ_CODEL_TARGET,
+			codel_time_to_us(q->cparams.target)) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_LIMIT,
+			sch->limit) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_INTERVAL,
+			codel_time_to_us(q->cparams.interval)) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_ECN,
+			q->cparams.ecn) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_FLOWS,
+			q->flows_cnt))
+		goto nla_put_failure;
+
+	return nla_nest_end(skb, opts);
+
+nla_put_failure:
+	nla_nest_cancel(skb, opts);
+	return -1;
+}
+
+static int fq_codel_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct tc_fq_codel_xstats st = {
+		.type				= TCA_FQ_CODEL_XSTATS_QDISC,
+		.qdisc_stats.maxpacket		= q->cstats.maxpacket,
+		.qdisc_stats.drop_overlimit	= q->drop_overlimit,
+		.qdisc_stats.ecn_mark		= q->cstats.ecn_mark,
+		.qdisc_stats.new_flow_count	= q->new_flow_count,
+	};
+	struct list_head *pos;
+
+	list_for_each(pos, &q->new_flows)
+		st.qdisc_stats.new_flows_len++;
+
+	list_for_each(pos, &q->old_flows)
+		st.qdisc_stats.old_flows_len++;
+
+	return gnet_stats_copy_app(d, &st, sizeof(st));
+}
+
+static struct Qdisc *fq_codel_leaf(struct Qdisc *sch, unsigned long arg)
+{
+	return NULL;
+}
+
+static unsigned long fq_codel_get(struct Qdisc *sch, u32 classid)
+{
+	return 0;
+}
+
+static unsigned long fq_codel_bind(struct Qdisc *sch, unsigned long parent,
+			      u32 classid)
+{
+	/* we cannot bypass queue discipline anymore */
+	sch->flags &= ~TCQ_F_CAN_BYPASS;
+	return 0;
+}
+
+static void fq_codel_put(struct Qdisc *q, unsigned long cl)
+{
+}
+
+static struct tcf_proto **fq_codel_find_tcf(struct Qdisc *sch, unsigned long cl)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+
+	if (cl)
+		return NULL;
+	return &q->filter_list;
+}
+
+static int fq_codel_dump_class(struct Qdisc *sch, unsigned long cl,
+			  struct sk_buff *skb, struct tcmsg *tcm)
+{
+	tcm->tcm_handle |= TC_H_MIN(cl);
+	return 0;
+}
+
+static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl,
+				     struct gnet_dump *d)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	u32 idx = cl - 1;
+	struct gnet_stats_queue qs = { 0 };
+	struct tc_fq_codel_xstats xstats;
+
+	if (idx < q->flows_cnt) {
+		const struct fq_codel_flow *flow = &q->flows[idx];
+		const struct sk_buff *skb = flow->head;
+
+		memset(&xstats, 0, sizeof(xstats));
+		xstats.type = TCA_FQ_CODEL_XSTATS_CLASS;
+		xstats.class_stats.deficit = flow->deficit;
+		xstats.class_stats.ldelay =
+			codel_time_to_us(flow->cvars.ldelay);
+		xstats.class_stats.count = flow->cvars.count;
+		xstats.class_stats.lastcount = flow->cvars.lastcount;
+		xstats.class_stats.dropping = flow->cvars.dropping;
+		if (flow->cvars.dropping) {
+			codel_tdiff_t delta = flow->cvars.drop_next -
+					      codel_get_time();
+
+			xstats.class_stats.drop_next = (delta >= 0) ?
+				codel_time_to_us(delta) :
+				-codel_time_to_us(-delta);
+		}
+		while (skb) {
+			qs.qlen++;
+			skb = skb->next;
+		}
+		qs.backlog = q->backlogs[idx];
+	}
+	if (gnet_stats_copy_queue(d, &qs) < 0)
+		return -1;
+	if (idx < q->flows_cnt)
+		return gnet_stats_copy_app(d, &xstats, sizeof(xstats));
+	return 0;
+}
+
+static void fq_codel_walk(struct Qdisc *sch, struct qdisc_walker *arg)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	unsigned int i;
+
+	if (arg->stop)
+		return;
+
+	for (i = 0; i < q->flows_cnt; i++) {
+		if (list_empty(&q->flows[i].flowchain) ||
+		    arg->count < arg->skip) {
+			arg->count++;
+			continue;
+		}
+		if (arg->fn(sch, i + 1, arg) < 0) {
+			arg->stop = 1;
+			break;
+		}
+		arg->count++;
+	}
+}
+
+static const struct Qdisc_class_ops fq_codel_class_ops = {
+	.leaf		=	fq_codel_leaf,
+	.get		=	fq_codel_get,
+	.put		=	fq_codel_put,
+	.tcf_chain	=	fq_codel_find_tcf,
+	.bind_tcf	=	fq_codel_bind,
+	.unbind_tcf	=	fq_codel_put,
+	.dump		=	fq_codel_dump_class,
+	.dump_stats	=	fq_codel_dump_class_stats,
+	.walk		=	fq_codel_walk,
+};
+
+static struct Qdisc_ops fq_codel_qdisc_ops __read_mostly = {
+	.cl_ops		=	&fq_codel_class_ops,
+	.id		=	"fq_codel",
+	.priv_size	=	sizeof(struct fq_codel_sched_data),
+	.enqueue	=	fq_codel_enqueue,
+	.dequeue	=	fq_codel_dequeue,
+	.peek		=	qdisc_peek_dequeued,
+	.drop		=	fq_codel_drop,
+	.init		=	fq_codel_init,
+	.reset		=	fq_codel_reset,
+	.destroy	=	fq_codel_destroy,
+	.change		=	NULL,
+	.dump		=	fq_codel_dump,
+	.dump_stats =	fq_codel_dump_stats,
+	.owner		=	THIS_MODULE,
+};
+
+static int __init fq_codel_module_init(void)
+{
+	return register_qdisc(&fq_codel_qdisc_ops);
+}
+
+static void __exit fq_codel_module_exit(void)
+{
+	unregister_qdisc(&fq_codel_qdisc_ops);
+}
+
+module_init(fq_codel_module_init)
+module_exit(fq_codel_module_exit)
+MODULE_AUTHOR("Eric Dumazet");
+MODULE_LICENSE("GPL");

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 13:59 [PATCH net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
@ 2012-05-11 15:03 ` Changli Gao
  2012-05-11 15:23   ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Changli Gao @ 2012-05-11 15:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Dave Taht, Kathleen Nichols, Van Jacobson,
	Tom Herbert, Matt Mathis, Yuchung Cheng, Stephen Hemminger

On Fri, May 11, 2012 at 9:59 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Fair Queue Codel implementation.
>
> Principles :
>
> - Packets are classified (internal classifier or external) on flows.
> - This is a Stochastic model (as we use a hash, several flows might
>                              be hashed on same slot)
> - Each flow has a CoDel managed queue.
> - Flows are linked onto two (Round Robin) lists,
>  so that new flows have priority on old ones.

I don't think it is a good idea, as the old ones may be starved. It isn't
fair. Why not use the conventional DRR?

> +
> +       /* Queue is full! Find the fat flow and drop packet from it.
> +        * This might sound expensive, but with 1024 flows, we scan
> +        * 4KB of memory, and we dont need to handle a complex tree
> +        * in fast path (packet queue/enqueue) with many cache misses.
> +        */

How about the tricks used by SFQ?

Thanks.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 15:03 ` Changli Gao
@ 2012-05-11 15:23   ` Eric Dumazet
  2012-05-11 16:08     ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2012-05-11 15:23 UTC (permalink / raw)
  To: Changli Gao
  Cc: David Miller, netdev, Dave Taht, Kathleen Nichols, Van Jacobson,
	Tom Herbert, Matt Mathis, Yuchung Cheng, Stephen Hemminger

On Fri, 2012-05-11 at 23:03 +0800, Changli Gao wrote:
> On Fri, May 11, 2012 at 9:59 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Fair Queue Codel implementation.
> >
> > Principles :
> >
> > - Packets are classified (internal classifier or external) on flows.
> > - This is a Stochastic model (as we use a hash, several flows might
> >                              be hashed on same slot)
> > - Each flow has a CoDel managed queue.
> > - Flows are linked onto two (Round Robin) lists,
> >  so that new flows have priority on old ones.
> 
> I don't think it is a good idea, as the old ones may be starved. It isn't
> fair. Why not use the conventional DRR?
> 

Hey, its DRR, but with 64 bytes per flow instead of more than 256.
One cache line per flow, that was my goal, sharing the codel_params and
stats for all flows.

A 'struct fq_codel_flow' can be in three states :

- Detached state
- In new flow list
- In old flow list

And its the dequeue() that can put a flow in detached state, only if
coming from old flow list.

Its possible I missed something, because in my first coding I had 3
lists.

Anyway I'll send a V2 because I left .change method to NULL, while the
intent was to permit a change on fq_codel.

> > +
> > +       /* Queue is full! Find the fat flow and drop packet from it.
> > +        * This might sound expensive, but with 1024 flows, we scan
> > +        * 4KB of memory, and we dont need to handle a complex tree
> > +        * in fast path (packet queue/enqueue) with many cache misses.
> > +        */
> 
> How about the tricks used by SFQ?

They are too expensive in term of cache misses and limits.
Code is complex and difficult to maintain.
That was a nice compromise 20 years ago when memory was expensive.
Now, memory is cheap but still slow.

Also adding the 'priority to new flows' is too difficult with SFQ.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 15:23   ` Eric Dumazet
@ 2012-05-11 16:08     ` Eric Dumazet
  2012-05-11 18:49       ` pch_gbe oops with vlan Andy Cress
  2012-05-11 19:30       ` [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2012-05-11 16:08 UTC (permalink / raw)
  To: Changli Gao
  Cc: David Miller, netdev, Dave Taht, Kathleen Nichols, Van Jacobson,
	Tom Herbert, Matt Mathis, Yuchung Cheng, Stephen Hemminger

On Fri, 2012-05-11 at 17:23 +0200, Eric Dumazet wrote:

> Its possible I missed something, because in my first coding I had 3
> lists.
> 
> Anyway I'll send a V2 because I left .change method to NULL, while the
> intent was to permit a change on fq_codel.

Before sending v2, here is the diff against v1 :

 net/sched/sch_fq_codel.c |   26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 8675ff8..f29a967 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -231,18 +231,16 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *skb;
 	struct fq_codel_flow *flow;
+	struct list_head *head;
 
 begin:
-	if (!list_empty(&q->new_flows))
-		flow = list_first_entry(&q->new_flows,
-					struct fq_codel_flow,
-					flowchain);
-	else if (!list_empty(&q->old_flows))
-		flow = list_first_entry(&q->old_flows,
-					struct fq_codel_flow,
-					flowchain);
-	else
-		return NULL;
+	head = &q->new_flows;
+	if (list_empty(head)) {
+		head = &q->old_flows;
+		if (list_empty(head))
+			return NULL;
+	}
+	flow = list_first_entry(head, struct fq_codel_flow, flowchain);
 
 	if (flow->deficit <= 0) {
 		flow->deficit += q->quantum;
@@ -252,7 +250,11 @@ begin:
 	skb = codel_dequeue(sch, &q->cparams, &flow->cvars, &q->cstats,
 			    dequeue, &q->backlogs[flow - q->flows]);
 	if (!skb) {
-		list_del_init(&flow->flowchain);
+		/* force a pass through old_flows to prevent starvation */
+		if ((head == &q->new_flows) && !list_empty(&q->old_flows))
+			list_move_tail(&flow->flowchain, &q->old_flows);
+		else
+			list_del_init(&flow->flowchain);
 		goto begin;
 	}
 	qdisc_bstats_update(sch, skb);
@@ -573,7 +575,7 @@ static struct Qdisc_ops fq_codel_qdisc_ops __read_mostly = {
 	.init		=	fq_codel_init,
 	.reset		=	fq_codel_reset,
 	.destroy	=	fq_codel_destroy,
-	.change		=	NULL,
+	.change		=	fq_codel_change,
 	.dump		=	fq_codel_dump,
 	.dump_stats =	fq_codel_dump_stats,
 	.owner		=	THIS_MODULE,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* pch_gbe oops with vlan
  2012-05-11 16:08     ` Eric Dumazet
@ 2012-05-11 18:49       ` Andy Cress
  2012-05-11 20:36         ` David Miller
  2012-05-11 19:30       ` [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
  1 sibling, 1 reply; 12+ messages in thread
From: Andy Cress @ 2012-05-11 18:49 UTC (permalink / raw)
  To: netdev

Folks,

I am looking for help in debugging a pch_gbe driver oops/abort.

Kernel: version 2.6.32-220.el6.i686 (RHEL6.2)
Driver: pch_gbe version 0.91-NAPI  (source tarball we used is at https://sendfile.kontron.com/message/24tdUi6MXklnUtBLnOsumq until May 16)
NIC: 0b:00.1 Ethernet controller [0200]: Intel Corporation Platform Controller Hub EG20T Gigabit Ethernet Controller [8086:8802] (rev 02)

Configuration, with VLAN:
 eth0 (not started)
 eth0.100 = 192.168.100.1 
 eth0.200 = 192.168.200.1 
 eth0.6  = 192.168.6.1

When starting the VLAN configuration, then doing a ping test for >= 5 minutes, I get a kernel oop/abort message as shown below.  This does not happen without configuring VLAN.
Where should I look for possible causes for a transmit queue timeout like this?  

I have contacted the OKI/LAPIS driver authors, but no response so far.  I thought that this group might be able to comment from similar experiences.

Andy

May 11 11:06:09 kontron kernel: ------------[ cut here ]------------
May 11 11:06:09 kontron kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x1ec/0x200() (Not tainted)
May 11 11:06:09 kontron kernel: Hardware name: N/A
May 11 11:06:09 kontron kernel: NETDEV WATCHDOG: eth0 (pch_gbe): transmit queue 0 timed out
May 11 11:06:09 kontron kernel: Modules linked in: fuse ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables tun bridge autofs4 sunrpc cpufreq_ondemand acpi_cpufreq mperf 8021q garp stp llc ipv6 ext3 jbd uinput ppdev parport_pc parport sg microcode pch_gbe(U) mii serio_raw snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc ext4 mbcache jbd2 sd_mod crc_t10dif ahci sdhci_pci sdhci mmc_core video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
May 11 11:06:09 kontron kernel: Pid: 0, comm: swapper Not tainted 2.6.32-220.el6.i686 #1
May 11 11:06:09 kontron kernel: Call Trace:
May 11 11:06:09 kontron kernel: [<c0454c81>] ? warn_slowpath_common+0x81/0xc0
May 11 11:06:09 kontron kernel: [<c07a16bc>] ? dev_watchdog+0x1ec/0x200
May 11 11:06:09 kontron kernel: [<c07a16bc>] ? dev_watchdog+0x1ec/0x200
May 11 11:06:09 kontron kernel: [<c0454d53>] ? warn_slowpath_fmt+0x33/0x40
May 11 11:06:09 kontron kernel: [<c07a16bc>] ? dev_watchdog+0x1ec/0x200
May 11 11:06:09 kontron kernel: [<c0471bfa>] ? insert_work+0x5a/0xb0
May 11 11:06:09 kontron kernel: [<c04656f9>] ? run_timer_softirq+0x139/0x2c0
May 11 11:06:09 kontron kernel: [<c0831315>] ? apic_timer_interrupt+0x31/0x38
May 11 11:06:09 kontron kernel: [<c07a14d0>] ? dev_watchdog+0x0/0x200
May 11 11:06:09 kontron kernel: [<c045be4a>] ? __do_softirq+0x8a/0x1a0
May 11 11:06:09 kontron kernel: [<c045bf9d>] ? do_softirq+0x3d/0x50
May 11 11:06:09 kontron kernel: [<c045c0f5>] ? irq_exit+0x65/0x70
May 11 11:06:09 kontron kernel: [<c0428473>] ? smp_apic_timer_interrupt+0x53/0x90
May 11 11:06:09 kontron kernel: [<c0831315>] ? apic_timer_interrupt+0x31/0x38
May 11 11:06:09 kontron kernel: [<c045007b>] ? throttle_cfs_rq+0x6b/0x130
May 11 11:06:09 kontron kernel: [<c064735f>] ? intel_idle+0xaf/0x140
May 11 11:06:09 kontron kernel: [<c075c282>] ? cpuidle_idle_call+0x72/0x100
May 11 11:06:09 kontron kernel: [<c0408964>] ? cpu_idle+0x94/0xd0
May 11 11:06:09 kontron kernel: [<c082a645>] ? start_secondary+0x20d/0x252
May 11 11:06:09 kontron kernel: ---[ end trace 3672ff56500ae344 ]---
May 11 11:06:09 kontron NetworkManager[1608]: <info> (eth0): carrier now OFF (device state 3)
May 11 11:06:09 kontron NetworkManager[1608]: <info> (eth0): device state change: 3 -> 2 (reason 40)
May 11 11:06:09 kontron NetworkManager[1608]: <info> (eth0): deactivating device (reason: 40).
May 11 11:06:10 kontron abrtd: Directory 'oops-2012-05-11-11:06:10-1924-0' creation detected
May 11 11:06:10 kontron abrt-dump-oops: Reported 1 kernel oopses to Abrt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 16:08     ` Eric Dumazet
  2012-05-11 18:49       ` pch_gbe oops with vlan Andy Cress
@ 2012-05-11 19:30       ` Eric Dumazet
  2012-05-11 19:49         ` [PATCH v2 iproute2] " Eric Dumazet
  2012-05-11 22:16         ` [PATCH v2 net-next] " Eric Dumazet
  1 sibling, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2012-05-11 19:30 UTC (permalink / raw)
  To: Changli Gao, David Miller
  Cc: netdev, Dave Taht, Kathleen Nichols, Van Jacobson, Tom Herbert,
	Matt Mathis, Yuchung Cheng, Stephen Hemminger,
	Maciej Żenczykowski, Nandita Dukkipati

From: Eric Dumazet <edumazet@google.com>

Fair Queue Codel packet scheduler

Principles :

- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
                              be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
  so that new flows have priority on old ones.

- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)

tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                      [ target TIME ] [ interval TIME ] [ noecn ]
                      [ quantum BYTES ]

defaults : 1024 flows, 10240 packets limit, quantum : device MTU
           target : 5ms (CoDel default)
           interval : 100ms (CoDel default)

Impressive results on load :

# tc -s -d cl show dev eth9

class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0 
 Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 201691Kbit 28595pps backlog 0b 312p requeues 0 
 lended: 33063109 borrowed: 0 giants: 0
 tokens: -912 ctokens: -912

class fq_codel 10:1735 parent 10: 
 (dropped 1292, overlimits 0 requeues 0) 
 backlog 15140b 10p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4524 parent 10: 
 (dropped 1291, overlimits 0 requeues 0) 
 backlog 16654b 11p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4e74 parent 10: 
 (dropped 1290, overlimits 0 requeues 0) 
 backlog 6056b 4p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
class fq_codel 10:628a parent 10: 
 (dropped 1289, overlimits 0 requeues 0) 
 backlog 7570b 5p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
class fq_codel 10:a4b3 parent 10: 
 (dropped 302, overlimits 0 requeues 0) 
 backlog 16654b 11p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:c3c2 parent 10: 
 (dropped 1284, overlimits 0 requeues 0) 
 backlog 13626b 9p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:d331 parent 10: 
 (dropped 299, overlimits 0 requeues 0) 
 backlog 15140b 10p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.0ms
class fq_codel 10:d526 parent 10: 
 (dropped 12160, overlimits 0 requeues 0) 
 backlog 35870b 211p requeues 0 
  deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
class fq_codel 10:e2c6 parent 10: 
 (dropped 1288, overlimits 0 requeues 0) 
 backlog 15140b 10p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:eab5 parent 10: 
 (dropped 1285, overlimits 0 requeues 0) 
 backlog 16654b 11p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:f220 parent 10: 
 (dropped 1289, overlimits 0 requeues 0) 
 backlog 15140b 10p requeues 0 
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms

# tc -s -d qd show dev eth9

qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71) 
 rate 201697Kbit 28602pps backlog 0b 260p requeues 71 
qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn 
 Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0) 
 rate 201697Kbit 28602pps backlog 189352b 260p requeues 0 
  maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
  new_flows_len 0 old_flows_len 11


# ping -c 10 172.30.42.18
PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms

--- 172.30.42.18 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms

Much better than SFQ because of priority given to new flows, and fast
path dirtying less cache lines.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Changli Gao <xiaosuo@gmail.com>
---
v2: added 'dropped' counter per flow (sum of drops and marks)
    .change method allowed (tc qdisc change .... )
    quantum is a tunable
    no starvation of old flows because of new ones.
    drop_count correctly handled in dequeue() (upcall to parents)
    pkt_sched.h cleanups

 include/linux/pkt_sched.h |   54 +++
 net/sched/Kconfig         |   11 
 net/sched/Makefile        |    1 
 net/sched/sch_fq_codel.c  |  625 ++++++++++++++++++++++++++++++++++++
 4 files changed, 691 insertions(+)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index cde56c2..32aef0a 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -681,4 +681,58 @@ struct tc_codel_xstats {
 	__u32	dropping;  /* are we in dropping state ? */
 };
 
+/* FQ_CODEL */
+
+enum {
+	TCA_FQ_CODEL_UNSPEC,
+	TCA_FQ_CODEL_TARGET,
+	TCA_FQ_CODEL_LIMIT,
+	TCA_FQ_CODEL_INTERVAL,
+	TCA_FQ_CODEL_ECN,
+	TCA_FQ_CODEL_FLOWS,
+	TCA_FQ_CODEL_QUANTUM,
+	__TCA_FQ_CODEL_MAX
+};
+
+#define TCA_FQ_CODEL_MAX	(__TCA_FQ_CODEL_MAX - 1)
+
+enum {
+	TCA_FQ_CODEL_XSTATS_QDISC,
+	TCA_FQ_CODEL_XSTATS_CLASS,
+};
+
+struct tc_fq_codel_qd_stats {
+	__u32	maxpacket;	/* largest packet we've seen so far */
+	__u32	drop_overlimit; /* number of time max qdisc
+				 * packet limit was hit
+				 */
+	__u32	ecn_mark;	/* number of packets we ECN marked
+				 * instead of being dropped
+				 */
+	__u32	new_flow_count; /* number of time packets
+				 * created a 'new flow'
+				 */
+	__u32	new_flows_len;	/* count of flows in new list */
+	__u32	old_flows_len;	/* count of flows in old list */
+};
+
+struct tc_fq_codel_cl_stats {
+	__s32	deficit;
+	__u32	ldelay;		/* in-queue delay seen by most recently
+				 * dequeued packet
+				 */
+	__u32	count;
+	__u32	lastcount;
+	__u32	dropping;
+	__s32	drop_next;
+};
+
+struct tc_fq_codel_xstats {
+	__u32	type;
+	union {
+		struct tc_fq_codel_qd_stats qdisc_stats;
+		struct tc_fq_codel_cl_stats class_stats;
+	};
+};
+
 #endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index fadd252..e7a8976 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -261,6 +261,17 @@ config NET_SCH_CODEL
 
 	  If unsure, say N.
 
+config NET_SCH_FQ_CODEL
+	tristate "Fair Queue Controlled Delay AQM (FQ_CODEL)"
+	help
+	  Say Y here if you want to use the FQ Controlled Delay (FQ_CODEL)
+	  packet scheduling algorithm.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called sch_fq_codel.
+
+	  If unsure, say N.
+
 config NET_SCH_INGRESS
 	tristate "Ingress Qdisc"
 	depends on NET_CLS_ACT
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 30fab03..5940a19 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_NET_SCH_MQPRIO)	+= sch_mqprio.o
 obj-$(CONFIG_NET_SCH_CHOKE)	+= sch_choke.o
 obj-$(CONFIG_NET_SCH_QFQ)	+= sch_qfq.o
 obj-$(CONFIG_NET_SCH_CODEL)	+= sch_codel.o
+obj-$(CONFIG_NET_SCH_FQ_CODEL)	+= sch_fq_codel.o
 
 obj-$(CONFIG_NET_CLS_U32)	+= cls_u32.o
 obj-$(CONFIG_NET_CLS_ROUTE4)	+= cls_route.o
diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
new file mode 100644
index 0000000..1f538e4
--- /dev/null
+++ b/net/sched/sch_fq_codel.c
@@ -0,0 +1,625 @@
+/*
+ * Fair Queue CoDel discipline
+ *
+ *	This program is free software; you can redistribute it and/or
+ *	modify it under the terms of the GNU General Public License
+ *	as published by the Free Software Foundation; either version
+ *	2 of the License, or (at your option) any later version.
+ *
+ *  Copyright (C) 2012 Eric Dumazet <edumazet@google.com>
+ */
+
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/jiffies.h>
+#include <linux/string.h>
+#include <linux/in.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/skbuff.h>
+#include <linux/jhash.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <net/flow_keys.h>
+#include <net/codel.h>
+
+/*	Fair Queue CoDel.
+ *
+ * Principles :
+ * Packets are classified (internal classifier or external) on flows.
+ * This is a Stochastic model (as we use a hash, several flows
+ *			       might be hashed on same slot)
+ * Each flow has a CoDel managed queue.
+ * Flows are linked onto two (Round Robin) lists,
+ * so that new flows have priority on old ones.
+ *
+ * For a given flow, packets are not reordered (CoDel uses a FIFO)
+ * head drops only.
+ * ECN capability is on by default.
+ * Low memory footprint (64 bytes per flow)
+ */
+
+struct fq_codel_flow {
+	struct sk_buff	  *head;
+	struct sk_buff	  *tail;
+	struct list_head  flowchain;
+	int		  deficit;
+	u32		  dropped; /* number of drops (or ECN marks) on this flow */
+	struct codel_vars cvars;
+}; /* please try to keep this structure <= 64 bytes */
+
+struct fq_codel_sched_data {
+	struct tcf_proto *filter_list;	/* optional external classifier */
+	struct fq_codel_flow *flows;	/* Flows table [flows_cnt] */
+	u32		*backlogs;	/* backlog table [flows_cnt] */
+	u32		flows_cnt;	/* number of flows */
+	u32		perturbation;	/* hash perturbation */
+	u32		quantum;	/* psched_mtu(qdisc_dev(sch)); */
+	struct codel_params cparams;
+	struct codel_stats cstats;
+	u32		drop_overlimit;
+	u32		new_flow_count;
+
+	struct list_head new_flows;	/* list of new flows */
+	struct list_head old_flows;	/* list of old flows */
+};
+
+static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
+				  const struct sk_buff *skb)
+{
+	struct flow_keys keys;
+	unsigned int hash;
+
+	skb_flow_dissect(skb, &keys);
+	hash = jhash_3words((__force u32)keys.dst,
+			    (__force u32)keys.src ^ keys.ip_proto,
+			    (__force u32)keys.ports, q->perturbation);
+	return ((u64)hash * q->flows_cnt) >> 32;
+}
+
+static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch,
+				      int *qerr)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct tcf_result res;
+	int result;
+
+	if (TC_H_MAJ(skb->priority) == sch->handle &&
+	    TC_H_MIN(skb->priority) > 0 &&
+	    TC_H_MIN(skb->priority) <= q->flows_cnt)
+		return TC_H_MIN(skb->priority);
+
+	if (!q->filter_list)
+		return fq_codel_hash(q, skb) + 1;
+
+	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
+	result = tc_classify(skb, q->filter_list, &res);
+	if (result >= 0) {
+#ifdef CONFIG_NET_CLS_ACT
+		switch (result) {
+		case TC_ACT_STOLEN:
+		case TC_ACT_QUEUED:
+			*qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN;
+		case TC_ACT_SHOT:
+			return 0;
+		}
+#endif
+		if (TC_H_MIN(res.classid) <= q->flows_cnt)
+			return TC_H_MIN(res.classid);
+	}
+	return 0;
+}
+
+/* helper functions : might be changed when/if skb use a standard list_head */
+
+/* remove one skb from head of slot queue */
+static inline struct sk_buff *dequeue_head(struct fq_codel_flow *flow)
+{
+	struct sk_buff *skb = flow->head;
+
+	flow->head = skb->next;
+	skb->next = NULL;
+	return skb;
+}
+
+/* add skb to flow queue (tail add) */
+static inline void flow_queue_add(struct fq_codel_flow *flow,
+				  struct sk_buff *skb)
+{
+	if (flow->head == NULL)
+		flow->head = skb;
+	else
+		flow->tail->next = skb;
+	flow->tail = skb;
+	skb->next = NULL;
+}
+
+static unsigned int fq_codel_drop(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *skb;
+	unsigned int maxbacklog = 0, idx = 0, i, len;
+	struct fq_codel_flow *flow;
+
+	/* Queue is full! Find the fat flow and drop packet from it.
+	 * This might sound expensive, but with 1024 flows, we scan
+	 * 4KB of memory, and we dont need to handle a complex tree
+	 * in fast path (packet queue/enqueue) with many cache misses.
+	 */
+	for (i = 0; i < q->flows_cnt; i++) {
+		if (q->backlogs[i] > maxbacklog) {
+			maxbacklog = q->backlogs[i];
+			idx = i;
+		}
+	}
+	flow = &q->flows[idx];
+	skb = dequeue_head(flow);
+	len = qdisc_pkt_len(skb);
+	q->backlogs[idx] -= len;
+	kfree_skb(skb);
+	sch->q.qlen--;
+	sch->qstats.drops++;
+	sch->qstats.backlog -= len;
+	flow->dropped++;
+	return idx;
+}
+
+static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	unsigned int idx;
+	struct fq_codel_flow *flow;
+	int uninitialized_var(ret);
+
+	idx = fq_codel_classify(skb, sch, &ret);
+	if (idx == 0) {
+		if (ret & __NET_XMIT_BYPASS)
+			sch->qstats.drops++;
+		kfree_skb(skb);
+		return ret;
+	}
+	idx--;
+
+	codel_set_enqueue_time(skb);
+	flow = &q->flows[idx];
+	flow_queue_add(flow, skb);
+	q->backlogs[idx] += qdisc_pkt_len(skb);
+	sch->qstats.backlog += qdisc_pkt_len(skb);
+
+	if (list_empty(&flow->flowchain)) {
+		list_add_tail(&flow->flowchain, &q->new_flows);
+		codel_vars_init(&flow->cvars);
+		q->new_flow_count++;
+		flow->deficit = q->quantum;
+		flow->dropped = 0;
+	}
+	if (++sch->q.qlen < sch->limit)
+		return NET_XMIT_SUCCESS;
+
+	q->drop_overlimit++;
+	/* Return Congestion Notification only if we dropped a packet
+	 * from this flow.
+	 */
+	if (fq_codel_drop(sch) == idx)
+		return NET_XMIT_CN;
+
+	/* As we dropped a packet, better let upper stack know this */
+	qdisc_tree_decrease_qlen(sch, 1);
+	return NET_XMIT_SUCCESS;
+}
+
+/* This is the specific function called from codel_dequeue()
+ * to dequeue a packet from queue. Note: backlog is handled in
+ * codel, we dont need to reduce it here.
+ */
+static struct sk_buff *dequeue(struct codel_vars *vars, struct Qdisc *sch)
+{
+	struct fq_codel_flow *flow;
+	struct sk_buff *skb = NULL;
+
+	flow = container_of(vars, struct fq_codel_flow, cvars);
+	if (flow->head) {
+		skb = dequeue_head(flow);
+		sch->qstats.backlog -= qdisc_pkt_len(skb);
+		sch->q.qlen--;
+	}
+	return skb;
+}
+
+static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *skb;
+	struct fq_codel_flow *flow;
+	struct list_head *head;
+	u32 prev_drop_count, prev_ecn_mark;
+
+begin:
+	head = &q->new_flows;
+	if (list_empty(head)) {
+		head = &q->old_flows;
+		if (list_empty(head))
+			return NULL;
+	}
+	flow = list_first_entry(head, struct fq_codel_flow, flowchain);
+
+	if (flow->deficit <= 0) {
+		flow->deficit += q->quantum;
+		list_move_tail(&flow->flowchain, &q->old_flows);
+		goto begin;
+	}
+
+	prev_drop_count = q->cstats.drop_count;
+	prev_ecn_mark = q->cstats.ecn_mark;
+
+	skb = codel_dequeue(sch, &q->cparams, &flow->cvars, &q->cstats,
+			    dequeue, &q->backlogs[flow - q->flows]);
+
+	flow->dropped += q->cstats.drop_count - prev_drop_count;
+	flow->dropped += q->cstats.ecn_mark - prev_ecn_mark;
+
+	if (!skb) {
+		/* force a pass through old_flows to prevent starvation */
+		if ((head == &q->new_flows) && !list_empty(&q->old_flows))
+			list_move_tail(&flow->flowchain, &q->old_flows);
+		else
+			list_del_init(&flow->flowchain);
+		goto begin;
+	}
+	qdisc_bstats_update(sch, skb);
+	flow->deficit -= qdisc_pkt_len(skb);
+	/* We cant call qdisc_tree_decrease_qlen() if our qlen is 0,
+	 * or HTB crashes. Defer it for next round.
+	 */
+	if (q->cstats.drop_count && sch->q.qlen) {
+		qdisc_tree_decrease_qlen(sch, q->cstats.drop_count);
+		q->cstats.drop_count = 0;
+	}
+	return skb;
+}
+
+static void fq_codel_reset(struct Qdisc *sch)
+{
+	struct sk_buff *skb;
+
+	while ((skb = fq_codel_dequeue(sch)) != NULL)
+		kfree_skb(skb);
+}
+
+static const struct nla_policy fq_codel_policy[TCA_FQ_CODEL_MAX + 1] = {
+	[TCA_FQ_CODEL_TARGET]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_LIMIT]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_INTERVAL]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_ECN]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_FLOWS]	= { .type = NLA_U32 },
+	[TCA_FQ_CODEL_QUANTUM]	= { .type = NLA_U32 },
+};
+
+static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct nlattr *tb[TCA_FQ_CODEL_MAX + 1];
+	int err;
+
+	if (!opt)
+		return -EINVAL;
+
+	err = nla_parse_nested(tb, TCA_FQ_CODEL_MAX, opt, fq_codel_policy);
+	if (err < 0)
+		return err;
+	if (tb[TCA_FQ_CODEL_FLOWS]) {
+		if (q->flows)
+			return -EINVAL;
+		q->flows_cnt = nla_get_u32(tb[TCA_FQ_CODEL_FLOWS]);
+		if (!q->flows_cnt ||
+		    q->flows_cnt > 65536)
+			return -EINVAL;
+	}
+	sch_tree_lock(sch);
+
+	if (tb[TCA_FQ_CODEL_TARGET]) {
+		u64 target = nla_get_u32(tb[TCA_FQ_CODEL_TARGET]);
+
+		q->cparams.target = (target * NSEC_PER_USEC) >> CODEL_SHIFT;
+	}
+
+	if (tb[TCA_FQ_CODEL_INTERVAL]) {
+		u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]);
+
+		q->cparams.interval = (interval * NSEC_PER_USEC) >> CODEL_SHIFT;
+	}
+
+	if (tb[TCA_FQ_CODEL_LIMIT])
+		sch->limit = nla_get_u32(tb[TCA_FQ_CODEL_LIMIT]);
+
+	if (tb[TCA_FQ_CODEL_ECN])
+		q->cparams.ecn = !!nla_get_u32(tb[TCA_FQ_CODEL_ECN]);
+
+	if (tb[TCA_FQ_CODEL_QUANTUM])
+		q->quantum = max(256U, nla_get_u32(tb[TCA_FQ_CODEL_QUANTUM]));
+
+	while (sch->q.qlen > sch->limit) {
+		struct sk_buff *skb = fq_codel_dequeue(sch);
+
+		kfree_skb(skb);
+		q->cstats.drop_count++;
+	}
+	qdisc_tree_decrease_qlen(sch, q->cstats.drop_count);
+	q->cstats.drop_count = 0;
+
+	sch_tree_unlock(sch);
+	return 0;
+}
+
+static void *fq_codel_zalloc(size_t sz)
+{
+	void *ptr = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN);
+
+	if (!ptr)
+		ptr = vzalloc(sz);
+	return ptr;
+}
+
+static void fq_codel_free(void *addr)
+{
+	if (addr) {
+		if (is_vmalloc_addr(addr))
+			vfree(addr);
+		else
+			kfree(addr);
+	}
+}
+
+static void fq_codel_destroy(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+
+	tcf_destroy_chain(&q->filter_list);
+	fq_codel_free(q->backlogs);
+	fq_codel_free(q->flows);
+}
+
+static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	int i;
+
+	sch->limit = 10*1024;
+	q->flows_cnt = 1024;
+	q->quantum = psched_mtu(qdisc_dev(sch));
+	q->perturbation = net_random();
+	INIT_LIST_HEAD(&q->new_flows);
+	INIT_LIST_HEAD(&q->old_flows);
+	codel_params_init(&q->cparams);
+	codel_stats_init(&q->cstats);
+	q->cparams.ecn = true;
+
+	if (opt) {
+		int err = fq_codel_change(sch, opt);
+		if (err)
+			return err;
+	}
+
+	if (!q->flows) {
+		q->flows = fq_codel_zalloc(q->flows_cnt *
+					   sizeof(struct fq_codel_flow));
+		if (!q->flows)
+			return -ENOMEM;
+		q->backlogs = fq_codel_zalloc(q->flows_cnt * sizeof(u32));
+		if (!q->backlogs) {
+			fq_codel_free(q->flows);
+			return -ENOMEM;
+		}
+		for (i = 0; i < q->flows_cnt; i++) {
+			struct fq_codel_flow *flow = q->flows + i;
+
+			INIT_LIST_HEAD(&flow->flowchain);
+		}
+	}
+	if (sch->limit >= 1)
+		sch->flags |= TCQ_F_CAN_BYPASS;
+	else
+		sch->flags &= ~TCQ_F_CAN_BYPASS;
+	return 0;
+}
+
+static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct nlattr *opts;
+
+	opts = nla_nest_start(skb, TCA_OPTIONS);
+	if (opts == NULL)
+		goto nla_put_failure;
+
+	if (nla_put_u32(skb, TCA_FQ_CODEL_TARGET,
+			codel_time_to_us(q->cparams.target)) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_LIMIT,
+			sch->limit) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_INTERVAL,
+			codel_time_to_us(q->cparams.interval)) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_ECN,
+			q->cparams.ecn) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_QUANTUM,
+			q->quantum) ||
+	    nla_put_u32(skb, TCA_FQ_CODEL_FLOWS,
+			q->flows_cnt))
+		goto nla_put_failure;
+
+	nla_nest_end(skb, opts);
+	return skb->len;
+
+nla_put_failure:
+	return -1;
+}
+
+static int fq_codel_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	struct tc_fq_codel_xstats st = {
+		.type				= TCA_FQ_CODEL_XSTATS_QDISC,
+		.qdisc_stats.maxpacket		= q->cstats.maxpacket,
+		.qdisc_stats.drop_overlimit	= q->drop_overlimit,
+		.qdisc_stats.ecn_mark		= q->cstats.ecn_mark,
+		.qdisc_stats.new_flow_count	= q->new_flow_count,
+	};
+	struct list_head *pos;
+
+	list_for_each(pos, &q->new_flows)
+		st.qdisc_stats.new_flows_len++;
+
+	list_for_each(pos, &q->old_flows)
+		st.qdisc_stats.old_flows_len++;
+
+	return gnet_stats_copy_app(d, &st, sizeof(st));
+}
+
+static struct Qdisc *fq_codel_leaf(struct Qdisc *sch, unsigned long arg)
+{
+	return NULL;
+}
+
+static unsigned long fq_codel_get(struct Qdisc *sch, u32 classid)
+{
+	return 0;
+}
+
+static unsigned long fq_codel_bind(struct Qdisc *sch, unsigned long parent,
+			      u32 classid)
+{
+	/* we cannot bypass queue discipline anymore */
+	sch->flags &= ~TCQ_F_CAN_BYPASS;
+	return 0;
+}
+
+static void fq_codel_put(struct Qdisc *q, unsigned long cl)
+{
+}
+
+static struct tcf_proto **fq_codel_find_tcf(struct Qdisc *sch, unsigned long cl)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+
+	if (cl)
+		return NULL;
+	return &q->filter_list;
+}
+
+static int fq_codel_dump_class(struct Qdisc *sch, unsigned long cl,
+			  struct sk_buff *skb, struct tcmsg *tcm)
+{
+	tcm->tcm_handle |= TC_H_MIN(cl);
+	return 0;
+}
+
+static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl,
+				     struct gnet_dump *d)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	u32 idx = cl - 1;
+	struct gnet_stats_queue qs = { 0 };
+	struct tc_fq_codel_xstats xstats;
+
+	WARN_ON_ONCE(1);
+	if (idx < q->flows_cnt) {
+		const struct fq_codel_flow *flow = &q->flows[idx];
+		const struct sk_buff *skb = flow->head;
+
+		memset(&xstats, 0, sizeof(xstats));
+		xstats.type = TCA_FQ_CODEL_XSTATS_CLASS;
+		xstats.class_stats.deficit = flow->deficit;
+		xstats.class_stats.ldelay =
+			codel_time_to_us(flow->cvars.ldelay);
+		xstats.class_stats.count = flow->cvars.count;
+		xstats.class_stats.lastcount = flow->cvars.lastcount;
+		xstats.class_stats.dropping = flow->cvars.dropping;
+		if (flow->cvars.dropping) {
+			codel_tdiff_t delta = flow->cvars.drop_next -
+					      codel_get_time();
+
+			xstats.class_stats.drop_next = (delta >= 0) ?
+				codel_time_to_us(delta) :
+				-codel_time_to_us(-delta);
+		}
+		while (skb) {
+			qs.qlen++;
+			skb = skb->next;
+		}
+		qs.backlog = q->backlogs[idx];
+		qs.drops = flow->dropped;
+	}
+	if (gnet_stats_copy_queue(d, &qs) < 0)
+		return -1;
+	if (idx < q->flows_cnt)
+		return gnet_stats_copy_app(d, &xstats, sizeof(xstats));
+	return 0;
+}
+
+static void fq_codel_walk(struct Qdisc *sch, struct qdisc_walker *arg)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+	unsigned int i;
+
+	if (arg->stop)
+		return;
+
+	for (i = 0; i < q->flows_cnt; i++) {
+		if (list_empty(&q->flows[i].flowchain) ||
+		    arg->count < arg->skip) {
+			arg->count++;
+			continue;
+		}
+		if (arg->fn(sch, i + 1, arg) < 0) {
+			arg->stop = 1;
+			break;
+		}
+		arg->count++;
+	}
+}
+
+static const struct Qdisc_class_ops fq_codel_class_ops = {
+	.leaf		=	fq_codel_leaf,
+	.get		=	fq_codel_get,
+	.put		=	fq_codel_put,
+	.tcf_chain	=	fq_codel_find_tcf,
+	.bind_tcf	=	fq_codel_bind,
+	.unbind_tcf	=	fq_codel_put,
+	.dump		=	fq_codel_dump_class,
+	.dump_stats	=	fq_codel_dump_class_stats,
+	.walk		=	fq_codel_walk,
+};
+
+static struct Qdisc_ops fq_codel_qdisc_ops __read_mostly = {
+	.cl_ops		=	&fq_codel_class_ops,
+	.id		=	"fq_codel",
+	.priv_size	=	sizeof(struct fq_codel_sched_data),
+	.enqueue	=	fq_codel_enqueue,
+	.dequeue	=	fq_codel_dequeue,
+	.peek		=	qdisc_peek_dequeued,
+	.drop		=	fq_codel_drop,
+	.init		=	fq_codel_init,
+	.reset		=	fq_codel_reset,
+	.destroy	=	fq_codel_destroy,
+	.change		=	fq_codel_change,
+	.dump		=	fq_codel_dump,
+	.dump_stats =	fq_codel_dump_stats,
+	.owner		=	THIS_MODULE,
+};
+
+static int __init fq_codel_module_init(void)
+{
+	return register_qdisc(&fq_codel_qdisc_ops);
+}
+
+static void __exit fq_codel_module_exit(void)
+{
+	unregister_qdisc(&fq_codel_qdisc_ops);
+}
+
+module_init(fq_codel_module_init)
+module_exit(fq_codel_module_exit)
+MODULE_AUTHOR("Eric Dumazet");
+MODULE_LICENSE("GPL");

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 iproute2] fq_codel: Fair Queue Codel AQM
  2012-05-11 19:30       ` [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
@ 2012-05-11 19:49         ` Eric Dumazet
  2012-05-11 22:16         ` [PATCH v2 net-next] " Eric Dumazet
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2012-05-11 19:49 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, Changli Gao, netdev, Dave Taht, Kathleen Nichols,
	Van Jacobson, Tom Herbert, Matt Mathis, Yuchung Cheng,
	Stephen Hemminger, Maciej Żenczykowski, Nandita Dukkipati

From: Eric Dumazet <edumazet@google.com>

Fair Queue Codel packet scheduler

Principles :

- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
                              be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
  so that new flows have priority on old ones.

- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)

tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                      [ target TIME ] [ interval TIME ] [ noecn ]
                      [ quantum BYTES ]

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Changli Gao <xiaosuo@gmail.com>
---
 include/linux/pkt_sched.h |   54 ++++++++
 tc/Makefile               |    1 
 tc/q_fq_codel.c           |  232 ++++++++++++++++++++++++++++++++++++
 3 files changed, 287 insertions(+)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index cde56c2..32aef0a 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -681,4 +681,58 @@ struct tc_codel_xstats {
 	__u32	dropping;  /* are we in dropping state ? */
 };
 
+/* FQ_CODEL */
+
+enum {
+	TCA_FQ_CODEL_UNSPEC,
+	TCA_FQ_CODEL_TARGET,
+	TCA_FQ_CODEL_LIMIT,
+	TCA_FQ_CODEL_INTERVAL,
+	TCA_FQ_CODEL_ECN,
+	TCA_FQ_CODEL_FLOWS,
+	TCA_FQ_CODEL_QUANTUM,
+	__TCA_FQ_CODEL_MAX
+};
+
+#define TCA_FQ_CODEL_MAX	(__TCA_FQ_CODEL_MAX - 1)
+
+enum {
+	TCA_FQ_CODEL_XSTATS_QDISC,
+	TCA_FQ_CODEL_XSTATS_CLASS,
+};
+
+struct tc_fq_codel_qd_stats {
+	__u32	maxpacket;	/* largest packet we've seen so far */
+	__u32	drop_overlimit; /* number of time max qdisc
+				 * packet limit was hit
+				 */
+	__u32	ecn_mark;	/* number of packets we ECN marked
+				 * instead of being dropped
+				 */
+	__u32	new_flow_count; /* number of time packets
+				 * created a 'new flow'
+				 */
+	__u32	new_flows_len;	/* count of flows in new list */
+	__u32	old_flows_len;	/* count of flows in old list */
+};
+
+struct tc_fq_codel_cl_stats {
+	__s32	deficit;
+	__u32	ldelay;		/* in-queue delay seen by most recently
+				 * dequeued packet
+				 */
+	__u32	count;
+	__u32	lastcount;
+	__u32	dropping;
+	__s32	drop_next;
+};
+
+struct tc_fq_codel_xstats {
+	__u32	type;
+	union {
+		struct tc_fq_codel_qd_stats qdisc_stats;
+		struct tc_fq_codel_cl_stats class_stats;
+	};
+};
+
 #endif
diff --git a/tc/Makefile b/tc/Makefile
index 8a7cc8d..64d93ad 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -48,6 +48,7 @@ TCMODULES += em_u32.o
 TCMODULES += em_meta.o
 TCMODULES += q_mqprio.o
 TCMODULES += q_codel.o
+TCMODULES += q_fq_codel.o
 
 TCSO :=
 ifeq ($(TC_CONFIG_ATM),y)
diff --git a/tc/q_fq_codel.c b/tc/q_fq_codel.c
new file mode 100644
index 0000000..3b3b074
--- /dev/null
+++ b/tc/q_fq_codel.c
@@ -0,0 +1,232 @@
+/*
+ * Fair Queue Codel
+ *
+ *  Copyright (C) 2012 Eric Dumazet <edumazet@google.com>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions, and the following disclaimer,
+ *    without modification.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. The names of the authors may not be used to endorse or promote products
+ *    derived from this software without specific prior written permission.
+ *
+ * Alternatively, provided that this notice is retained in full, this
+ * software may be distributed under the terms of the GNU General
+ * Public License ("GPL") version 2, in which case the provisions of the
+ * GPL apply INSTEAD OF those given above.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
+ * DAMAGE.
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <syslog.h>
+#include <fcntl.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <string.h>
+
+#include "utils.h"
+#include "tc_util.h"
+
+static void explain(void)
+{
+	fprintf(stderr, "Usage: ... fq_codel [ limit PACKETS ] [ flows NUMBER ]\n");
+	fprintf(stderr, "                    [ target TIME] [ interval TIME ]\n");
+	fprintf(stderr, "                    [ quantum BYTES ] [ [no]ecn ]\n");
+}
+
+static int fq_codel_parse_opt(struct qdisc_util *qu, int argc, char **argv,
+			      struct nlmsghdr *n)
+{
+	unsigned limit = 0;
+	unsigned flows = 0;
+	unsigned target = 0;
+	unsigned interval = 0;
+	unsigned quantum = 0;
+	int ecn = -1;
+	struct rtattr *tail;
+
+	while (argc > 0) {
+		if (strcmp(*argv, "limit") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&limit, *argv, 0)) {
+				fprintf(stderr, "Illegal \"limit\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "flows") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&flows, *argv, 0)) {
+				fprintf(stderr, "Illegal \"flows\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "quantum") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&quantum, *argv, 0)) {
+				fprintf(stderr, "Illegal \"quantum\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "target") == 0) {
+			NEXT_ARG();
+			if (get_time(&target, *argv)) {
+				fprintf(stderr, "Illegal \"target\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "interval") == 0) {
+			NEXT_ARG();
+			if (get_time(&interval, *argv)) {
+				fprintf(stderr, "Illegal \"interval\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "ecn") == 0) {
+			ecn = 1;
+		} else if (strcmp(*argv, "noecn") == 0) {
+			ecn = 0;
+		} else if (strcmp(*argv, "help") == 0) {
+			explain();
+			return -1;
+		} else {
+			fprintf(stderr, "What is \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		argc--; argv++;
+	}
+
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (limit)
+		addattr_l(n, 1024, TCA_FQ_CODEL_LIMIT, &limit, sizeof(limit));
+	if (flows)
+		addattr_l(n, 1024, TCA_FQ_CODEL_FLOWS, &flows, sizeof(flows));
+	if (quantum)
+		addattr_l(n, 1024, TCA_FQ_CODEL_QUANTUM, &quantum, sizeof(quantum));
+	if (interval)
+		addattr_l(n, 1024, TCA_FQ_CODEL_INTERVAL, &interval, sizeof(interval));
+	if (target)
+		addattr_l(n, 1024, TCA_FQ_CODEL_TARGET, &target, sizeof(target));
+	if (ecn != -1)
+		addattr_l(n, 1024, TCA_FQ_CODEL_ECN, &ecn, sizeof(ecn));
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
+	return 0;
+}
+
+static int fq_codel_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
+{
+	struct rtattr *tb[TCA_FQ_CODEL_MAX + 1];
+	unsigned limit;
+	unsigned flows;
+	unsigned interval;
+	unsigned target;
+	unsigned ecn;
+	unsigned quantum;
+	SPRINT_BUF(b1);
+
+	if (opt == NULL)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_FQ_CODEL_MAX, opt);
+
+	if (tb[TCA_FQ_CODEL_LIMIT] &&
+	    RTA_PAYLOAD(tb[TCA_FQ_CODEL_LIMIT]) >= sizeof(__u32)) {
+		limit = rta_getattr_u32(tb[TCA_FQ_CODEL_LIMIT]);
+		fprintf(f, "limit %up ", limit);
+	}
+	if (tb[TCA_FQ_CODEL_FLOWS] &&
+	    RTA_PAYLOAD(tb[TCA_FQ_CODEL_FLOWS]) >= sizeof(__u32)) {
+		flows = rta_getattr_u32(tb[TCA_FQ_CODEL_FLOWS]);
+		fprintf(f, "flows %u ", flows);
+	}
+	if (tb[TCA_FQ_CODEL_QUANTUM] &&
+	    RTA_PAYLOAD(tb[TCA_FQ_CODEL_QUANTUM]) >= sizeof(__u32)) {
+		quantum = rta_getattr_u32(tb[TCA_FQ_CODEL_QUANTUM]);
+		fprintf(f, "quantum %u ", quantum);
+	}
+	if (tb[TCA_FQ_CODEL_TARGET] &&
+	    RTA_PAYLOAD(tb[TCA_FQ_CODEL_TARGET]) >= sizeof(__u32)) {
+		target = rta_getattr_u32(tb[TCA_FQ_CODEL_TARGET]);
+		fprintf(f, "target %s ", sprint_time(target, b1));
+	}
+	if (tb[TCA_FQ_CODEL_INTERVAL] &&
+	    RTA_PAYLOAD(tb[TCA_FQ_CODEL_INTERVAL]) >= sizeof(__u32)) {
+		interval = rta_getattr_u32(tb[TCA_FQ_CODEL_INTERVAL]);
+		fprintf(f, "interval %s ", sprint_time(interval, b1));
+	}
+	if (tb[TCA_FQ_CODEL_ECN] &&
+	    RTA_PAYLOAD(tb[TCA_FQ_CODEL_ECN]) >= sizeof(__u32)) {
+		ecn = rta_getattr_u32(tb[TCA_FQ_CODEL_ECN]);
+		if (ecn)
+			fprintf(f, "ecn ");
+	}
+
+	return 0;
+}
+
+static int fq_codel_print_xstats(struct qdisc_util *qu, FILE *f,
+				 struct rtattr *xstats)
+{
+	struct tc_fq_codel_xstats *st;
+	SPRINT_BUF(b1);
+
+	if (xstats == NULL)
+		return 0;
+
+	if (RTA_PAYLOAD(xstats) < sizeof(*st))
+		return -1;
+
+	st = RTA_DATA(xstats);
+	if (st->type == TCA_FQ_CODEL_XSTATS_QDISC) {
+		fprintf(f, "  maxpacket %u drop_overlimit %u new_flow_count %u ecn_mark %u",
+			st->qdisc_stats.maxpacket,
+			st->qdisc_stats.drop_overlimit,
+			st->qdisc_stats.new_flow_count,
+			st->qdisc_stats.ecn_mark);
+		fprintf(f, "\n  new_flows_len %u old_flows_len %u",
+			st->qdisc_stats.new_flows_len,
+			st->qdisc_stats.old_flows_len);
+	}
+	if (st->type == TCA_FQ_CODEL_XSTATS_CLASS) {
+		fprintf(f, "  deficit %d count %u lastcount %u ldelay %s",
+			st->class_stats.deficit,
+			st->class_stats.count,
+			st->class_stats.lastcount,
+			sprint_time(st->class_stats.ldelay, b1));
+		if (st->class_stats.dropping) {
+			fprintf(f, " dropping");
+			if (st->class_stats.drop_next < 0)
+				fprintf(f, " drop_next -%s",
+					sprint_time(-st->class_stats.drop_next, b1));
+			else
+				fprintf(f, " drop_next %s",
+					sprint_time(st->class_stats.drop_next, b1));
+		}
+	}
+	return 0;
+
+}
+
+struct qdisc_util fq_codel_qdisc_util = {
+	.id		= "fq_codel",
+	.parse_qopt	= fq_codel_parse_opt,
+	.print_qopt	= fq_codel_print_opt,
+	.print_xstats	= fq_codel_print_xstats,
+};

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: pch_gbe oops with vlan
  2012-05-11 18:49       ` pch_gbe oops with vlan Andy Cress
@ 2012-05-11 20:36         ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2012-05-11 20:36 UTC (permalink / raw)
  To: andy.cress; +Cc: netdev


Ummm, no.  You can't do this.

You replied to Eric Dumazet's patch posting, which is completely
unrelated to what you want to post about.  Then you edited the
Subject: and thought that was OK.

This is wrong, because the thread ID and other related fields still
refer to Eric's posting, so all thread indexing facilities still
think your posting is a reply to Eric's.

Don't do this, write a new email to the list properly.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 19:30       ` [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
  2012-05-11 19:49         ` [PATCH v2 iproute2] " Eric Dumazet
@ 2012-05-11 22:16         ` Eric Dumazet
  2012-05-11 22:17           ` David Miller
  2012-05-12 19:55           ` David Miller
  1 sibling, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2012-05-11 22:16 UTC (permalink / raw)
  To: Changli Gao
  Cc: David Miller, netdev, Dave Taht, Kathleen Nichols, Van Jacobson,
	Tom Herbert, Matt Mathis, Yuchung Cheng, Stephen Hemminger,
	Maciej Żenczykowski, Nandita Dukkipati

On Fri, 2012-05-11 at 21:30 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>

...

> +static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl,
> +				     struct gnet_dump *d)
> +{
> +	struct fq_codel_sched_data *q = qdisc_priv(sch);
> +	u32 idx = cl - 1;
> +	struct gnet_stats_queue qs = { 0 };
> +	struct tc_fq_codel_xstats xstats;
> +
> +	WARN_ON_ONCE(1);
> +	if (idx < q->flows_cnt) {
> +		const struct fq_codel_flow *flow = &q->flows[idx];
> +		const struct sk_buff *skb = flow->head;

Oh well, I forgot to remove this WARN_ON_ONCE(1)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 22:16         ` [PATCH v2 net-next] " Eric Dumazet
@ 2012-05-11 22:17           ` David Miller
  2012-05-12 19:55           ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2012-05-11 22:17 UTC (permalink / raw)
  To: eric.dumazet
  Cc: xiaosuo, netdev, dave.taht, nichols, van, therbert, mattmathis,
	ycheng, shemminger, maze, nanditad

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 12 May 2012 00:16:16 +0200

> On Fri, 2012-05-11 at 21:30 +0200, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
> 
> ...
> 
>> +static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl,
>> +				     struct gnet_dump *d)
>> +{
>> +	struct fq_codel_sched_data *q = qdisc_priv(sch);
>> +	u32 idx = cl - 1;
>> +	struct gnet_stats_queue qs = { 0 };
>> +	struct tc_fq_codel_xstats xstats;
>> +
>> +	WARN_ON_ONCE(1);
>> +	if (idx < q->flows_cnt) {
>> +		const struct fq_codel_flow *flow = &q->flows[idx];
>> +		const struct sk_buff *skb = flow->head;
> 
> Oh well, I forgot to remove this WARN_ON_ONCE(1)

I can do it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM
  2012-05-11 22:16         ` [PATCH v2 net-next] " Eric Dumazet
  2012-05-11 22:17           ` David Miller
@ 2012-05-12 19:55           ` David Miller
  2012-05-12 20:42             ` Eric Dumazet
  1 sibling, 1 reply; 12+ messages in thread
From: David Miller @ 2012-05-12 19:55 UTC (permalink / raw)
  To: eric.dumazet
  Cc: xiaosuo, netdev, dave.taht, nichols, van, therbert, mattmathis,
	ycheng, shemminger, maze, nanditad

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 12 May 2012 00:16:16 +0200

> On Fri, 2012-05-11 at 21:30 +0200, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
> 
> ...
> 
>> +static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl,
>> +				     struct gnet_dump *d)
>> +{
>> +	struct fq_codel_sched_data *q = qdisc_priv(sch);
>> +	u32 idx = cl - 1;
>> +	struct gnet_stats_queue qs = { 0 };
>> +	struct tc_fq_codel_xstats xstats;
>> +
>> +	WARN_ON_ONCE(1);
>> +	if (idx < q->flows_cnt) {
>> +		const struct fq_codel_flow *flow = &q->flows[idx];
>> +		const struct sk_buff *skb = flow->head;
> 
> Oh well, I forgot to remove this WARN_ON_ONCE(1)

I applied this with the WARN_ON_ONCE(1) removed but there was another
problem.

When you include ping output in your commit message that "---" string
told GIT that this was the end of the commit message when in fact
there was more content including your signoff.

I caught it and fixed it up, but please be more mindful of this in
the future.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM
  2012-05-12 19:55           ` David Miller
@ 2012-05-12 20:42             ` Eric Dumazet
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2012-05-12 20:42 UTC (permalink / raw)
  To: David Miller
  Cc: xiaosuo, netdev, dave.taht, nichols, van, therbert, mattmathis,
	ycheng, shemminger, maze, nanditad

On Sat, 2012-05-12 at 15:55 -0400, David Miller wrote:

> I applied this with the WARN_ON_ONCE(1) removed but there was another
> problem.
> 
> When you include ping output in your commit message that "---" string
> told GIT that this was the end of the commit message when in fact
> there was more content including your signoff.
> 
> I caught it and fixed it up, but please be more mindful of this in
> the future.

Oops... I feel sorry and I fear this will happen again, because I use a
lot "ping". I'll try to remember this.

Thanks a lot David.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-05-12 20:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-11 13:59 [PATCH net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
2012-05-11 15:03 ` Changli Gao
2012-05-11 15:23   ` Eric Dumazet
2012-05-11 16:08     ` Eric Dumazet
2012-05-11 18:49       ` pch_gbe oops with vlan Andy Cress
2012-05-11 20:36         ` David Miller
2012-05-11 19:30       ` [PATCH v2 net-next] fq_codel: Fair Queue Codel AQM Eric Dumazet
2012-05-11 19:49         ` [PATCH v2 iproute2] " Eric Dumazet
2012-05-11 22:16         ` [PATCH v2 net-next] " Eric Dumazet
2012-05-11 22:17           ` David Miller
2012-05-12 19:55           ` David Miller
2012-05-12 20:42             ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.