* Re: [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable [not found] <201607120341.UDv5wy6R%fengguang.wu@intel.com> @ 2016-07-14 14:14 ` Jiri Kosina 2016-07-28 12:51 ` Fengguang Wu 0 siblings, 1 reply; 5+ messages in thread From: Jiri Kosina @ 2016-07-14 14:14 UTC (permalink / raw) To: kbuild test robot; +Cc: kbuild-all, linux-kernel, netdev [ added CCs ] On Tue, 12 Jul 2016, kbuild test robot wrote: > Hi, > > [auto build test ERROR on net/master] > [also build test ERROR on v4.7-rc7 next-20160711] > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] > > url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160711-220527 > config: arm-tct_hammer_defconfig (attached as .config) > compiler: arm-linux-gnueabi-gcc (Debian 5.3.1-8) 5.3.1 20160205 > reproduce: > wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # save the attached .config to linux build tree > make.cross ARCH=arm > > All errors (new ones prefixed by >>): > > net/built-in.o: In function `dev_activate': > >> wext-proc.c:(.text+0x38544): undefined reference to `qdisc_hash_add' This issue is be there even without my patch (but with qdisc_list_add instead), isn't it? The problem is that sch_generic.c (where dev_activate() is) is being compiled everytime CONFIG_NET is set, but sch_api.c (where qdisc_list_add() is defined) only when CONFIG_NET_SCHED is set (and there is no stub for !CONFIG_NET_SCHED case). -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable 2016-07-14 14:14 ` [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable Jiri Kosina @ 2016-07-28 12:51 ` Fengguang Wu 0 siblings, 0 replies; 5+ messages in thread From: Fengguang Wu @ 2016-07-28 12:51 UTC (permalink / raw) To: Jiri Kosina; +Cc: kbuild-all, linux-kernel, netdev Hi Jiri, On Thu, Jul 14, 2016 at 04:14:58PM +0200, Jiri Kosina wrote: > >[ added CCs ] > >On Tue, 12 Jul 2016, kbuild test robot wrote: > >> Hi, >> >> [auto build test ERROR on net/master] >> [also build test ERROR on v4.7-rc7 next-20160711] >> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] >> >> url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160711-220527 >> config: arm-tct_hammer_defconfig (attached as .config) >> compiler: arm-linux-gnueabi-gcc (Debian 5.3.1-8) 5.3.1 20160205 >> reproduce: >> wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross >> chmod +x ~/bin/make.cross >> # save the attached .config to linux build tree >> make.cross ARCH=arm >> >> All errors (new ones prefixed by >>): >> >> net/built-in.o: In function `dev_activate': >> >> wext-proc.c:(.text+0x38544): undefined reference to `qdisc_hash_add' > >This issue is be there even without my patch (but with qdisc_list_add >instead), isn't it? Yes it looks so, this number happens in a number of places: dns_query.c:(.text+0x39b84): undefined reference to `qdisc_hash_add' include/linux/netdevice.h:1935: undefined reference to `qdisc_hash_add' net/core/netevent.c:31: undefined reference to `qdisc_hash_add' net/sched/sch_generic.c:789: undefined reference to `qdisc_hash_add' sch_generic.c:(.text+0x33487): undefined reference to `qdisc_hash_add' switchdev.c:(.text+0x3bf58): undefined reference to `qdisc_hash_add' sysctl_net.c:(.text+0x31f70): undefined reference to `qdisc_hash_add' (.text.dev_activate+0x228): undefined reference to `qdisc_hash_add' (.text+0x37d0b): undefined reference to `qdisc_hash_add' wext-proc.c:(.text+0x390a8): undefined reference to `qdisc_hash_add' >The problem is that sch_generic.c (where dev_activate() is) is being >compiled everytime CONFIG_NET is set, but sch_api.c (where >qdisc_list_add() is defined) only when CONFIG_NET_SCHED is set (and there >is no stub for !CONFIG_NET_SCHED case). So it looks like a more general problem than specific to this patch. Thanks, Fengguang ^ permalink raw reply [flat|nested] 5+ messages in thread
* Deleting child qdisc doesn't reset parent to default qdisc? @ 2016-04-14 14:44 Jiri Kosina 2016-04-14 15:01 ` Eric Dumazet 0 siblings, 1 reply; 5+ messages in thread From: Jiri Kosina @ 2016-04-14 14:44 UTC (permalink / raw) To: Jamal Hadi Salim; +Cc: netdev, linux-kernel Hi, I've came across the behavior where adding a child qdisc and then deleting it again makes the networking dysfunctional (I guess that's because all of a sudden there is absolutely no working qdisc on the device, although there originally was a default one in the parent). In a nutshell, is this expected behavior or bug? ===== jikos:~ # tc qdisc show qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms jikos:~ # ping -c 1 nix.cz | head -2 PING nix.cz (195.47.235.3) 56(84) bytes of data. 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.59 ms jikos:~ # tc qdisc add dev eth0 parent 10:1 sfq jikos:~ # tc qdisc show qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms qdisc sfq 8008: dev eth0 parent 10:1 limit 127p quantum 1514b depth 127 divisor 1024 jikos:~ # ping -c 1 nix.cz | head -2 PING nix.cz (195.47.235.3) 56(84) bytes of data. 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.67 ms jikos:~ # tc qdisc del dev eth0 parent 10:1 sfq jikos:~ # tc qdisc show qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms jikos:~ # ping -c 1 nix.cz | head -2 PING nix.cz (195.47.235.3) 56(84) bytes of data. [ ... nothing happens ... ] ^C jikos:~ # tc qdisc add dev eth0 parent 10:1 sfq jikos:~ # ping -c 1 nix.cz | head -2 PING nix.cz (195.47.235.3) 56(84) bytes of data. 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.66 ms ===== Thanks, -- Jiri Kosina ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc? 2016-04-14 14:44 Deleting child qdisc doesn't reset parent to default qdisc? Jiri Kosina @ 2016-04-14 15:01 ` Eric Dumazet 2016-04-14 15:18 ` Phil Sutter 0 siblings, 1 reply; 5+ messages in thread From: Eric Dumazet @ 2016-04-14 15:01 UTC (permalink / raw) To: Jiri Kosina; +Cc: Jamal Hadi Salim, netdev, linux-kernel On Thu, 2016-04-14 at 16:44 +0200, Jiri Kosina wrote: > Hi, > > I've came across the behavior where adding a child qdisc and then deleting > it again makes the networking dysfunctional (I guess that's because all of > a sudden there is absolutely no working qdisc on the device, although > there originally was a default one in the parent). > > In a nutshell, is this expected behavior or bug? This is the expected behavior. If the kernel was suddenly doing a 'replace' when you ask a delete, then the scripts doing a delete , than a add would break. tc users are skilled admins ;) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc? 2016-04-14 15:01 ` Eric Dumazet @ 2016-04-14 15:18 ` Phil Sutter 2016-04-14 16:08 ` Jiri Kosina 0 siblings, 1 reply; 5+ messages in thread From: Phil Sutter @ 2016-04-14 15:18 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jiri Kosina, Jamal Hadi Salim, netdev, linux-kernel On Thu, Apr 14, 2016 at 08:01:39AM -0700, Eric Dumazet wrote: > On Thu, 2016-04-14 at 16:44 +0200, Jiri Kosina wrote: > > Hi, > > > > I've came across the behavior where adding a child qdisc and then deleting > > it again makes the networking dysfunctional (I guess that's because all of > > a sudden there is absolutely no working qdisc on the device, although > > there originally was a default one in the parent). > > > > In a nutshell, is this expected behavior or bug? > > This is the expected behavior. OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default one upon deletion instead of noop_qdisc, hence I would describe the situation using the words 'inconsistent' and 'accident' rather than 'expected'. :) Anyhow, the problem with skilled admins is they accept quirks too easily and just build their scripts around them - the same scripts we have to keep compatible to then. Cheers, Phil ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc? 2016-04-14 15:18 ` Phil Sutter @ 2016-04-14 16:08 ` Jiri Kosina 2016-04-14 17:49 ` Eric Dumazet 0 siblings, 1 reply; 5+ messages in thread From: Jiri Kosina @ 2016-04-14 16:08 UTC (permalink / raw) To: Phil Sutter; +Cc: Eric Dumazet, Jamal Hadi Salim, netdev, linux-kernel On Thu, 14 Apr 2016, Phil Sutter wrote: > > > I've came across the behavior where adding a child qdisc and then deleting > > > it again makes the networking dysfunctional (I guess that's because all of > > > a sudden there is absolutely no working qdisc on the device, although > > > there originally was a default one in the parent). > > > > > > In a nutshell, is this expected behavior or bug? > > > > This is the expected behavior. > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > one upon deletion instead of noop_qdisc, hence I would describe > the situation using the words 'inconsistent' and 'accident' rather than > 'expected'. :) Would a patch that'd unify this in a sense that all qdiscs would assign the default one upon deletion acceptable? Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc? 2016-04-14 16:08 ` Jiri Kosina @ 2016-04-14 17:49 ` Eric Dumazet 2016-04-15 12:42 ` Jamal Hadi Salim 0 siblings, 1 reply; 5+ messages in thread From: Eric Dumazet @ 2016-04-14 17:49 UTC (permalink / raw) To: Jiri Kosina; +Cc: Phil Sutter, Jamal Hadi Salim, netdev, linux-kernel On Thu, 2016-04-14 at 18:08 +0200, Jiri Kosina wrote: > On Thu, 14 Apr 2016, Phil Sutter wrote: > > > > > I've came across the behavior where adding a child qdisc and then deleting > > > > it again makes the networking dysfunctional (I guess that's because all of > > > > a sudden there is absolutely no working qdisc on the device, although > > > > there originally was a default one in the parent). > > > > > > > > In a nutshell, is this expected behavior or bug? > > > > > > This is the expected behavior. > > > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > > one upon deletion instead of noop_qdisc, hence I would describe > > the situation using the words 'inconsistent' and 'accident' rather than > > 'expected'. :) > > Would a patch that'd unify this in a sense that all qdiscs would assign > the default one upon deletion acceptable? > And what would be the chosen behavior ? Relying on TBF installing a bfifo for you at delete would be hazardous. For example CBQ got it differently than HFSC If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC falls back to noop_qdisc, without warning the user :( At least always using noop_qdisc is consistent. No magic there. Doing 'unification' right now would break existing scripts. This is too late, I am afraid. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc? 2016-04-14 17:49 ` Eric Dumazet @ 2016-04-15 12:42 ` Jamal Hadi Salim 2016-04-15 14:58 ` Eric Dumazet 0 siblings, 1 reply; 5+ messages in thread From: Jamal Hadi Salim @ 2016-04-15 12:42 UTC (permalink / raw) To: Eric Dumazet, Jiri Kosina; +Cc: Phil Sutter, netdev, linux-kernel On 16-04-14 01:49 PM, Eric Dumazet wrote: > And what would be the chosen behavior ? > TBF is probably a bad example because it started life as a classless qdisc. There was only one built-in fifo queue that was shaped. Then someone made it classful and changed this behavior. To me it sounds reasonable to have the default behavior restored. At minimal consistency. > Relying on TBF installing a bfifo for you at delete would be hazardous. > > For example CBQ got it differently than HFSC > > If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC > falls back to noop_qdisc, without warning the user :( > > At least always using noop_qdisc is consistent. No magic there. > > Doing 'unification' right now would break existing scripts. > > This is too late, I am afraid. Sigh. So rant: IMO, we should let any new APIs and API updates stay longer in discussion. Or better mark them as unstable for sometime. The excuse that "it is out in the wild therefore cant be changed" is harmful because the timeline is "forever" whereas patches are applied after a short period of posting and discussions and sometimes not involving the right people. It is like having a jury issuing a death sentence after 1 week of deliberation. You cant take it back after execution. cheers, jamal ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc? 2016-04-15 12:42 ` Jamal Hadi Salim @ 2016-04-15 14:58 ` Eric Dumazet 2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) Jiri Kosina 0 siblings, 1 reply; 5+ messages in thread From: Eric Dumazet @ 2016-04-15 14:58 UTC (permalink / raw) To: Jamal Hadi Salim; +Cc: Jiri Kosina, Phil Sutter, netdev, linux-kernel On Fri, 2016-04-15 at 08:42 -0400, Jamal Hadi Salim wrote: > On 16-04-14 01:49 PM, Eric Dumazet wrote: > > > And what would be the chosen behavior ? > > > > TBF is probably a bad example because it started life as > a classless qdisc. There was only one built-in fifo queue > that was shaped. Then someone made it classful and changed > this behavior. To me it sounds reasonable to have the > default behavior restored. At minimal consistency. Then you need to save the initial qdisc (bfifo for TBF) in a special place, to make sure the delete operation is guaranteed to succeed. Or fail the delete if the bfifo can not be allocated. I can tell that determinism if far more interesting than usability for some users occasionally playing with tc. Surely the silent fallback to noop_qdisc is wrong. Anyway, we probably need to improve our ability to understand qdisc hierarchies. Having some hidden qdiscs is the real problem here. We need to add some hash table so that qdisc_match_from_root() does not have to scan hundred of qdiscs. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) 2016-04-15 14:58 ` Eric Dumazet @ 2016-07-07 9:04 ` Jiri Kosina 2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina 0 siblings, 1 reply; 5+ messages in thread From: Jiri Kosina @ 2016-07-07 9:04 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel On Fri, 15 Apr 2016, Eric Dumazet wrote: > Anyway, we probably need to improve our ability to understand qdisc > hierarchies. Having some hidden qdiscs is the real problem here. > > We need to add some hash table so that qdisc_match_from_root() does not > have to scan hundred of qdiscs. So how about something like the patch below? I already have preliminary patches on top which unhide the default qdiscs, but let's make this one step after the other. Thanks. From: Jiri Kosina <jkosina@suse.cz> Subject: [PATCH] net: sched: convert qdisc linked list to hashtable Convert the per-device linked list into a hashtable. The primary motivation for this change is that currently, we're not tracking all the qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup performed over the linked list by qdisc_match_from_root() is rather expensive. The ultimate goal is to get rid of hidden qdiscs completely, which will bring much more determinism in user experience. Signed-off-by: Jiri Kosina <jkosina@suse.cz> --- include/linux/netdevice.h | 2 ++ include/net/pkt_sched.h | 4 ++-- include/net/sch_generic.h | 2 +- net/core/dev.c | 1 + net/sched/sch_api.c | 23 +++++++++++++---------- net/sched/sch_generic.c | 6 +++--- net/sched/sch_mq.c | 2 +- net/sched/sch_mqprio.c | 2 +- 8 files changed, 24 insertions(+), 18 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f45929c..630838e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -52,6 +52,7 @@ #include <uapi/linux/netdevice.h> #include <uapi/linux/if_bonding.h> #include <uapi/linux/pkt_cls.h> +#include <linux/hashtable.h> struct netpoll_info; struct device; @@ -1778,6 +1779,7 @@ struct net_device { unsigned int num_tx_queues; unsigned int real_num_tx_queues; struct Qdisc *qdisc; + DECLARE_HASHTABLE (qdisc_hash, 16); unsigned long tx_queue_len; spinlock_t tx_global_lock; int watchdog_timeo; diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index fea53f4..8ba11b4 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops); void qdisc_get_default(char *id, size_t len); int qdisc_set_default(const char *id); -void qdisc_list_add(struct Qdisc *q); -void qdisc_list_del(struct Qdisc *q); +void qdisc_hash_add(struct Qdisc *q); +void qdisc_hash_del(struct Qdisc *q); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle); struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle); struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 62d5531..26f5cb3 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -67,7 +67,7 @@ struct Qdisc { u32 limit; const struct Qdisc_ops *ops; struct qdisc_size_table __rcu *stab; - struct list_head list; + struct hlist_node hash; u32 handle; u32 parent; int (*reshape_fail)(struct sk_buff *skb, diff --git a/net/core/dev.c b/net/core/dev.c index 904ff43..edc1617 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->all_adj_list.lower); INIT_LIST_HEAD(&dev->ptype_all); INIT_LIST_HEAD(&dev->ptype_specific); + hash_init(dev->qdisc_hash); dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM; setup(dev); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index ddf047d..82953cb 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -29,6 +29,7 @@ #include <linux/hrtimer.h> #include <linux/lockdep.h> #include <linux/slab.h> +#include <linux/hashtable.h> #include <net/net_namespace.h> #include <net/sock.h> @@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle) root->handle == handle) return root; - list_for_each_entry_rcu(q, &root->list, list) { + hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) { if (q->handle == handle) return q; } return NULL; } -void qdisc_list_add(struct Qdisc *q) +void qdisc_hash_add(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { struct Qdisc *root = qdisc_dev(q)->qdisc; WARN_ON_ONCE(root == &noop_qdisc); ASSERT_RTNL(); - list_add_tail_rcu(&q->list, &root->list); + hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle); } } -EXPORT_SYMBOL(qdisc_list_add); +EXPORT_SYMBOL(qdisc_hash_add); -void qdisc_list_del(struct Qdisc *q) +void qdisc_hash_del(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { ASSERT_RTNL(); - list_del_rcu(&q->list); + hash_del_rcu(&q->hash); } } -EXPORT_SYMBOL(qdisc_list_del); +EXPORT_SYMBOL(qdisc_hash_del); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle) { @@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue, goto err_out4; } - qdisc_list_add(sch); + qdisc_hash_add(sch); return sch; } @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb, { int ret = 0, q_idx = *q_idx_p; struct Qdisc *q; + int b; if (!root) return 0; @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb, goto done; q_idx++; } - list_for_each_entry(q, &root->list, list) { + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { if (q_idx < s_q_idx) { q_idx++; continue; @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, int *t_p, int s_t) { struct Qdisc *q; + int b; if (!root) return 0; @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) return -1; - list_for_each_entry(q, &root->list, list) { + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) { if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) return -1; } diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index f9e0e9c..7efc923 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = { .dequeue = noop_dequeue, .flags = TCQ_F_BUILTIN, .ops = &noop_qdisc_ops, - .list = LIST_HEAD_INIT(noop_qdisc.list), .q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock), .dev_queue = &noop_netdev_queue, .busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock), @@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p); sch->padded = (char *) sch - (char *) p; } - INIT_LIST_HEAD(&sch->list); skb_queue_head_init(&sch->q); spin_lock_init(&sch->busylock); @@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc) return; #ifdef CONFIG_NET_SCHED - qdisc_list_del(qdisc); + qdisc_hash_del(qdisc); qdisc_put_stab(rtnl_dereference(qdisc->stab)); #endif @@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev) qdisc->ops->attach(qdisc); } } + if (dev->qdisc) + qdisc_hash_add(dev->qdisc); } static void transition_one_qdisc(struct net_device *dev, diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c index 56a77b8..3bee15d 100644 --- a/net/sched/sch_mq.c +++ b/net/sched/sch_mq.c @@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch) qdisc_destroy(old); #ifdef CONFIG_NET_SCHED if (ntx < dev->real_num_tx_queues) - qdisc_list_add(qdisc); + qdisc_hash_add(qdisc); #endif } diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index b8002ce..dbfb3a5 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch) if (old) qdisc_destroy(old); if (ntx < dev->real_num_tx_queues) - qdisc_list_add(qdisc); + qdisc_hash_add(qdisc); } kfree(priv->qdiscs); priv->qdiscs = NULL; -- Jiri Kosina SUSE Labs ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable 2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) Jiri Kosina @ 2016-07-07 20:36 ` Jiri Kosina 2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina 0 siblings, 1 reply; 5+ messages in thread From: Jiri Kosina @ 2016-07-07 20:36 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel From: Jiri Kosina <jkosina@suse.cz> Convert the per-device linked list into a hashtable. The primary motivation for this change is that currently, we're not tracking all the qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup performed over the linked list by qdisc_match_from_root() is rather expensive. The ultimate goal is to get rid of hidden qdiscs completely, which will bring much more determinism in user experience. As we're adding hashtable.h include into generic netdevice.h, we have to make sure HASH_SIZE macro is now non-conflicting with local definitions. Signed-off-by: Jiri Kosina <jkosina@suse.cz> --- v1 -> v2: fix up RCU hastable usage wrt. rtnl fix compilation of .c files which define their own HASH_SIZE that now oncflicts with the one from hashtable.h (newly included via netdevice.h) include/linux/netdevice.h | 2 ++ include/net/pkt_sched.h | 4 ++-- include/net/sch_generic.h | 2 +- net/core/dev.c | 1 + net/ipv6/ip6_gre.c | 8 ++++---- net/ipv6/ip6_tunnel.c | 6 +++--- net/ipv6/ip6_vti.c | 6 +++--- net/ipv6/sit.c | 10 +++++----- net/sched/sch_api.c | 23 +++++++++++++---------- net/sched/sch_generic.c | 6 +++--- net/sched/sch_mq.c | 2 +- net/sched/sch_mqprio.c | 2 +- 12 files changed, 39 insertions(+), 33 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f45929c..630838e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -52,6 +52,7 @@ #include <uapi/linux/netdevice.h> #include <uapi/linux/if_bonding.h> #include <uapi/linux/pkt_cls.h> +#include <linux/hashtable.h> struct netpoll_info; struct device; @@ -1778,6 +1779,7 @@ struct net_device { unsigned int num_tx_queues; unsigned int real_num_tx_queues; struct Qdisc *qdisc; + DECLARE_HASHTABLE (qdisc_hash, 16); unsigned long tx_queue_len; spinlock_t tx_global_lock; int watchdog_timeo; diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index fea53f4..8ba11b4 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops); void qdisc_get_default(char *id, size_t len); int qdisc_set_default(const char *id); -void qdisc_list_add(struct Qdisc *q); -void qdisc_list_del(struct Qdisc *q); +void qdisc_hash_add(struct Qdisc *q); +void qdisc_hash_del(struct Qdisc *q); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle); struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle); struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 62d5531..26f5cb3 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -67,7 +67,7 @@ struct Qdisc { u32 limit; const struct Qdisc_ops *ops; struct qdisc_size_table __rcu *stab; - struct list_head list; + struct hlist_node hash; u32 handle; u32 parent; int (*reshape_fail)(struct sk_buff *skb, diff --git a/net/core/dev.c b/net/core/dev.c index 904ff43..edc1617 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->all_adj_list.lower); INIT_LIST_HEAD(&dev->ptype_all); INIT_LIST_HEAD(&dev->ptype_specific); + hash_init(dev->qdisc_hash); dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM; setup(dev); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index fdc9de2..0f70ecc 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -62,11 +62,11 @@ module_param(log_ecn_error, bool, 0644); MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN"); #define HASH_SIZE_SHIFT 5 -#define HASH_SIZE (1 << HASH_SIZE_SHIFT) +#define __HASH_SIZE (1 << HASH_SIZE_SHIFT) static int ip6gre_net_id __read_mostly; struct ip6gre_net { - struct ip6_tnl __rcu *tunnels[4][HASH_SIZE]; + struct ip6_tnl __rcu *tunnels[4][__HASH_SIZE]; struct net_device *fb_tunnel_dev; }; @@ -96,7 +96,7 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu); will match fallback tunnel. */ -#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1)) +#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(__HASH_SIZE - 1)) static u32 HASH_ADDR(const struct in6_addr *addr) { u32 hash = ipv6_addr_hash(addr); @@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head) for (prio = 0; prio < 4; prio++) { int h; - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < __HASH_SIZE; h++) { struct ip6_tnl *t; t = rtnl_dereference(ign->tunnels[prio][h]); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 7b0481e..a9da620 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -65,7 +65,7 @@ MODULE_ALIAS_RTNL_LINK("ip6tnl"); MODULE_ALIAS_NETDEV("ip6tnl0"); #define HASH_SIZE_SHIFT 5 -#define HASH_SIZE (1 << HASH_SIZE_SHIFT) +#define __HASH_SIZE (1 << HASH_SIZE_SHIFT) static bool log_ecn_error = true; module_param(log_ecn_error, bool, 0644); @@ -87,7 +87,7 @@ struct ip6_tnl_net { /* the IPv6 tunnel fallback device */ struct net_device *fb_tnl_dev; /* lists for storing tunnels in use */ - struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE]; + struct ip6_tnl __rcu *tnls_r_l[__HASH_SIZE]; struct ip6_tnl __rcu *tnls_wc[1]; struct ip6_tnl __rcu **tnls[2]; }; @@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net) if (dev->rtnl_link_ops == &ip6_link_ops) unregister_netdevice_queue(dev, &list); - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < __HASH_SIZE; h++) { t = rtnl_dereference(ip6n->tnls_r_l[h]); while (t) { /* If dev is in the same netns, it has already diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index d90a11f..2d192af 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -51,7 +51,7 @@ #include <net/netns/generic.h> #define HASH_SIZE_SHIFT 5 -#define HASH_SIZE (1 << HASH_SIZE_SHIFT) +#define __HASH_SIZE (1 << HASH_SIZE_SHIFT) static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2) { @@ -69,7 +69,7 @@ struct vti6_net { /* the vti6 tunnel fallback device */ struct net_device *fb_tnl_dev; /* lists for storing tunnels in use */ - struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE]; + struct ip6_tnl __rcu *tnls_r_l[__HASH_SIZE]; struct ip6_tnl __rcu *tnls_wc[1]; struct ip6_tnl __rcu **tnls[2]; }; @@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n) struct ip6_tnl *t; LIST_HEAD(list); - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < __HASH_SIZE; h++) { t = rtnl_dereference(ip6n->tnls_r_l[h]); while (t) { unregister_netdevice_queue(t->dev, &list); diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 0a5a255..9f776d7 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -62,7 +62,7 @@ For comments look at net/ipv4/ip_gre.c --ANK */ -#define HASH_SIZE 16 +#define __HASH_SIZE 16 #define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF) static bool log_ecn_error = true; @@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly; static int sit_net_id __read_mostly; struct sit_net { - struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE]; - struct ip_tunnel __rcu *tunnels_r[HASH_SIZE]; - struct ip_tunnel __rcu *tunnels_l[HASH_SIZE]; + struct ip_tunnel __rcu *tunnels_r_l[__HASH_SIZE]; + struct ip_tunnel __rcu *tunnels_r[__HASH_SIZE]; + struct ip_tunnel __rcu *tunnels_l[__HASH_SIZE]; struct ip_tunnel __rcu *tunnels_wc[1]; struct ip_tunnel __rcu **tunnels[4]; @@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net, for (prio = 1; prio < 4; prio++) { int h; - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < __HASH_SIZE; h++) { struct ip_tunnel *t; t = rtnl_dereference(sitn->tunnels[prio][h]); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index ddf047d..c093d32 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -29,6 +29,7 @@ #include <linux/hrtimer.h> #include <linux/lockdep.h> #include <linux/slab.h> +#include <linux/hashtable.h> #include <net/net_namespace.h> #include <net/sock.h> @@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle) root->handle == handle) return root; - list_for_each_entry_rcu(q, &root->list, list) { + hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) { if (q->handle == handle) return q; } return NULL; } -void qdisc_list_add(struct Qdisc *q) +void qdisc_hash_add(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { struct Qdisc *root = qdisc_dev(q)->qdisc; WARN_ON_ONCE(root == &noop_qdisc); ASSERT_RTNL(); - list_add_tail_rcu(&q->list, &root->list); + hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle); } } -EXPORT_SYMBOL(qdisc_list_add); +EXPORT_SYMBOL(qdisc_hash_add); -void qdisc_list_del(struct Qdisc *q) +void qdisc_hash_del(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { ASSERT_RTNL(); - list_del_rcu(&q->list); + hash_del_rcu(&q->hash); } } -EXPORT_SYMBOL(qdisc_list_del); +EXPORT_SYMBOL(qdisc_hash_del); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle) { @@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue, goto err_out4; } - qdisc_list_add(sch); + qdisc_hash_add(sch); return sch; } @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb, { int ret = 0, q_idx = *q_idx_p; struct Qdisc *q; + int b; if (!root) return 0; @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb, goto done; q_idx++; } - list_for_each_entry(q, &root->list, list) { + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { if (q_idx < s_q_idx) { q_idx++; continue; @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, int *t_p, int s_t) { struct Qdisc *q; + int b; if (!root) return 0; @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) return -1; - list_for_each_entry(q, &root->list, list) { + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) return -1; } diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index f9e0e9c..7efc923 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = { .dequeue = noop_dequeue, .flags = TCQ_F_BUILTIN, .ops = &noop_qdisc_ops, - .list = LIST_HEAD_INIT(noop_qdisc.list), .q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock), .dev_queue = &noop_netdev_queue, .busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock), @@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p); sch->padded = (char *) sch - (char *) p; } - INIT_LIST_HEAD(&sch->list); skb_queue_head_init(&sch->q); spin_lock_init(&sch->busylock); @@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc) return; #ifdef CONFIG_NET_SCHED - qdisc_list_del(qdisc); + qdisc_hash_del(qdisc); qdisc_put_stab(rtnl_dereference(qdisc->stab)); #endif @@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev) qdisc->ops->attach(qdisc); } } + if (dev->qdisc) + qdisc_hash_add(dev->qdisc); } static void transition_one_qdisc(struct net_device *dev, diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c index 56a77b8..3bee15d 100644 --- a/net/sched/sch_mq.c +++ b/net/sched/sch_mq.c @@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch) qdisc_destroy(old); #ifdef CONFIG_NET_SCHED if (ntx < dev->real_num_tx_queues) - qdisc_list_add(qdisc); + qdisc_hash_add(qdisc); #endif } diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index b8002ce..dbfb3a5 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch) if (old) qdisc_destroy(old); if (ntx < dev->real_num_tx_queues) - qdisc_list_add(qdisc); + qdisc_hash_add(qdisc); } kfree(priv->qdiscs); priv->qdiscs = NULL; -- Jiri Kosina SUSE Labs ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable 2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina @ 2016-07-11 14:02 ` Jiri Kosina 2016-07-12 17:36 ` Cong Wang 0 siblings, 1 reply; 5+ messages in thread From: Jiri Kosina @ 2016-07-11 14:02 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel From: Jiri Kosina <jkosina@suse.cz> Convert the per-device linked list into a hashtable. The primary motivation for this change is that currently, we're not tracking all the qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup performed over the linked list by qdisc_match_from_root() is rather expensive. The ultimate goal is to get rid of hidden qdiscs completely, which will bring much more determinism in user experience. As we're adding hashtable.h include into generic netdevice.h, we have to make sure HASH_SIZE macro is now non-conflicting with local definitions. Signed-off-by: Jiri Kosina <jkosina@suse.cz> --- v1 -> v2: fix up RCU hastable usage wrt. rtnl fix compilation of .c files which define their own HASH_SIZE that now oncflicts with the one from hashtable.h (newly included via netdevice.h) v2 -> v3: resolve HASH_SIZE identifier conflicts in a cleaner way fix up the number of hash bucket bits (4 bits for 16 buckets) include/linux/netdevice.h | 2 ++ include/net/pkt_sched.h | 4 ++-- include/net/sch_generic.h | 2 +- net/core/dev.c | 1 + net/ipv6/ip6_gre.c | 12 ++++++------ net/ipv6/ip6_tunnel.c | 10 +++++----- net/ipv6/ip6_vti.c | 10 +++++----- net/ipv6/sit.c | 10 +++++----- net/sched/sch_api.c | 23 +++++++++++++---------- net/sched/sch_generic.c | 6 +++--- net/sched/sch_mq.c | 2 +- net/sched/sch_mqprio.c | 2 +- 12 files changed, 45 insertions(+), 39 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f45929c..0b5c172e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -52,6 +52,7 @@ #include <uapi/linux/netdevice.h> #include <uapi/linux/if_bonding.h> #include <uapi/linux/pkt_cls.h> +#include <linux/hashtable.h> struct netpoll_info; struct device; @@ -1778,6 +1779,7 @@ struct net_device { unsigned int num_tx_queues; unsigned int real_num_tx_queues; struct Qdisc *qdisc; + DECLARE_HASHTABLE (qdisc_hash, 4); unsigned long tx_queue_len; spinlock_t tx_global_lock; int watchdog_timeo; diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index fea53f4..8ba11b4 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops); void qdisc_get_default(char *id, size_t len); int qdisc_set_default(const char *id); -void qdisc_list_add(struct Qdisc *q); -void qdisc_list_del(struct Qdisc *q); +void qdisc_hash_add(struct Qdisc *q); +void qdisc_hash_del(struct Qdisc *q); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle); struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle); struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 62d5531..26f5cb3 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -67,7 +67,7 @@ struct Qdisc { u32 limit; const struct Qdisc_ops *ops; struct qdisc_size_table __rcu *stab; - struct list_head list; + struct hlist_node hash; u32 handle; u32 parent; int (*reshape_fail)(struct sk_buff *skb, diff --git a/net/core/dev.c b/net/core/dev.c index 904ff43..edc1617 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->all_adj_list.lower); INIT_LIST_HEAD(&dev->ptype_all); INIT_LIST_HEAD(&dev->ptype_specific); + hash_init(dev->qdisc_hash); dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM; setup(dev); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index fdc9de2..d3697a4 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -61,12 +61,12 @@ static bool log_ecn_error = true; module_param(log_ecn_error, bool, 0644); MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN"); -#define HASH_SIZE_SHIFT 5 -#define HASH_SIZE (1 << HASH_SIZE_SHIFT) +#define IP6_GRE_HASH_SIZE_SHIFT 5 +#define IP6_GRE_HASH_SIZE (1 << IP6_GRE_HASH_SIZE_SHIFT) static int ip6gre_net_id __read_mostly; struct ip6gre_net { - struct ip6_tnl __rcu *tunnels[4][HASH_SIZE]; + struct ip6_tnl __rcu *tunnels[4][IP6_GRE_HASH_SIZE]; struct net_device *fb_tunnel_dev; }; @@ -96,12 +96,12 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu); will match fallback tunnel. */ -#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1)) +#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(IP6_GRE_HASH_SIZE - 1)) static u32 HASH_ADDR(const struct in6_addr *addr) { u32 hash = ipv6_addr_hash(addr); - return hash_32(hash, HASH_SIZE_SHIFT); + return hash_32(hash, IP6_GRE_HASH_SIZE_SHIFT); } #define tunnels_r_l tunnels[3] @@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head) for (prio = 0; prio < 4; prio++) { int h; - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < IP6_GRE_HASH_SIZE; h++) { struct ip6_tnl *t; t = rtnl_dereference(ign->tunnels[prio][h]); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 7b0481e..2050217 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -64,8 +64,8 @@ MODULE_LICENSE("GPL"); MODULE_ALIAS_RTNL_LINK("ip6tnl"); MODULE_ALIAS_NETDEV("ip6tnl0"); -#define HASH_SIZE_SHIFT 5 -#define HASH_SIZE (1 << HASH_SIZE_SHIFT) +#define IP6_TUNNEL_HASH_SIZE_SHIFT 5 +#define IP6_TUNNEL_HASH_SIZE (1 << IP6_TUNNEL_HASH_SIZE_SHIFT) static bool log_ecn_error = true; module_param(log_ecn_error, bool, 0644); @@ -75,7 +75,7 @@ static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2) { u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2); - return hash_32(hash, HASH_SIZE_SHIFT); + return hash_32(hash, IP6_TUNNEL_HASH_SIZE_SHIFT); } static int ip6_tnl_dev_init(struct net_device *dev); @@ -87,7 +87,7 @@ struct ip6_tnl_net { /* the IPv6 tunnel fallback device */ struct net_device *fb_tnl_dev; /* lists for storing tunnels in use */ - struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE]; + struct ip6_tnl __rcu *tnls_r_l[IP6_TUNNEL_HASH_SIZE]; struct ip6_tnl __rcu *tnls_wc[1]; struct ip6_tnl __rcu **tnls[2]; }; @@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net) if (dev->rtnl_link_ops == &ip6_link_ops) unregister_netdevice_queue(dev, &list); - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < IP6_TUNNEL_HASH_SIZE; h++) { t = rtnl_dereference(ip6n->tnls_r_l[h]); while (t) { /* If dev is in the same netns, it has already diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index d90a11f..cc7e058 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -50,14 +50,14 @@ #include <net/net_namespace.h> #include <net/netns/generic.h> -#define HASH_SIZE_SHIFT 5 -#define HASH_SIZE (1 << HASH_SIZE_SHIFT) +#define IP6_VTI_HASH_SIZE_SHIFT 5 +#define IP6_VTI_HASH_SIZE (1 << IP6_VTI_HASH_SIZE_SHIFT) static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2) { u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2); - return hash_32(hash, HASH_SIZE_SHIFT); + return hash_32(hash, IP6_VTI_HASH_SIZE_SHIFT); } static int vti6_dev_init(struct net_device *dev); @@ -69,7 +69,7 @@ struct vti6_net { /* the vti6 tunnel fallback device */ struct net_device *fb_tnl_dev; /* lists for storing tunnels in use */ - struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE]; + struct ip6_tnl __rcu *tnls_r_l[IP6_VTI_HASH_SIZE]; struct ip6_tnl __rcu *tnls_wc[1]; struct ip6_tnl __rcu **tnls[2]; }; @@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n) struct ip6_tnl *t; LIST_HEAD(list); - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < IP6_VTI_HASH_SIZE; h++) { t = rtnl_dereference(ip6n->tnls_r_l[h]); while (t) { unregister_netdevice_queue(t->dev, &list); diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 0a5a255..94dd0f0 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -62,7 +62,7 @@ For comments look at net/ipv4/ip_gre.c --ANK */ -#define HASH_SIZE 16 +#define IP6_SIT_HASH_SIZE 16 #define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF) static bool log_ecn_error = true; @@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly; static int sit_net_id __read_mostly; struct sit_net { - struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE]; - struct ip_tunnel __rcu *tunnels_r[HASH_SIZE]; - struct ip_tunnel __rcu *tunnels_l[HASH_SIZE]; + struct ip_tunnel __rcu *tunnels_r_l[IP6_SIT_HASH_SIZE]; + struct ip_tunnel __rcu *tunnels_r[IP6_SIT_HASH_SIZE]; + struct ip_tunnel __rcu *tunnels_l[IP6_SIT_HASH_SIZE]; struct ip_tunnel __rcu *tunnels_wc[1]; struct ip_tunnel __rcu **tunnels[4]; @@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net, for (prio = 1; prio < 4; prio++) { int h; - for (h = 0; h < HASH_SIZE; h++) { + for (h = 0; h < IP6_SIT_HASH_SIZE; h++) { struct ip_tunnel *t; t = rtnl_dereference(sitn->tunnels[prio][h]); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index ddf047d..c093d32 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -29,6 +29,7 @@ #include <linux/hrtimer.h> #include <linux/lockdep.h> #include <linux/slab.h> +#include <linux/hashtable.h> #include <net/net_namespace.h> #include <net/sock.h> @@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle) root->handle == handle) return root; - list_for_each_entry_rcu(q, &root->list, list) { + hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) { if (q->handle == handle) return q; } return NULL; } -void qdisc_list_add(struct Qdisc *q) +void qdisc_hash_add(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { struct Qdisc *root = qdisc_dev(q)->qdisc; WARN_ON_ONCE(root == &noop_qdisc); ASSERT_RTNL(); - list_add_tail_rcu(&q->list, &root->list); + hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle); } } -EXPORT_SYMBOL(qdisc_list_add); +EXPORT_SYMBOL(qdisc_hash_add); -void qdisc_list_del(struct Qdisc *q) +void qdisc_hash_del(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { ASSERT_RTNL(); - list_del_rcu(&q->list); + hash_del_rcu(&q->hash); } } -EXPORT_SYMBOL(qdisc_list_del); +EXPORT_SYMBOL(qdisc_hash_del); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle) { @@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue, goto err_out4; } - qdisc_list_add(sch); + qdisc_hash_add(sch); return sch; } @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb, { int ret = 0, q_idx = *q_idx_p; struct Qdisc *q; + int b; if (!root) return 0; @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb, goto done; q_idx++; } - list_for_each_entry(q, &root->list, list) { + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { if (q_idx < s_q_idx) { q_idx++; continue; @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, int *t_p, int s_t) { struct Qdisc *q; + int b; if (!root) return 0; @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) return -1; - list_for_each_entry(q, &root->list, list) { + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) return -1; } diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index f9e0e9c..7efc923 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = { .dequeue = noop_dequeue, .flags = TCQ_F_BUILTIN, .ops = &noop_qdisc_ops, - .list = LIST_HEAD_INIT(noop_qdisc.list), .q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock), .dev_queue = &noop_netdev_queue, .busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock), @@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p); sch->padded = (char *) sch - (char *) p; } - INIT_LIST_HEAD(&sch->list); skb_queue_head_init(&sch->q); spin_lock_init(&sch->busylock); @@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc) return; #ifdef CONFIG_NET_SCHED - qdisc_list_del(qdisc); + qdisc_hash_del(qdisc); qdisc_put_stab(rtnl_dereference(qdisc->stab)); #endif @@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev) qdisc->ops->attach(qdisc); } } + if (dev->qdisc) + qdisc_hash_add(dev->qdisc); } static void transition_one_qdisc(struct net_device *dev, diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c index 56a77b8..3bee15d 100644 --- a/net/sched/sch_mq.c +++ b/net/sched/sch_mq.c @@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch) qdisc_destroy(old); #ifdef CONFIG_NET_SCHED if (ntx < dev->real_num_tx_queues) - qdisc_list_add(qdisc); + qdisc_hash_add(qdisc); #endif } diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index b8002ce..dbfb3a5 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch) if (old) qdisc_destroy(old); if (ntx < dev->real_num_tx_queues) - qdisc_list_add(qdisc); + qdisc_hash_add(qdisc); } kfree(priv->qdiscs); priv->qdiscs = NULL; -- Jiri Kosina SUSE Labs ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable 2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina @ 2016-07-12 17:36 ` Cong Wang 2016-07-13 13:47 ` Jiri Kosina 0 siblings, 1 reply; 5+ messages in thread From: Cong Wang @ 2016-07-12 17:36 UTC (permalink / raw) To: Jiri Kosina Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter, Linux Kernel Network Developers, LKML On Mon, Jul 11, 2016 at 7:02 AM, Jiri Kosina <jikos@kernel.org> wrote: > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index f45929c..0b5c172e 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -52,6 +52,7 @@ > #include <uapi/linux/netdevice.h> > #include <uapi/linux/if_bonding.h> > #include <uapi/linux/pkt_cls.h> > +#include <linux/hashtable.h> > > struct netpoll_info; > struct device; > @@ -1778,6 +1779,7 @@ struct net_device { > unsigned int num_tx_queues; > unsigned int real_num_tx_queues; > struct Qdisc *qdisc; > + DECLARE_HASHTABLE (qdisc_hash, 4); > unsigned long tx_queue_len; > spinlock_t tx_global_lock; > int watchdog_timeo; Should it be surrounded by CONFIG_NET_SCHED? To save several bytes for !CONFIG_NET_SCHED case. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable 2016-07-12 17:36 ` Cong Wang @ 2016-07-13 13:47 ` Jiri Kosina 0 siblings, 0 replies; 5+ messages in thread From: Jiri Kosina @ 2016-07-13 13:47 UTC (permalink / raw) To: Cong Wang Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter, Linux Kernel Network Developers, LKML On Tue, 12 Jul 2016, Cong Wang wrote: > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > index f45929c..0b5c172e 100644 > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -52,6 +52,7 @@ > > #include <uapi/linux/netdevice.h> > > #include <uapi/linux/if_bonding.h> > > #include <uapi/linux/pkt_cls.h> > > +#include <linux/hashtable.h> > > > > struct netpoll_info; > > struct device; > > @@ -1778,6 +1779,7 @@ struct net_device { > > unsigned int num_tx_queues; > > unsigned int real_num_tx_queues; > > struct Qdisc *qdisc; > > + DECLARE_HASHTABLE (qdisc_hash, 4); > > unsigned long tx_queue_len; > > spinlock_t tx_global_lock; > > int watchdog_timeo; > > Should it be surrounded by CONFIG_NET_SCHED? > To save several bytes for !CONFIG_NET_SCHED case. Makes sense. I'll wait a bit for more feedback (if there is any) before including this in potential v4. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-07-28 12:52 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <201607120341.UDv5wy6R%fengguang.wu@intel.com> 2016-07-14 14:14 ` [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable Jiri Kosina 2016-07-28 12:51 ` Fengguang Wu 2016-04-14 14:44 Deleting child qdisc doesn't reset parent to default qdisc? Jiri Kosina 2016-04-14 15:01 ` Eric Dumazet 2016-04-14 15:18 ` Phil Sutter 2016-04-14 16:08 ` Jiri Kosina 2016-04-14 17:49 ` Eric Dumazet 2016-04-15 12:42 ` Jamal Hadi Salim 2016-04-15 14:58 ` Eric Dumazet 2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) Jiri Kosina 2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina 2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina 2016-07-12 17:36 ` Cong Wang 2016-07-13 13:47 ` Jiri Kosina
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).