* Deleting child qdisc doesn't reset parent to default qdisc?
@ 2016-04-14 14:44 Jiri Kosina
2016-04-14 14:51 ` Jiri Kosina
2016-04-14 15:01 ` Eric Dumazet
0 siblings, 2 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-04-14 14:44 UTC (permalink / raw)
To: Jamal Hadi Salim; +Cc: netdev, linux-kernel
Hi,
I've came across the behavior where adding a child qdisc and then deleting
it again makes the networking dysfunctional (I guess that's because all of
a sudden there is absolutely no working qdisc on the device, although
there originally was a default one in the parent).
In a nutshell, is this expected behavior or bug?
=====
jikos:~ # tc qdisc show
qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
jikos:~ # ping -c 1 nix.cz | head -2
PING nix.cz (195.47.235.3) 56(84) bytes of data.
64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.59 ms
jikos:~ # tc qdisc add dev eth0 parent 10:1 sfq
jikos:~ # tc qdisc show
qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
qdisc sfq 8008: dev eth0 parent 10:1 limit 127p quantum 1514b depth 127 divisor 1024
jikos:~ # ping -c 1 nix.cz | head -2
PING nix.cz (195.47.235.3) 56(84) bytes of data.
64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.67 ms
jikos:~ # tc qdisc del dev eth0 parent 10:1 sfq
jikos:~ # tc qdisc show
qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
jikos:~ # ping -c 1 nix.cz | head -2
PING nix.cz (195.47.235.3) 56(84) bytes of data.
[ ... nothing happens ... ]
^C
jikos:~ # tc qdisc add dev eth0 parent 10:1 sfq
jikos:~ # ping -c 1 nix.cz | head -2
PING nix.cz (195.47.235.3) 56(84) bytes of data.
64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.66 ms
=====
Thanks,
--
Jiri Kosina
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 14:44 Deleting child qdisc doesn't reset parent to default qdisc? Jiri Kosina
@ 2016-04-14 14:51 ` Jiri Kosina
2016-04-14 15:01 ` Eric Dumazet
1 sibling, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-04-14 14:51 UTC (permalink / raw)
To: Jamal Hadi Salim; +Cc: netdev, linux-kernel
On Thu, 14 Apr 2016, Jiri Kosina wrote:
> In a nutshell, is this expected behavior or bug?
Just to clarify what seems to suggest to me that this is rather a bug that
needs to be fixed (but apparently one that has been there for quite a long
time) can be demonstrated by this:
>
> =====
> jikos:~ # tc qdisc show
> qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
The above configuration works.
> jikos:~ # ping -c 1 nix.cz | head -2
> PING nix.cz (195.47.235.3) 56(84) bytes of data.
> 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.59 ms
>
> jikos:~ # tc qdisc add dev eth0 parent 10:1 sfq
> jikos:~ # tc qdisc show
> qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
> qdisc sfq 8008: dev eth0 parent 10:1 limit 127p quantum 1514b depth 127 divisor 1024
>
> jikos:~ # ping -c 1 nix.cz | head -2
> PING nix.cz (195.47.235.3) 56(84) bytes of data.
> 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.67 ms
>
> jikos:~ # tc qdisc del dev eth0 parent 10:1 sfq
> jikos:~ # tc qdisc show
> qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
The above configuration doesn't although it's identical to the working one
at the beginning.
> jikos:~ # ping -c 1 nix.cz | head -2
> PING nix.cz (195.47.235.3) 56(84) bytes of data.
> [ ... nothing happens ... ]
> ^C
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 14:44 Deleting child qdisc doesn't reset parent to default qdisc? Jiri Kosina
2016-04-14 14:51 ` Jiri Kosina
@ 2016-04-14 15:01 ` Eric Dumazet
2016-04-14 15:18 ` Phil Sutter
1 sibling, 1 reply; 43+ messages in thread
From: Eric Dumazet @ 2016-04-14 15:01 UTC (permalink / raw)
To: Jiri Kosina; +Cc: Jamal Hadi Salim, netdev, linux-kernel
On Thu, 2016-04-14 at 16:44 +0200, Jiri Kosina wrote:
> Hi,
>
> I've came across the behavior where adding a child qdisc and then deleting
> it again makes the networking dysfunctional (I guess that's because all of
> a sudden there is absolutely no working qdisc on the device, although
> there originally was a default one in the parent).
>
> In a nutshell, is this expected behavior or bug?
This is the expected behavior.
If the kernel was suddenly doing a 'replace' when you ask a delete,
then the scripts doing a delete , than a add would break.
tc users are skilled admins ;)
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 15:01 ` Eric Dumazet
@ 2016-04-14 15:18 ` Phil Sutter
2016-04-14 15:34 ` Jiri Kosina
2016-04-14 16:08 ` Jiri Kosina
0 siblings, 2 replies; 43+ messages in thread
From: Phil Sutter @ 2016-04-14 15:18 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jiri Kosina, Jamal Hadi Salim, netdev, linux-kernel
On Thu, Apr 14, 2016 at 08:01:39AM -0700, Eric Dumazet wrote:
> On Thu, 2016-04-14 at 16:44 +0200, Jiri Kosina wrote:
> > Hi,
> >
> > I've came across the behavior where adding a child qdisc and then deleting
> > it again makes the networking dysfunctional (I guess that's because all of
> > a sudden there is absolutely no working qdisc on the device, although
> > there originally was a default one in the parent).
> >
> > In a nutshell, is this expected behavior or bug?
>
> This is the expected behavior.
OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
one upon deletion instead of noop_qdisc, hence I would describe
the situation using the words 'inconsistent' and 'accident' rather than
'expected'. :)
Anyhow, the problem with skilled admins is they accept quirks too easily
and just build their scripts around them - the same scripts we have to
keep compatible to then.
Cheers, Phil
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 15:18 ` Phil Sutter
@ 2016-04-14 15:34 ` Jiri Kosina
2016-04-14 15:44 ` Eric Dumazet
2016-04-14 16:08 ` Jiri Kosina
1 sibling, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-04-14 15:34 UTC (permalink / raw)
To: Phil Sutter; +Cc: Eric Dumazet, Jamal Hadi Salim, netdev, linux-kernel
On Thu, 14 Apr 2016, Phil Sutter wrote:
> OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
> one upon deletion instead of noop_qdisc, hence I would describe
> the situation using the words 'inconsistent' and 'accident' rather than
> 'expected'. :)
Exactly. I'd again like to stress the fact that this configuration works:
jikos:~ # tc qdisc show
qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
and this (after performing add/delete operation) doesn't:
jikos:~ # tc qdisc show
qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
It's hard to spot a difference (hint: there is none).
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 15:34 ` Jiri Kosina
@ 2016-04-14 15:44 ` Eric Dumazet
2016-04-14 16:22 ` Phil Sutter
0 siblings, 1 reply; 43+ messages in thread
From: Eric Dumazet @ 2016-04-14 15:44 UTC (permalink / raw)
To: Jiri Kosina; +Cc: Phil Sutter, Jamal Hadi Salim, netdev, linux-kernel
On Thu, 2016-04-14 at 17:34 +0200, Jiri Kosina wrote:
> On Thu, 14 Apr 2016, Phil Sutter wrote:
>
> > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
> > one upon deletion instead of noop_qdisc, hence I would describe
> > the situation using the words 'inconsistent' and 'accident' rather than
> > 'expected'. :)
>
> Exactly. I'd again like to stress the fact that this configuration works:
>
> jikos:~ # tc qdisc show
> qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
>
> and this (after performing add/delete operation) doesn't:
>
> jikos:~ # tc qdisc show
> qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
>
> It's hard to spot a difference (hint: there is none).
This is because some qdisc are not visible in the dump.
qdisc_list_add() uses a single list, so adding too much stuff in it
could slow down fast path (qdisc_lookup(), called from
qdisc_tree_reduce_backlog())
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 15:18 ` Phil Sutter
2016-04-14 15:34 ` Jiri Kosina
@ 2016-04-14 16:08 ` Jiri Kosina
2016-04-14 17:49 ` Eric Dumazet
1 sibling, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-04-14 16:08 UTC (permalink / raw)
To: Phil Sutter; +Cc: Eric Dumazet, Jamal Hadi Salim, netdev, linux-kernel
On Thu, 14 Apr 2016, Phil Sutter wrote:
> > > I've came across the behavior where adding a child qdisc and then deleting
> > > it again makes the networking dysfunctional (I guess that's because all of
> > > a sudden there is absolutely no working qdisc on the device, although
> > > there originally was a default one in the parent).
> > >
> > > In a nutshell, is this expected behavior or bug?
> >
> > This is the expected behavior.
>
> OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
> one upon deletion instead of noop_qdisc, hence I would describe
> the situation using the words 'inconsistent' and 'accident' rather than
> 'expected'. :)
Would a patch that'd unify this in a sense that all qdiscs would assign
the default one upon deletion acceptable?
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 15:44 ` Eric Dumazet
@ 2016-04-14 16:22 ` Phil Sutter
2016-04-14 16:40 ` Eric Dumazet
0 siblings, 1 reply; 43+ messages in thread
From: Phil Sutter @ 2016-04-14 16:22 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jiri Kosina, Jamal Hadi Salim, netdev, linux-kernel
On Thu, Apr 14, 2016 at 08:44:40AM -0700, Eric Dumazet wrote:
> On Thu, 2016-04-14 at 17:34 +0200, Jiri Kosina wrote:
> > On Thu, 14 Apr 2016, Phil Sutter wrote:
> >
> > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
> > > one upon deletion instead of noop_qdisc, hence I would describe
> > > the situation using the words 'inconsistent' and 'accident' rather than
> > > 'expected'. :)
> >
> > Exactly. I'd again like to stress the fact that this configuration works:
> >
> > jikos:~ # tc qdisc show
> > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
> >
> > and this (after performing add/delete operation) doesn't:
> >
> > jikos:~ # tc qdisc show
> > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms
> >
> > It's hard to spot a difference (hint: there is none).
>
> This is because some qdisc are not visible in the dump.
And those being invisible can be overridden using 'tc qd add', right?
AFAIR they're not listed because they don't properly register, so the
system doesn't care to override them. In this case we could change all
classful qdiscs to restore the default qdisc if a leaf qdisc is being
deleted instead of noop (which is probably not what the user wants
anyway).
Cheers, Phil
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 16:22 ` Phil Sutter
@ 2016-04-14 16:40 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-04-14 16:40 UTC (permalink / raw)
To: Phil Sutter; +Cc: Jiri Kosina, Jamal Hadi Salim, netdev, linux-kernel
On Thu, 2016-04-14 at 18:22 +0200, Phil Sutter wrote:
> And those being invisible can be overridden using 'tc qd add', right?
> AFAIR they're not listed because they don't properly register, so the
> system doesn't care to override them. In this case we could change all
> classful qdiscs to restore the default qdisc if a leaf qdisc is being
> deleted instead of noop (which is probably not what the user wants
> anyway).
Even if they properly register, they are not visible.
Take a look at
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=95dc19299f741c986227ec33e23cbf9b3321f812
for some context.
When a default pfifo is created on say a HTB class, you do not see it by
default in a dump.
If you have 100 HTB classes, HTB created 100 pfifo just fine, and it
works, unless an admin tries to delete them maybe ;)
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 16:08 ` Jiri Kosina
@ 2016-04-14 17:49 ` Eric Dumazet
2016-04-15 12:42 ` Jamal Hadi Salim
0 siblings, 1 reply; 43+ messages in thread
From: Eric Dumazet @ 2016-04-14 17:49 UTC (permalink / raw)
To: Jiri Kosina; +Cc: Phil Sutter, Jamal Hadi Salim, netdev, linux-kernel
On Thu, 2016-04-14 at 18:08 +0200, Jiri Kosina wrote:
> On Thu, 14 Apr 2016, Phil Sutter wrote:
>
> > > > I've came across the behavior where adding a child qdisc and then deleting
> > > > it again makes the networking dysfunctional (I guess that's because all of
> > > > a sudden there is absolutely no working qdisc on the device, although
> > > > there originally was a default one in the parent).
> > > >
> > > > In a nutshell, is this expected behavior or bug?
> > >
> > > This is the expected behavior.
> >
> > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
> > one upon deletion instead of noop_qdisc, hence I would describe
> > the situation using the words 'inconsistent' and 'accident' rather than
> > 'expected'. :)
>
> Would a patch that'd unify this in a sense that all qdiscs would assign
> the default one upon deletion acceptable?
>
And what would be the chosen behavior ?
Relying on TBF installing a bfifo for you at delete would be hazardous.
For example CBQ got it differently than HFSC
If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC
falls back to noop_qdisc, without warning the user :(
At least always using noop_qdisc is consistent. No magic there.
Doing 'unification' right now would break existing scripts.
This is too late, I am afraid.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-14 17:49 ` Eric Dumazet
@ 2016-04-15 12:42 ` Jamal Hadi Salim
2016-04-15 14:58 ` Eric Dumazet
0 siblings, 1 reply; 43+ messages in thread
From: Jamal Hadi Salim @ 2016-04-15 12:42 UTC (permalink / raw)
To: Eric Dumazet, Jiri Kosina; +Cc: Phil Sutter, netdev, linux-kernel
On 16-04-14 01:49 PM, Eric Dumazet wrote:
> And what would be the chosen behavior ?
>
TBF is probably a bad example because it started life as
a classless qdisc. There was only one built-in fifo queue
that was shaped. Then someone made it classful and changed
this behavior. To me it sounds reasonable to have the
default behavior restored. At minimal consistency.
> Relying on TBF installing a bfifo for you at delete would be hazardous.
>
> For example CBQ got it differently than HFSC
>
> If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC
> falls back to noop_qdisc, without warning the user :(
>
> At least always using noop_qdisc is consistent. No magic there.
>
> Doing 'unification' right now would break existing scripts.
>
> This is too late, I am afraid.
Sigh. So rant:
IMO, we should let any new APIs and API updates stay longer
in discussion. Or better mark them as unstable for sometime.
The excuse that "it is out in the wild therefore cant be changed"
is harmful because the timeline is "forever" whereas
patches are applied after a short period of posting
and discussions and sometimes not involving the right people.
It is like having a jury issuing a death sentence after 1 week of
deliberation. You cant take it back after execution.
cheers,
jamal
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-15 12:42 ` Jamal Hadi Salim
@ 2016-04-15 14:58 ` Eric Dumazet
2016-04-15 17:13 ` David Miller
` (3 more replies)
0 siblings, 4 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-04-15 14:58 UTC (permalink / raw)
To: Jamal Hadi Salim; +Cc: Jiri Kosina, Phil Sutter, netdev, linux-kernel
On Fri, 2016-04-15 at 08:42 -0400, Jamal Hadi Salim wrote:
> On 16-04-14 01:49 PM, Eric Dumazet wrote:
>
> > And what would be the chosen behavior ?
> >
>
> TBF is probably a bad example because it started life as
> a classless qdisc. There was only one built-in fifo queue
> that was shaped. Then someone made it classful and changed
> this behavior. To me it sounds reasonable to have the
> default behavior restored. At minimal consistency.
Then you need to save the initial qdisc (bfifo for TBF) in a special
place, to make sure the delete operation is guaranteed to succeed.
Or fail the delete if the bfifo can not be allocated.
I can tell that determinism if far more interesting than usability for
some users occasionally playing with tc.
Surely the silent fallback to noop_qdisc is wrong.
Anyway, we probably need to improve our ability to understand qdisc
hierarchies. Having some hidden qdiscs is the real problem here.
We need to add some hash table so that qdisc_match_from_root() does not
have to scan hundred of qdiscs.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-15 14:58 ` Eric Dumazet
@ 2016-04-15 17:13 ` David Miller
2016-06-28 15:19 ` Jiri Kosina
` (2 subsequent siblings)
3 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2016-04-15 17:13 UTC (permalink / raw)
To: eric.dumazet; +Cc: jhs, jikos, phil, netdev, linux-kernel
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 15 Apr 2016 07:58:48 -0700
> Having some hidden qdiscs is the real problem here.
+1
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-04-15 14:58 ` Eric Dumazet
2016-04-15 17:13 ` David Miller
@ 2016-06-28 15:19 ` Jiri Kosina
2016-06-28 17:28 ` Cong Wang
2016-06-28 16:49 ` [RFC PATCH] sch_tbf: avoid silent fallback to noop_qdisc (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) Jiri Kosina
2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable " Jiri Kosina
3 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-06-28 15:19 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Fri, 15 Apr 2016, Eric Dumazet wrote:
> > TBF is probably a bad example because it started life as a classless
> > qdisc. There was only one built-in fifo queue that was shaped. Then
> > someone made it classful and changed this behavior. To me it sounds
> > reasonable to have the default behavior restored. At minimal
> > consistency.
>
>
> Then you need to save the initial qdisc (bfifo for TBF) in a special
> place, to make sure the delete operation is guaranteed to succeed.
>
> Or fail the delete if the bfifo can not be allocated.
>
> I can tell that determinism if far more interesting than usability for
> some users occasionally playing with tc.
BTW, I've started to actually work on fixing this, and I've noticed that
TBF behavior actually violates what's stated in pfifo_fast manpage:
==========
Whenever an interface is created, the pfifo_fast qdisc is
automatically used as a queue. If another qdisc is
attached, it preempts the default pfifo_fast, which automatically
returns to function when an existing qdisc is detached.
In this sense this qdisc is magic, and unlike other qdiscs.
==========
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC PATCH] sch_tbf: avoid silent fallback to noop_qdisc (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
2016-04-15 14:58 ` Eric Dumazet
2016-04-15 17:13 ` David Miller
2016-06-28 15:19 ` Jiri Kosina
@ 2016-06-28 16:49 ` Jiri Kosina
2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable " Jiri Kosina
3 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-06-28 16:49 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Fri, 15 Apr 2016, Eric Dumazet wrote:
> Then you need to save the initial qdisc (bfifo for TBF) in a special
> place, to make sure the delete operation is guaranteed to succeed.
>
> Or fail the delete if the bfifo can not be allocated.
>
> I can tell that determinism if far more interesting than usability for
> some users occasionally playing with tc.
>
> Surely the silent fallback to noop_qdisc is wrong.
So before we go further and fix the fact that we actually do have hidden
qdiscs (by refactoring qdisc_match_from_root() and friends), I'd still
like to bring the patch below up for consideration.
Thanks.
From: Jiri Kosina <jkosina@suse.cz>
Subject: [PATCH] sch_tbf: avoid silent fallback to noop_qdisc
TBF started its life as a classless qdisc with a single builtin FIFO queue
which was being shaped.
When it got later turned into classful qdisc, it was written in a way that
the fallback qdisc was noop_qdisc, which produces bad user experience (delete
of last manually added class doesn't reset it to initial default, but renders
networking unusable instead).
Switch the default fallback to bfifo; this also mimics how the other guys
(HTB, HFSC, CBQ, ...) are behaving.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
net/sched/sch_tbf.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index 3161e49..b06dffe 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -508,8 +508,12 @@ static int tbf_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new,
{
struct tbf_sched_data *q = qdisc_priv(sch);
- if (new == NULL)
- new = &noop_qdisc;
+ if (new == NULL) {
+ /* reset to default qdisc */
+ new = qdisc_create_dflt(sch->dev_queue, &bfifo_qdisc_ops, sch->parent);
+ if (!new)
+ return -ENOBUFS;
+ }
*old = qdisc_replace(sch, new, &q->qdisc);
return 0;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-06-28 15:19 ` Jiri Kosina
@ 2016-06-28 17:28 ` Cong Wang
2016-06-28 17:33 ` Jiri Kosina
0 siblings, 1 reply; 43+ messages in thread
From: Cong Wang @ 2016-06-28 17:28 UTC (permalink / raw)
To: Jiri Kosina
Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter,
Linux Kernel Network Developers, LKML
On Tue, Jun 28, 2016 at 8:19 AM, Jiri Kosina <jikos@kernel.org> wrote:
> BTW, I've started to actually work on fixing this, and I've noticed that
> TBF behavior actually violates what's stated in pfifo_fast manpage:
>
> ==========
> Whenever an interface is created, the pfifo_fast qdisc is
> automatically used as a queue. If another qdisc is
> attached, it preempts the default pfifo_fast, which automatically
> returns to function when an existing qdisc is detached.
>
> In this sense this qdisc is magic, and unlike other qdiscs.
> ==========
It is out of date, now default qdisc can be set to any other qdisc
via /proc. Also, probably due to historical reasons, we don't have
a unified default default qdisc, some uses bfifo, some uses pfifo,
we may break some existing script if we change that.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Deleting child qdisc doesn't reset parent to default qdisc?
2016-06-28 17:28 ` Cong Wang
@ 2016-06-28 17:33 ` Jiri Kosina
0 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-06-28 17:33 UTC (permalink / raw)
To: Cong Wang
Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter,
Linux Kernel Network Developers, LKML
On Tue, 28 Jun 2016, Cong Wang wrote:
> > BTW, I've started to actually work on fixing this, and I've noticed that
> > TBF behavior actually violates what's stated in pfifo_fast manpage:
> >
> > ==========
> > Whenever an interface is created, the pfifo_fast qdisc is
> > automatically used as a queue. If another qdisc is
> > attached, it preempts the default pfifo_fast, which automatically
> > returns to function when an existing qdisc is detached.
> >
> > In this sense this qdisc is magic, and unlike other qdiscs.
> > ==========
>
> It is out of date, now default qdisc can be set to any other qdisc
> via /proc. Also, probably due to historical reasons, we don't have
> a unified default default qdisc, some uses bfifo, some uses pfifo,
> we may break some existing script if we change that.
While I do understand that reasoning, I'd argue that unpredictable and
unexpected behavior of TBF causing systems with non-working networking is
much more likely than any userspace having hard dependency on the fact
that default (*) qdisc for TBF is noop.
(*) where 'default upon creation' != 'default when reset'
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
2016-04-15 14:58 ` Eric Dumazet
` (2 preceding siblings ...)
2016-06-28 16:49 ` [RFC PATCH] sch_tbf: avoid silent fallback to noop_qdisc (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) Jiri Kosina
@ 2016-07-07 9:04 ` Jiri Kosina
2016-07-07 13:51 ` Eric Dumazet
2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina
3 siblings, 2 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-07 9:04 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Fri, 15 Apr 2016, Eric Dumazet wrote:
> Anyway, we probably need to improve our ability to understand qdisc
> hierarchies. Having some hidden qdiscs is the real problem here.
>
> We need to add some hash table so that qdisc_match_from_root() does not
> have to scan hundred of qdiscs.
So how about something like the patch below? I already have preliminary
patches on top which unhide the default qdiscs, but let's make this one
step after the other.
Thanks.
From: Jiri Kosina <jkosina@suse.cz>
Subject: [PATCH] net: sched: convert qdisc linked list to hashtable
Convert the per-device linked list into a hashtable. The primary motivation
for this change is that currently, we're not tracking all the qdiscs in
hierarchy (e.g. excluding default qdiscs), as the lookup performed over the
linked list by qdisc_match_from_root() is rather expensive.
The ultimate goal is to get rid of hidden qdiscs completely, which will bring
much more determinism in user experience.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
include/linux/netdevice.h | 2 ++
include/net/pkt_sched.h | 4 ++--
include/net/sch_generic.h | 2 +-
net/core/dev.c | 1 +
net/sched/sch_api.c | 23 +++++++++++++----------
net/sched/sch_generic.c | 6 +++---
net/sched/sch_mq.c | 2 +-
net/sched/sch_mqprio.c | 2 +-
8 files changed, 24 insertions(+), 18 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f45929c..630838e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,7 @@
#include <uapi/linux/netdevice.h>
#include <uapi/linux/if_bonding.h>
#include <uapi/linux/pkt_cls.h>
+#include <linux/hashtable.h>
struct netpoll_info;
struct device;
@@ -1778,6 +1779,7 @@ struct net_device {
unsigned int num_tx_queues;
unsigned int real_num_tx_queues;
struct Qdisc *qdisc;
+ DECLARE_HASHTABLE (qdisc_hash, 16);
unsigned long tx_queue_len;
spinlock_t tx_global_lock;
int watchdog_timeo;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fea53f4..8ba11b4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops);
void qdisc_get_default(char *id, size_t len);
int qdisc_set_default(const char *id);
-void qdisc_list_add(struct Qdisc *q);
-void qdisc_list_del(struct Qdisc *q);
+void qdisc_hash_add(struct Qdisc *q);
+void qdisc_hash_del(struct Qdisc *q);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle);
struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle);
struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 62d5531..26f5cb3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -67,7 +67,7 @@ struct Qdisc {
u32 limit;
const struct Qdisc_ops *ops;
struct qdisc_size_table __rcu *stab;
- struct list_head list;
+ struct hlist_node hash;
u32 handle;
u32 parent;
int (*reshape_fail)(struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 904ff43..edc1617 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->all_adj_list.lower);
INIT_LIST_HEAD(&dev->ptype_all);
INIT_LIST_HEAD(&dev->ptype_specific);
+ hash_init(dev->qdisc_hash);
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ddf047d..82953cb 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -29,6 +29,7 @@
#include <linux/hrtimer.h>
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <linux/hashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
root->handle == handle)
return root;
- list_for_each_entry_rcu(q, &root->list, list) {
+ hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) {
if (q->handle == handle)
return q;
}
return NULL;
}
-void qdisc_list_add(struct Qdisc *q)
+void qdisc_hash_add(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
struct Qdisc *root = qdisc_dev(q)->qdisc;
WARN_ON_ONCE(root == &noop_qdisc);
ASSERT_RTNL();
- list_add_tail_rcu(&q->list, &root->list);
+ hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle);
}
}
-EXPORT_SYMBOL(qdisc_list_add);
+EXPORT_SYMBOL(qdisc_hash_add);
-void qdisc_list_del(struct Qdisc *q)
+void qdisc_hash_del(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
ASSERT_RTNL();
- list_del_rcu(&q->list);
+ hash_del_rcu(&q->hash);
}
}
-EXPORT_SYMBOL(qdisc_list_del);
+EXPORT_SYMBOL(qdisc_hash_del);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
{
@@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
goto err_out4;
}
- qdisc_list_add(sch);
+ qdisc_hash_add(sch);
return sch;
}
@@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
{
int ret = 0, q_idx = *q_idx_p;
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
goto done;
q_idx++;
}
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (q_idx < s_q_idx) {
q_idx++;
continue;
@@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
int *t_p, int s_t)
{
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
return -1;
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
return -1;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f9e0e9c..7efc923 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = {
.dequeue = noop_dequeue,
.flags = TCQ_F_BUILTIN,
.ops = &noop_qdisc_ops,
- .list = LIST_HEAD_INIT(noop_qdisc.list),
.q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
.dev_queue = &noop_netdev_queue,
.busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
@@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- INIT_LIST_HEAD(&sch->list);
skb_queue_head_init(&sch->q);
spin_lock_init(&sch->busylock);
@@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
return;
#ifdef CONFIG_NET_SCHED
- qdisc_list_del(qdisc);
+ qdisc_hash_del(qdisc);
qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
@@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+ if (dev->qdisc)
+ qdisc_hash_add(dev->qdisc);
}
static void transition_one_qdisc(struct net_device *dev,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 56a77b8..3bee15d 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch)
qdisc_destroy(old);
#ifdef CONFIG_NET_SCHED
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
#endif
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b8002ce..dbfb3a5 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch)
if (old)
qdisc_destroy(old);
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
}
kfree(priv->qdiscs);
priv->qdiscs = NULL;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable " Jiri Kosina
@ 2016-07-07 13:51 ` Eric Dumazet
2016-07-07 16:32 ` Jiri Kosina
2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina
1 sibling, 1 reply; 43+ messages in thread
From: Eric Dumazet @ 2016-07-07 13:51 UTC (permalink / raw)
To: Jiri Kosina, Craig Gallek
Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Thu, 2016-07-07 at 11:04 +0200, Jiri Kosina wrote:
>
>
> From: Jiri Kosina <jkosina@suse.cz>
> Subject: [PATCH] net: sched: convert qdisc linked list to hashtable
>
> Convert the per-device linked list into a hashtable. The primary motivation
> for this change is that currently, we're not tracking all the qdiscs in
> hierarchy (e.g. excluding default qdiscs), as the lookup performed over the
> linked list by qdisc_match_from_root() is rather expensive.
...
> }
> @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
> {
> int ret = 0, q_idx = *q_idx_p;
> struct Qdisc *q;
> + int b;
>
> if (!root)
> return 0;
> @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
> goto done;
> q_idx++;
> }
> - list_for_each_entry(q, &root->list, list) {
> + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
> if (q_idx < s_q_idx) {
> q_idx++;
> continue;
> @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
> int *t_p, int s_t)
> {
> struct Qdisc *q;
> + int b;
>
> if (!root)
> return 0;
> @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
> if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
> return -1;
>
> - list_for_each_entry(q, &root->list, list) {
> + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) {
> if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
> return -1;
> }
Not sure why you used the rcu version here, but the non rcu version in
tc_dump_qdisc_root()
Thanks.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
2016-07-07 13:51 ` Eric Dumazet
@ 2016-07-07 16:32 ` Jiri Kosina
2016-07-07 16:53 ` Eric Dumazet
0 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-07-07 16:32 UTC (permalink / raw)
To: Eric Dumazet
Cc: Craig Gallek, Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Thu, 7 Jul 2016, Eric Dumazet wrote:
> > @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
> > {
> > int ret = 0, q_idx = *q_idx_p;
> > struct Qdisc *q;
> > + int b;
> >
> > if (!root)
> > return 0;
> > @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
> > goto done;
> > q_idx++;
> > }
> > - list_for_each_entry(q, &root->list, list) {
> > + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
> > if (q_idx < s_q_idx) {
> > q_idx++;
> > continue;
> > @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
> > int *t_p, int s_t)
> > {
> > struct Qdisc *q;
> > + int b;
> >
> > if (!root)
> > return 0;
> > @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
> > if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
> > return -1;
> >
> > - list_for_each_entry(q, &root->list, list) {
> > + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) {
> > if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
> > return -1;
> > }
>
>
> Not sure why you used the rcu version here, but the non rcu version in
> tc_dump_qdisc_root()
Good catch.
Actually even the current code is odd in this regard --
qdisc_match_from_root() uses RCU iterator, while tc_dump_*() use the
non-RCU one; addition and deletion is performed using RCU primitives.
I haven't got my head around this yet; if it's correct at all, it'd at
least deserve a comment somewhere.
I'll respin v2 of the patch (there is also a conflict on HASH_SIZE
definition in ip6_tunnel.c, ip6_gre.c and sit.c due to hashtable.h include
in netdevice.h that needs to be resolved as well) that'd make RCU usage
consistent.
Any other objections/comments? I was namely curious about any opinions
regarding the hashtable size.
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
2016-07-07 16:32 ` Jiri Kosina
@ 2016-07-07 16:53 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-07-07 16:53 UTC (permalink / raw)
To: Jiri Kosina
Cc: Craig Gallek, Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Thu, 2016-07-07 at 18:32 +0200, Jiri Kosina wrote:
> On Thu, 7 Jul 2016, Eric Dumazet wrote:
>
> > > @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
> > > {
> > > int ret = 0, q_idx = *q_idx_p;
> > > struct Qdisc *q;
> > > + int b;
> > >
> > > if (!root)
> > > return 0;
> > > @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
> > > goto done;
> > > q_idx++;
> > > }
> > > - list_for_each_entry(q, &root->list, list) {
> > > + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
> > > if (q_idx < s_q_idx) {
> > > q_idx++;
> > > continue;
> > > @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
> > > int *t_p, int s_t)
> > > {
> > > struct Qdisc *q;
> > > + int b;
> > >
> > > if (!root)
> > > return 0;
> > > @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
> > > if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
> > > return -1;
> > >
> > > - list_for_each_entry(q, &root->list, list) {
> > > + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) {
> > > if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
> > > return -1;
> > > }
> >
> >
> > Not sure why you used the rcu version here, but the non rcu version in
> > tc_dump_qdisc_root()
>
> Good catch.
>
> Actually even the current code is odd in this regard --
> qdisc_match_from_root() uses RCU iterator,
Because it can be run from qdisc enqueue() dequeue(), not holding RTNL.
> while tc_dump_*() use the
> non-RCU one; addition and deletion is performed using RCU primitives.
It really is protected by RTNL (qdiscs can not change during the dump)
>
> I haven't got my head around this yet; if it's correct at all, it'd at
> least deserve a comment somewhere.
>
> I'll respin v2 of the patch (there is also a conflict on HASH_SIZE
> definition in ip6_tunnel.c, ip6_gre.c and sit.c due to hashtable.h include
> in netdevice.h that needs to be resolved as well) that'd make RCU usage
> consistent.
>
> Any other objections/comments? I was namely curious about any opinions
> regarding the hashtable size.
Well, this is the tricky part, but rhashtable would mean way more
changes...
^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable
2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable " Jiri Kosina
2016-07-07 13:51 ` Eric Dumazet
@ 2016-07-07 20:36 ` Jiri Kosina
2016-07-08 8:50 ` Eric Dumazet
` (2 more replies)
1 sibling, 3 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-07 20:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
From: Jiri Kosina <jkosina@suse.cz>
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.
The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.
As we're adding hashtable.h include into generic netdevice.h, we have to make
sure HASH_SIZE macro is now non-conflicting with local definitions.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
v1 -> v2: fix up RCU hastable usage wrt. rtnl
fix compilation of .c files which define their own
HASH_SIZE that now oncflicts with the one from
hashtable.h (newly included via netdevice.h)
include/linux/netdevice.h | 2 ++
include/net/pkt_sched.h | 4 ++--
include/net/sch_generic.h | 2 +-
net/core/dev.c | 1 +
net/ipv6/ip6_gre.c | 8 ++++----
net/ipv6/ip6_tunnel.c | 6 +++---
net/ipv6/ip6_vti.c | 6 +++---
net/ipv6/sit.c | 10 +++++-----
net/sched/sch_api.c | 23 +++++++++++++----------
net/sched/sch_generic.c | 6 +++---
net/sched/sch_mq.c | 2 +-
net/sched/sch_mqprio.c | 2 +-
12 files changed, 39 insertions(+), 33 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f45929c..630838e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,7 @@
#include <uapi/linux/netdevice.h>
#include <uapi/linux/if_bonding.h>
#include <uapi/linux/pkt_cls.h>
+#include <linux/hashtable.h>
struct netpoll_info;
struct device;
@@ -1778,6 +1779,7 @@ struct net_device {
unsigned int num_tx_queues;
unsigned int real_num_tx_queues;
struct Qdisc *qdisc;
+ DECLARE_HASHTABLE (qdisc_hash, 16);
unsigned long tx_queue_len;
spinlock_t tx_global_lock;
int watchdog_timeo;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fea53f4..8ba11b4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops);
void qdisc_get_default(char *id, size_t len);
int qdisc_set_default(const char *id);
-void qdisc_list_add(struct Qdisc *q);
-void qdisc_list_del(struct Qdisc *q);
+void qdisc_hash_add(struct Qdisc *q);
+void qdisc_hash_del(struct Qdisc *q);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle);
struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle);
struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 62d5531..26f5cb3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -67,7 +67,7 @@ struct Qdisc {
u32 limit;
const struct Qdisc_ops *ops;
struct qdisc_size_table __rcu *stab;
- struct list_head list;
+ struct hlist_node hash;
u32 handle;
u32 parent;
int (*reshape_fail)(struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 904ff43..edc1617 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->all_adj_list.lower);
INIT_LIST_HEAD(&dev->ptype_all);
INIT_LIST_HEAD(&dev->ptype_specific);
+ hash_init(dev->qdisc_hash);
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index fdc9de2..0f70ecc 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -62,11 +62,11 @@ module_param(log_ecn_error, bool, 0644);
MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define __HASH_SIZE (1 << HASH_SIZE_SHIFT)
static int ip6gre_net_id __read_mostly;
struct ip6gre_net {
- struct ip6_tnl __rcu *tunnels[4][HASH_SIZE];
+ struct ip6_tnl __rcu *tunnels[4][__HASH_SIZE];
struct net_device *fb_tunnel_dev;
};
@@ -96,7 +96,7 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu);
will match fallback tunnel.
*/
-#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1))
+#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(__HASH_SIZE - 1))
static u32 HASH_ADDR(const struct in6_addr *addr)
{
u32 hash = ipv6_addr_hash(addr);
@@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head)
for (prio = 0; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < __HASH_SIZE; h++) {
struct ip6_tnl *t;
t = rtnl_dereference(ign->tunnels[prio][h]);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 7b0481e..a9da620 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -65,7 +65,7 @@ MODULE_ALIAS_RTNL_LINK("ip6tnl");
MODULE_ALIAS_NETDEV("ip6tnl0");
#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define __HASH_SIZE (1 << HASH_SIZE_SHIFT)
static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
@@ -87,7 +87,7 @@ struct ip6_tnl_net {
/* the IPv6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[__HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net)
if (dev->rtnl_link_ops == &ip6_link_ops)
unregister_netdevice_queue(dev, &list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < __HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
/* If dev is in the same netns, it has already
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index d90a11f..2d192af 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -51,7 +51,7 @@
#include <net/netns/generic.h>
#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define __HASH_SIZE (1 << HASH_SIZE_SHIFT)
static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
@@ -69,7 +69,7 @@ struct vti6_net {
/* the vti6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[__HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n)
struct ip6_tnl *t;
LIST_HEAD(list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < __HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
unregister_netdevice_queue(t->dev, &list);
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 0a5a255..9f776d7 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -62,7 +62,7 @@
For comments look at net/ipv4/ip_gre.c --ANK
*/
-#define HASH_SIZE 16
+#define __HASH_SIZE 16
#define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF)
static bool log_ecn_error = true;
@@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly;
static int sit_net_id __read_mostly;
struct sit_net {
- struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_r[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_l[HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r_l[__HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r[__HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_l[__HASH_SIZE];
struct ip_tunnel __rcu *tunnels_wc[1];
struct ip_tunnel __rcu **tunnels[4];
@@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net,
for (prio = 1; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < __HASH_SIZE; h++) {
struct ip_tunnel *t;
t = rtnl_dereference(sitn->tunnels[prio][h]);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ddf047d..c093d32 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -29,6 +29,7 @@
#include <linux/hrtimer.h>
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <linux/hashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
root->handle == handle)
return root;
- list_for_each_entry_rcu(q, &root->list, list) {
+ hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) {
if (q->handle == handle)
return q;
}
return NULL;
}
-void qdisc_list_add(struct Qdisc *q)
+void qdisc_hash_add(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
struct Qdisc *root = qdisc_dev(q)->qdisc;
WARN_ON_ONCE(root == &noop_qdisc);
ASSERT_RTNL();
- list_add_tail_rcu(&q->list, &root->list);
+ hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle);
}
}
-EXPORT_SYMBOL(qdisc_list_add);
+EXPORT_SYMBOL(qdisc_hash_add);
-void qdisc_list_del(struct Qdisc *q)
+void qdisc_hash_del(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
ASSERT_RTNL();
- list_del_rcu(&q->list);
+ hash_del_rcu(&q->hash);
}
}
-EXPORT_SYMBOL(qdisc_list_del);
+EXPORT_SYMBOL(qdisc_hash_del);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
{
@@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
goto err_out4;
}
- qdisc_list_add(sch);
+ qdisc_hash_add(sch);
return sch;
}
@@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
{
int ret = 0, q_idx = *q_idx_p;
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
goto done;
q_idx++;
}
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (q_idx < s_q_idx) {
q_idx++;
continue;
@@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
int *t_p, int s_t)
{
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
return -1;
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
return -1;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f9e0e9c..7efc923 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = {
.dequeue = noop_dequeue,
.flags = TCQ_F_BUILTIN,
.ops = &noop_qdisc_ops,
- .list = LIST_HEAD_INIT(noop_qdisc.list),
.q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
.dev_queue = &noop_netdev_queue,
.busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
@@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- INIT_LIST_HEAD(&sch->list);
skb_queue_head_init(&sch->q);
spin_lock_init(&sch->busylock);
@@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
return;
#ifdef CONFIG_NET_SCHED
- qdisc_list_del(qdisc);
+ qdisc_hash_del(qdisc);
qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
@@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+ if (dev->qdisc)
+ qdisc_hash_add(dev->qdisc);
}
static void transition_one_qdisc(struct net_device *dev,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 56a77b8..3bee15d 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch)
qdisc_destroy(old);
#ifdef CONFIG_NET_SCHED
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
#endif
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b8002ce..dbfb3a5 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch)
if (old)
qdisc_destroy(old);
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
}
kfree(priv->qdiscs);
priv->qdiscs = NULL;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable
2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina
@ 2016-07-08 8:50 ` Eric Dumazet
2016-07-08 9:02 ` Jiri Kosina
2016-07-08 11:07 ` Thomas Graf
2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina
2 siblings, 1 reply; 43+ messages in thread
From: Eric Dumazet @ 2016-07-08 8:50 UTC (permalink / raw)
To: Jiri Kosina; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Thu, 2016-07-07 at 22:36 +0200, Jiri Kosina wrote:
> From: Jiri Kosina <jkosina@suse.cz>
>
> Convert the per-device linked list into a hashtable. The primary
> motivation for this change is that currently, we're not tracking all the
> qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
> performed over the linked list by qdisc_match_from_root() is rather
> expensive.
>
> The ultimate goal is to get rid of hidden qdiscs completely, which will
> bring much more determinism in user experience.
>
> As we're adding hashtable.h include into generic netdevice.h, we have to make
> sure HASH_SIZE macro is now non-conflicting with local definitions.
>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> ---
> diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
> index fdc9de2..0f70ecc 100644
> --- a/net/ipv6/ip6_gre.c
> +++ b/net/ipv6/ip6_gre.c
> @@ -62,11 +62,11 @@ module_param(log_ecn_error, bool, 0644);
> MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
>
> #define HASH_SIZE_SHIFT 5
> -#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
> +#define __HASH_SIZE (1 << HASH_SIZE_SHIFT)
__ prefix is mostly used for functions having some kind of
shells/helpers.
I would rather use IP6_GRE_HASH_SIZE or something which has lower
chances of being used elsewhere.
Or maybe you could use new HASH_SIZE(name), providing proper 'name'
@@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev)
> qdisc->ops->attach(qdisc);
> }
> }
> + if (dev->qdisc)
> + qdisc_hash_add(dev->qdisc);
> }
>
I do not understand this addition, could you comment on it ?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable
2016-07-08 8:50 ` Eric Dumazet
@ 2016-07-08 9:02 ` Jiri Kosina
0 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-08 9:02 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Fri, 8 Jul 2016, Eric Dumazet wrote:
> > diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
> > index fdc9de2..0f70ecc 100644
> > --- a/net/ipv6/ip6_gre.c
> > +++ b/net/ipv6/ip6_gre.c
> > @@ -62,11 +62,11 @@ module_param(log_ecn_error, bool, 0644);
> > MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
> >
> > #define HASH_SIZE_SHIFT 5
> > -#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
> > +#define __HASH_SIZE (1 << HASH_SIZE_SHIFT)
>
> __ prefix is mostly used for functions having some kind of
> shells/helpers.
>
> I would rather use IP6_GRE_HASH_SIZE or something which has lower
> chances of being used elsewhere.
Alright, makes sense, will do this in v3.
> @@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev)
> > qdisc->ops->attach(qdisc);
> > }
> > }
> > + if (dev->qdisc)
> > + qdisc_hash_add(dev->qdisc);
> > }
> >
>
> I do not understand this addition, could you comment on it ?
With linked lists, assigning to struct net_device's Qdisc pointer is
enough to "initialize" the linked list and have it contain one (root)
item. With hashtable, this is not the case, it needs to be explicitly
added.
Hmm, dev_init_scheduler() (and perhaps also dev_shutdown()) would possibly
need similar treatment in order to have accurate data there 100% of the
time even during initialization.
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable
2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina
2016-07-08 8:50 ` Eric Dumazet
@ 2016-07-08 11:07 ` Thomas Graf
2016-07-08 13:52 ` Eric Dumazet
2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina
2 siblings, 1 reply; 43+ messages in thread
From: Thomas Graf @ 2016-07-08 11:07 UTC (permalink / raw)
To: Jiri Kosina
Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On 07/07/16 at 10:36pm, Jiri Kosina wrote:
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index f45929c..630838e 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -52,6 +52,7 @@
> #include <uapi/linux/netdevice.h>
> #include <uapi/linux/if_bonding.h>
> #include <uapi/linux/pkt_cls.h>
> +#include <linux/hashtable.h>
>
> struct netpoll_info;
> struct device;
> @@ -1778,6 +1779,7 @@ struct net_device {
> unsigned int num_tx_queues;
> unsigned int real_num_tx_queues;
> struct Qdisc *qdisc;
> + DECLARE_HASHTABLE (qdisc_hash, 16);
This blows up net_device to an insane size: 64K * sizeof(struct
hlist_head). Can we allocate this on demand for net_devices where
it is actually needed? The majority of virtual devices won't need
this. Doesn't have to be rhashtable, can still be fixed size but
at least allocate it.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable
2016-07-08 11:07 ` Thomas Graf
@ 2016-07-08 13:52 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-07-08 13:52 UTC (permalink / raw)
To: Thomas Graf
Cc: Jiri Kosina, Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
On Fri, 2016-07-08 at 13:07 +0200, Thomas Graf wrote:
> On 07/07/16 at 10:36pm, Jiri Kosina wrote:
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index f45929c..630838e 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -52,6 +52,7 @@
> > #include <uapi/linux/netdevice.h>
> > #include <uapi/linux/if_bonding.h>
> > #include <uapi/linux/pkt_cls.h>
> > +#include <linux/hashtable.h>
> >
> > struct netpoll_info;
> > struct device;
> > @@ -1778,6 +1779,7 @@ struct net_device {
> > unsigned int num_tx_queues;
> > unsigned int real_num_tx_queues;
> > struct Qdisc *qdisc;
> > + DECLARE_HASHTABLE (qdisc_hash, 16);
>
> This blows up net_device to an insane size: 64K * sizeof(struct
> hlist_head). Can we allocate this on demand for net_devices where
> it is actually needed? The majority of virtual devices won't need
> this. Doesn't have to be rhashtable, can still be fixed size but
> at least allocate it.
Jiri probably misread the API and should have used :
DECLARE_HASHTABLE (qdisc_hash, 4);
Google has a very similar patch with 16 buckets, and it is 'good enough',
although we do not hit the qdisc_tree_reduce_backlog() penalty.
^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable
2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina
2016-07-08 8:50 ` Eric Dumazet
2016-07-08 11:07 ` Thomas Graf
@ 2016-07-11 14:02 ` Jiri Kosina
2016-07-12 17:36 ` Cong Wang
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
2 siblings, 2 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-11 14:02 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel
From: Jiri Kosina <jkosina@suse.cz>
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.
The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.
As we're adding hashtable.h include into generic netdevice.h, we have to
make sure HASH_SIZE macro is now non-conflicting with local definitions.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
v1 -> v2: fix up RCU hastable usage wrt. rtnl
fix compilation of .c files which define their own
HASH_SIZE that now oncflicts with the one from
hashtable.h (newly included via netdevice.h)
v2 -> v3: resolve HASH_SIZE identifier conflicts in a cleaner way
fix up the number of hash bucket bits (4 bits for 16 buckets)
include/linux/netdevice.h | 2 ++
include/net/pkt_sched.h | 4 ++--
include/net/sch_generic.h | 2 +-
net/core/dev.c | 1 +
net/ipv6/ip6_gre.c | 12 ++++++------
net/ipv6/ip6_tunnel.c | 10 +++++-----
net/ipv6/ip6_vti.c | 10 +++++-----
net/ipv6/sit.c | 10 +++++-----
net/sched/sch_api.c | 23 +++++++++++++----------
net/sched/sch_generic.c | 6 +++---
net/sched/sch_mq.c | 2 +-
net/sched/sch_mqprio.c | 2 +-
12 files changed, 45 insertions(+), 39 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f45929c..0b5c172e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,7 @@
#include <uapi/linux/netdevice.h>
#include <uapi/linux/if_bonding.h>
#include <uapi/linux/pkt_cls.h>
+#include <linux/hashtable.h>
struct netpoll_info;
struct device;
@@ -1778,6 +1779,7 @@ struct net_device {
unsigned int num_tx_queues;
unsigned int real_num_tx_queues;
struct Qdisc *qdisc;
+ DECLARE_HASHTABLE (qdisc_hash, 4);
unsigned long tx_queue_len;
spinlock_t tx_global_lock;
int watchdog_timeo;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fea53f4..8ba11b4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops);
void qdisc_get_default(char *id, size_t len);
int qdisc_set_default(const char *id);
-void qdisc_list_add(struct Qdisc *q);
-void qdisc_list_del(struct Qdisc *q);
+void qdisc_hash_add(struct Qdisc *q);
+void qdisc_hash_del(struct Qdisc *q);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle);
struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle);
struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 62d5531..26f5cb3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -67,7 +67,7 @@ struct Qdisc {
u32 limit;
const struct Qdisc_ops *ops;
struct qdisc_size_table __rcu *stab;
- struct list_head list;
+ struct hlist_node hash;
u32 handle;
u32 parent;
int (*reshape_fail)(struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 904ff43..edc1617 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->all_adj_list.lower);
INIT_LIST_HEAD(&dev->ptype_all);
INIT_LIST_HEAD(&dev->ptype_specific);
+ hash_init(dev->qdisc_hash);
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index fdc9de2..d3697a4 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -61,12 +61,12 @@ static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_GRE_HASH_SIZE_SHIFT 5
+#define IP6_GRE_HASH_SIZE (1 << IP6_GRE_HASH_SIZE_SHIFT)
static int ip6gre_net_id __read_mostly;
struct ip6gre_net {
- struct ip6_tnl __rcu *tunnels[4][HASH_SIZE];
+ struct ip6_tnl __rcu *tunnels[4][IP6_GRE_HASH_SIZE];
struct net_device *fb_tunnel_dev;
};
@@ -96,12 +96,12 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu);
will match fallback tunnel.
*/
-#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1))
+#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(IP6_GRE_HASH_SIZE - 1))
static u32 HASH_ADDR(const struct in6_addr *addr)
{
u32 hash = ipv6_addr_hash(addr);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_GRE_HASH_SIZE_SHIFT);
}
#define tunnels_r_l tunnels[3]
@@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head)
for (prio = 0; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_GRE_HASH_SIZE; h++) {
struct ip6_tnl *t;
t = rtnl_dereference(ign->tunnels[prio][h]);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 7b0481e..2050217 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -64,8 +64,8 @@ MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("ip6tnl");
MODULE_ALIAS_NETDEV("ip6tnl0");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_TUNNEL_HASH_SIZE_SHIFT 5
+#define IP6_TUNNEL_HASH_SIZE (1 << IP6_TUNNEL_HASH_SIZE_SHIFT)
static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
@@ -75,7 +75,7 @@ static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_TUNNEL_HASH_SIZE_SHIFT);
}
static int ip6_tnl_dev_init(struct net_device *dev);
@@ -87,7 +87,7 @@ struct ip6_tnl_net {
/* the IPv6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_TUNNEL_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net)
if (dev->rtnl_link_ops == &ip6_link_ops)
unregister_netdevice_queue(dev, &list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_TUNNEL_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
/* If dev is in the same netns, it has already
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index d90a11f..cc7e058 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -50,14 +50,14 @@
#include <net/net_namespace.h>
#include <net/netns/generic.h>
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_VTI_HASH_SIZE_SHIFT 5
+#define IP6_VTI_HASH_SIZE (1 << IP6_VTI_HASH_SIZE_SHIFT)
static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_VTI_HASH_SIZE_SHIFT);
}
static int vti6_dev_init(struct net_device *dev);
@@ -69,7 +69,7 @@ struct vti6_net {
/* the vti6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_VTI_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n)
struct ip6_tnl *t;
LIST_HEAD(list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_VTI_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
unregister_netdevice_queue(t->dev, &list);
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 0a5a255..94dd0f0 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -62,7 +62,7 @@
For comments look at net/ipv4/ip_gre.c --ANK
*/
-#define HASH_SIZE 16
+#define IP6_SIT_HASH_SIZE 16
#define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF)
static bool log_ecn_error = true;
@@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly;
static int sit_net_id __read_mostly;
struct sit_net {
- struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_r[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_l[HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r_l[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_l[IP6_SIT_HASH_SIZE];
struct ip_tunnel __rcu *tunnels_wc[1];
struct ip_tunnel __rcu **tunnels[4];
@@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net,
for (prio = 1; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_SIT_HASH_SIZE; h++) {
struct ip_tunnel *t;
t = rtnl_dereference(sitn->tunnels[prio][h]);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ddf047d..c093d32 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -29,6 +29,7 @@
#include <linux/hrtimer.h>
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <linux/hashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
root->handle == handle)
return root;
- list_for_each_entry_rcu(q, &root->list, list) {
+ hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) {
if (q->handle == handle)
return q;
}
return NULL;
}
-void qdisc_list_add(struct Qdisc *q)
+void qdisc_hash_add(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
struct Qdisc *root = qdisc_dev(q)->qdisc;
WARN_ON_ONCE(root == &noop_qdisc);
ASSERT_RTNL();
- list_add_tail_rcu(&q->list, &root->list);
+ hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle);
}
}
-EXPORT_SYMBOL(qdisc_list_add);
+EXPORT_SYMBOL(qdisc_hash_add);
-void qdisc_list_del(struct Qdisc *q)
+void qdisc_hash_del(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
ASSERT_RTNL();
- list_del_rcu(&q->list);
+ hash_del_rcu(&q->hash);
}
}
-EXPORT_SYMBOL(qdisc_list_del);
+EXPORT_SYMBOL(qdisc_hash_del);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
{
@@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
goto err_out4;
}
- qdisc_list_add(sch);
+ qdisc_hash_add(sch);
return sch;
}
@@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
{
int ret = 0, q_idx = *q_idx_p;
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
goto done;
q_idx++;
}
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (q_idx < s_q_idx) {
q_idx++;
continue;
@@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
int *t_p, int s_t)
{
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
return -1;
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
return -1;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f9e0e9c..7efc923 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = {
.dequeue = noop_dequeue,
.flags = TCQ_F_BUILTIN,
.ops = &noop_qdisc_ops,
- .list = LIST_HEAD_INIT(noop_qdisc.list),
.q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
.dev_queue = &noop_netdev_queue,
.busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
@@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- INIT_LIST_HEAD(&sch->list);
skb_queue_head_init(&sch->q);
spin_lock_init(&sch->busylock);
@@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
return;
#ifdef CONFIG_NET_SCHED
- qdisc_list_del(qdisc);
+ qdisc_hash_del(qdisc);
qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
@@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+ if (dev->qdisc)
+ qdisc_hash_add(dev->qdisc);
}
static void transition_one_qdisc(struct net_device *dev,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 56a77b8..3bee15d 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch)
qdisc_destroy(old);
#ifdef CONFIG_NET_SCHED
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
#endif
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b8002ce..dbfb3a5 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch)
if (old)
qdisc_destroy(old);
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
}
kfree(priv->qdiscs);
priv->qdiscs = NULL;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable
2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina
@ 2016-07-12 17:36 ` Cong Wang
2016-07-13 13:47 ` Jiri Kosina
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
1 sibling, 1 reply; 43+ messages in thread
From: Cong Wang @ 2016-07-12 17:36 UTC (permalink / raw)
To: Jiri Kosina
Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter,
Linux Kernel Network Developers, LKML
On Mon, Jul 11, 2016 at 7:02 AM, Jiri Kosina <jikos@kernel.org> wrote:
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index f45929c..0b5c172e 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -52,6 +52,7 @@
> #include <uapi/linux/netdevice.h>
> #include <uapi/linux/if_bonding.h>
> #include <uapi/linux/pkt_cls.h>
> +#include <linux/hashtable.h>
>
> struct netpoll_info;
> struct device;
> @@ -1778,6 +1779,7 @@ struct net_device {
> unsigned int num_tx_queues;
> unsigned int real_num_tx_queues;
> struct Qdisc *qdisc;
> + DECLARE_HASHTABLE (qdisc_hash, 4);
> unsigned long tx_queue_len;
> spinlock_t tx_global_lock;
> int watchdog_timeo;
Should it be surrounded by CONFIG_NET_SCHED?
To save several bytes for !CONFIG_NET_SCHED case.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH v3] net: sched: convert qdisc linked list to hashtable
2016-07-12 17:36 ` Cong Wang
@ 2016-07-13 13:47 ` Jiri Kosina
0 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-13 13:47 UTC (permalink / raw)
To: Cong Wang
Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter,
Linux Kernel Network Developers, LKML
On Tue, 12 Jul 2016, Cong Wang wrote:
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index f45929c..0b5c172e 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -52,6 +52,7 @@
> > #include <uapi/linux/netdevice.h>
> > #include <uapi/linux/if_bonding.h>
> > #include <uapi/linux/pkt_cls.h>
> > +#include <linux/hashtable.h>
> >
> > struct netpoll_info;
> > struct device;
> > @@ -1778,6 +1779,7 @@ struct net_device {
> > unsigned int num_tx_queues;
> > unsigned int real_num_tx_queues;
> > struct Qdisc *qdisc;
> > + DECLARE_HASHTABLE (qdisc_hash, 4);
> > unsigned long tx_queue_len;
> > spinlock_t tx_global_lock;
> > int watchdog_timeo;
>
> Should it be surrounded by CONFIG_NET_SCHED?
> To save several bytes for !CONFIG_NET_SCHED case.
Makes sense. I'll wait a bit for more feedback (if there is any) before
including this in potential v4.
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina
2016-07-12 17:36 ` Cong Wang
@ 2016-07-28 9:56 ` Jiri Kosina
2016-07-28 11:10 ` kbuild test robot
` (4 more replies)
1 sibling, 5 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-28 9:56 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel, Cong Wang
From: Jiri Kosina <jkosina@suse.cz>
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.
The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.
As we're adding hashtable.h include into generic netdevice.h, we have to
make sure HASH_SIZE macro is now non-conflicting with local definitions.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
v1 -> v2: fix up RCU hastable usage wrt. rtnl
fix compilation of .c files which define their own
HASH_SIZE that now oncflicts with the one from
hashtable.h (newly included via netdevice.h)
v2 -> v3: resolve HASH_SIZE identifier conflicts in a cleaner way
fix up the number of hash bucket bits (4 bits for 16 buckets)
v3 -> v4: put the hastable into struct netdevice only if
CONFIG_NET_SCHED has been enabled
include/linux/netdevice.h | 4 ++++
include/net/pkt_sched.h | 4 ++--
include/net/sch_generic.h | 2 +-
net/core/dev.c | 3 +++
net/ipv6/ip6_gre.c | 12 ++++++------
net/ipv6/ip6_tunnel.c | 10 +++++-----
net/ipv6/ip6_vti.c | 10 +++++-----
net/ipv6/sit.c | 10 +++++-----
net/sched/sch_api.c | 23 +++++++++++++----------
net/sched/sch_generic.c | 6 +++---
net/sched/sch_mq.c | 2 +-
net/sched/sch_mqprio.c | 2 +-
12 files changed, 49 insertions(+), 39 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f45929c..17c6499 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,7 @@
#include <uapi/linux/netdevice.h>
#include <uapi/linux/if_bonding.h>
#include <uapi/linux/pkt_cls.h>
+#include <linux/hashtable.h>
struct netpoll_info;
struct device;
@@ -1778,6 +1779,9 @@ struct net_device {
unsigned int num_tx_queues;
unsigned int real_num_tx_queues;
struct Qdisc *qdisc;
+#ifdef CONFIG_NET_SCHED
+ DECLARE_HASHTABLE (qdisc_hash, 4);
+#endif
unsigned long tx_queue_len;
spinlock_t tx_global_lock;
int watchdog_timeo;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fea53f4..8ba11b4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops);
void qdisc_get_default(char *id, size_t len);
int qdisc_set_default(const char *id);
-void qdisc_list_add(struct Qdisc *q);
-void qdisc_list_del(struct Qdisc *q);
+void qdisc_hash_add(struct Qdisc *q);
+void qdisc_hash_del(struct Qdisc *q);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle);
struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle);
struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 62d5531..26f5cb3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -67,7 +67,7 @@ struct Qdisc {
u32 limit;
const struct Qdisc_ops *ops;
struct qdisc_size_table __rcu *stab;
- struct list_head list;
+ struct hlist_node hash;
u32 handle;
u32 parent;
int (*reshape_fail)(struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 904ff43..d3736d5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7511,6 +7511,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->all_adj_list.lower);
INIT_LIST_HEAD(&dev->ptype_all);
INIT_LIST_HEAD(&dev->ptype_specific);
+#ifdef CONFIG_NET_SCHED
+ hash_init(dev->qdisc_hash);
+#endif
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index fdc9de2..d3697a4 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -61,12 +61,12 @@ static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_GRE_HASH_SIZE_SHIFT 5
+#define IP6_GRE_HASH_SIZE (1 << IP6_GRE_HASH_SIZE_SHIFT)
static int ip6gre_net_id __read_mostly;
struct ip6gre_net {
- struct ip6_tnl __rcu *tunnels[4][HASH_SIZE];
+ struct ip6_tnl __rcu *tunnels[4][IP6_GRE_HASH_SIZE];
struct net_device *fb_tunnel_dev;
};
@@ -96,12 +96,12 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu);
will match fallback tunnel.
*/
-#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1))
+#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(IP6_GRE_HASH_SIZE - 1))
static u32 HASH_ADDR(const struct in6_addr *addr)
{
u32 hash = ipv6_addr_hash(addr);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_GRE_HASH_SIZE_SHIFT);
}
#define tunnels_r_l tunnels[3]
@@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head)
for (prio = 0; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_GRE_HASH_SIZE; h++) {
struct ip6_tnl *t;
t = rtnl_dereference(ign->tunnels[prio][h]);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 7b0481e..2050217 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -64,8 +64,8 @@ MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("ip6tnl");
MODULE_ALIAS_NETDEV("ip6tnl0");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_TUNNEL_HASH_SIZE_SHIFT 5
+#define IP6_TUNNEL_HASH_SIZE (1 << IP6_TUNNEL_HASH_SIZE_SHIFT)
static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
@@ -75,7 +75,7 @@ static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_TUNNEL_HASH_SIZE_SHIFT);
}
static int ip6_tnl_dev_init(struct net_device *dev);
@@ -87,7 +87,7 @@ struct ip6_tnl_net {
/* the IPv6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_TUNNEL_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net)
if (dev->rtnl_link_ops == &ip6_link_ops)
unregister_netdevice_queue(dev, &list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_TUNNEL_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
/* If dev is in the same netns, it has already
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index d90a11f..cc7e058 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -50,14 +50,14 @@
#include <net/net_namespace.h>
#include <net/netns/generic.h>
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_VTI_HASH_SIZE_SHIFT 5
+#define IP6_VTI_HASH_SIZE (1 << IP6_VTI_HASH_SIZE_SHIFT)
static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_VTI_HASH_SIZE_SHIFT);
}
static int vti6_dev_init(struct net_device *dev);
@@ -69,7 +69,7 @@ struct vti6_net {
/* the vti6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_VTI_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n)
struct ip6_tnl *t;
LIST_HEAD(list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_VTI_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
unregister_netdevice_queue(t->dev, &list);
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 0a5a255..94dd0f0 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -62,7 +62,7 @@
For comments look at net/ipv4/ip_gre.c --ANK
*/
-#define HASH_SIZE 16
+#define IP6_SIT_HASH_SIZE 16
#define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF)
static bool log_ecn_error = true;
@@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly;
static int sit_net_id __read_mostly;
struct sit_net {
- struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_r[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_l[HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r_l[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_l[IP6_SIT_HASH_SIZE];
struct ip_tunnel __rcu *tunnels_wc[1];
struct ip_tunnel __rcu **tunnels[4];
@@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net,
for (prio = 1; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_SIT_HASH_SIZE; h++) {
struct ip_tunnel *t;
t = rtnl_dereference(sitn->tunnels[prio][h]);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ddf047d..c093d32 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -29,6 +29,7 @@
#include <linux/hrtimer.h>
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <linux/hashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
root->handle == handle)
return root;
- list_for_each_entry_rcu(q, &root->list, list) {
+ hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) {
if (q->handle == handle)
return q;
}
return NULL;
}
-void qdisc_list_add(struct Qdisc *q)
+void qdisc_hash_add(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
struct Qdisc *root = qdisc_dev(q)->qdisc;
WARN_ON_ONCE(root == &noop_qdisc);
ASSERT_RTNL();
- list_add_tail_rcu(&q->list, &root->list);
+ hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle);
}
}
-EXPORT_SYMBOL(qdisc_list_add);
+EXPORT_SYMBOL(qdisc_hash_add);
-void qdisc_list_del(struct Qdisc *q)
+void qdisc_hash_del(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
ASSERT_RTNL();
- list_del_rcu(&q->list);
+ hash_del_rcu(&q->hash);
}
}
-EXPORT_SYMBOL(qdisc_list_del);
+EXPORT_SYMBOL(qdisc_hash_del);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
{
@@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
goto err_out4;
}
- qdisc_list_add(sch);
+ qdisc_hash_add(sch);
return sch;
}
@@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
{
int ret = 0, q_idx = *q_idx_p;
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
goto done;
q_idx++;
}
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (q_idx < s_q_idx) {
q_idx++;
continue;
@@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
int *t_p, int s_t)
{
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
return -1;
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
return -1;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f9e0e9c..7efc923 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = {
.dequeue = noop_dequeue,
.flags = TCQ_F_BUILTIN,
.ops = &noop_qdisc_ops,
- .list = LIST_HEAD_INIT(noop_qdisc.list),
.q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
.dev_queue = &noop_netdev_queue,
.busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
@@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- INIT_LIST_HEAD(&sch->list);
skb_queue_head_init(&sch->q);
spin_lock_init(&sch->busylock);
@@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
return;
#ifdef CONFIG_NET_SCHED
- qdisc_list_del(qdisc);
+ qdisc_hash_del(qdisc);
qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
@@ -732,6 +730,8 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+ if (dev->qdisc)
+ qdisc_hash_add(dev->qdisc);
}
static void transition_one_qdisc(struct net_device *dev,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 56a77b8..3bee15d 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch)
qdisc_destroy(old);
#ifdef CONFIG_NET_SCHED
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
#endif
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b8002ce..dbfb3a5 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch)
if (old)
qdisc_destroy(old);
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
}
kfree(priv->qdiscs);
priv->qdiscs = NULL;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
@ 2016-07-28 11:10 ` kbuild test robot
2016-07-28 11:18 ` Jiri Kosina
2016-07-28 11:22 ` kbuild test robot
` (3 subsequent siblings)
4 siblings, 1 reply; 43+ messages in thread
From: kbuild test robot @ 2016-07-28 11:10 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
[-- Attachment #1: Type: text/plain, Size: 874 bytes --]
Hi,
[auto build test ERROR on v4.7-rc7]
[also build test ERROR on next-20160728]
[cannot apply to net/master net-next/master ipsec-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160728-182303
config: i386-randconfig-s0-201630 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
net/built-in.o: In function `dev_activate':
>> (.text+0x37ccb): undefined reference to `qdisc_hash_add'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 23936 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 11:10 ` kbuild test robot
@ 2016-07-28 11:18 ` Jiri Kosina
2016-07-28 12:53 ` Fengguang Wu
0 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-07-28 11:18 UTC (permalink / raw)
To: kbuild test robot
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
On Thu, 28 Jul 2016, kbuild test robot wrote:
> [auto build test ERROR on v4.7-rc7]
> [also build test ERROR on next-20160728]
> [cannot apply to net/master net-next/master ipsec-next/master]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>
> url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160728-182303
> config: i386-randconfig-s0-201630 (attached as .config)
> compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> All errors (new ones prefixed by >>):
>
> net/built-in.o: In function `dev_activate':
> >> (.text+0x37ccb): undefined reference to `qdisc_hash_add'
Dear 0-day team,
could you please check my question regarding this very build failure here?
lkml.kernel.org/r/alpine.LNX.2.00.1607141612560.24757@cbobk.fhfr.pm
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
2016-07-28 11:10 ` kbuild test robot
@ 2016-07-28 11:22 ` kbuild test robot
2016-07-28 11:52 ` kbuild test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 43+ messages in thread
From: kbuild test robot @ 2016-07-28 11:22 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
[-- Attachment #1: Type: text/plain, Size: 1341 bytes --]
Hi,
[auto build test ERROR on v4.7-rc7]
[also build test ERROR on next-20160728]
[cannot apply to net/master net-next/master ipsec-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160728-182303
config: sh-edosk7760_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sh
All errors (new ones prefixed by >>):
net/built-in.o: In function `dev_activate':
>> net/sched/sch_generic.c:733: undefined reference to `qdisc_hash_add'
vim +733 net/sched/sch_generic.c
727 qdisc = qdisc_create_dflt(txq, &mq_qdisc_ops, TC_H_ROOT);
728 if (qdisc) {
729 dev->qdisc = qdisc;
730 qdisc->ops->attach(qdisc);
731 }
732 }
> 733 if (dev->qdisc)
734 qdisc_hash_add(dev->qdisc);
735 }
736
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 12180 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
2016-07-28 11:10 ` kbuild test robot
2016-07-28 11:22 ` kbuild test robot
@ 2016-07-28 11:52 ` kbuild test robot
2016-07-28 16:52 ` Cong Wang
2016-07-29 7:49 ` [PATCH v5] " Jiri Kosina
4 siblings, 0 replies; 43+ messages in thread
From: kbuild test robot @ 2016-07-28 11:52 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
[-- Attachment #1: Type: text/plain, Size: 1053 bytes --]
Hi,
[auto build test ERROR on v4.7-rc7]
[also build test ERROR on next-20160728]
[cannot apply to net/master net-next/master ipsec-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160728-182303
config: arm-sunxi_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm
All errors (new ones prefixed by >>):
net/built-in.o: In function `dev_activate':
>> dns_query.c:(.text+0x39adc): undefined reference to `qdisc_hash_add'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 19385 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 11:18 ` Jiri Kosina
@ 2016-07-28 12:53 ` Fengguang Wu
2016-07-28 16:53 ` Cong Wang
2016-07-31 11:21 ` Fengguang Wu
0 siblings, 2 replies; 43+ messages in thread
From: Fengguang Wu @ 2016-07-28 12:53 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
On Thu, Jul 28, 2016 at 01:18:27PM +0200, Jiri Kosina wrote:
>On Thu, 28 Jul 2016, kbuild test robot wrote:
>
>> [auto build test ERROR on v4.7-rc7]
>> [also build test ERROR on next-20160728]
>> [cannot apply to net/master net-next/master ipsec-next/master]
>> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>>
>> url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160728-182303
>> config: i386-randconfig-s0-201630 (attached as .config)
>> compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
>> reproduce:
>> # save the attached .config to linux build tree
>> make ARCH=i386
>>
>> All errors (new ones prefixed by >>):
>>
>> net/built-in.o: In function `dev_activate':
>> >> (.text+0x37ccb): undefined reference to `qdisc_hash_add'
>
>Dear 0-day team,
>
>could you please check my question regarding this very build failure here?
>
> lkml.kernel.org/r/alpine.LNX.2.00.1607141612560.24757@cbobk.fhfr.pm
Sorry I missed that. For your convenience, here is the answer to the
original email:
>This issue is be there even without my patch (but with qdisc_list_add
>instead), isn't it?
Yes it looks so, this number happens in a number of places:
dns_query.c:(.text+0x39b84): undefined reference to `qdisc_hash_add'
include/linux/netdevice.h:1935: undefined reference to `qdisc_hash_add'
net/core/netevent.c:31: undefined reference to `qdisc_hash_add'
net/sched/sch_generic.c:789: undefined reference to `qdisc_hash_add'
sch_generic.c:(.text+0x33487): undefined reference to `qdisc_hash_add'
switchdev.c:(.text+0x3bf58): undefined reference to `qdisc_hash_add'
sysctl_net.c:(.text+0x31f70): undefined reference to `qdisc_hash_add'
(.text.dev_activate+0x228): undefined reference to `qdisc_hash_add'
(.text+0x37d0b): undefined reference to `qdisc_hash_add'
wext-proc.c:(.text+0x390a8): undefined reference to `qdisc_hash_add'
>The problem is that sch_generic.c (where dev_activate() is) is being
>compiled everytime CONFIG_NET is set, but sch_api.c (where
>qdisc_list_add() is defined) only when CONFIG_NET_SCHED is set (and there
>is no stub for !CONFIG_NET_SCHED case).
So it looks like a more general problem than specific to this patch.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
` (2 preceding siblings ...)
2016-07-28 11:52 ` kbuild test robot
@ 2016-07-28 16:52 ` Cong Wang
2016-07-29 7:49 ` [PATCH v5] " Jiri Kosina
4 siblings, 0 replies; 43+ messages in thread
From: Cong Wang @ 2016-07-28 16:52 UTC (permalink / raw)
To: Jiri Kosina
Cc: Eric Dumazet, Jamal Hadi Salim, Phil Sutter,
Linux Kernel Network Developers, LKML
On Thu, Jul 28, 2016 at 2:56 AM, Jiri Kosina <jikos@kernel.org> wrote:
> From: Jiri Kosina <jkosina@suse.cz>
>
> Convert the per-device linked list into a hashtable. The primary
> motivation for this change is that currently, we're not tracking all the
> qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
> performed over the linked list by qdisc_match_from_root() is rather
> expensive.
>
> The ultimate goal is to get rid of hidden qdiscs completely, which will
> bring much more determinism in user experience.
>
> As we're adding hashtable.h include into generic netdevice.h, we have to
> make sure HASH_SIZE macro is now non-conflicting with local definitions.
>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> ---
> v1 -> v2: fix up RCU hastable usage wrt. rtnl
> fix compilation of .c files which define their own
> HASH_SIZE that now oncflicts with the one from
> hashtable.h (newly included via netdevice.h)
>
> v2 -> v3: resolve HASH_SIZE identifier conflicts in a cleaner way
> fix up the number of hash bucket bits (4 bits for 16 buckets)
>
> v3 -> v4: put the hastable into struct netdevice only if
> CONFIG_NET_SCHED has been enabled
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Thanks!
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 12:53 ` Fengguang Wu
@ 2016-07-28 16:53 ` Cong Wang
2016-07-31 11:21 ` Fengguang Wu
1 sibling, 0 replies; 43+ messages in thread
From: Cong Wang @ 2016-07-28 16:53 UTC (permalink / raw)
To: Fengguang Wu
Cc: Jiri Kosina, kbuild-all, Eric Dumazet, Jamal Hadi Salim,
Phil Sutter, Linux Kernel Network Developers, LKML
On Thu, Jul 28, 2016 at 5:53 AM, Fengguang Wu <lkp@intel.com> wrote:
> On Thu, Jul 28, 2016 at 01:18:27PM +0200, Jiri Kosina wrote:
>> This issue is be there even without my patch (but with qdisc_list_add
>> instead), isn't it?
>
>
> Yes it looks so, this number happens in a number of places:
>
> dns_query.c:(.text+0x39b84): undefined reference to `qdisc_hash_add'
> include/linux/netdevice.h:1935: undefined reference to `qdisc_hash_add'
> net/core/netevent.c:31: undefined reference to `qdisc_hash_add'
> net/sched/sch_generic.c:789: undefined reference to `qdisc_hash_add'
> sch_generic.c:(.text+0x33487): undefined reference to `qdisc_hash_add'
> switchdev.c:(.text+0x3bf58): undefined reference to `qdisc_hash_add'
> sysctl_net.c:(.text+0x31f70): undefined reference to `qdisc_hash_add'
> (.text.dev_activate+0x228): undefined reference to `qdisc_hash_add'
> (.text+0x37d0b): undefined reference to `qdisc_hash_add'
> wext-proc.c:(.text+0x390a8): undefined reference to `qdisc_hash_add'
>
>> The problem is that sch_generic.c (where dev_activate() is) is being
>> compiled everytime CONFIG_NET is set, but sch_api.c (where
>> qdisc_list_add() is defined) only when CONFIG_NET_SCHED is set (and there
>> is no stub for !CONFIG_NET_SCHED case).
>
>
> So it looks like a more general problem than specific to this patch.
Agreed. I can send a patch if Jiri doesn't. ;)
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v5] net: sched: convert qdisc linked list to hashtable
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
` (3 preceding siblings ...)
2016-07-28 16:52 ` Cong Wang
@ 2016-07-29 7:49 ` Jiri Kosina
2016-07-29 20:01 ` kbuild test robot
` (2 more replies)
4 siblings, 3 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-07-29 7:49 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel, Cong Wang
From: Jiri Kosina <jkosina@suse.cz>
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.
The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.
As we're adding hashtable.h include into generic netdevice.h, we have to
make sure HASH_SIZE macro is now non-conflicting with local definitions.
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
v1 -> v2: fix up RCU hastable usage wrt. rtnl
fix compilation of .c files which define their own
HASH_SIZE that now oncflicts with the one from
hashtable.h (newly included via netdevice.h)
v2 -> v3: resolve HASH_SIZE identifier conflicts in a cleaner way
fix up the number of hash bucket bits (4 bits for 16 buckets)
v3 -> v4: put the hastable into struct netdevice only if
CONFIG_NET_SCHED has been enabled
v4 -> v5: fix !CONFIG_NET_SCHED build (reported by Fengguang Wu)
add Cong Wang's reviewed-by
include/linux/netdevice.h | 4 ++++
include/net/pkt_sched.h | 4 ++--
include/net/sch_generic.h | 2 +-
net/core/dev.c | 3 +++
net/ipv6/ip6_gre.c | 12 ++++++------
net/ipv6/ip6_tunnel.c | 10 +++++-----
net/ipv6/ip6_vti.c | 10 +++++-----
net/ipv6/sit.c | 10 +++++-----
net/sched/sch_api.c | 23 +++++++++++++----------
net/sched/sch_generic.c | 8 +++++---
net/sched/sch_mq.c | 2 +-
net/sched/sch_mqprio.c | 2 +-
12 files changed, 51 insertions(+), 39 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f45929c..17c6499 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,7 @@
#include <uapi/linux/netdevice.h>
#include <uapi/linux/if_bonding.h>
#include <uapi/linux/pkt_cls.h>
+#include <linux/hashtable.h>
struct netpoll_info;
struct device;
@@ -1778,6 +1779,9 @@ struct net_device {
unsigned int num_tx_queues;
unsigned int real_num_tx_queues;
struct Qdisc *qdisc;
+#ifdef CONFIG_NET_SCHED
+ DECLARE_HASHTABLE (qdisc_hash, 4);
+#endif
unsigned long tx_queue_len;
spinlock_t tx_global_lock;
int watchdog_timeo;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fea53f4..8ba11b4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops);
void qdisc_get_default(char *id, size_t len);
int qdisc_set_default(const char *id);
-void qdisc_list_add(struct Qdisc *q);
-void qdisc_list_del(struct Qdisc *q);
+void qdisc_hash_add(struct Qdisc *q);
+void qdisc_hash_del(struct Qdisc *q);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle);
struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle);
struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 62d5531..26f5cb3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -67,7 +67,7 @@ struct Qdisc {
u32 limit;
const struct Qdisc_ops *ops;
struct qdisc_size_table __rcu *stab;
- struct list_head list;
+ struct hlist_node hash;
u32 handle;
u32 parent;
int (*reshape_fail)(struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 904ff43..d3736d5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7511,6 +7511,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->all_adj_list.lower);
INIT_LIST_HEAD(&dev->ptype_all);
INIT_LIST_HEAD(&dev->ptype_specific);
+#ifdef CONFIG_NET_SCHED
+ hash_init(dev->qdisc_hash);
+#endif
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index fdc9de2..d3697a4 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -61,12 +61,12 @@ static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_GRE_HASH_SIZE_SHIFT 5
+#define IP6_GRE_HASH_SIZE (1 << IP6_GRE_HASH_SIZE_SHIFT)
static int ip6gre_net_id __read_mostly;
struct ip6gre_net {
- struct ip6_tnl __rcu *tunnels[4][HASH_SIZE];
+ struct ip6_tnl __rcu *tunnels[4][IP6_GRE_HASH_SIZE];
struct net_device *fb_tunnel_dev;
};
@@ -96,12 +96,12 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu);
will match fallback tunnel.
*/
-#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1))
+#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(IP6_GRE_HASH_SIZE - 1))
static u32 HASH_ADDR(const struct in6_addr *addr)
{
u32 hash = ipv6_addr_hash(addr);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_GRE_HASH_SIZE_SHIFT);
}
#define tunnels_r_l tunnels[3]
@@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head)
for (prio = 0; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_GRE_HASH_SIZE; h++) {
struct ip6_tnl *t;
t = rtnl_dereference(ign->tunnels[prio][h]);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 7b0481e..2050217 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -64,8 +64,8 @@ MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("ip6tnl");
MODULE_ALIAS_NETDEV("ip6tnl0");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_TUNNEL_HASH_SIZE_SHIFT 5
+#define IP6_TUNNEL_HASH_SIZE (1 << IP6_TUNNEL_HASH_SIZE_SHIFT)
static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
@@ -75,7 +75,7 @@ static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_TUNNEL_HASH_SIZE_SHIFT);
}
static int ip6_tnl_dev_init(struct net_device *dev);
@@ -87,7 +87,7 @@ struct ip6_tnl_net {
/* the IPv6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_TUNNEL_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net)
if (dev->rtnl_link_ops == &ip6_link_ops)
unregister_netdevice_queue(dev, &list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_TUNNEL_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
/* If dev is in the same netns, it has already
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index d90a11f..cc7e058 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -50,14 +50,14 @@
#include <net/net_namespace.h>
#include <net/netns/generic.h>
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_VTI_HASH_SIZE_SHIFT 5
+#define IP6_VTI_HASH_SIZE (1 << IP6_VTI_HASH_SIZE_SHIFT)
static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_VTI_HASH_SIZE_SHIFT);
}
static int vti6_dev_init(struct net_device *dev);
@@ -69,7 +69,7 @@ struct vti6_net {
/* the vti6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_VTI_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n)
struct ip6_tnl *t;
LIST_HEAD(list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_VTI_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
unregister_netdevice_queue(t->dev, &list);
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 0a5a255..94dd0f0 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -62,7 +62,7 @@
For comments look at net/ipv4/ip_gre.c --ANK
*/
-#define HASH_SIZE 16
+#define IP6_SIT_HASH_SIZE 16
#define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF)
static bool log_ecn_error = true;
@@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly;
static int sit_net_id __read_mostly;
struct sit_net {
- struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_r[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_l[HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r_l[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_l[IP6_SIT_HASH_SIZE];
struct ip_tunnel __rcu *tunnels_wc[1];
struct ip_tunnel __rcu **tunnels[4];
@@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net,
for (prio = 1; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_SIT_HASH_SIZE; h++) {
struct ip_tunnel *t;
t = rtnl_dereference(sitn->tunnels[prio][h]);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ddf047d..c093d32 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -29,6 +29,7 @@
#include <linux/hrtimer.h>
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <linux/hashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
root->handle == handle)
return root;
- list_for_each_entry_rcu(q, &root->list, list) {
+ hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) {
if (q->handle == handle)
return q;
}
return NULL;
}
-void qdisc_list_add(struct Qdisc *q)
+void qdisc_hash_add(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
struct Qdisc *root = qdisc_dev(q)->qdisc;
WARN_ON_ONCE(root == &noop_qdisc);
ASSERT_RTNL();
- list_add_tail_rcu(&q->list, &root->list);
+ hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle);
}
}
-EXPORT_SYMBOL(qdisc_list_add);
+EXPORT_SYMBOL(qdisc_hash_add);
-void qdisc_list_del(struct Qdisc *q)
+void qdisc_hash_del(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
ASSERT_RTNL();
- list_del_rcu(&q->list);
+ hash_del_rcu(&q->hash);
}
}
-EXPORT_SYMBOL(qdisc_list_del);
+EXPORT_SYMBOL(qdisc_hash_del);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
{
@@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
goto err_out4;
}
- qdisc_list_add(sch);
+ qdisc_hash_add(sch);
return sch;
}
@@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
{
int ret = 0, q_idx = *q_idx_p;
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
goto done;
q_idx++;
}
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (q_idx < s_q_idx) {
q_idx++;
continue;
@@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
int *t_p, int s_t)
{
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
return -1;
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
return -1;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f9e0e9c..94d5999 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = {
.dequeue = noop_dequeue,
.flags = TCQ_F_BUILTIN,
.ops = &noop_qdisc_ops,
- .list = LIST_HEAD_INIT(noop_qdisc.list),
.q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
.dev_queue = &noop_netdev_queue,
.busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
@@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- INIT_LIST_HEAD(&sch->list);
skb_queue_head_init(&sch->q);
spin_lock_init(&sch->busylock);
@@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
return;
#ifdef CONFIG_NET_SCHED
- qdisc_list_del(qdisc);
+ qdisc_hash_del(qdisc);
qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
@@ -732,6 +730,10 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+#ifdef CONFIG_NET_SCHED
+ if (dev->qdisc)
+ qdisc_hash_add(dev->qdisc);
+#endif
}
static void transition_one_qdisc(struct net_device *dev,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 56a77b8..3bee15d 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch)
qdisc_destroy(old);
#ifdef CONFIG_NET_SCHED
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
#endif
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b8002ce..dbfb3a5 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch)
if (old)
qdisc_destroy(old);
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
}
kfree(priv->qdiscs);
priv->qdiscs = NULL;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v5] net: sched: convert qdisc linked list to hashtable
2016-07-29 7:49 ` [PATCH v5] " Jiri Kosina
@ 2016-07-29 20:01 ` kbuild test robot
2016-07-31 0:11 ` kbuild test robot
2016-08-01 10:23 ` [PATCH v6] " Jiri Kosina
2 siblings, 0 replies; 43+ messages in thread
From: kbuild test robot @ 2016-07-29 20:01 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
[-- Attachment #1: Type: text/plain, Size: 3331 bytes --]
Hi,
[auto build test WARNING on v4.7-rc7]
[cannot apply to net/master net-next/master ipsec-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160729-155412
config: x86_64-randconfig-s1-07292101 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All warnings (new ones prefixed by >>):
In file included from drivers/net/ethernet/dec/tulip/de4x5.c:480:
drivers/net/ethernet/dec/tulip/de4x5.h:864:1: warning: "HASH_BITS" redefined
In file included from include/linux/netdevice.h:55,
from drivers/net/ethernet/dec/tulip/de4x5.c:459:
>> include/linux/hashtable.h:27:1: warning: this is the location of the previous definition
drivers/net/ethernet/dec/tulip/de4x5.o: warning: objtool: dma_free_coherent()+0x4b: function has unreachable instruction
vim +27 include/linux/hashtable.h
d9b482c8 Sasha Levin 2012-10-30 11 #include <linux/kernel.h>
d9b482c8 Sasha Levin 2012-10-30 12 #include <linux/hash.h>
d9b482c8 Sasha Levin 2012-10-30 13 #include <linux/rculist.h>
d9b482c8 Sasha Levin 2012-10-30 14
d9b482c8 Sasha Levin 2012-10-30 15 #define DEFINE_HASHTABLE(name, bits) \
d9b482c8 Sasha Levin 2012-10-30 16 struct hlist_head name[1 << (bits)] = \
d9b482c8 Sasha Levin 2012-10-30 17 { [0 ... ((1 << (bits)) - 1)] = HLIST_HEAD_INIT }
d9b482c8 Sasha Levin 2012-10-30 18
6180d9de Eric Dumazet 2015-11-18 19 #define DEFINE_READ_MOSTLY_HASHTABLE(name, bits) \
6180d9de Eric Dumazet 2015-11-18 20 struct hlist_head name[1 << (bits)] __read_mostly = \
6180d9de Eric Dumazet 2015-11-18 21 { [0 ... ((1 << (bits)) - 1)] = HLIST_HEAD_INIT }
6180d9de Eric Dumazet 2015-11-18 22
d9b482c8 Sasha Levin 2012-10-30 23 #define DECLARE_HASHTABLE(name, bits) \
d9b482c8 Sasha Levin 2012-10-30 24 struct hlist_head name[1 << (bits)]
d9b482c8 Sasha Levin 2012-10-30 25
d9b482c8 Sasha Levin 2012-10-30 26 #define HASH_SIZE(name) (ARRAY_SIZE(name))
d9b482c8 Sasha Levin 2012-10-30 @27 #define HASH_BITS(name) ilog2(HASH_SIZE(name))
d9b482c8 Sasha Levin 2012-10-30 28
d9b482c8 Sasha Levin 2012-10-30 29 /* Use hash_32 when possible to allow for fast 32bit hashing in 64bit kernels. */
d9b482c8 Sasha Levin 2012-10-30 30 #define hash_min(val, bits) \
d9b482c8 Sasha Levin 2012-10-30 31 (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits))
d9b482c8 Sasha Levin 2012-10-30 32
d9b482c8 Sasha Levin 2012-10-30 33 static inline void __hash_init(struct hlist_head *ht, unsigned int sz)
d9b482c8 Sasha Levin 2012-10-30 34 {
d9b482c8 Sasha Levin 2012-10-30 35 unsigned int i;
:::::: The code at line 27 was first introduced by commit
:::::: d9b482c8ba1973a189f2d4c8175d405b87fbf2d7 hashtable: introduce a small and naive hashtable
:::::: TO: Sasha Levin <levinsasha928@gmail.com>
:::::: CC: Linus Torvalds <torvalds@linux-foundation.org>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 25043 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v5] net: sched: convert qdisc linked list to hashtable
2016-07-29 7:49 ` [PATCH v5] " Jiri Kosina
2016-07-29 20:01 ` kbuild test robot
@ 2016-07-31 0:11 ` kbuild test robot
2016-08-01 10:23 ` [PATCH v6] " Jiri Kosina
2 siblings, 0 replies; 43+ messages in thread
From: kbuild test robot @ 2016-07-31 0:11 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
[-- Attachment #1: Type: text/plain, Size: 13361 bytes --]
Hi,
[auto build test ERROR on v4.7-rc7]
[also build test ERROR on next-20160729]
[cannot apply to net/master net-next/master ipsec-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160729-155412
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm
All errors (new ones prefixed by >>):
>> drivers/net/ethernet/ti/davinci_emac.c:736:57: error: macro "hash_add" requires 3 arguments, but only 2 given
static int hash_add(struct emac_priv *priv, u8 *mac_addr)
^
>> drivers/net/ethernet/ti/davinci_emac.c:737:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token
{
^
>> drivers/net/ethernet/ti/davinci_emac.c:778:12: error: conflicting types for 'hash_del'
static int hash_del(struct emac_priv *priv, u8 *mac_addr)
^
In file included from include/linux/netdevice.h:55:0,
from drivers/net/ethernet/ti/davinci_emac.c:44:
include/linux/hashtable.h:104:20: note: previous definition of 'hash_del' was here
static inline void hash_del(struct hlist_node *node)
^
drivers/net/ethernet/ti/davinci_emac.c: In function 'emac_add_mcast':
drivers/net/ethernet/ti/davinci_emac.c:828:35: error: macro "hash_add" requires 3 arguments, but only 2 given
update = hash_add(priv, mac_addr);
^
>> drivers/net/ethernet/ti/davinci_emac.c:828:12: error: 'hash_add' undeclared (first use in this function)
update = hash_add(priv, mac_addr);
^
drivers/net/ethernet/ti/davinci_emac.c:828:12: note: each undeclared identifier is reported only once for each function it appears in
vim +/hash_add +736 drivers/net/ethernet/ti/davinci_emac.c
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 730 * @priv: The DaVinci EMAC private adapter structure
49ce9c2c drivers/net/ethernet/ti/davinci_emac.c Ben Hutchings 2012-07-10 731 * @mac_addr: mac address to delete from hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 732 *
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 733 * Adds mac address to the internal hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 734 *
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 735 */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 @736 static int hash_add(struct emac_priv *priv, u8 *mac_addr)
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 @737 {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 738 struct device *emac_dev = &priv->ndev->dev;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 739 u32 rc = 0;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 740 u32 hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 741 u32 hash_value = hash_get(mac_addr);
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 742
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 743 if (hash_value >= EMAC_NUM_MULTICAST_BITS) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 744 if (netif_msg_drv(priv)) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 745 dev_err(emac_dev, "DaVinci EMAC: hash_add(): Invalid "\
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 746 "Hash %08x, should not be greater than %08x",
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 747 hash_value, (EMAC_NUM_MULTICAST_BITS - 1));
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 748 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 749 return -1;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 750 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 751
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 752 /* set the hash bit only if not previously set */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 753 if (priv->multicast_hash_cnt[hash_value] == 0) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 754 rc = 1; /* hash value changed */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 755 if (hash_value < 32) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 756 hash_bit = BIT(hash_value);
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 757 priv->mac_hash1 |= hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 758 } else {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 759 hash_bit = BIT((hash_value - 32));
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 760 priv->mac_hash2 |= hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 761 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 762 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 763
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 764 /* incr counter for num of mcast addr's mapped to "this" hash bit */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 765 ++priv->multicast_hash_cnt[hash_value];
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 766
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 767 return rc;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 768 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 769
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 770 /**
49ce9c2c drivers/net/ethernet/ti/davinci_emac.c Ben Hutchings 2012-07-10 771 * hash_del - Hash function to delete mac addr from hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 772 * @priv: The DaVinci EMAC private adapter structure
49ce9c2c drivers/net/ethernet/ti/davinci_emac.c Ben Hutchings 2012-07-10 773 * @mac_addr: mac address to delete from hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 774 *
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 775 * Removes mac address from the internal hash table
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 776 *
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 777 */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 @778 static int hash_del(struct emac_priv *priv, u8 *mac_addr)
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 779 {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 780 u32 hash_value;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 781 u32 hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 782
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 783 hash_value = hash_get(mac_addr);
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 784 if (priv->multicast_hash_cnt[hash_value] > 0) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 785 /* dec cntr for num of mcast addr's mapped to this hash bit */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 786 --priv->multicast_hash_cnt[hash_value];
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 787 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 788
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 789 /* if counter still > 0, at least one multicast address refers
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 790 * to this hash bit. so return 0 */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 791 if (priv->multicast_hash_cnt[hash_value] > 0)
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 792 return 0;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 793
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 794 if (hash_value < 32) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 795 hash_bit = BIT(hash_value);
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 796 priv->mac_hash1 &= ~hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 797 } else {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 798 hash_bit = BIT((hash_value - 32));
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 799 priv->mac_hash2 &= ~hash_bit;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 800 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 801
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 802 /* return 1 to indicate change in mac_hash registers reqd */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 803 return 1;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 804 }
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 805
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 806 /* EMAC multicast operation */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 807 #define EMAC_MULTICAST_ADD 0
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 808 #define EMAC_MULTICAST_DEL 1
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 809 #define EMAC_ALL_MULTI_SET 2
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 810 #define EMAC_ALL_MULTI_CLR 3
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 811
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 812 /**
49ce9c2c drivers/net/ethernet/ti/davinci_emac.c Ben Hutchings 2012-07-10 813 * emac_add_mcast - Set multicast address in the EMAC adapter (Internal)
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 814 * @priv: The DaVinci EMAC private adapter structure
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 815 * @action: multicast operation to perform
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 816 * mac_addr: mac address to set
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 817 *
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 818 * Set multicast addresses in EMAC adapter - internal function
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 819 *
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 820 */
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 821 static void emac_add_mcast(struct emac_priv *priv, u32 action, u8 *mac_addr)
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 822 {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 823 struct device *emac_dev = &priv->ndev->dev;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 824 int update = -1;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 825
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 826 switch (action) {
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 827 case EMAC_MULTICAST_ADD:
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 @828 update = hash_add(priv, mac_addr);
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 829 break;
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 830 case EMAC_MULTICAST_DEL:
a6286ee6 drivers/net/davinci_emac.c Anant Gole 2009-05-18 831 update = hash_del(priv, mac_addr);
:::::: The code at line 736 was first introduced by commit
:::::: a6286ee630f6d95f8466e19d7f1ae38d677028ae net: Add TI DaVinci EMAC driver
:::::: TO: Anant Gole <anantgole@ti.com>
:::::: CC: David S. Miller <davem@davemloft.net>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 57608 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-28 12:53 ` Fengguang Wu
2016-07-28 16:53 ` Cong Wang
@ 2016-07-31 11:21 ` Fengguang Wu
2016-08-01 10:12 ` Jiri Kosina
1 sibling, 1 reply; 43+ messages in thread
From: Fengguang Wu @ 2016-07-31 11:21 UTC (permalink / raw)
To: Jiri Kosina
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
On Thu, Jul 28, 2016 at 08:53:12PM +0800, Fengguang Wu wrote:
>On Thu, Jul 28, 2016 at 01:18:27PM +0200, Jiri Kosina wrote:
>>On Thu, 28 Jul 2016, kbuild test robot wrote:
>>
>>> [auto build test ERROR on v4.7-rc7]
>>> [also build test ERROR on next-20160728]
>>> [cannot apply to net/master net-next/master ipsec-next/master]
>>> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>>>
>>> url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/net-sched-convert-qdisc-linked-list-to-hashtable/20160728-182303
>>> config: i386-randconfig-s0-201630 (attached as .config)
>>> compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
>>> reproduce:
>>> # save the attached .config to linux build tree
>>> make ARCH=i386
>>>
>>> All errors (new ones prefixed by >>):
>>>
>>> net/built-in.o: In function `dev_activate':
>>> >> (.text+0x37ccb): undefined reference to `qdisc_hash_add'
>>
>>Dear 0-day team,
>>
>>could you please check my question regarding this very build failure here?
>>
>> lkml.kernel.org/r/alpine.LNX.2.00.1607141612560.24757@cbobk.fhfr.pm
>
>Sorry I missed that. For your convenience, here is the answer to the
>original email:
>
>>This issue is be there even without my patch (but with qdisc_list_add
>>instead), isn't it?
>
>Yes it looks so, this number happens in a number of places:
>
>dns_query.c:(.text+0x39b84): undefined reference to `qdisc_hash_add'
>include/linux/netdevice.h:1935: undefined reference to `qdisc_hash_add'
>net/core/netevent.c:31: undefined reference to `qdisc_hash_add'
>net/sched/sch_generic.c:789: undefined reference to `qdisc_hash_add'
>sch_generic.c:(.text+0x33487): undefined reference to `qdisc_hash_add'
>switchdev.c:(.text+0x3bf58): undefined reference to `qdisc_hash_add'
>sysctl_net.c:(.text+0x31f70): undefined reference to `qdisc_hash_add'
>(.text.dev_activate+0x228): undefined reference to `qdisc_hash_add'
>(.text+0x37d0b): undefined reference to `qdisc_hash_add'
>wext-proc.c:(.text+0x390a8): undefined reference to `qdisc_hash_add'
Jiri, I just double checked and find no similar errors related to
qdisc_list_add(). The parent commit 95556a8838 ("dccp: avoid deadlock
in dccp_v4_ctl_send_reset") builds fine without error.
Thanks,
Fengguang
>>The problem is that sch_generic.c (where dev_activate() is) is being
>>compiled everytime CONFIG_NET is set, but sch_api.c (where
>>qdisc_list_add() is defined) only when CONFIG_NET_SCHED is set (and there
>>is no stub for !CONFIG_NET_SCHED case).
>
>So it looks like a more general problem than specific to this patch.
>
>Thanks,
>Fengguang
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v4] net: sched: convert qdisc linked list to hashtable
2016-07-31 11:21 ` Fengguang Wu
@ 2016-08-01 10:12 ` Jiri Kosina
0 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-08-01 10:12 UTC (permalink / raw)
To: Fengguang Wu
Cc: kbuild-all, Eric Dumazet, Jamal Hadi Salim, Phil Sutter, netdev,
linux-kernel, Cong Wang
On Sun, 31 Jul 2016, Fengguang Wu wrote:
> Jiri, I just double checked and find no similar errors related to
> qdisc_list_add(). The parent commit 95556a8838 ("dccp: avoid deadlock
> in dccp_v4_ctl_send_reset") builds fine without error.
You are right, I realized my mistake afterwards. This is fixed in v5 of
the patch.
Thanks,
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6] net: sched: convert qdisc linked list to hashtable
2016-07-29 7:49 ` [PATCH v5] " Jiri Kosina
2016-07-29 20:01 ` kbuild test robot
2016-07-31 0:11 ` kbuild test robot
@ 2016-08-01 10:23 ` Jiri Kosina
2 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-08-01 10:23 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jamal Hadi Salim, Phil Sutter, netdev, linux-kernel, Cong Wang
From: Jiri Kosina <jkosina@suse.cz>
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.
The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.
As we're adding hashtable.h include into generic netdevice.h, we have to
make sure HASH_SIZE macro is now non-conflicting with local definitions.
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
v1 -> v2: fix up RCU hastable usage wrt. rtnl
fix compilation of .c files which define their own
HASH_SIZE that now oncflicts with the one from
hashtable.h (newly included via netdevice.h)
v2 -> v3: resolve HASH_SIZE identifier conflicts in a cleaner way
fix up the number of hash bucket bits (4 bits for 16 buckets)
v3 -> v4: put the hastable into struct netdevice only if
CONFIG_NET_SCHED has been enabled
v4 -> v5: fix !CONFIG_NET_SCHED build (reported by Fengguang Wu)
add Cong Wang's reviewed-by
v5 -> v6: build fix for davinci_emac driver that got symbol conflict
due to hashtable.h include, reported by 0day bot
drivers/net/ethernet/ti/davinci_emac.c | 14 +++++++-------
include/linux/netdevice.h | 4 ++++
include/net/pkt_sched.h | 4 ++--
include/net/sch_generic.h | 2 +-
net/core/dev.c | 3 +++
net/ipv6/ip6_gre.c | 12 ++++++------
net/ipv6/ip6_tunnel.c | 10 +++++-----
net/ipv6/ip6_vti.c | 10 +++++-----
net/ipv6/sit.c | 10 +++++-----
net/sched/sch_api.c | 23 +++++++++++++----------
net/sched/sch_generic.c | 8 +++++---
net/sched/sch_mq.c | 2 +-
net/sched/sch_mqprio.c | 2 +-
13 files changed, 58 insertions(+), 46 deletions(-)
diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index f56d66e..91ca2b2 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -726,14 +726,14 @@ static u32 hash_get(u8 *addr)
}
/**
- * hash_add - Hash function to add mac addr from hash table
+ * emac_hash_add - Hash function to add mac addr from hash table
* @priv: The DaVinci EMAC private adapter structure
* @mac_addr: mac address to delete from hash table
*
* Adds mac address to the internal hash table
*
*/
-static int hash_add(struct emac_priv *priv, u8 *mac_addr)
+static int emac_hash_add(struct emac_priv *priv, u8 *mac_addr)
{
struct device *emac_dev = &priv->ndev->dev;
u32 rc = 0;
@@ -742,7 +742,7 @@ static int hash_add(struct emac_priv *priv, u8 *mac_addr)
if (hash_value >= EMAC_NUM_MULTICAST_BITS) {
if (netif_msg_drv(priv)) {
- dev_err(emac_dev, "DaVinci EMAC: hash_add(): Invalid "\
+ dev_err(emac_dev, "DaVinci EMAC: emac_hash_add(): Invalid "\
"Hash %08x, should not be greater than %08x",
hash_value, (EMAC_NUM_MULTICAST_BITS - 1));
}
@@ -768,14 +768,14 @@ static int hash_add(struct emac_priv *priv, u8 *mac_addr)
}
/**
- * hash_del - Hash function to delete mac addr from hash table
+ * emac_hash_del - Hash function to delete mac addr from hash table
* @priv: The DaVinci EMAC private adapter structure
* @mac_addr: mac address to delete from hash table
*
* Removes mac address from the internal hash table
*
*/
-static int hash_del(struct emac_priv *priv, u8 *mac_addr)
+static int emac_hash_del(struct emac_priv *priv, u8 *mac_addr)
{
u32 hash_value;
u32 hash_bit;
@@ -825,10 +825,10 @@ static void emac_add_mcast(struct emac_priv *priv, u32 action, u8 *mac_addr)
switch (action) {
case EMAC_MULTICAST_ADD:
- update = hash_add(priv, mac_addr);
+ update = emac_hash_add(priv, mac_addr);
break;
case EMAC_MULTICAST_DEL:
- update = hash_del(priv, mac_addr);
+ update = emac_hash_del(priv, mac_addr);
break;
case EMAC_ALL_MULTI_SET:
update = 1;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f45929c..17c6499 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -52,6 +52,7 @@
#include <uapi/linux/netdevice.h>
#include <uapi/linux/if_bonding.h>
#include <uapi/linux/pkt_cls.h>
+#include <linux/hashtable.h>
struct netpoll_info;
struct device;
@@ -1778,6 +1779,9 @@ struct net_device {
unsigned int num_tx_queues;
unsigned int real_num_tx_queues;
struct Qdisc *qdisc;
+#ifdef CONFIG_NET_SCHED
+ DECLARE_HASHTABLE (qdisc_hash, 4);
+#endif
unsigned long tx_queue_len;
spinlock_t tx_global_lock;
int watchdog_timeo;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fea53f4..8ba11b4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops);
void qdisc_get_default(char *id, size_t len);
int qdisc_set_default(const char *id);
-void qdisc_list_add(struct Qdisc *q);
-void qdisc_list_del(struct Qdisc *q);
+void qdisc_hash_add(struct Qdisc *q);
+void qdisc_hash_del(struct Qdisc *q);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle);
struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle);
struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 62d5531..26f5cb3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -67,7 +67,7 @@ struct Qdisc {
u32 limit;
const struct Qdisc_ops *ops;
struct qdisc_size_table __rcu *stab;
- struct list_head list;
+ struct hlist_node hash;
u32 handle;
u32 parent;
int (*reshape_fail)(struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 904ff43..d3736d5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7511,6 +7511,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->all_adj_list.lower);
INIT_LIST_HEAD(&dev->ptype_all);
INIT_LIST_HEAD(&dev->ptype_specific);
+#ifdef CONFIG_NET_SCHED
+ hash_init(dev->qdisc_hash);
+#endif
dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
setup(dev);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index fdc9de2..d3697a4 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -61,12 +61,12 @@ static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_GRE_HASH_SIZE_SHIFT 5
+#define IP6_GRE_HASH_SIZE (1 << IP6_GRE_HASH_SIZE_SHIFT)
static int ip6gre_net_id __read_mostly;
struct ip6gre_net {
- struct ip6_tnl __rcu *tunnels[4][HASH_SIZE];
+ struct ip6_tnl __rcu *tunnels[4][IP6_GRE_HASH_SIZE];
struct net_device *fb_tunnel_dev;
};
@@ -96,12 +96,12 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu);
will match fallback tunnel.
*/
-#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(HASH_SIZE - 1))
+#define HASH_KEY(key) (((__force u32)key^((__force u32)key>>4))&(IP6_GRE_HASH_SIZE - 1))
static u32 HASH_ADDR(const struct in6_addr *addr)
{
u32 hash = ipv6_addr_hash(addr);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_GRE_HASH_SIZE_SHIFT);
}
#define tunnels_r_l tunnels[3]
@@ -1089,7 +1089,7 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head)
for (prio = 0; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_GRE_HASH_SIZE; h++) {
struct ip6_tnl *t;
t = rtnl_dereference(ign->tunnels[prio][h]);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 7b0481e..2050217 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -64,8 +64,8 @@ MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("ip6tnl");
MODULE_ALIAS_NETDEV("ip6tnl0");
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_TUNNEL_HASH_SIZE_SHIFT 5
+#define IP6_TUNNEL_HASH_SIZE (1 << IP6_TUNNEL_HASH_SIZE_SHIFT)
static bool log_ecn_error = true;
module_param(log_ecn_error, bool, 0644);
@@ -75,7 +75,7 @@ static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_TUNNEL_HASH_SIZE_SHIFT);
}
static int ip6_tnl_dev_init(struct net_device *dev);
@@ -87,7 +87,7 @@ struct ip6_tnl_net {
/* the IPv6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_TUNNEL_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -2031,7 +2031,7 @@ static void __net_exit ip6_tnl_destroy_tunnels(struct net *net)
if (dev->rtnl_link_ops == &ip6_link_ops)
unregister_netdevice_queue(dev, &list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_TUNNEL_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
/* If dev is in the same netns, it has already
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index d90a11f..cc7e058 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -50,14 +50,14 @@
#include <net/net_namespace.h>
#include <net/netns/generic.h>
-#define HASH_SIZE_SHIFT 5
-#define HASH_SIZE (1 << HASH_SIZE_SHIFT)
+#define IP6_VTI_HASH_SIZE_SHIFT 5
+#define IP6_VTI_HASH_SIZE (1 << IP6_VTI_HASH_SIZE_SHIFT)
static u32 HASH(const struct in6_addr *addr1, const struct in6_addr *addr2)
{
u32 hash = ipv6_addr_hash(addr1) ^ ipv6_addr_hash(addr2);
- return hash_32(hash, HASH_SIZE_SHIFT);
+ return hash_32(hash, IP6_VTI_HASH_SIZE_SHIFT);
}
static int vti6_dev_init(struct net_device *dev);
@@ -69,7 +69,7 @@ struct vti6_net {
/* the vti6 tunnel fallback device */
struct net_device *fb_tnl_dev;
/* lists for storing tunnels in use */
- struct ip6_tnl __rcu *tnls_r_l[HASH_SIZE];
+ struct ip6_tnl __rcu *tnls_r_l[IP6_VTI_HASH_SIZE];
struct ip6_tnl __rcu *tnls_wc[1];
struct ip6_tnl __rcu **tnls[2];
};
@@ -1040,7 +1040,7 @@ static void __net_exit vti6_destroy_tunnels(struct vti6_net *ip6n)
struct ip6_tnl *t;
LIST_HEAD(list);
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_VTI_HASH_SIZE; h++) {
t = rtnl_dereference(ip6n->tnls_r_l[h]);
while (t) {
unregister_netdevice_queue(t->dev, &list);
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 0a5a255..94dd0f0 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -62,7 +62,7 @@
For comments look at net/ipv4/ip_gre.c --ANK
*/
-#define HASH_SIZE 16
+#define IP6_SIT_HASH_SIZE 16
#define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF)
static bool log_ecn_error = true;
@@ -78,9 +78,9 @@ static struct rtnl_link_ops sit_link_ops __read_mostly;
static int sit_net_id __read_mostly;
struct sit_net {
- struct ip_tunnel __rcu *tunnels_r_l[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_r[HASH_SIZE];
- struct ip_tunnel __rcu *tunnels_l[HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r_l[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_r[IP6_SIT_HASH_SIZE];
+ struct ip_tunnel __rcu *tunnels_l[IP6_SIT_HASH_SIZE];
struct ip_tunnel __rcu *tunnels_wc[1];
struct ip_tunnel __rcu **tunnels[4];
@@ -1773,7 +1773,7 @@ static void __net_exit sit_destroy_tunnels(struct net *net,
for (prio = 1; prio < 4; prio++) {
int h;
- for (h = 0; h < HASH_SIZE; h++) {
+ for (h = 0; h < IP6_SIT_HASH_SIZE; h++) {
struct ip_tunnel *t;
t = rtnl_dereference(sitn->tunnels[prio][h]);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ddf047d..c093d32 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -29,6 +29,7 @@
#include <linux/hrtimer.h>
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <linux/hashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
root->handle == handle)
return root;
- list_for_each_entry_rcu(q, &root->list, list) {
+ hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) {
if (q->handle == handle)
return q;
}
return NULL;
}
-void qdisc_list_add(struct Qdisc *q)
+void qdisc_hash_add(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
struct Qdisc *root = qdisc_dev(q)->qdisc;
WARN_ON_ONCE(root == &noop_qdisc);
ASSERT_RTNL();
- list_add_tail_rcu(&q->list, &root->list);
+ hash_add_rcu(qdisc_dev(q)->qdisc_hash, &q->hash, q->handle);
}
}
-EXPORT_SYMBOL(qdisc_list_add);
+EXPORT_SYMBOL(qdisc_hash_add);
-void qdisc_list_del(struct Qdisc *q)
+void qdisc_hash_del(struct Qdisc *q)
{
if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
ASSERT_RTNL();
- list_del_rcu(&q->list);
+ hash_del_rcu(&q->hash);
}
}
-EXPORT_SYMBOL(qdisc_list_del);
+EXPORT_SYMBOL(qdisc_hash_del);
struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
{
@@ -1004,7 +1005,7 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
goto err_out4;
}
- qdisc_list_add(sch);
+ qdisc_hash_add(sch);
return sch;
}
@@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
{
int ret = 0, q_idx = *q_idx_p;
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
goto done;
q_idx++;
}
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (q_idx < s_q_idx) {
q_idx++;
continue;
@@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
int *t_p, int s_t)
{
struct Qdisc *q;
+ int b;
if (!root)
return 0;
@@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
return -1;
- list_for_each_entry(q, &root->list, list) {
+ hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
return -1;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f9e0e9c..94d5999 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -378,7 +378,6 @@ struct Qdisc noop_qdisc = {
.dequeue = noop_dequeue,
.flags = TCQ_F_BUILTIN,
.ops = &noop_qdisc_ops,
- .list = LIST_HEAD_INIT(noop_qdisc.list),
.q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
.dev_queue = &noop_netdev_queue,
.busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
@@ -565,7 +564,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- INIT_LIST_HEAD(&sch->list);
skb_queue_head_init(&sch->q);
spin_lock_init(&sch->busylock);
@@ -645,7 +643,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
return;
#ifdef CONFIG_NET_SCHED
- qdisc_list_del(qdisc);
+ qdisc_hash_del(qdisc);
qdisc_put_stab(rtnl_dereference(qdisc->stab));
#endif
@@ -732,6 +730,10 @@ static void attach_default_qdiscs(struct net_device *dev)
qdisc->ops->attach(qdisc);
}
}
+#ifdef CONFIG_NET_SCHED
+ if (dev->qdisc)
+ qdisc_hash_add(dev->qdisc);
+#endif
}
static void transition_one_qdisc(struct net_device *dev,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 56a77b8..3bee15d 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -88,7 +88,7 @@ static void mq_attach(struct Qdisc *sch)
qdisc_destroy(old);
#ifdef CONFIG_NET_SCHED
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
#endif
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b8002ce..dbfb3a5 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -182,7 +182,7 @@ static void mqprio_attach(struct Qdisc *sch)
if (old)
qdisc_destroy(old);
if (ntx < dev->real_num_tx_queues)
- qdisc_list_add(qdisc);
+ qdisc_hash_add(qdisc);
}
kfree(priv->qdiscs);
priv->qdiscs = NULL;
--
Jiri Kosina
SUSE Labs
^ permalink raw reply related [flat|nested] 43+ messages in thread
end of thread, other threads:[~2016-08-01 10:24 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-14 14:44 Deleting child qdisc doesn't reset parent to default qdisc? Jiri Kosina
2016-04-14 14:51 ` Jiri Kosina
2016-04-14 15:01 ` Eric Dumazet
2016-04-14 15:18 ` Phil Sutter
2016-04-14 15:34 ` Jiri Kosina
2016-04-14 15:44 ` Eric Dumazet
2016-04-14 16:22 ` Phil Sutter
2016-04-14 16:40 ` Eric Dumazet
2016-04-14 16:08 ` Jiri Kosina
2016-04-14 17:49 ` Eric Dumazet
2016-04-15 12:42 ` Jamal Hadi Salim
2016-04-15 14:58 ` Eric Dumazet
2016-04-15 17:13 ` David Miller
2016-06-28 15:19 ` Jiri Kosina
2016-06-28 17:28 ` Cong Wang
2016-06-28 17:33 ` Jiri Kosina
2016-06-28 16:49 ` [RFC PATCH] sch_tbf: avoid silent fallback to noop_qdisc (was Re: Deleting child qdisc doesn't reset parent to default qdisc?) Jiri Kosina
2016-07-07 9:04 ` [RFC PATCH] net: sched: convert qdisc linked list to hashtable " Jiri Kosina
2016-07-07 13:51 ` Eric Dumazet
2016-07-07 16:32 ` Jiri Kosina
2016-07-07 16:53 ` Eric Dumazet
2016-07-07 20:36 ` [RFC PATCH v2] net: sched: convert qdisc linked list to hashtable Jiri Kosina
2016-07-08 8:50 ` Eric Dumazet
2016-07-08 9:02 ` Jiri Kosina
2016-07-08 11:07 ` Thomas Graf
2016-07-08 13:52 ` Eric Dumazet
2016-07-11 14:02 ` [RFC PATCH v3] " Jiri Kosina
2016-07-12 17:36 ` Cong Wang
2016-07-13 13:47 ` Jiri Kosina
2016-07-28 9:56 ` [PATCH v4] " Jiri Kosina
2016-07-28 11:10 ` kbuild test robot
2016-07-28 11:18 ` Jiri Kosina
2016-07-28 12:53 ` Fengguang Wu
2016-07-28 16:53 ` Cong Wang
2016-07-31 11:21 ` Fengguang Wu
2016-08-01 10:12 ` Jiri Kosina
2016-07-28 11:22 ` kbuild test robot
2016-07-28 11:52 ` kbuild test robot
2016-07-28 16:52 ` Cong Wang
2016-07-29 7:49 ` [PATCH v5] " Jiri Kosina
2016-07-29 20:01 ` kbuild test robot
2016-07-31 0:11 ` kbuild test robot
2016-08-01 10:23 ` [PATCH v6] " Jiri Kosina
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).