* [PATCH net-next] sch_netem: faster rb tree removal
@ 2017-09-23 18:07 Eric Dumazet
2017-09-25 1:57 ` David Ahern
2017-09-26 3:33 ` David Miller
0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2017-09-23 18:07 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Stephen Hemminger
From: Eric Dumazet <edumazet@google.com>
While running TCP tests involving netem storing millions of packets,
I had the idea to speed up tfifo_reset() and did experiments.
I tried the rbtree_postorder_for_each_entry_safe() method that is
used in skb_rbtree_purge() but discovered it was slower than the
current tfifo_reset() method.
I measured time taken to release skbs with three occupation levels :
10^4, 10^5 and 10^6 skbs with three methods :
1) (current 'naive' method)
while ((p = rb_first(&q->t_root))) {
struct sk_buff *skb = netem_rb_to_skb(p);
rb_erase(p, &q->t_root);
rtnl_kfree_skbs(skb, skb);
}
2) Use rb_next() instead of rb_first() in the loop :
p = rb_first(&q->t_root);
while (p) {
struct sk_buff *skb = netem_rb_to_skb(p);
p = rb_next(p);
rb_erase(&skb->rbnode, &q->t_root);
rtnl_kfree_skbs(skb, skb);
}
3) "optimized" method using rbtree_postorder_for_each_entry_safe()
struct sk_buff *skb, *next;
rbtree_postorder_for_each_entry_safe(skb, next,
&q->t_root, rbnode) {
rtnl_kfree_skbs(skb, skb);
}
q->t_root = RB_ROOT;
Results :
method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...} 10000 skbs in 541846 ns (54 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)
method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...} 100000 skbs in 5942456 ns (59 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)
method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...} 1000000 skbs in 82619635 ns (82 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)
Method 2) is simply faster, probably because it maintains a smaller
working size set.
Note that this is the method we use in tcp_ofo_queue() already.
I will also change skb_rbtree_purge() in a second patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_netem.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
static void tfifo_reset(struct Qdisc *sch)
{
struct netem_sched_data *q = qdisc_priv(sch);
- struct rb_node *p;
+ struct rb_node *p = rb_first(&q->t_root);
- while ((p = rb_first(&q->t_root))) {
+ while (p) {
struct sk_buff *skb = netem_rb_to_skb(p);
- rb_erase(p, &q->t_root);
+ p = rb_next(p);
+ rb_erase(&skb->rbnode, &q->t_root);
rtnl_kfree_skbs(skb, skb);
}
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-23 18:07 [PATCH net-next] sch_netem: faster rb tree removal Eric Dumazet
@ 2017-09-25 1:57 ` David Ahern
2017-09-25 2:05 ` David Ahern
2017-09-26 3:33 ` David Miller
1 sibling, 1 reply; 9+ messages in thread
From: David Ahern @ 2017-09-25 1:57 UTC (permalink / raw)
To: Eric Dumazet, David Miller; +Cc: netdev, Stephen Hemminger
On 9/23/17 12:07 PM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> While running TCP tests involving netem storing millions of packets,
> I had the idea to speed up tfifo_reset() and did experiments.
>
> I tried the rbtree_postorder_for_each_entry_safe() method that is
> used in skb_rbtree_purge() but discovered it was slower than the
> current tfifo_reset() method.
>
> I measured time taken to release skbs with three occupation levels :
> 10^4, 10^5 and 10^6 skbs with three methods :
>
> 1) (current 'naive' method)
>
> while ((p = rb_first(&q->t_root))) {
> struct sk_buff *skb = netem_rb_to_skb(p);
>
> rb_erase(p, &q->t_root);
> rtnl_kfree_skbs(skb, skb);
> }
>
> 2) Use rb_next() instead of rb_first() in the loop :
>
> p = rb_first(&q->t_root);
> while (p) {
> struct sk_buff *skb = netem_rb_to_skb(p);
>
> p = rb_next(p);
> rb_erase(&skb->rbnode, &q->t_root);
> rtnl_kfree_skbs(skb, skb);
> }
>
> 3) "optimized" method using rbtree_postorder_for_each_entry_safe()
>
> struct sk_buff *skb, *next;
>
> rbtree_postorder_for_each_entry_safe(skb, next,
> &q->t_root, rbnode) {
> rtnl_kfree_skbs(skb, skb);
> }
> q->t_root = RB_ROOT;
>
> Results :
>
> method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
> method_2:rb_first; while (p) { p = rb_next(p); ...} 10000 skbs in 541846 ns (54 ns per skb)
> method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)
>
> method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
> method_2:rb_first; while (p) { p = rb_next(p); ...} 100000 skbs in 5942456 ns (59 ns per skb)
> method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)
>
> method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
> method_2:rb_first; while (p) { p = rb_next(p); ...} 1000000 skbs in 82619635 ns (82 ns per skb)
> method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)
>
> Method 2) is simply faster, probably because it maintains a smaller
> working size set.
>
> Note that this is the method we use in tcp_ofo_queue() already.
>
> I will also change skb_rbtree_purge() in a second patch.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> net/sched/sch_netem.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
> static void tfifo_reset(struct Qdisc *sch)
> {
> struct netem_sched_data *q = qdisc_priv(sch);
> - struct rb_node *p;
> + struct rb_node *p = rb_first(&q->t_root);
>
> - while ((p = rb_first(&q->t_root))) {
> + while (p) {
> struct sk_buff *skb = netem_rb_to_skb(p);
>
> - rb_erase(p, &q->t_root);
> + p = rb_next(p);
> + rb_erase(&skb->rbnode, &q->t_root);
> rtnl_kfree_skbs(skb, skb);
> }
> }
>
>
Hi Eric:
I'm guessing the cost is in the rb_first and rb_next computations. Did
you consider something like this:
struct rb_root *root
struct rb_node **p = &root->rb_node;
while (*p != NULL) {
struct foobar *fb;
fb = container_of(*p, struct foobar, rb_node);
// fb processing
p = &root->rb_node;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-25 1:57 ` David Ahern
@ 2017-09-25 2:05 ` David Ahern
2017-09-25 5:27 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: David Ahern @ 2017-09-25 2:05 UTC (permalink / raw)
To: Eric Dumazet, David Miller; +Cc: netdev, Stephen Hemminger
On 9/24/17 7:57 PM, David Ahern wrote:
> On 9/23/17 12:07 PM, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
>>
>> While running TCP tests involving netem storing millions of packets,
>> I had the idea to speed up tfifo_reset() and did experiments.
>>
>> I tried the rbtree_postorder_for_each_entry_safe() method that is
>> used in skb_rbtree_purge() but discovered it was slower than the
>> current tfifo_reset() method.
>>
>> I measured time taken to release skbs with three occupation levels :
>> 10^4, 10^5 and 10^6 skbs with three methods :
>>
>> 1) (current 'naive' method)
>>
>> while ((p = rb_first(&q->t_root))) {
>> struct sk_buff *skb = netem_rb_to_skb(p);
>>
>> rb_erase(p, &q->t_root);
>> rtnl_kfree_skbs(skb, skb);
>> }
>>
>> 2) Use rb_next() instead of rb_first() in the loop :
>>
>> p = rb_first(&q->t_root);
>> while (p) {
>> struct sk_buff *skb = netem_rb_to_skb(p);
>>
>> p = rb_next(p);
>> rb_erase(&skb->rbnode, &q->t_root);
>> rtnl_kfree_skbs(skb, skb);
>> }
>>
>> 3) "optimized" method using rbtree_postorder_for_each_entry_safe()
>>
>> struct sk_buff *skb, *next;
>>
>> rbtree_postorder_for_each_entry_safe(skb, next,
>> &q->t_root, rbnode) {
>> rtnl_kfree_skbs(skb, skb);
>> }
>> q->t_root = RB_ROOT;
>>
>> Results :
>>
>> method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
>> method_2:rb_first; while (p) { p = rb_next(p); ...} 10000 skbs in 541846 ns (54 ns per skb)
>> method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)
>>
>> method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
>> method_2:rb_first; while (p) { p = rb_next(p); ...} 100000 skbs in 5942456 ns (59 ns per skb)
>> method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)
>>
>> method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
>> method_2:rb_first; while (p) { p = rb_next(p); ...} 1000000 skbs in 82619635 ns (82 ns per skb)
>> method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)
>>
>> Method 2) is simply faster, probably because it maintains a smaller
>> working size set.
>>
>> Note that this is the method we use in tcp_ofo_queue() already.
>>
>> I will also change skb_rbtree_purge() in a second patch.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> ---
>> net/sched/sch_netem.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
>> index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
>> --- a/net/sched/sch_netem.c
>> +++ b/net/sched/sch_netem.c
>> @@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
>> static void tfifo_reset(struct Qdisc *sch)
>> {
>> struct netem_sched_data *q = qdisc_priv(sch);
>> - struct rb_node *p;
>> + struct rb_node *p = rb_first(&q->t_root);
>>
>> - while ((p = rb_first(&q->t_root))) {
>> + while (p) {
>> struct sk_buff *skb = netem_rb_to_skb(p);
>>
>> - rb_erase(p, &q->t_root);
>> + p = rb_next(p);
>> + rb_erase(&skb->rbnode, &q->t_root);
>> rtnl_kfree_skbs(skb, skb);
>> }
>> }
>>
>>
>
> Hi Eric:
>
> I'm guessing the cost is in the rb_first and rb_next computations. Did
> you consider something like this:
>
> struct rb_root *root
> struct rb_node **p = &root->rb_node;
>
> while (*p != NULL) {
> struct foobar *fb;
>
> fb = container_of(*p, struct foobar, rb_node);
> // fb processing
rb_erase(&nh->rb_node, root);
> p = &root->rb_node;
> }
>
Oops, dropped the rb_erase in my consolidating the code to this snippet.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-25 2:05 ` David Ahern
@ 2017-09-25 5:27 ` Eric Dumazet
2017-09-25 16:14 ` David Ahern
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2017-09-25 5:27 UTC (permalink / raw)
To: David Ahern; +Cc: David Miller, netdev, Stephen Hemminger
On Sun, 2017-09-24 at 20:05 -0600, David Ahern wrote:
> On 9/24/17 7:57 PM, David Ahern wrote:
> > Hi Eric:
> >
> > I'm guessing the cost is in the rb_first and rb_next computations. Did
> > you consider something like this:
> >
> > struct rb_root *root
> > struct rb_node **p = &root->rb_node;
> >
> > while (*p != NULL) {
> > struct foobar *fb;
> >
> > fb = container_of(*p, struct foobar, rb_node);
> > // fb processing
> rb_erase(&nh->rb_node, root);
>
> > p = &root->rb_node;
> > }
> >
>
> Oops, dropped the rb_erase in my consolidating the code to this snippet.
Hi David
This gives about same numbers than method_1
I tried with 10^7 skbs in the tree :
Your suggestion takes 66ns per skb, while the one I chose takes 37ns per
skb.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-25 5:27 ` Eric Dumazet
@ 2017-09-25 16:14 ` David Ahern
2017-09-25 16:52 ` Eric Dumazet
2017-09-25 20:11 ` David Miller
0 siblings, 2 replies; 9+ messages in thread
From: David Ahern @ 2017-09-25 16:14 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Stephen Hemminger
On 9/24/17 11:27 PM, Eric Dumazet wrote:
> On Sun, 2017-09-24 at 20:05 -0600, David Ahern wrote:
>> On 9/24/17 7:57 PM, David Ahern wrote:
>
>>> Hi Eric:
>>>
>>> I'm guessing the cost is in the rb_first and rb_next computations. Did
>>> you consider something like this:
>>>
>>> struct rb_root *root
>>> struct rb_node **p = &root->rb_node;
>>>
>>> while (*p != NULL) {
>>> struct foobar *fb;
>>>
>>> fb = container_of(*p, struct foobar, rb_node);
>>> // fb processing
>> rb_erase(&nh->rb_node, root);
>>
>>> p = &root->rb_node;
>>> }
>>>
>>
>> Oops, dropped the rb_erase in my consolidating the code to this snippet.
>
> Hi David
>
> This gives about same numbers than method_1
>
> I tried with 10^7 skbs in the tree :
>
> Your suggestion takes 66ns per skb, while the one I chose takes 37ns per
> skb.
Thanks for the test.
I made a simple program this morning and ran it under perf. With the
above suggestion the rb_erase has a high cost because it always deletes
the root node. Your method 1 has a high cost on rb_first which is
expected given its definition and it is run on each removal. Both
options increase in time with the number of entries in the tree.
Your method 2 is fairly constant from 10,000 entries to 10M entries
which makes sense: a one time cost at finding rb_first and then always
removing a bottom node so rb_erase is light.
As for the change:
Acked-by: David Ahern <dsahern@gmail.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-25 16:14 ` David Ahern
@ 2017-09-25 16:52 ` Eric Dumazet
2017-09-25 20:11 ` David Miller
1 sibling, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2017-09-25 16:52 UTC (permalink / raw)
To: David Ahern; +Cc: David Miller, netdev, Stephen Hemminger
On Mon, 2017-09-25 at 10:14 -0600, David Ahern wrote:
> Thanks for the test.
>
> I made a simple program this morning and ran it under perf. With the
> above suggestion the rb_erase has a high cost because it always deletes
> the root node. Your method 1 has a high cost on rb_first which is
> expected given its definition and it is run on each removal. Both
> options increase in time with the number of entries in the tree.
>
> Your method 2 is fairly constant from 10,000 entries to 10M entries
> which makes sense: a one time cost at finding rb_first and then always
> removing a bottom node so rb_erase is light.
>
> As for the change:
>
> Acked-by: David Ahern <dsahern@gmail.com>
Thanks a lot for double checking !
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-25 16:14 ` David Ahern
2017-09-25 16:52 ` Eric Dumazet
@ 2017-09-25 20:11 ` David Miller
2017-09-25 21:52 ` David Ahern
1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2017-09-25 20:11 UTC (permalink / raw)
To: dsahern; +Cc: eric.dumazet, netdev, stephen
From: David Ahern <dsahern@gmail.com>
Date: Mon, 25 Sep 2017 10:14:23 -0600
> I made a simple program this morning and ran it under perf.
If possible please submit this for selftests.
Thank you.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-25 20:11 ` David Miller
@ 2017-09-25 21:52 ` David Ahern
0 siblings, 0 replies; 9+ messages in thread
From: David Ahern @ 2017-09-25 21:52 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, netdev, stephen
On 9/25/17 2:11 PM, David Miller wrote:
> From: David Ahern <dsahern@gmail.com>
> Date: Mon, 25 Sep 2017 10:14:23 -0600
>
>> I made a simple program this morning and ran it under perf.
>
> If possible please submit this for selftests.
>
It is more of a microbenchmark of options to flush an rbtree than a
self-test. Further, it relies on the tools/lib/rbtree.c versus
lib/rbtree.c. The tools/lib version was imported by Arnaldo in July 2015
and is a out of date, though it is good enough to show the intent w.r.t.
flushing options.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] sch_netem: faster rb tree removal
2017-09-23 18:07 [PATCH net-next] sch_netem: faster rb tree removal Eric Dumazet
2017-09-25 1:57 ` David Ahern
@ 2017-09-26 3:33 ` David Miller
1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2017-09-26 3:33 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, stephen
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 23 Sep 2017 11:07:28 -0700
> From: Eric Dumazet <edumazet@google.com>
>
> While running TCP tests involving netem storing millions of packets,
> I had the idea to speed up tfifo_reset() and did experiments.
>
> I tried the rbtree_postorder_for_each_entry_safe() method that is
> used in skb_rbtree_purge() but discovered it was slower than the
> current tfifo_reset() method.
>
> I measured time taken to release skbs with three occupation levels :
> 10^4, 10^5 and 10^6 skbs with three methods :
...
> Results :
...
> I will also change skb_rbtree_purge() in a second patch.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks Eric.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-09-26 3:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-23 18:07 [PATCH net-next] sch_netem: faster rb tree removal Eric Dumazet
2017-09-25 1:57 ` David Ahern
2017-09-25 2:05 ` David Ahern
2017-09-25 5:27 ` Eric Dumazet
2017-09-25 16:14 ` David Ahern
2017-09-25 16:52 ` Eric Dumazet
2017-09-25 20:11 ` David Miller
2017-09-25 21:52 ` David Ahern
2017-09-26 3:33 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.