All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] sch_netem: faster rb tree removal
@ 2017-09-23 18:07 Eric Dumazet
  2017-09-25  1:57 ` David Ahern
  2017-09-26  3:33 ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2017-09-23 18:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Stephen Hemminger

From: Eric Dumazet <edumazet@google.com>

While running TCP tests involving netem storing millions of packets,
I had the idea to speed up tfifo_reset() and did experiments.

I tried the rbtree_postorder_for_each_entry_safe() method that is
used in skb_rbtree_purge() but discovered it was slower than the
current tfifo_reset() method.

I measured time taken to release skbs with three occupation levels :
10^4, 10^5 and 10^6 skbs with three methods :

1) (current 'naive' method)

	while ((p = rb_first(&q->t_root))) {
		struct sk_buff *skb = netem_rb_to_skb(p);
 
		rb_erase(p, &q->t_root);
		rtnl_kfree_skbs(skb, skb);
	}

2) Use rb_next() instead of rb_first() in the loop :

	p = rb_first(&q->t_root);
	while (p) {
		struct sk_buff *skb = netem_rb_to_skb(p);

		p = rb_next(p);
		rb_erase(&skb->rbnode, &q->t_root);
		rtnl_kfree_skbs(skb, skb);
	}

3) "optimized" method using rbtree_postorder_for_each_entry_safe()

	struct sk_buff *skb, *next;

	rbtree_postorder_for_each_entry_safe(skb, next,
					     &q->t_root, rbnode) {
               rtnl_kfree_skbs(skb, skb);
	}
	q->t_root = RB_ROOT;

Results :

method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  10000 skbs in 541846 ns (54 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)

method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  100000 skbs in 5942456 ns (59 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)

method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
method_2:rb_first; while (p) { p = rb_next(p); ...}  1000000 skbs in 82619635 ns (82 ns per skb)
method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)

Method 2) is simply faster, probably because it maintains a smaller
working size set.

Note that this is the method we use in tcp_ofo_queue() already.

I will also change skb_rbtree_purge() in a second patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/sched/sch_netem.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
 static void tfifo_reset(struct Qdisc *sch)
 {
 	struct netem_sched_data *q = qdisc_priv(sch);
-	struct rb_node *p;
+	struct rb_node *p = rb_first(&q->t_root);
 
-	while ((p = rb_first(&q->t_root))) {
+	while (p) {
 		struct sk_buff *skb = netem_rb_to_skb(p);
 
-		rb_erase(p, &q->t_root);
+		p = rb_next(p);
+		rb_erase(&skb->rbnode, &q->t_root);
 		rtnl_kfree_skbs(skb, skb);
 	}
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-23 18:07 [PATCH net-next] sch_netem: faster rb tree removal Eric Dumazet
@ 2017-09-25  1:57 ` David Ahern
  2017-09-25  2:05   ` David Ahern
  2017-09-26  3:33 ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: David Ahern @ 2017-09-25  1:57 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: netdev, Stephen Hemminger

On 9/23/17 12:07 PM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> While running TCP tests involving netem storing millions of packets,
> I had the idea to speed up tfifo_reset() and did experiments.
> 
> I tried the rbtree_postorder_for_each_entry_safe() method that is
> used in skb_rbtree_purge() but discovered it was slower than the
> current tfifo_reset() method.
> 
> I measured time taken to release skbs with three occupation levels :
> 10^4, 10^5 and 10^6 skbs with three methods :
> 
> 1) (current 'naive' method)
> 
> 	while ((p = rb_first(&q->t_root))) {
> 		struct sk_buff *skb = netem_rb_to_skb(p);
>  
> 		rb_erase(p, &q->t_root);
> 		rtnl_kfree_skbs(skb, skb);
> 	}
> 
> 2) Use rb_next() instead of rb_first() in the loop :
> 
> 	p = rb_first(&q->t_root);
> 	while (p) {
> 		struct sk_buff *skb = netem_rb_to_skb(p);
> 
> 		p = rb_next(p);
> 		rb_erase(&skb->rbnode, &q->t_root);
> 		rtnl_kfree_skbs(skb, skb);
> 	}
> 
> 3) "optimized" method using rbtree_postorder_for_each_entry_safe()
> 
> 	struct sk_buff *skb, *next;
> 
> 	rbtree_postorder_for_each_entry_safe(skb, next,
> 					     &q->t_root, rbnode) {
>                rtnl_kfree_skbs(skb, skb);
> 	}
> 	q->t_root = RB_ROOT;
> 
> Results :
> 
> method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
> method_2:rb_first; while (p) { p = rb_next(p); ...}  10000 skbs in 541846 ns (54 ns per skb)
> method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)
> 
> method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
> method_2:rb_first; while (p) { p = rb_next(p); ...}  100000 skbs in 5942456 ns (59 ns per skb)
> method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)
> 
> method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
> method_2:rb_first; while (p) { p = rb_next(p); ...}  1000000 skbs in 82619635 ns (82 ns per skb)
> method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)
> 
> Method 2) is simply faster, probably because it maintains a smaller
> working size set.
> 
> Note that this is the method we use in tcp_ofo_queue() already.
> 
> I will also change skb_rbtree_purge() in a second patch.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/sched/sch_netem.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
>  static void tfifo_reset(struct Qdisc *sch)
>  {
>  	struct netem_sched_data *q = qdisc_priv(sch);
> -	struct rb_node *p;
> +	struct rb_node *p = rb_first(&q->t_root);
>  
> -	while ((p = rb_first(&q->t_root))) {
> +	while (p) {
>  		struct sk_buff *skb = netem_rb_to_skb(p);
>  
> -		rb_erase(p, &q->t_root);
> +		p = rb_next(p);
> +		rb_erase(&skb->rbnode, &q->t_root);
>  		rtnl_kfree_skbs(skb, skb);
>  	}
>  }
> 
> 

Hi Eric:

I'm guessing the cost is in the rb_first and rb_next computations. Did
you consider something like this:

        struct rb_root *root
        struct rb_node **p = &root->rb_node;

        while (*p != NULL) {
                struct foobar *fb;

                fb = container_of(*p, struct foobar, rb_node);
                // fb processing

                p = &root->rb_node;
        }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-25  1:57 ` David Ahern
@ 2017-09-25  2:05   ` David Ahern
  2017-09-25  5:27     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: David Ahern @ 2017-09-25  2:05 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: netdev, Stephen Hemminger

On 9/24/17 7:57 PM, David Ahern wrote:
> On 9/23/17 12:07 PM, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
>>
>> While running TCP tests involving netem storing millions of packets,
>> I had the idea to speed up tfifo_reset() and did experiments.
>>
>> I tried the rbtree_postorder_for_each_entry_safe() method that is
>> used in skb_rbtree_purge() but discovered it was slower than the
>> current tfifo_reset() method.
>>
>> I measured time taken to release skbs with three occupation levels :
>> 10^4, 10^5 and 10^6 skbs with three methods :
>>
>> 1) (current 'naive' method)
>>
>> 	while ((p = rb_first(&q->t_root))) {
>> 		struct sk_buff *skb = netem_rb_to_skb(p);
>>  
>> 		rb_erase(p, &q->t_root);
>> 		rtnl_kfree_skbs(skb, skb);
>> 	}
>>
>> 2) Use rb_next() instead of rb_first() in the loop :
>>
>> 	p = rb_first(&q->t_root);
>> 	while (p) {
>> 		struct sk_buff *skb = netem_rb_to_skb(p);
>>
>> 		p = rb_next(p);
>> 		rb_erase(&skb->rbnode, &q->t_root);
>> 		rtnl_kfree_skbs(skb, skb);
>> 	}
>>
>> 3) "optimized" method using rbtree_postorder_for_each_entry_safe()
>>
>> 	struct sk_buff *skb, *next;
>>
>> 	rbtree_postorder_for_each_entry_safe(skb, next,
>> 					     &q->t_root, rbnode) {
>>                rtnl_kfree_skbs(skb, skb);
>> 	}
>> 	q->t_root = RB_ROOT;
>>
>> Results :
>>
>> method_1:while (rb_first()) rb_erase() 10000 skbs in 690378 ns (69 ns per skb)
>> method_2:rb_first; while (p) { p = rb_next(p); ...}  10000 skbs in 541846 ns (54 ns per skb)
>> method_3:rbtree_postorder_for_each_entry_safe() 10000 skbs in 868307 ns (86 ns per skb)
>>
>> method_1:while (rb_first()) rb_erase() 99996 skbs in 7804021 ns (78 ns per skb)
>> method_2:rb_first; while (p) { p = rb_next(p); ...}  100000 skbs in 5942456 ns (59 ns per skb)
>> method_3:rbtree_postorder_for_each_entry_safe() 100000 skbs in 11584940 ns (115 ns per skb)
>>
>> method_1:while (rb_first()) rb_erase() 1000000 skbs in 108577838 ns (108 ns per skb)
>> method_2:rb_first; while (p) { p = rb_next(p); ...}  1000000 skbs in 82619635 ns (82 ns per skb)
>> method_3:rbtree_postorder_for_each_entry_safe() 1000000 skbs in 127328743 ns (127 ns per skb)
>>
>> Method 2) is simply faster, probably because it maintains a smaller
>> working size set.
>>
>> Note that this is the method we use in tcp_ofo_queue() already.
>>
>> I will also change skb_rbtree_purge() in a second patch.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> ---
>>  net/sched/sch_netem.c |    7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
>> index 063a4bdb9ee6f26b01387959e8f6ccd15ec16191..5a4f1008029068372019a965186e7a3c0a18aac3 100644
>> --- a/net/sched/sch_netem.c
>> +++ b/net/sched/sch_netem.c
>> @@ -361,12 +361,13 @@ static psched_time_t packet_len_2_sched_time(unsigned int len, struct netem_sche
>>  static void tfifo_reset(struct Qdisc *sch)
>>  {
>>  	struct netem_sched_data *q = qdisc_priv(sch);
>> -	struct rb_node *p;
>> +	struct rb_node *p = rb_first(&q->t_root);
>>  
>> -	while ((p = rb_first(&q->t_root))) {
>> +	while (p) {
>>  		struct sk_buff *skb = netem_rb_to_skb(p);
>>  
>> -		rb_erase(p, &q->t_root);
>> +		p = rb_next(p);
>> +		rb_erase(&skb->rbnode, &q->t_root);
>>  		rtnl_kfree_skbs(skb, skb);
>>  	}
>>  }
>>
>>
> 
> Hi Eric:
> 
> I'm guessing the cost is in the rb_first and rb_next computations. Did
> you consider something like this:
> 
>         struct rb_root *root
>         struct rb_node **p = &root->rb_node;
> 
>         while (*p != NULL) {
>                 struct foobar *fb;
> 
>                 fb = container_of(*p, struct foobar, rb_node);
>                 // fb processing
		  rb_erase(&nh->rb_node, root);

>                 p = &root->rb_node;
>         }
> 

Oops, dropped the rb_erase in my consolidating the code to this snippet.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-25  2:05   ` David Ahern
@ 2017-09-25  5:27     ` Eric Dumazet
  2017-09-25 16:14       ` David Ahern
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2017-09-25  5:27 UTC (permalink / raw)
  To: David Ahern; +Cc: David Miller, netdev, Stephen Hemminger

On Sun, 2017-09-24 at 20:05 -0600, David Ahern wrote:
> On 9/24/17 7:57 PM, David Ahern wrote:

> > Hi Eric:
> > 
> > I'm guessing the cost is in the rb_first and rb_next computations. Did
> > you consider something like this:
> > 
> >         struct rb_root *root
> >         struct rb_node **p = &root->rb_node;
> > 
> >         while (*p != NULL) {
> >                 struct foobar *fb;
> > 
> >                 fb = container_of(*p, struct foobar, rb_node);
> >                 // fb processing
> 		  rb_erase(&nh->rb_node, root);
> 
> >                 p = &root->rb_node;
> >         }
> > 
> 
> Oops, dropped the rb_erase in my consolidating the code to this snippet.

Hi David

This gives about same numbers than method_1

I tried with 10^7 skbs in the tree :

Your suggestion takes 66ns per skb, while the one I chose takes 37ns per
skb.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-25  5:27     ` Eric Dumazet
@ 2017-09-25 16:14       ` David Ahern
  2017-09-25 16:52         ` Eric Dumazet
  2017-09-25 20:11         ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: David Ahern @ 2017-09-25 16:14 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Stephen Hemminger

On 9/24/17 11:27 PM, Eric Dumazet wrote:
> On Sun, 2017-09-24 at 20:05 -0600, David Ahern wrote:
>> On 9/24/17 7:57 PM, David Ahern wrote:
> 
>>> Hi Eric:
>>>
>>> I'm guessing the cost is in the rb_first and rb_next computations. Did
>>> you consider something like this:
>>>
>>>         struct rb_root *root
>>>         struct rb_node **p = &root->rb_node;
>>>
>>>         while (*p != NULL) {
>>>                 struct foobar *fb;
>>>
>>>                 fb = container_of(*p, struct foobar, rb_node);
>>>                 // fb processing
>> 		  rb_erase(&nh->rb_node, root);
>>
>>>                 p = &root->rb_node;
>>>         }
>>>
>>
>> Oops, dropped the rb_erase in my consolidating the code to this snippet.
> 
> Hi David
> 
> This gives about same numbers than method_1
> 
> I tried with 10^7 skbs in the tree :
> 
> Your suggestion takes 66ns per skb, while the one I chose takes 37ns per
> skb.

Thanks for the test.

I made a simple program this morning and ran it under perf. With the
above suggestion the rb_erase has a high cost because it always deletes
the root node. Your method 1 has a high cost on rb_first which is
expected given its definition and it is run on each removal. Both
options increase in time with the number of entries in the tree.

Your method 2 is fairly constant from 10,000 entries to 10M entries
which makes sense: a one time cost at finding rb_first and then always
removing a bottom node so rb_erase is light.

As for the change:

Acked-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-25 16:14       ` David Ahern
@ 2017-09-25 16:52         ` Eric Dumazet
  2017-09-25 20:11         ` David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2017-09-25 16:52 UTC (permalink / raw)
  To: David Ahern; +Cc: David Miller, netdev, Stephen Hemminger

On Mon, 2017-09-25 at 10:14 -0600, David Ahern wrote:

> Thanks for the test.
> 
> I made a simple program this morning and ran it under perf. With the
> above suggestion the rb_erase has a high cost because it always deletes
> the root node. Your method 1 has a high cost on rb_first which is
> expected given its definition and it is run on each removal. Both
> options increase in time with the number of entries in the tree.
> 
> Your method 2 is fairly constant from 10,000 entries to 10M entries
> which makes sense: a one time cost at finding rb_first and then always
> removing a bottom node so rb_erase is light.
> 
> As for the change:
> 
> Acked-by: David Ahern <dsahern@gmail.com>

Thanks a lot for double checking !

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-25 16:14       ` David Ahern
  2017-09-25 16:52         ` Eric Dumazet
@ 2017-09-25 20:11         ` David Miller
  2017-09-25 21:52           ` David Ahern
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2017-09-25 20:11 UTC (permalink / raw)
  To: dsahern; +Cc: eric.dumazet, netdev, stephen

From: David Ahern <dsahern@gmail.com>
Date: Mon, 25 Sep 2017 10:14:23 -0600

> I made a simple program this morning and ran it under perf.

If possible please submit this for selftests.

Thank you.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-25 20:11         ` David Miller
@ 2017-09-25 21:52           ` David Ahern
  0 siblings, 0 replies; 9+ messages in thread
From: David Ahern @ 2017-09-25 21:52 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, stephen

On 9/25/17 2:11 PM, David Miller wrote:
> From: David Ahern <dsahern@gmail.com>
> Date: Mon, 25 Sep 2017 10:14:23 -0600
> 
>> I made a simple program this morning and ran it under perf.
> 
> If possible please submit this for selftests.
> 

It is more of a microbenchmark of options to flush an rbtree than a
self-test. Further, it relies on the tools/lib/rbtree.c versus
lib/rbtree.c. The tools/lib version was imported by Arnaldo in July 2015
and is a out of date, though it is good enough to show the intent w.r.t.
flushing options.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] sch_netem: faster rb tree removal
  2017-09-23 18:07 [PATCH net-next] sch_netem: faster rb tree removal Eric Dumazet
  2017-09-25  1:57 ` David Ahern
@ 2017-09-26  3:33 ` David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2017-09-26  3:33 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, stephen

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 23 Sep 2017 11:07:28 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> While running TCP tests involving netem storing millions of packets,
> I had the idea to speed up tfifo_reset() and did experiments.
> 
> I tried the rbtree_postorder_for_each_entry_safe() method that is
> used in skb_rbtree_purge() but discovered it was slower than the
> current tfifo_reset() method.
> 
> I measured time taken to release skbs with three occupation levels :
> 10^4, 10^5 and 10^6 skbs with three methods :
 ...
> Results :
 ...
> I will also change skb_rbtree_purge() in a second patch.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-09-26  3:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-23 18:07 [PATCH net-next] sch_netem: faster rb tree removal Eric Dumazet
2017-09-25  1:57 ` David Ahern
2017-09-25  2:05   ` David Ahern
2017-09-25  5:27     ` Eric Dumazet
2017-09-25 16:14       ` David Ahern
2017-09-25 16:52         ` Eric Dumazet
2017-09-25 20:11         ` David Miller
2017-09-25 21:52           ` David Ahern
2017-09-26  3:33 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.