linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 8-CPU (SMP) #s for lockfree rtcache
@ 2002-05-28 19:57 Dipankar Sarma
  0 siblings, 0 replies; 16+ messages in thread
From: Dipankar Sarma @ 2002-05-28 19:57 UTC (permalink / raw)
  To: alan; +Cc: Dave Miller, Paul McKenney, Linus Torvalds, linux-kernel, Andi Kleen


Hi Alan,

In article <1022609447.4123.126.camel@irongate.swansea.linux.org.uk> Alan Cox wrote:
> On Tue, 2002-05-28 at 17:34, Andi Kleen wrote:
>> And gain tons of new atomic_incs and decs everywhere in the process?  
>> I would prefer RCU. 

> Lots of people write drivers, many of them not SMP kernel locking gurus
> who have time to understand RCU and when they can or cannot sleep, and
> what happens if their unload is pre-empted and RCU is in use. The kernel
> core has to provide a clean easy interface. The network code is a superb
> example of this. All the hard thinking is done outside of the driver, at
> least unless you choose to join in that thinking to get the last scraps
> of performance.

FWIW, recent RCU implementations support preemption. synchronize_kernel()
in rcu_poll_preempt patches use call_rcu_preempt() where callbacks
wait until all CPUs have done a voluntary context switch.
See http://prdownloads.sourceforge.net/lse/rcu_poll_preempt-2.5.14-2.patch

Thanks
-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 15:49     ` Robert Love
  2002-05-28 16:25       ` Dipankar Sarma
  2002-05-29 17:44       ` kuznet
@ 2002-06-03 12:08       ` Dipankar Sarma
  2 siblings, 0 replies; 16+ messages in thread
From: Dipankar Sarma @ 2002-06-03 12:08 UTC (permalink / raw)
  To: Robert Love
  Cc: David S. Miller, Linus Torvalds, linux-kernel, Paul McKenney,
	Andrea Arcangeli

On Tue, May 28, 2002 at 08:49:58AM -0700, Robert Love wrote:
> I agree the numbers posted are nice, but I remain skeptical like Linus. 
> Sure, the locking overhead is nearly gone in the profiled function where
> RCU is used.  But the overhead has just been _moved_ to wherever the RCU
> work is now done.  Any benchmark needs to include the damage done there,
> too.

Hi Robert,

I did a crude analysis of RCU overhead for rt_rcu-2.5.3-1.patch
and the corresponding RCU infrastructure patch rcu_ltimer-2.5.3-1.patch.
(http://prdownloads.sourceforge.net/lse/rcu_ltimer-2.5.3-1.patch).
The rcu_ltimer patch uses the local timer interrupt handler to check
if there is any RCU pending for that that CPU. The 
smp_local_timer_interrupt() routine is never counted for profiling,
but it happens every 10ms and the RCU overhead is limited
to checking a few CPU-local things and scheduling the per-CPU
RCU tasklet. The rest of the RCU code is entirely in rcupdate.c
and were measured in kernel profiling.

Here is an analysis of what we can measure -

1. rt_rcu with neighbor table garbage collection threshold increased
   prevent frequent overflow (due to random dest addresses).
   (8-1-32-gc31048576gcint60 configuration in earlier published results).


Function                              2.5.3              rt_rcu-2.5.3
--------                              ------             ------------
ip_route_output_key [c0214470]:        4486                2026

call_rcu [c0125f40]:                   N/A                 11
rcu_process_callbacks [c01261d0]:      N/A                 4
rcu_invoke_callbacks [c0125fc0]:       N/A                 4

So with infrequent updates, clearly RCU overheads are practically
negligible.


2. rt_rcu with frequent neighbor table overflow (due to random dest addresses)
   (8-1-32 configuration in earlier published results).


Function                              2.5.3              rt_rcu-2.5.3
--------                              ------             ------------
ip_route_output_key [c0214470]:       2358                 1646

call_rcu [c0125f40]:                  N/A                  262
rcu_invoke_callbacks [c0125fc0]:      N/A                  57
rcu_process_callbacks [c01261d0]:     N/A                  49
rcu_check_quiescent_state [c0126030]: N/A                  27
rcu_check_callbacks [c01260d0]:       N/A                  24
rcu_reg_batch [c0125ff0]:             N/A                  3

This shows that with very frequent RCU updates, the real gains
made in ip_route_output_key() is less but still outweighs RCU overhead. 
I suspect that such frequent update is not a common occurrence, but Davem
can confirm that.

The bottom line is that RCU overhead is tolerable where we know that
updates are not going to be frequent. Also different RCU
algorithms are likely to have different overheads. We will
present analysis for these algorithms as we go along.

Thanks
-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 15:49     ` Robert Love
  2002-05-28 16:25       ` Dipankar Sarma
@ 2002-05-29 17:44       ` kuznet
  2002-06-03 12:08       ` Dipankar Sarma
  2 siblings, 0 replies; 16+ messages in thread
From: kuznet @ 2002-05-29 17:44 UTC (permalink / raw)
  To: Robert Love; +Cc: Dave Miller, linux-kernel

Hello!

> I also balk at implicit locking...

rcu is not implicit. At least in the only demo, which I have read
i.e. in routing cache, it is more explicit than spinlocks are. :-)

I also strongly dislike any kind of implicit serialization and even
not standard serialization. So, rcu used in route.c is supposed
to be cleaned of assembly style code and converted to something
more intelligible.

Alexey

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 15:45       ` Andi Kleen
  2002-05-28 17:03         ` Alan Cox
@ 2002-05-29  4:44         ` Rusty Russell
  1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2002-05-29  4:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: davem, torvalds, linux-kernel, paul.mckenney, andrea

On 28 May 2002 17:45:56 +0200
Andi Kleen <ak@muc.de> wrote:

> "David S. Miller" <davem@redhat.com> writes:
> 
> >    From: Dipankar Sarma <dipankar@in.ibm.com>
> >    Date: Tue, 28 May 2002 18:28:06 +0530
> >    
> >    Well, the last time RCU was discussed, Linus said that he would
> >    like to see someplace where RCU clearly helps.
> > 
> > Alexey and I are in firm agreement that the routing cache
> > clearly benefits from RCU.
> 
> The next obvious benefitor IMHO is module unloading.

There is a much bigger question here, which is "are modules first class
citizens"?  Doing it properly turns us into a poor-man's microkernel.
We would standardize our registration interfaces (similar to the standard
notifier.h), and have them all do the inc and decs.

OTOH, if you treat module removal as a CONFIG_DEBUG_KERNEL thing, life
becomes much much simpler.

I have the code, I'll be serious about it in ~2 months.
Rusty.
-- 
   there are those who do and those who hang on and you don't see too
   many doers quoting their contemporaries.  -- Larry McVoy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 16:34           ` Andi Kleen
@ 2002-05-28 18:10             ` Alan Cox
  2002-05-28 17:24               ` Andi Kleen
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Cox @ 2002-05-28 18:10 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, torvalds, linux-kernel, paul.mckenney, andrea

On Tue, 2002-05-28 at 17:34, Andi Kleen wrote:
> And gain tons of new atomic_incs and decs everywhere in the process?  
> I would prefer RCU. 

RCU is a great way to make sure people get module unloading *wrong*. It
has to be simple for the driver authors. The odd locked operation on
things like open() of a device file is not a performance issue, not
remotely. 

Lots of people write drivers, many of them not SMP kernel locking gurus
who have time to understand RCU and when they can or cannot sleep, and
what happens if their unload is pre-empted and RCU is in use. The kernel
core has to provide a clean easy interface. The network code is a superb
example of this. All the hard thinking is done outside of the driver, at
least unless you choose to join in that thinking to get the last scraps
of performance.

Alan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 18:10             ` Alan Cox
@ 2002-05-28 17:24               ` Andi Kleen
  0 siblings, 0 replies; 16+ messages in thread
From: Andi Kleen @ 2002-05-28 17:24 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andi Kleen, David S. Miller, torvalds, linux-kernel,
	paul.mckenney, andrea

On Tue, May 28, 2002 at 08:10:47PM +0200, Alan Cox wrote:
> On Tue, 2002-05-28 at 17:34, Andi Kleen wrote:
> > And gain tons of new atomic_incs and decs everywhere in the process?  
> > I would prefer RCU. 
> 
> RCU is a great way to make sure people get module unloading *wrong*. It
> has to be simple for the driver authors. The odd locked operation on
> things like open() of a device file is not a performance issue, not
> remotely. 

open() of device file is not the problem. The problem are lots of other
data structures that gain module owners and atomic counters all the time.

> 
> Lots of people write drivers, many of them not SMP kernel locking gurus
> who have time to understand RCU and when they can or cannot sleep, and
> what happens if their unload is pre-empted and RCU is in use. The kernel

The current RCU patch doesn't kick in for preemption, so preemption is 
a non issue.

They have to understand when things can or cannot sleep. Without that
I think they shouldn't write linux drivers, because they will get many things
wrong (like spinlocks or even interrupt disabling in 2.5) 

With the "simple" module unload RCU variant you just stick a 
synchronize_kernel() after the module destructor call

It's also no problem for them to sleep in the destructor, the simple
variant obviously makes no difference here. It also doesn't change any sleeping
rules; or at least they are not different than in 2.0/2.2.

Just this simple variant plugs a lot of the races and would allow dropping
some module counts. It also makes all the nasty "cannot do MOD_*USE_COUNT in
the driver code itself" issues go away.

The remaining hole is driver reentering while the cleanup runs. The simple
rcu unload assumes that open and cleanup are atomic to each other, which
is usually not true. Fixing that properly likely requires two stage 
cleanup as proposed by Rusty/Kaos. 

Still given that the simple variant is not a complete solution, just making
the issue of not being able to do MOD_*_COUNT in driver code go away would
be imho a bit step forward. In fact following your initial point it would
make some code much more obvious to device driver writers.

-Andi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 16:25       ` Dipankar Sarma
@ 2002-05-28 17:09         ` Dipankar Sarma
  0 siblings, 0 replies; 16+ messages in thread
From: Dipankar Sarma @ 2002-05-28 17:09 UTC (permalink / raw)
  To: Robert Love
  Cc: David S. Miller, Linus Torvalds, linux-kernel, Paul McKenney,
	Andrea Arcangeli

On Tue, May 28, 2002 at 09:55:35PM +0530, Dipankar Sarma wrote:
> Hi Robert,
> 
> On Tue, May 28, 2002 at 08:49:58AM -0700, Robert Love wrote:
> > 
> > > Well, the last time RCU was discussed, Linus said that he would
> > > like to see someplace where RCU clearly helps.
> > 
> > I agree the numbers posted are nice, but I remain skeptical like Linus. 
> > Sure, the locking overhead is nearly gone in the profiled function where
> > RCU is used.  But the overhead has just been _moved_ to wherever the RCU
> > work is now done.  Any benchmark needs to include the damage done there,
> > too.
> 
> Have you looked at the rt_rcu patch ? Where do you think there
> is overhead compared to what route cache alread does ? In my
> profiles, rcu routines and kernel mechanisms that it uses
> don't show high up. If you have any suggestions, then I can
> do an investigation.

Hi Robert,

While we are at it, I think this is good point to analyze.
So here is an brief analysis of rt_rcu patch from the overhead
standpoint -

1. Read side has no overhead, we just don't take the per-bucket lock.
2. For just the route cache portion of code, RCU comes into picture
   only when dst entries are deleted. This however has two issues -
   a> expiry of dst entries is checked through a non-frequent
   timer b>lease for recently used dst entries are extended.
   So we don't do frequent RCU based deletion of dst entries.
   Periodically a set of dst entries expire and instead of
   freeing them immediately, we just put them in RCU queue(s)
   for freeing after the grace period (call_rcu() in rt_free()).

Coming to the RCU mechanism -

1. Grace period detection : Different RCU algorithms do it
   differently, however if there is no RCU pending *nothing*
   is done regarding this. One rcu implementation uses
   a 10ms timer to check for grace period completion and another
   rcu_poll uses a repeating tasklet to poll for it. The grace period
   detection is based on a per-cpu context switch counter. I have not seen
   signficant profile counts for grace period detection scheme, but
   nevertheless I will put up the profile counts for Dave's test
   at the LSE website.

2. Actual update : RCU processes the batched update callbacks from tasklet
   context. The rt_rcu callbacks don't do anything other than
   call dst_free(), which would have been called by non-RCU
   code under lock in any case. I am not sure doing this from
   tasklet context adds any overhead and I suspect that it doesn't.

Comments/suggestions ?

Thanks
-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 15:45       ` Andi Kleen
@ 2002-05-28 17:03         ` Alan Cox
  2002-05-28 16:34           ` Andi Kleen
  2002-05-29  4:44         ` Rusty Russell
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Cox @ 2002-05-28 17:03 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, torvalds, linux-kernel, paul.mckenney, andrea

On Tue, 2002-05-28 at 16:45, Andi Kleen wrote: 
> The next obvious benefitor IMHO is module unloading. Just putting 
> a synchronize_kernel() somewhere at the end of sys_delete_modules()
> after the destructor makes module unloading much less nasty than it 
> used to be (yes it doesn't fix all possible module unload races, but a 
> large share of them and it makes the problem much more controllable) 

For 2.5 it would be much more productive to make sys_delete_module
memset the entire vmalloced space of the module to an illegal
instruction before returning


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 17:03         ` Alan Cox
@ 2002-05-28 16:34           ` Andi Kleen
  2002-05-28 18:10             ` Alan Cox
  0 siblings, 1 reply; 16+ messages in thread
From: Andi Kleen @ 2002-05-28 16:34 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andi Kleen, David S. Miller, torvalds, linux-kernel,
	paul.mckenney, andrea

On Tue, May 28, 2002 at 07:03:13PM +0200, Alan Cox wrote:
> On Tue, 2002-05-28 at 16:45, Andi Kleen wrote: 
> > The next obvious benefitor IMHO is module unloading. Just putting 
> > a synchronize_kernel() somewhere at the end of sys_delete_modules()
> > after the destructor makes module unloading much less nasty than it 
> > used to be (yes it doesn't fix all possible module unload races, but a 
> > large share of them and it makes the problem much more controllable) 
> 
> For 2.5 it would be much more productive to make sys_delete_module
> memset the entire vmalloced space of the module to an illegal
> instruction before returning

And gain tons of new atomic_incs and decs everywhere in the process?  
I would prefer RCU. 

-Andi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 15:49     ` Robert Love
@ 2002-05-28 16:25       ` Dipankar Sarma
  2002-05-28 17:09         ` Dipankar Sarma
  2002-05-29 17:44       ` kuznet
  2002-06-03 12:08       ` Dipankar Sarma
  2 siblings, 1 reply; 16+ messages in thread
From: Dipankar Sarma @ 2002-05-28 16:25 UTC (permalink / raw)
  To: Robert Love
  Cc: David S. Miller, Linus Torvalds, linux-kernel, Paul McKenney,
	Andrea Arcangeli

Hi Robert,

On Tue, May 28, 2002 at 08:49:58AM -0700, Robert Love wrote:
> 
> > Well, the last time RCU was discussed, Linus said that he would
> > like to see someplace where RCU clearly helps.
> 
> I agree the numbers posted are nice, but I remain skeptical like Linus. 
> Sure, the locking overhead is nearly gone in the profiled function where
> RCU is used.  But the overhead has just been _moved_ to wherever the RCU
> work is now done.  Any benchmark needs to include the damage done there,
> too.

Have you looked at the rt_rcu patch ? Where do you think there
is overhead compared to what route cache alread does ? In my
profiles, rcu routines and kernel mechanisms that it uses
don't show high up. If you have any suggestions, then I can
do an investigation.

> 
> I also balk at implicit locking...
> 

I agree that it is better to keep things simple and RCU isn't a
replacement for locking. However the route cache hash table with
refcount is a relatively simpler use of RCU and since it has
benefits, we shouldn't shy away from using it if it is useful.

Thanks
-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 12:58   ` Dipankar Sarma
  2002-05-28 12:40     ` David S. Miller
@ 2002-05-28 15:49     ` Robert Love
  2002-05-28 16:25       ` Dipankar Sarma
                         ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Robert Love @ 2002-05-28 15:49 UTC (permalink / raw)
  To: dipankar
  Cc: David S. Miller, Linus Torvalds, linux-kernel, Paul McKenney,
	Andrea Arcangeli

On Tue, 2002-05-28 at 05:58, Dipankar Sarma wrote:

> > Thanks, I am convinced RCU is the way to go.

I am not. :P

> Well, the last time RCU was discussed, Linus said that he would
> like to see someplace where RCU clearly helps.

I agree the numbers posted are nice, but I remain skeptical like Linus. 
Sure, the locking overhead is nearly gone in the profiled function where
RCU is used.  But the overhead has just been _moved_ to wherever the RCU
work is now done.  Any benchmark needs to include the damage done there,
too.

I also balk at implicit locking...

	Robert Love


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 12:40     ` David S. Miller
@ 2002-05-28 15:45       ` Andi Kleen
  2002-05-28 17:03         ` Alan Cox
  2002-05-29  4:44         ` Rusty Russell
  0 siblings, 2 replies; 16+ messages in thread
From: Andi Kleen @ 2002-05-28 15:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, linux-kernel, paul.mckenney, andrea

"David S. Miller" <davem@redhat.com> writes:

>    From: Dipankar Sarma <dipankar@in.ibm.com>
>    Date: Tue, 28 May 2002 18:28:06 +0530
>    
>    Well, the last time RCU was discussed, Linus said that he would
>    like to see someplace where RCU clearly helps.
> 
> Alexey and I are in firm agreement that the routing cache
> clearly benefits from RCU.

The next obvious benefitor IMHO is module unloading. Just putting 
a synchronize_kernel() somewhere at the end of sys_delete_modules()
after the destructor makes module unloading much less nasty than it 
used to be (yes it doesn't fix all possible module unload races, but a 
large share of them and it makes the problem much more controllable) 

-Andi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 11:25 ` David S. Miller
@ 2002-05-28 12:58   ` Dipankar Sarma
  2002-05-28 12:40     ` David S. Miller
  2002-05-28 15:49     ` Robert Love
  0 siblings, 2 replies; 16+ messages in thread
From: Dipankar Sarma @ 2002-05-28 12:58 UTC (permalink / raw)
  To: David S. Miller, Linus Torvalds
  Cc: linux-kernel, Paul McKenney, Andrea Arcangeli

On Tue, May 28, 2002 at 04:25:14AM -0700, David S. Miller wrote:
>    From: Dipankar Sarma <dipankar@in.ibm.com>
>    Date: Tue, 28 May 2002 17:11:04 +0530
>    
>    Here are the results in terms of profile counts in
>    ip_route_output_key() - gc stands for neighbor table garbage
>    collection adjustment and u2000 stands for 2ms packet
>    rate delay. All measurements where done based on  2.5.3 kernel.
> 
> Thanks, I am convinced RCU is the way to go.
> 
> Once the generic RCU bits are in the 2.5.x tree, feel free to
> send me your ipv4 routing cache changes.

Well, the last time RCU was discussed, Linus said that he would
like to see someplace where RCU clearly helps.

Linus, would you consider this to be such a case and consider
including the rcu_poll patch from aa series of kernels ? It
has been a part of aa kernels for quite a while now. The latest
RCU patches support preemption and AFAICS, new module unloading
and cpu hotplug frameworks can use the RCU synchronize_kernel() 
interface.

Or atleast, we can perhaps discuss RCU and see if there are
potential issues that have not been disected so far.

Thanks
-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 12:58   ` Dipankar Sarma
@ 2002-05-28 12:40     ` David S. Miller
  2002-05-28 15:45       ` Andi Kleen
  2002-05-28 15:49     ` Robert Love
  1 sibling, 1 reply; 16+ messages in thread
From: David S. Miller @ 2002-05-28 12:40 UTC (permalink / raw)
  To: dipankar; +Cc: torvalds, linux-kernel, paul.mckenney, andrea

   From: Dipankar Sarma <dipankar@in.ibm.com>
   Date: Tue, 28 May 2002 18:28:06 +0530
   
   Well, the last time RCU was discussed, Linus said that he would
   like to see someplace where RCU clearly helps.

Alexey and I are in firm agreement that the routing cache
clearly benefits from RCU.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* 8-CPU (SMP) #s for lockfree rtcache
@ 2002-05-28 11:41 Dipankar Sarma
  2002-05-28 11:25 ` David S. Miller
  0 siblings, 1 reply; 16+ messages in thread
From: Dipankar Sarma @ 2002-05-28 11:41 UTC (permalink / raw)
  To: Dave Miller; +Cc: linux-kernel

Hi Dave,

Here are the SMP numbers that you had asked for. They were measured
on an 8-CPU SMP box (PIII Xeon 700 MHz processors, 1MB L2 cache
and 6GB RAM).

The test was as per your suggestion. rt_rcu is a patch that
uses RCU to do lockfree lookup of ipv4 route cache.

The basic test sends a fixed number of packets to random
destination addresses repeating every dest for 5 packets.
We tried two configurations of this test - 8-1-32 where
32 test processes simultaneosly uses 32 different
random seeds to generate different dest addreses and 8-4-8
where 8 sets, each consisting 4 processes all of whom use the same
random seed generating the same dst addresses, send packets
to the dest addresses repeating dest for every 5 packets.

With these basic tests, we measured under several other
conditions - avoid forced neighbor table garbage collection
by increasing the threshold and interval (to 31048576 and
60) and slowing the packet rate by introducing a delay of 2ms
between bursts of 5 packets.

Here are the results in terms of profile counts in
ip_route_output_key() - gc stands for neighbor table garbage
collection adjustment and u2000 stands for 2ms packet
rate delay. All measurements where done based on  2.5.3 kernel.


Test                            base    rtrcu    speedup
----                            ----    -----    -------
8-1-32                          2358    1655     29.8%
8-1-32-gc                       4486    2176     51.4%
8-1-32-u2000                    2990    1942     35.0%
8-1-32-u2000-gc                 4047    2029     49.8%

8-4-8                           2870    1965     31.5%
8-4-8-gc                        3389    2083     38.5%
8-4-8-u2000                     3459    2373     31.3%
8-4-8-u2000-gc                  4686    2603     44.4%

Thanks
-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 8-CPU (SMP) #s for lockfree rtcache
  2002-05-28 11:41 Dipankar Sarma
@ 2002-05-28 11:25 ` David S. Miller
  2002-05-28 12:58   ` Dipankar Sarma
  0 siblings, 1 reply; 16+ messages in thread
From: David S. Miller @ 2002-05-28 11:25 UTC (permalink / raw)
  To: dipankar; +Cc: linux-kernel

   From: Dipankar Sarma <dipankar@in.ibm.com>
   Date: Tue, 28 May 2002 17:11:04 +0530
   
   Here are the results in terms of profile counts in
   ip_route_output_key() - gc stands for neighbor table garbage
   collection adjustment and u2000 stands for 2ms packet
   rate delay. All measurements where done based on  2.5.3 kernel.

Thanks, I am convinced RCU is the way to go.

Once the generic RCU bits are in the 2.5.x tree, feel free to
send me your ipv4 routing cache changes.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-06-03 12:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-28 19:57 8-CPU (SMP) #s for lockfree rtcache Dipankar Sarma
  -- strict thread matches above, loose matches on Subject: below --
2002-05-28 11:41 Dipankar Sarma
2002-05-28 11:25 ` David S. Miller
2002-05-28 12:58   ` Dipankar Sarma
2002-05-28 12:40     ` David S. Miller
2002-05-28 15:45       ` Andi Kleen
2002-05-28 17:03         ` Alan Cox
2002-05-28 16:34           ` Andi Kleen
2002-05-28 18:10             ` Alan Cox
2002-05-28 17:24               ` Andi Kleen
2002-05-29  4:44         ` Rusty Russell
2002-05-28 15:49     ` Robert Love
2002-05-28 16:25       ` Dipankar Sarma
2002-05-28 17:09         ` Dipankar Sarma
2002-05-29 17:44       ` kuznet
2002-06-03 12:08       ` Dipankar Sarma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).