linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Balancing near the locking cliff, with some numbers
@ 2005-11-04 19:56 Andi Kleen
  2005-11-05  3:12 ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2005-11-04 19:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: dipankar


I recently generated some data on how many locks/semaphores/atomics/memory
barriers simple system calls use. The original idea for this is from Dipankar, 
who did similar things for an older kernel.

I thought some folks might find these numbers interesting.

Making the kernel more scalable requires more locks and atomic counts etc.,
but each lock even when uncontended adds more overhead (on x86 everything
with a LOCK prefix is costly, depending on the CPU). The locking cliff is when
you add some many locks that (a) nobody can understand/debug all
the dependencies anymore and when the basic performance gets worse
and worse due to locking overhead.

Here are some results:

open:
locks: 106 sems: 32 atomics: 50 rwlocks: 30 irqsaves: 89 barriers: 47
read:
locks: 52 sems: 0 atomics: 16 rwlocks: 12 irqsaves: 69 barriers: 11
write:
locks: 38 sems: 4 atomics: 20 rwlocks: 8 irqsaves: 42 barriers: 12
page fault:
locks: 4 sems: 2 atomics: 2 rwlocks: 0 irqsaves: 9 barriers: 0

open: open a file on ext3
read: read a cache cold file from disk
write: write a new file
page fault: fault in an empty page

Notes:
I generated the numbers by running an instrumented SMP kernel
on my P-M laptop. It's not from a real multi processor machine.
The number always count lock and unlock separately.
atomic_t is any atomic_t operation
barriers includes mb/wmb/rmb
irqsaves don't have a lock prefix, but at least one some CPUs they tend to be 
quite slow too.

Summary:
I don't think Linux fell over the locking cliff yet. But it's getting 
dangerously near. Before you add new locking or atomics always think twice
and only add them when there is really a very good reason for it.
Even on MP systems using less locks and atomics can be faster.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Balancing near the locking cliff, with some numbers
  2005-11-04 19:56 Balancing near the locking cliff, with some numbers Andi Kleen
@ 2005-11-05  3:12 ` Al Viro
  2005-11-05  4:46   ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2005-11-05  3:12 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, dipankar

On Fri, Nov 04, 2005 at 08:56:59PM +0100, Andi Kleen wrote:
> open:
> locks: 106 sems: 32 atomics: 50 rwlocks: 30 irqsaves: 89 barriers: 47

How long was the pathname and how much of that was in cache?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Balancing near the locking cliff, with some numbers
  2005-11-05  3:12 ` Al Viro
@ 2005-11-05  4:46   ` Andi Kleen
  0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2005-11-05  4:46 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, dipankar

On Saturday 05 November 2005 04:12, Al Viro wrote:
> On Fri, Nov 04, 2005 at 08:56:59PM +0100, Andi Kleen wrote:
> > open:
> > locks: 106 sems: 32 atomics: 50 rwlocks: 30 irqsaves: 89 barriers: 47
>
> How long was the pathname and how much of that was in cache?

It read random files from /usr/X11R6/bin to get something out of cache.
The bin directory was likely all in dcache and the directory lpages in buffer 
cache.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Balancing near the locking cliff, with some numbers
  2005-11-14 12:03 linux
@ 2005-11-15  3:26 ` Andi Kleen
  0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2005-11-15  3:26 UTC (permalink / raw)
  To: linux; +Cc: linux-kernel

Mr Linux,

On Monday 14 November 2005 13:03, linux@horizon.com wrote:

> This is very interesting data, thank you!
> This is using the standard IDE driver?
> And the path names were absolute?

Yes. No.

> 
> What would be really nice is a full trace of the locks acquired so we
> can look for specific problems.  (I can see the OpenSolaris folks puffing
> up to crow about dtrace already.)

> Barring that, a few variants like hot-cache cases, different file systems
> (includig tmpfs), and different device drivers would be informative.
> (You could also try the different ext3 journalling modes.)

I have no plans to generate such data right now, but if you want to
do it yourself I can send you my patches as a starting point. Should be easy enough
using relayfs.

> I'm not sre quite how you did this, but assuming you just installed global
> counters via macros 

per process counters.

> and ran the test by booting with init=

from a normal shell in a running system

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Balancing near the locking cliff, with some numbers
@ 2005-11-14 12:03 linux
  2005-11-15  3:26 ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: linux @ 2005-11-14 12:03 UTC (permalink / raw)
  To: ak; +Cc: linux-kernel

> open:
> locks: 106 sems: 32 atomics: 50 rwlocks: 30 irqsaves: 89 barriers: 47
> read:
> locks: 52 sems: 0 atomics: 16 rwlocks: 12 irqsaves: 69 barriers: 11
> write:
> locks: 38 sems: 4 atomics: 20 rwlocks: 8 irqsaves: 42 barriers: 12
> page fault:
> locks: 4 sems: 2 atomics: 2 rwlocks: 0 irqsaves: 9 barriers: 0
> 
> open: open a file on ext3
> read: read a cache cold file from disk
> write: write a new file
> page fault: fault in an empty page

> It read random files from /usr/X11R6/bin to get something out of cache.
> The bin directory was likely all in dcache and the directory lpages in buffer 
> cache.

This is very interesting data, thank you!
This is using the standard IDE driver?
And the path names were absolute?

What would be really nice is a full trace of the locks acquired so we
can look for specific problems.  (I can see the OpenSolaris folks puffing
up to crow about dtrace already.)

Barring that, a few variants like hot-cache cases, different file systems
(includig tmpfs), and different device drivers would be informative.
(You could also try the different ext3 journalling modes.)


I'm not sre quite how you did this, but assuming you just installed global
counters via macros and ran the test by booting with init=, you could do
a bit better with a hook that would log 1000 filename/line number pairs,
stopping when the log was full, and another to read and clear the log.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-11-15  3:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-04 19:56 Balancing near the locking cliff, with some numbers Andi Kleen
2005-11-05  3:12 ` Al Viro
2005-11-05  4:46   ` Andi Kleen
2005-11-14 12:03 linux
2005-11-15  3:26 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).