RE: [PATCH 1/19] MUTEX: Introduce simple mutex implementation

All of lore.kernel.org
 help / color / mirror / Atom feed

* RE: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-15 17:45 ` Luck, Tony
  0 siblings, 0 replies; 239+ messages in thread
From: Luck, Tony @ 2005-12-15 17:45 UTC (permalink / raw)
  To: dhowells, Andrew Morton
  Cc: Mark Lord, tglx, alan, pj, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

     Okay, spinlocks are null ops when CONFIG_SMP and CONFIG_DEBUG_SPINLOCK
     are both disabled, but you still have to disable interrupts, and that
     slows things down, sometimes quite appreciably. It is, for example,
     something I really want to avoid doing on FRV as it takes a *lot* of
     cycles.

There was a USENIX paper a couple of decades ago that described how
to do a fast s/w disable of interrupts on machines where really disabling
interrupts was expensive.  The rough gist was that the spl[1-7]()
functions would just set a flag in memory to hold the desired interrupt
mask.  If an interrupt actually occurred when it was s/w blocked, the
handler would set a pending flag, and just rfi with interrupts disabled.
Then the splx() code checked to see whether there was a pending interrupt
and dealt with it if there was.

-Tony

^ permalink raw reply	[flat|nested] 239+ messages in thread

* RE: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-15 17:45 ` Luck, Tony
  0 siblings, 0 replies; 239+ messages in thread
From: Luck, Tony @ 2005-12-15 17:45 UTC (permalink / raw)
  To: dhowells, Andrew Morton
  Cc: Mark Lord, tglx, alan, pj, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

     Okay, spinlocks are null ops when CONFIG_SMP and CONFIG_DEBUG_SPINLOCK
     are both disabled, but you still have to disable interrupts, and that
     slows things down, sometimes quite appreciably. It is, for example,
     something I really want to avoid doing on FRV as it takes a *lot* of
     cycles.

There was a USENIX paper a couple of decades ago that described how
to do a fast s/w disable of interrupts on machines where really disabling
interrupts was expensive.  The rough gist was that the spl[1-7]()
functions would just set a flag in memory to hold the desired interrupt
mask.  If an interrupt actually occurred when it was s/w blocked, the
handler would set a pending flag, and just rfi with interrupts disabled.
Then the splx() code checked to see whether there was a pending interrupt
and dealt with it if there was.

-Tony

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 17:45 ` Luck, Tony
  (?)
@ 2005-12-15 18:00 ` David Howells
  -1 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-15 18:00 UTC (permalink / raw)
  To: Luck, Tony
  Cc: dhowells, Andrew Morton, Mark Lord, tglx, alan, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

Luck, Tony <tony.luck@intel.com> wrote:

> There was a USENIX paper a couple of decades ago that described how
> to do a fast s/w disable of interrupts on machines where really disabling
> interrupts was expensive.  The rough gist was that the spl[1-7]()
> functions would just set a flag in memory to hold the desired interrupt
> mask.

Cute. The slow bit on FRV is any time you access the PSR register (read or
write). It seems to be something on the order of 60 clock cycles a pop - in
which time the CPU could have executed 120 instructions under ideal
circumstances.

I do something like this to implement "atomic" operations, playing on the
FRV's ability to pack two instructions atomically together and to have
conditionally executed instructions:

	Documentation/fujitsu/frv/atomic-ops.txt.

Trading off against the memory speed might just do it - though you have to do
a write and a read (the latter of which should hopefully be cached). I could
always steal another register (I have 31-ish to play with, plus a bunch of
single-bit condition values).

It'd make the exception prologue even more "interesting" though...:-)

Hmmm...

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* RE: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 17:45 ` Luck, Tony
  (?)
  (?)
@ 2005-12-15 18:48 ` James Bottomley
  -1 siblings, 0 replies; 239+ messages in thread
From: James Bottomley @ 2005-12-15 18:48 UTC (permalink / raw)
  To: Luck, Tony
  Cc: dhowells, Andrew Morton, Mark Lord, tglx, alan, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Thu, 2005-12-15 at 09:45 -0800, Luck, Tony wrote:
> There was a USENIX paper a couple of decades ago that described how
> to do a fast s/w disable of interrupts on machines where really disabling
> interrupts was expensive.  The rough gist was that the spl[1-7]()
> functions would just set a flag in memory to hold the desired interrupt
> mask.  If an interrupt actually occurred when it was s/w blocked, the
> handler would set a pending flag, and just rfi with interrupts disabled.
> Then the splx() code checked to see whether there was a pending interrupt
> and dealt with it if there was.

Would you believe that that paper was written about the NCR Voyager
architecture (The VIC is very expensive for interrupt disables) and that
the current Linux Voyager Subarchitecture still makes partial use of the
scheme.

James



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 17:45 ` Luck, Tony
                   ` (2 preceding siblings ...)
  (?)
@ 2005-12-15 20:38 ` Jeff Dike
  2005-12-15 23:45   ` Stephen Rothwell
  -1 siblings, 1 reply; 239+ messages in thread
From: Jeff Dike @ 2005-12-15 20:38 UTC (permalink / raw)
  To: Luck, Tony
  Cc: dhowells, Andrew Morton, Mark Lord, tglx, alan, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Thu, Dec 15, 2005 at 09:45:10AM -0800, Luck, Tony wrote:
> There was a USENIX paper a couple of decades ago that described how
> to do a fast s/w disable of interrupts on machines where really disabling
> interrupts was expensive.  The rough gist was that the spl[1-7]()
> functions would just set a flag in memory to hold the desired interrupt
> mask.  If an interrupt actually occurred when it was s/w blocked, the
> handler would set a pending flag, and just rfi with interrupts disabled.
> Then the splx() code checked to see whether there was a pending interrupt
> and dealt with it if there was.

... and this is currently implemented (but not yet merged to mainline) in
UML.

				Jeff

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 20:38 ` Jeff Dike
@ 2005-12-15 23:45   ` Stephen Rothwell
  0 siblings, 0 replies; 239+ messages in thread
From: Stephen Rothwell @ 2005-12-15 23:45 UTC (permalink / raw)
  To: Jeff Dike
  Cc: tony.luck, dhowells, akpm, lkml, tglx, alan, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 1132 bytes --]

On Thu, 15 Dec 2005 15:38:18 -0500 Jeff Dike <jdike@addtoit.com> wrote:
>
> On Thu, Dec 15, 2005 at 09:45:10AM -0800, Luck, Tony wrote:
> > There was a USENIX paper a couple of decades ago that described how
> > to do a fast s/w disable of interrupts on machines where really disabling
> > interrupts was expensive.  The rough gist was that the spl[1-7]()
> > functions would just set a flag in memory to hold the desired interrupt
> > mask.  If an interrupt actually occurred when it was s/w blocked, the
> > handler would set a pending flag, and just rfi with interrupts disabled.
> > Then the splx() code checked to see whether there was a pending interrupt
> > and dealt with it if there was.
> 
> ... and this is currently implemented (but not yet merged to mainline) in
> UML.

And, of course, this is the way the PowerPC iSeries has always worked because
we are not allowed to disable hardware interrupts for long periods of time or
the hypervisor will consider that our logical partition is dead.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-16 12:49 linux
  2005-12-16 15:24 ` David Howells
  0 siblings, 1 reply; 239+ messages in thread
From: linux @ 2005-12-16 12:49 UTC (permalink / raw)
  To: dhowells; +Cc: linux, linux-kernel

> Now my point about using LL/SC is that:
> 
> 	1,C,A	cmpxchg(0,1) [failed]
> 	1,C,A	cmpxchg(1,3) [success]
> 	3,C,A	...
> 
> Can be turned into:
> 
> 	1,C,A	x = LL()
> 	1,C,A	x |= 2;
> 	1,C,A	SC(3) [success]
> 	3,C,A	...

... which can be turned back into

 	1,C,A	x = load()
 	1,C,A	x' = x | 2;
 	1,C,A	cmpxchg(x,x') [success]
 	3,C,A	...

which will fail and retry in exactly the same contention cases as the
LL/SC.  The only thing that LL gives you that's nice is a hint that
an SC is due very soon and so resisting a cache eviction for a couple
of cycles might be a good idea.

The reason that we tend to do the former is optimism that the lock
won't be held.  If that's a bad assumption, make it more pessimistic.

LL/SC can detect double changes during the critical section, but it's
very similar in expressive power to load + CMPXCHG.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 12:49 linux
@ 2005-12-16 15:24 ` David Howells
  2005-12-16 18:03   ` linux
  0 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-16 15:24 UTC (permalink / raw)
  To: linux; +Cc: dhowells, linux-kernel

linux@horizon.com wrote:

> > Can be turned into:
> > 
> > 	1,C,A	x = LL()
> > 	1,C,A	x |= 2;
> > 	1,C,A	SC(3) [success]
> > 	3,C,A	...
> 
> ... which can be turned back into
> 
>  	1,C,A	x = load()
>  	1,C,A	x' = x | 2;
>  	1,C,A	cmpxchg(x,x') [success]
>  	3,C,A	...

Which would be totally pointless.

If you have LL/SC, then the odds are you _don't_ have CMPXCHG, and that
CMPXCHG is implemented using LL/SC, so what you end up with is:


 	1,C,A	x = load()
 	1,C,A	x' = x | 2;
	1,C,A	y = LL()
	1,C,A	if (y == x)
	1,X,A		SC(x');
 	3,C,A	...

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 15:24 ` David Howells
@ 2005-12-16 18:03   ` linux
  0 siblings, 0 replies; 239+ messages in thread
From: linux @ 2005-12-16 18:03 UTC (permalink / raw)
  To: dhowells; +Cc: linux, linux-kernel

> Which would be totally pointless.
> 
> If you have LL/SC, then the odds are you _don't_ have CMPXCHG, and that
> CMPXCHG is implemented using LL/SC, so what you end up with is:

Ah, you're not quite understanding what I wrote, but I see the confusion.

I took "turned into" to mean "ported to an architecture with the
other primitive", and intended it that when I said "turned back".
That's obviously pointless if you're emulating one with the other.

The point I was making is that, for any LL/SC sequence, there is an
exactly analagous LD/CMPXCHG version, so you never have to have more
CMPXCHGs than SCs.

This was an attempt to disprove your claim that LL/SC was better by more
than a very small factor.

It it possible to optimize for the contention-free case and do away
with the initial load, at the expense of an additional CMPXCHG in the
failure case.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-15 13:58 linux
  2005-12-15 16:15 ` Linus Torvalds
  0 siblings, 1 reply; 239+ messages in thread
From: linux @ 2005-12-15 13:58 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel

Our Fearless Leader, in a fit of madness, intoned:
> A real semaphore is counting. 
> 
> Dammit, unless the pure mutex has a _huge_ performance advantage on major 
> architectures, we're not changing it. There's absolutely zero point. A 
> counting semaphore is a perfectly fine mutex - the fact that it can _also_ 
> be used to allow more than 1 user into a critical region and generally do 
> other things is totally immaterial.

You're being thick today.  Pay attention to the arguments.

A counting semaphore is NOT a perfectly fine mutex, and it SHOULD be changed.

People are indeed unhappy with the naming, and whether patching 95%
of the callers of up() and down() is a good idea is a valid and active
subject of debate.  (For an out-of-tree -rt patch, is was certaintly
an extremely practical solution.)

But regardless of the eventual naming convention, mutexes are a good idea.
A mutex is *safer* than a counting semaphore.  That's the main benefit.
Indeed, unless there's a performance advantage to a counting semaphore,
you should use a mutex!

It documents in the source what you're doing.  Using a counting semaphore
for a mutex is as silly as using a float for a loop index.  Even if
there isn't much speed penalty on modern processors.

Or perhaps I should compare it to using void * everywhere.  That's a
perfectly fine pointer; why type-check it?

A separate mutex type allows extra error-checking.  You can keep track
of the current holder (since there can be only one) and check that the
same person released it and didn't try to double-acquire it.

You can do priority inheritance, which is the main motivation for doing
this work in the -rt patches.

This isn't about speed, it's about bug-free code.  And having a mutex
abstraction distinct from a general counting semaphore is damn useful
error-checking, even if it is simply a thin wrapper over a counting
semaphore.  The only reason the code is full of counting semaphores
right now is that that's all people had.

Better to give them the right tool.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 13:58 linux
@ 2005-12-15 16:15 ` Linus Torvalds
  2005-12-15 16:52   ` Erik Mouw
                     ` (3 more replies)
  0 siblings, 4 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-15 16:15 UTC (permalink / raw)
  To: linux; +Cc: linux-kernel

On Thu, 15 Dec 2005, linux@horizon.com wrote:
> 
> A counting semaphore is NOT a perfectly fine mutex, and it SHOULD be changed.

Don't be silly.

First off, the data structure is called a "semaphore", and always has 
been. It's _never_ been called a "mutex" in the first place, and the 
operations have been called "down()" and "up()", because I thought calling 
them P() and V() was just too damn traditional and confusing (I don't 
speak dutch, and even if I did, I think shortening names to that degree is 
just evil).

And dammit, a counting semaphore (and usually you don't even say the 
"counting" part, since counting is really always there) is just about 
_the_ classical mutual exclusion mechanism. If somebody doesn't know that, 
he has absolutely _no_ place talking about mutexes etc.

And a semaphore _is_ a mutex. Anybody who disputes that is just being a 
total troll. Even classically, the case where the semaphore was 
initialized to 1 is very very traditional, and is very much part of the 
whole point of a semaphore. Sometimes they are called "binary semaphores", 
but dammit, they are just the same thing.

A patch that
 - creates a non-counting mutex
 - .. that is SLOWER than the current counting one
 - .. and keeps the old "semaphore" and "up/down" naming

is simply INCREDIBLY BROKEN. It has absolutely _zero_ redeeming features. 
I can't understand how there are a hundred emails in my mailbox even 
discussing it. 

And I can't understand how somebody has the balls to even say that a 
semaphore isn't a mutex. That's like saying that an object of type "long" 
isn't an integer, because only "int" objects are integers. That's just 
INSANE.

> People are indeed unhappy with the naming, and whether patching 95%
> of the callers of up() and down() is a good idea is a valid and active
> subject of debate.  (For an out-of-tree -rt patch, is was certaintly
> an extremely practical solution.)

Whatever people you claim are unhappy with the naming are
 - obviously totally unaware of very basic synchronization primitives
   used in concurrent programming
 - likely haven't spent any time at all looking at the kernel source code.
 - haven't _ever_ complained that I've seen before this totally made-up 
   discussion.

In other words, you are
 (a) totally making up the claim that people are really unhappy
 (b) jerking people around who _do_ know about semaphores and _have_ 
     worked with the kernel locking primitives and understand them well

So tell me, what do you think about your own arguments in that light?

> But regardless of the eventual naming convention, mutexes are a good idea.
> A mutex is *safer* than a counting semaphore.  That's the main benefit.
> Indeed, unless there's a performance advantage to a counting semaphore,
> you should use a mutex!

Hey, feel free to introduce a mutex, but DAMMIT, just call it that, 
instead of switching people over. 

And even then, it should damn well also:
 - really _be_ faster. On platforms that matter. 
 - have enough real other advantages that it's worth introducing another 
   abstraction, and more conceptual complexity. At least the RT patches 
   had a reason for them.

And besides, all your "safer" arguments are pretty damn pointless in the 
face of the fact that we have basically had zero bugs with the semaphores. 
This is not where the bugs happen. Yeah, yeah, double releases can happen, 
but it sure as hell isn't on my radar of things I remember people doing.

So when you say "This isn't about speed, this is about bug-free code", 
you're just making that up. It's doubly silly when your "safer" 
implementation uses totally illogical names. THAT is what creates bugs.

So go away.

Come back if you have pondered, and accepted reality, and perhaps have an 
acceptable patch that introduces a separate data structure. 

And no, we're not switching users over whole-sale. First you introduce the 
new concept. Only THEN can you can switch over INDIVIDUAL LOCKS with 
reasons for why it's worth it.

And hell yes, performance does matter.

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:15 ` Linus Torvalds
@ 2005-12-15 16:52   ` Erik Mouw
  2005-12-15 17:23     ` Dick Streefland
  2005-12-16 12:17     ` Erik Mouw
  2005-12-15 19:02   ` Nikita Danilov
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 239+ messages in thread
From: Erik Mouw @ 2005-12-15 16:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux, linux-kernel

On Thu, Dec 15, 2005 at 08:15:49AM -0800, Linus Torvalds wrote:
> First off, the data structure is called a "semaphore", and always has 
> been. It's _never_ been called a "mutex" in the first place, and the 
> operations have been called "down()" and "up()", because I thought calling 
> them P() and V() was just too damn traditional and confusing (I don't 
> speak dutch, and even if I did, I think shortening names to that degree is 
> just evil).

Just FYI, according to Dijkstra[1] V means "verhoog" which is dutch for
"increase". P means "prolaag" which isn't a dutch word, just something
Dijkstra invented. I guess he did that because "decrease" is "verlaag"
in dutch and that would give you the confusing V() and V()
operations...

Other explanations you see in dutch CS courses are "passeer" (pass),
"probeer" (try), "vrijgave" (unlock).

I do agree that Dijkstra should have used Prolaag() and Verhoog(), but
I guess those operations wouldn't have sticked in the english CS
literature.

Erik

[1] http://www.cs.utexas.edu/users/EWD/ewd00xx/EWD74.PDF

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:52   ` Erik Mouw
@ 2005-12-15 17:23     ` Dick Streefland
  2005-12-16 12:17     ` Erik Mouw
  1 sibling, 0 replies; 239+ messages in thread
From: Dick Streefland @ 2005-12-15 17:23 UTC (permalink / raw)
  To: linux-kernel

Erik Mouw <erik@harddisk-recovery.com> wrote:
| Just FYI, according to Dijkstra[1] V means "verhoog" which is dutch for
| "increase". P means "prolaag" which isn't a dutch word, just something
| Dijkstra invented. I guess he did that because "decrease" is "verlaag"
| in dutch and that would give you the confusing V() and V()
| operations...
| 
| Other explanations you see in dutch CS courses are "passeer" (pass),
| "probeer" (try), "vrijgave" (unlock).

As far as I can remember, P() stands for "pakken" (grab) and V()
stands for "vrijgeven" (release).

-- 
Dick Streefland                      ////                      Altium BV
dick.streefland@altium.nl           (@ @)          http://www.altium.com
--------------------------------oOO--(_)--OOo---------------------------

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:52   ` Erik Mouw
  2005-12-15 17:23     ` Dick Streefland
@ 2005-12-16 12:17     ` Erik Mouw
  2005-12-17 10:59       ` Sander
  1 sibling, 1 reply; 239+ messages in thread
From: Erik Mouw @ 2005-12-16 12:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux, linux-kernel

On Thu, Dec 15, 2005 at 05:52:55PM +0100, Erik Mouw wrote:
> Just FYI, according to Dijkstra[1] V means "verhoog" which is dutch for
> "increase". P means "prolaag" which isn't a dutch word, just something
> Dijkstra invented. I guess he did that because "decrease" is "verlaag"
> in dutch and that would give you the confusing V() and V()
> operations...

Last night I've been browsing a little more through Dijkstra's papers,
and in a completely unrelated paper[1] about a now obsolete computer I
found that "prolaag" is a neologism coming from "probeer te verlagen",
which means "try and decrease".


Erik

[1] http://www.cs.utexas.edu/users/EWD/transcriptions/EWD00xx/EWD51.html

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 12:17     ` Erik Mouw
@ 2005-12-17 10:59       ` Sander
  2005-12-17 14:14         ` Douglas McNaught
  2005-12-19 10:44         ` Erik Mouw
  0 siblings, 2 replies; 239+ messages in thread
From: Sander @ 2005-12-17 10:59 UTC (permalink / raw)
  To: Erik Mouw; +Cc: Linus Torvalds, linux, linux-kernel

Erik Mouw wrote (ao):
> On Thu, Dec 15, 2005 at 05:52:55PM +0100, Erik Mouw wrote:
> > Just FYI, according to Dijkstra[1] V means "verhoog" which is dutch for
> > "increase". P means "prolaag" which isn't a dutch word, just something
> > Dijkstra invented. I guess he did that because "decrease" is "verlaag"
> > in dutch and that would give you the confusing V() and V()
> > operations...
> 
> Last night I've been browsing a little more through Dijkstra's papers,
> and in a completely unrelated paper[1] about a now obsolete computer I
> found that "prolaag" is a neologism coming from "probeer te verlagen",
> which means "try and decrease".

"probeer te verlagen" translates to "try to decrease".

"try and decrease" would be "probeer en verlaag".

-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17 10:59       ` Sander
@ 2005-12-17 14:14         ` Douglas McNaught
  2005-12-17 15:09           ` Sander
  2005-12-19 10:44         ` Erik Mouw
  1 sibling, 1 reply; 239+ messages in thread
From: Douglas McNaught @ 2005-12-17 14:14 UTC (permalink / raw)
  To: sander; +Cc: Erik Mouw, Linus Torvalds, linux, linux-kernel

Sander <sander@humilis.net> writes:

> Erik Mouw wrote (ao):

>> Last night I've been browsing a little more through Dijkstra's papers,
>> and in a completely unrelated paper[1] about a now obsolete computer I
>> found that "prolaag" is a neologism coming from "probeer te verlagen",
>> which means "try and decrease".
>
> "probeer te verlagen" translates to "try to decrease".
>
> "try and decrease" would be "probeer en verlaag".

Just in case you don't know, "try and" in English is an informal
equivalent of "try to".  I agree your translation is probably better.  :)

-Doug

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17 14:14         ` Douglas McNaught
@ 2005-12-17 15:09           ` Sander
  0 siblings, 0 replies; 239+ messages in thread
From: Sander @ 2005-12-17 15:09 UTC (permalink / raw)
  To: Douglas McNaught; +Cc: sander, Erik Mouw, Linus Torvalds, linux, linux-kernel

Douglas McNaught wrote (ao):
> Sander <sander@humilis.net> writes:
> > Erik Mouw wrote (ao):
> >> Last night I've been browsing a little more through Dijkstra's papers,
> >> and in a completely unrelated paper[1] about a now obsolete computer I
> >> found that "prolaag" is a neologism coming from "probeer te verlagen",
> >> which means "try and decrease".
> >
> > "probeer te verlagen" translates to "try to decrease".
> >
> > "try and decrease" would be "probeer en verlaag".
> 
> Just in case you don't know, "try and" in English is an informal
> equivalent of "try to".  I agree your translation is probably better.  :)

I didn't, so thanks for the education :-)

I read it as two things (try something, and decrease), but I see what is
meant now. And I'm sure Erik Mouw knows his was around in both Dutch and
English :-)

-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17 10:59       ` Sander
  2005-12-17 14:14         ` Douglas McNaught
@ 2005-12-19 10:44         ` Erik Mouw
  1 sibling, 0 replies; 239+ messages in thread
From: Erik Mouw @ 2005-12-19 10:44 UTC (permalink / raw)
  To: Sander; +Cc: Linus Torvalds, linux, linux-kernel

On Sat, Dec 17, 2005 at 11:59:14AM +0100, Sander wrote:
> Erik Mouw wrote (ao):
> > Last night I've been browsing a little more through Dijkstra's papers,
> > and in a completely unrelated paper[1] about a now obsolete computer I
> > found that "prolaag" is a neologism coming from "probeer te verlagen",
> > which means "try and decrease".
> 
> "probeer te verlagen" translates to "try to decrease".
> 
> "try and decrease" would be "probeer en verlaag".

I know, but that's how Dijkstra translates it. I guess he knew what he
meant.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Eigen lab: Delftechpark 26, 2628 XH, Delft, Nederland
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:15 ` Linus Torvalds
  2005-12-15 16:52   ` Erik Mouw
@ 2005-12-15 19:02   ` Nikita Danilov
  2005-12-15 19:09   ` linux
  2005-12-15 20:52   ` Steven Rostedt
  3 siblings, 0 replies; 239+ messages in thread
From: Nikita Danilov @ 2005-12-15 19:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Linus Torvalds writes:
 > 
 > 
 > On Thu, 15 Dec 2005, linux@horizon.com wrote:
 > > 
 > > A counting semaphore is NOT a perfectly fine mutex, and it SHOULD be changed.
 > 
 > Don't be silly.
 > 
 > First off, the data structure is called a "semaphore", and always has 
 > been. It's _never_ been called a "mutex" in the first place, and the 
 > operations have been called "down()" and "up()", because I thought calling 
 > them P() and V() was just too damn traditional and confusing (I don't 
 > speak dutch, and even if I did, I think shortening names to that degree is 
 > just evil).
 > 
 > And dammit, a counting semaphore (and usually you don't even say the 
 > "counting" part, since counting is really always there) is just about 
 > _the_ classical mutual exclusion mechanism. If somebody doesn't know that, 
 > he has absolutely _no_ place talking about mutexes etc.

Dijkstra (that cannot talk about this due to much more serious reasons)
didn't know this, because semaphores were initially used as a
wait/signal mechanism to provide concurrency control between "process
context" and "interrupts" however they were called at the time, and
calling this "just mutual exclusion" is stretching a bit far.

Mutex implies usage pattern much narrower than generic semaphore. 

 > 
 > And a semaphore _is_ a mutex. 

Nope, a mutex is a semaphore and not other way around. For one thing, a
notion of ownership is well-defined for the mutex, but it is not for a
semaphore. This is what they call "sub-type".

 >                              Anybody who disputes that is just being a 
 > total troll. 

Oh wait... what is that thing on the right of my screen? This
is... gnome task-list!

[...]

 > 
 > And I can't understand how somebody has the balls to even say that a 
 > semaphore isn't a mutex. That's like saying that an object of type "long" 
 > isn't an integer, because only "int" objects are integers. That's just 
 > INSANE.

And the person that claims that "long" is an "int" is non-portable. :-)

[...]

 > 
 > 			Linus

Nikita.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:15 ` Linus Torvalds
  2005-12-15 16:52   ` Erik Mouw
  2005-12-15 19:02   ` Nikita Danilov
@ 2005-12-15 19:09   ` linux
  2005-12-15 19:52     ` Linus Torvalds
  2005-12-15 21:18     ` Steven Rostedt
  2005-12-15 20:52   ` Steven Rostedt
  3 siblings, 2 replies; 239+ messages in thread
From: linux @ 2005-12-15 19:09 UTC (permalink / raw)
  To: linux, torvalds; +Cc: linux-kernel

> And I can't understand how somebody has the balls to even say that a 
> semaphore isn't a mutex. That's like saying that an object of type "long" 
> isn't an integer, because only "int" objects are integers. That's just 
> INSANE.

I didn't say it isn't a mutex, I said it isn't a GOOD one!

The fundamental reason is that a semaphore doesn't have an owner, and
a mutex does.  And you can do a lot when you know who owns the lock.

>> People are indeed unhappy with the naming, and whether patching 95%
>> of the callers of up() and down() is a good idea is a valid and active
>> subject of debate.  (For an out-of-tree -rt patch, is was certaintly
>> an extremely practical solution.)

> In other words, you are
>  (a) totally making up the claim that people are really unhappy

Huh?  I thought *you* were violently unhappy with the idea of naming
mutex acquire and release down() and up(), and your e-mail is an example
of this unhapiness.

Am I making it up that you are unhappy with usurping the down() and up()
names for mutex use?  If this is you being happy, I'd hate to see
unhappy.

> So tell me, what do you think about your own arguments in that light?

I think they're still completely valid.  Nothing you've said even seems
to address the points I've raised.

>> But regardless of the eventual naming convention, mutexes are a good idea.
>> A mutex is *safer* than a counting semaphore.  That's the main benefit.
>> Indeed, unless there's a performance advantage to a counting semaphore,
>> you should use a mutex!

> Hey, feel free to introduce a mutex, but DAMMIT, just call it that, 
> instead of switching people over. 

As I said, as long as the -rt patch was not in the main tree, taking
advantage of the fact that 95% of the down() and up() callers just want
a mutex was a sensible implementation tradeoff.  For merging it into the
tree, it's ugly, and people don't like that.  The -rt folks have gotten
used to their naming perversions and so don't feel as much repugnance.

> And even then, it should damn well also:
>  - really _be_ faster. On platforms that matter. 
>  - have enough real other advantages that it's worth introducing another 
>    abstraction, and more conceptual complexity. At least the RT patches 
>    had a reason for them.

Agreed.  A mutex that's slower than a counting semaphore needs to be
dragged out behind the wodshed and strangled.  If you can't do
any better, it can just *be* a counting semaphore.

> And besides, all your "safer" arguments are pretty damn pointless in the 
> face of the fact that we have basically had zero bugs with the semaphores. 
> This is not where the bugs happen. Yeah, yeah, double releases can happen, 
> but it sure as hell isn't on my radar of things I remember people doing.

There haven't been problems with the semaphore *implementation*, but
people screw up and deadlock themselves often enough.  I sure remember
double-acquire lockups.  Forgive me if I don't grep the archives, but
I remember people showing code paths that led to them.

Admittedly, lock *ordering* problems are the most common deadlock
situtation but hey, guess what!  Priority inheritance code can be
extended to notice that, too.  (There's a performance hit, so it'd
be a debug option.)

But all of this requires that a lock have an identifiable owner, which
is something hat a mutex has and a semaphore fundamentally doesn't.

> So when you say "This isn't about speed, this is about bug-free code", 
> you're just making that up.
>
> It's doubly silly when your "safer" 
> implementation uses totally illogical names. THAT is what creates bugs.

If you want to argue about names, go discuss gay marriage.

I don't care what it's *called*.  I care that we have stronger
conditions that we can test for correctness.

> So go away.
> 
> Come back if you have pondered, and accepted reality, and perhaps have an 
> acceptable patch that introduces a separate data structure. 

Ha!  I still say you're wrong, and I'm not going to fold over an obvious
technical point just because of flaming.

Are we having some communication problems?  I find it hard to believe
that you're actually this *stupid*, but we might not be talking about
the same thing.

I took your posting to say that

a) Using the names "struct semaphore", "up()" and "down()" for a mutex
   is monumentally brain-dead.  I'm not arguing, although I understand
   the pragmatic reasons for the original abuse of notation.

b) There is no need for a mutex implementation, because a semaphore can
   do anything that a mutex can.  Here, I absolutely disagree.  There
   are things you can do with a mutex that you CANNOT do with a
   general semaphore, because a mutex has stronger invariants.

   A counting semaphore can do MOST of what a mutex does, and is
   demonstrably close enough for a lot of uses.

> And no, we're not switching users over whole-sale.  First you introduce the 
> new concept.  Only THEN can you can switch over INDIVIDUAL LOCKS with 
> reasons for why it's worth it.

Given that 95% of callers are using it as mutex, you're making this 20
times more work than necessary.  Convert 'em all and change the 5%
that need the counting back.

> And hell yes, performance does matter.

I'm not arguing, but this seems to be at odds with your earlier statement:
>>> Dammit, unless the pure mutex has a _huge_ performance advantage on major 
>>> architectures, we're not changing it.

There is obviously no reason to accept a performance *decrease*, but
any potential performance *increase* is unimportant and incidental.

Which is exactly what I said:
>> Indeed, unless there's a performance advantage to a counting semaphore,
>> you should use a mutex!

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:09   ` linux
@ 2005-12-15 19:52     ` Linus Torvalds
  2005-12-16  1:33       ` linux
  2005-12-15 21:18     ` Steven Rostedt
  1 sibling, 1 reply; 239+ messages in thread
From: Linus Torvalds @ 2005-12-15 19:52 UTC (permalink / raw)
  To: linux; +Cc: linux-kernel

On Thu, 15 Dec 2005, linux@horizon.com wrote:
> 
> Huh?  I thought *you* were violently unhappy with the idea of naming
> mutex acquire and release down() and up(), and your e-mail is an example
> of this unhapiness.

Ahh, I thought you were considering naming unhappiness to be a reason
_for_ the mutex change.

down() and up() aren't the traditional names for the operations, P()/V()
are, and I thought you were arguing that some people might dislike the
_current_ naming.

> There haven't been problems with the semaphore *implementation*, but
> people screw up and deadlock themselves often enough.  I sure remember
> double-acquire lockups.  Forgive me if I don't grep the archives, but
> I remember people showing code paths that led to them.

Double aquires certainly occasionally happen, but they are (assuming it's 
a real double aquire, and not a race due to lock ordering) easy to see, 
since it just hangs the process and you get a traceback and find it.

But mutexes don't help either. A mutex will hang exactly the same way, 
with exactly the same behaviour, and aren't any easier to debug (as 
mentioned, a hung semaphore isn't exactly hard to debug).

Or maybe you're talking about a _recursive_ mutex, which is something that 
actually has totally different semantics. We've discussed adding them, but 
pretty much every time it was the result of some really really bad locking 
design, so at least so far we've instead decided to bite the bullet and 
fix the locking.

So yes, recursive mutexes can be easier to use, but they really do allow 
(and thus indirectly encourage) bad locking. So I'm not convinced we want 
one.

> But all of this requires that a lock have an identifiable owner, which
> is something hat a mutex has and a semaphore fundamentally doesn't.

Actually, we've certainly had semaphore debugging patches which consider 
the last person who successfully got a semaphore to be the "owner".

Sure, it's not well-defined for the generic semaphore case, but so what? 
In the generic case, you can't use mutexes anyway. For _debugging_ 
ownership as in "who got this lock last" is still useful. I know I've 
cooked up trivial patches to do that when I was trying to figure out a 
lock ordering bug.

Google is your friend. Just try "semaphore owner debug", and the #2 hit is 
a patch that does exactly that for Linux.

> I don't care what it's *called*.  I care that we have stronger
> conditions that we can test for correctness.

Hey, if so, please don't encourage the current patch. 

We can certainly add a new lockign mechanism, I'm just not at all 
convinced that it's warranted. I certainly disagree with you that using 
semaphores would somehow be less easy to test for correctness.

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:52     ` Linus Torvalds
@ 2005-12-16  1:33       ` linux
  0 siblings, 0 replies; 239+ messages in thread
From: linux @ 2005-12-16  1:33 UTC (permalink / raw)
  To: linux, torvalds; +Cc: linux-kernel

> Ahh, I thought you were considering naming unhappiness to be a reason
> _for_ the mutex change.

Good gods, no.  The best mnemonics for P() and V() I've ever seen
were in the Amiga which called them Procure() and Vacate(), but the
names still suck mightily.

> Double aquires certainly occasionally happen, but they are (assuming it's 
> a real double aquire, and not a race due to lock ordering) easy to see, 
> since it just hangs the process and you get a traceback and find it.

Ah, now we get to the valid point.

> But mutexes don't help either. A mutex will hang exactly the same way, 
> with exactly the same behaviour, and aren't any easier to debug (as 
> mentioned, a hung semaphore isn't exactly hard to debug).

Well, a mutex can detect it immediately, rather than via a timeout,
but that's a matter of a few seconds, and the vast majority of the
evidence you want is frozen by the deadlock itself.

> So yes, recursive mutexes can be easier to use, but they really do allow 
> (and thus indirectly encourage) bad locking. So I'm not convinced we want 
> one.

Agreed.  Maybe someone will someday find an application where there's
really no way around it, but avoiding the need is generally better
design.

>> But all of this requires that a lock have an identifiable owner, which
>> is something hat a mutex has and a semaphore fundamentally doesn't.

> Actually, we've certainly had semaphore debugging patches which consider 
> the last person who successfully got a semaphore to be the "owner".
> 
> Sure, it's not well-defined for the generic semaphore case, but so what? 

An excellent point.  As long as it's only used for post-mortem debugging,
and not to verify invariants at run-time, you can keep track of
"who would be the woner if this were a mutex".

>> I don't care what it's *called*.  I care that we have stronger
>> conditions that we can test for correctness.

> Hey, if so, please don't encourage the current patch. 
>
> We can certainly add a new locking mechanism, I'm just not at all 
> convinced that it's warranted. I certainly disagree with you that using 
> semaphores would somehow be less easy to test for correctness.

Let's go through what you lose if you give up a known lock owner and
just use something probabilistic....

Detecting deadlocks - can be done immediately and definitively with
a lock owner, and only via timeouts without.  Still, not a deal-breaker.

Multi-lock deadlocks - same, although the detection code is
more complex so probably shouldn't be enabled all the time.

Double-release: can be instantly detected with a known mutex, but will
produce really odd misbehaviour with a semaphore.  Still, you're quite
right that failing to release on a failure path is by far the more
common bug.

Priority inheritance.  This is the original reason that the -rt patch
implemented mutexes, and requires accurate lock owner information.
Not having it is a show-stopper here.

Source documentation.  This is more a style thing, but I really
like putting as much information into the source as possible.
(If comments worked, we wouldn't need sparse.)  

So the latter two are the only really good reasons.  Still, I think
they're persuasive.

Can anyone think of any other benefits?

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:09   ` linux
  2005-12-15 19:52     ` Linus Torvalds
@ 2005-12-15 21:18     ` Steven Rostedt
  1 sibling, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 21:18 UTC (permalink / raw)
  To: linux; +Cc: linux-kernel, torvalds

On Thu, 2005-12-15 at 14:09 -0500, linux@horizon.com wrote:
> >
> 
> As I said, as long as the -rt patch was not in the main tree, taking
> advantage of the fact that 95% of the down() and up() callers just want
> a mutex was a sensible implementation tradeoff.  For merging it into the
> tree, it's ugly, and people don't like that.  The -rt folks have gotten
> used to their naming perversions and so don't feel as much repugnance.
> 

The naming in the -rt side is to try to keep things as much in parallel
to the mainline as possible.  I don't think Ingo would have a problem
with up / down just being used for semaphores, and having true mutex
names for just that.  In fact that would help us a lot.  A lot of bugs
fixes that I send to Ingo, is finding places that use mutex when they
are really counting semaphores, and thus cant have PI.

>

...

> 
> > So when you say "This isn't about speed, this is about bug-free code", 
> > you're just making that up.
> >
> > It's doubly silly when your "safer" 
> > implementation uses totally illogical names. THAT is what creates bugs.
> 
> If you want to argue about names, go discuss gay marriage.

Are you suggesting a "mutex union"?

> 
> I don't care what it's *called*.  I care that we have stronger
> conditions that we can test for correctness.

Well a name is helpful in understanding what's going on.  Especially if
we want new up and coming kernel programmers to help out.  Instead of
staying with what is there now.

Also, while we're at it, lets fix that damn down_trylock (or
mutex_trylock) to return 1 on success, 0 on contention, just like the
spin_trylock does!!!

Actually, that alone is a good argument to not keep the same names.  We
can keep down_trylock as the same perverted self, and have mutex_trylock
do it right.  Of course, special care is needed when doing this
conversion, but a wrong pick should show itself right away.

> 
> > So go away.
> > 
> > Come back if you have pondered, and accepted reality, and perhaps have an 
> > acceptable patch that introduces a separate data structure. 
> 
> Ha!  I still say you're wrong, and I'm not going to fold over an obvious
> technical point just because of flaming.
> 
> Are we having some communication problems?  I find it hard to believe
> that you're actually this *stupid*, but we might not be talking about
> the same thing.
> 
> I took your posting to say that
> 
> a) Using the names "struct semaphore", "up()" and "down()" for a mutex
>    is monumentally brain-dead.  I'm not arguing, although I understand
>    the pragmatic reasons for the original abuse of notation.
> 
> b) There is no need for a mutex implementation, because a semaphore can
>    do anything that a mutex can.  Here, I absolutely disagree.  There
>    are things you can do with a mutex that you CANNOT do with a
>    general semaphore, because a mutex has stronger invariants.
> 
>    A counting semaphore can do MOST of what a mutex does, and is
>    demonstrably close enough for a lot of uses.
> 
> > And no, we're not switching users over whole-sale.  First you introduce the 
> > new concept.  Only THEN can you can switch over INDIVIDUAL LOCKS with 
> > reasons for why it's worth it.
> 
> Given that 95% of callers are using it as mutex, you're making this 20
> times more work than necessary.  Convert 'em all and change the 5%
> that need the counting back.

I disagree with doing that.  Especially since I've argued that a mutex
is a semaphore, but a semaphore is not a mutex.  So I rather go slowly
changing the semaphores that are acting as mutexes, (since those that
are not changed are not broken) than doing the change all mutexes to
semaphores, where a mutex can not always act like a semaphore, and then
go and break those 5%.

In reality, this is what the RT patch did. All semaphores (up / down)
became mutexes, and then we manually found the counting semaphores and
started switching them to compat_semaphores (what semaphore is today).
I'm still sending in patches to fix these.

-- Steve

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:15 ` Linus Torvalds
                     ` (2 preceding siblings ...)
  2005-12-15 19:09   ` linux
@ 2005-12-15 20:52   ` Steven Rostedt
  3 siblings, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 20:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux

On Thu, 2005-12-15 at 08:15 -0800, Linus Torvalds wrote:
> 
> On Thu, 15 Dec 2005, linux@horizon.com wrote:
> > 
> > A counting semaphore is NOT a perfectly fine mutex, and it SHOULD be changed.
> 
> Don't be silly.
> 
> First off, the data structure is called a "semaphore", and always has 
> been. It's _never_ been called a "mutex" in the first place, and the 
> operations have been called "down()" and "up()", because I thought calling 
> them P() and V() was just too damn traditional and confusing (I don't 
> speak dutch, and even if I did, I think shortening names to that degree is 
> just evil).

Thank you for the down and up, I always had problems way back when I was
in college.  I could never remember which was which (between P and V).

> And dammit, a counting semaphore (and usually you don't even say the 
> "counting" part, since counting is really always there) is just about 
> _the_ classical mutual exclusion mechanism. If somebody doesn't know that, 
> he has absolutely _no_ place talking about mutexes etc.
> 
> And a semaphore _is_ a mutex. Anybody who disputes that is just being a 
> total troll. Even classically, the case where the semaphore was 
> initialized to 1 is very very traditional, and is very much part of the 
> whole point of a semaphore. Sometimes they are called "binary semaphores", 
> but dammit, they are just the same thing.

<Total Troll> ;)

Uh, I think that a semaphore is _not_ a mutex.  As someone early said,
that a mutex is a semaphore, but not the other way around.  I have used
counting semaphores for resource allocations, not the semaphore=1 kind.
This is not a mutual exclusion, its shared.

Also a semaphore can be declared locked, a mutex cant.

Of course, if you said "binary semaphores" is a mutex, then I would
agree with you.

</Total Troll>


> 
> A patch that
>  - creates a non-counting mutex
>  - .. that is SLOWER than the current counting one
>  - .. and keeps the old "semaphore" and "up/down" naming
> 
> is simply INCREDIBLY BROKEN. It has absolutely _zero_ redeeming features. 
> I can't understand how there are a hundred emails in my mailbox even 
> discussing it. 

ACK

But that was just somebody's (David) first crack at this patch.  But
I've been pushing that he should first submit the mutex, where everyone
else can help make it really fast, and then submit the case by case
places that the semaphores should be replaced with mutex.  That's the
most logical way I see it. And yes, even if that means lots of patches,
but it makes it easier for more than one person to submit that, and
review.

> 
> And I can't understand how somebody has the balls to even say that a 
> semaphore isn't a mutex. That's like saying that an object of type "long" 
> isn't an integer, because only "int" objects are integers. That's just 
> INSANE.

No it's like saying a integer is a long. Wait did someone else say that
already?

> 
> > People are indeed unhappy with the naming, and whether patching 95%
> > of the callers of up() and down() is a good idea is a valid and active
> > subject of debate.  (For an out-of-tree -rt patch, is was certaintly
> > an extremely practical solution.)
> 
> Whatever people you claim are unhappy with the naming are
>  - obviously totally unaware of very basic synchronization primitives
>    used in concurrent programming
>  - likely haven't spent any time at all looking at the kernel source code.
>  - haven't _ever_ complained that I've seen before this totally made-up 
>    discussion.

:( I'm unhappy with calling a mutex down and up. But lets see if I am
any of the above?

1. Being the one to remove the global PI lock in the rt patch, gives me
some credit that I understand basic synchronization primitives used in
concurrent programming.

2. Have been playing with the Linux kernel source since 1998 (just not
publicly until 2003).

3.  OK, you got me there ;)


> 
> In other words, you are
>  (a) totally making up the claim that people are really unhappy
>  (b) jerking people around who _do_ know about semaphores and _have_ 
>      worked with the kernel locking primitives and understand them well

I'm one that would like to see semaphores turn to mutex when they really
are one.  But I'd like to keep up / down for semaphores (as they are
today) and introduce a new mutex api mutex_lock / mutex_unlock, since I
think that is the best way to explain what's going on.

> 
> So tell me, what do you think about your own arguments in that light?
> 
> > But regardless of the eventual naming convention, mutexes are a good idea.
> > A mutex is *safer* than a counting semaphore.  That's the main benefit.
> > Indeed, unless there's a performance advantage to a counting semaphore,
> > you should use a mutex!
> 
> Hey, feel free to introduce a mutex, but DAMMIT, just call it that, 
> instead of switching people over. 

ACK

> 
> And even then, it should damn well also:
>  - really _be_ faster. On platforms that matter. 
>  - have enough real other advantages that it's worth introducing another 
>    abstraction, and more conceptual complexity. At least the RT patches 
>    had a reason for them.

Actually, I would like this change to make it in mainline to help with
the RT patches.

> 
> And besides, all your "safer" arguments are pretty damn pointless in the 
> face of the fact that we have basically had zero bugs with the semaphores. 
> This is not where the bugs happen. Yeah, yeah, double releases can happen, 
> but it sure as hell isn't on my radar of things I remember people doing.
> 
> So when you say "This isn't about speed, this is about bug-free code", 
> you're just making that up. It's doubly silly when your "safer" 
> implementation uses totally illogical names. THAT is what creates bugs.
> 
> So go away.
> 
> Come back if you have pondered, and accepted reality, and perhaps have an 
> acceptable patch that introduces a separate data structure. 
> 
> And no, we're not switching users over whole-sale. First you introduce the 
> new concept. Only THEN can you can switch over INDIVIDUAL LOCKS with 
> reasons for why it's worth it.
> 
> And hell yes, performance does matter.

And so do a lot of other things.

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-12 23:45 David Howells
  2005-12-13  0:13 ` Nick Piggin
                   ` (14 more replies)
  0 siblings, 15 replies; 239+ messages in thread
From: David Howells @ 2005-12-12 23:45 UTC (permalink / raw)
  To: torvalds, akpm, hch, arjan, matthew; +Cc: linux-kernel, linux-arch

The attached patch introduces a simple mutex implementation as an alternative
to the usual semaphore implementation where simple mutex functionality is all
that is required.

This is useful in two ways:

 (1) A number of archs only provide very simple atomic instructions (such as
     XCHG on i386, TAS on M68K, SWAP on FRV) which aren't sufficient to
     implement full semaphore support directly. Instead spinlocks must be
     employed to implement fuller functionality.

 (2) This makes it obvious in what way the semaphore is being used: whether
     it's being used as a mutex or being used as a counter.

This patch set does the following:

 (1) Provides a simple xchg() based semaphore as a default for all
     architectures that don't wish to override it and provide their own.

     Overriding is possible by setting CONFIG_ARCH_IMPLEMENTS_MUTEX and
     supplying asm/mutex.h

     Partial overriding is possible by #defining mutex_grab(), mutex_release()
     and is_mutex_locked() to perform the appropriate optimised functions.

 (2) Provides linux/mutex.h as a common include for gaining access to mutex
     semaphores.

 (3) Provides linux/semaphore.h as a common include for gaining access to all
     the different types of semaphore that may be used from within the kernel.

 (4) Renames down*() to down_sem*() and up() to up_sem() for the traditional
     semaphores, and removes init_MUTEX*() and DECLARE_MUTEX*() from
     asm/semaphore.h

 (5) Redirects the following to apply to the new mutexes rather than the
     traditional semaphores:

	down()
	down_trylock()
	down_interruptible()
	up()
	init_MUTEX()
     	init_MUTEX_LOCKED()
	DECLARE_MUTEX()
	DECLARE_MUTEX_LOCKED()

     On the basis that most usages of semaphores are as mutexes, this makes
     sense for in most cases it's just then a matter of changing the type from
     struct semaphore to struct mutex. In some cases, sema_init() has to be
     changed to init_MUTEX*() also.

 (6) Generally include linux/semaphore.h in place of asm/semaphore.h.

 (7) Provides a debugging config option CONFIG_DEBUG_MUTEX_OWNER by which the
     mutex owner can be tracked and by which over-upping can be detected.

Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat -p1 mutex-simple-2615rc5.diff
 include/linux/mutex-simple.h |  194 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/mutex.h        |   32 +++++++
 include/linux/semaphore.h    |   30 ++++++
 lib/Kconfig.debug            |    8 +
 lib/Makefile                 |    4 
 lib/mutex-simple.c           |  178 +++++++++++++++++++++++++++++++++++++++
 lib/semaphore-sleepers.c     |    8 -
 7 files changed, 450 insertions(+), 4 deletions(-)

diff -uNrp /warthog/kernels/linux-2.6.15-rc5/include/linux/semaphore.h linux-2.6.15-rc5-mutex/include/linux/semaphore.h
--- /warthog/kernels/linux-2.6.15-rc5/include/linux/semaphore.h	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc5-mutex/include/linux/semaphore.h	2005-12-12 22:03:53.000000000 +0000
@@ -0,0 +1,30 @@
+/* semaphore.h: include the various types of semaphore in one package
+ *
+ * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _LINUX_SEMAPHORE_H
+#define _LINUX_SEMAPHORE_H
+
+/*
+ * simple mutex semaphores
+ */
+#include <linux/mutex.h>
+
+/*
+ * multiple-count semaphores
+ */
+#include <asm/semaphore.h>
+
+/*
+ * read/write semaphores
+ */
+#include <linux/rwsem.h>
+
+#endif /* _LINUX_SEMAPHORE_H */
diff -uNrp /warthog/kernels/linux-2.6.15-rc5/include/linux/mutex.h linux-2.6.15-rc5-mutex/include/linux/mutex.h
--- /warthog/kernels/linux-2.6.15-rc5/include/linux/mutex.h	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc5-mutex/include/linux/mutex.h	2005-12-12 22:13:30.000000000 +0000
@@ -0,0 +1,32 @@
+/* mutex.h: mutex semaphore implementation base
+ *
+ * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#ifndef _LINUX_MUTEX_H
+#define _LINUX_MUTEX_H
+
+#include <linux/config.h>
+
+#ifdef CONFIG_ARCH_IMPLEMENTS_MUTEX
+
+/*
+ * the arch wants to implement the whole mutex itself
+ */
+#include <asm/mutex.h>
+#else
+
+/*
+ * simple exchange-based mutex
+ * - the arch may override mutex_grab(), mutex_release() and is_mutex_locked()
+ *   to use something other than xchg() by #defining mutex_grab
+ */
+#include <linux/mutex-simple.h>
+#endif
+
+#endif /* _LINUX_MUTEX_H */
diff -uNrp /warthog/kernels/linux-2.6.15-rc5/include/linux/mutex-simple.h linux-2.6.15-rc5-mutex/include/linux/mutex-simple.h
--- /warthog/kernels/linux-2.6.15-rc5/include/linux/mutex-simple.h	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc5-mutex/include/linux/mutex-simple.h	2005-12-12 22:26:11.000000000 +0000
@@ -0,0 +1,194 @@
+/* mutex-simple.h: simple exchange-based mutexes
+ *
+ * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ *
+ * This doesn't require the arch to do anything for straightforward xchg()
+ * based mutexes
+ *
+ * If the sets CONFIG_ARCH_IMPLEMENTS_MUTEX then this implementation will not
+ * be used, and the arch should supply asm/mutex.h.
+ *
+ * If the arch defines mutex_grab(), mutex_release() and is_mutex_locked() for
+ * itself, then those will be used to provide the appropriate functionality
+ *
+ * See lib/mutex-simple.c for the slow-path implementation.
+ */
+#ifndef _LINUX_MUTEX_SIMPLE_H
+#define _LINUX_MUTEX_SIMPLE_H
+
+#ifndef _LINUX_MUTEX_H
+#error linux/mutex-xchg.h should not be included directly; use linux/mutex.h instead
+#endif
+
+#ifndef __ASSEMBLY__
+
+#include <linux/linkage.h>
+#include <linux/wait.h>
+#include <linux/spinlock.h>
+#include <asm/system.h>
+
+/*
+ * the mutex semaphore definition
+ * - if state is 0, then the mutex is available
+ * - if state is non-zero, then the mutex is busy
+ * - if wait_list is not empty, then there are processes waiting for the mutex
+ */
+struct mutex {
+	int			state;
+	spinlock_t		wait_lock;
+	struct list_head	wait_list;
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+	struct task_struct	*__owner;
+#endif
+};
+
+#ifndef mutex_grab
+/*
+ * mutex_grab() attempts to grab the mutex and returns true if successful
+ */
+#define mutex_grab(mutex)	(xchg(&(mutex)->state, 1) == 0)
+
+/*
+ * mutex_release() releases the mutex
+ */
+#define mutex_release(mutex)	do { (mutex)->state = 0; } while(0)
+
+/*
+ * is_mutex_locked() returns non-zero if the mutex is locked
+ */
+#define is_mutex_locked(mutex)	((mutex)->state)
+#endif
+
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+# define __MUTEX_OWNER_INIT(owner) , .__owner = (owner)
+#else
+# define __MUTEX_OWNER_INIT(owner)
+#endif
+
+#define __MUTEX_INITIALISER(name,_state,owner)			\
+{								\
+	.state		= (_state),				\
+	.wait_lock	= SPIN_LOCK_UNLOCKED,			\
+	.wait_list	= LIST_HEAD_INIT((name).wait_list)	\
+	__MUTEX_OWNER_INIT(owner)				\
+}
+
+#define __DECLARE_MUTEX_GENERIC(name, owner, state)			\
+	struct mutex name = __MUTEX_INITIALISER(name, owner, state)
+
+#define DECLARE_MUTEX(name) \
+	__DECLARE_MUTEX_GENERIC(name, 0, NULL)
+
+#define DECLARE_MUTEX_LOCKED(name, owner) \
+	__DECLARE_MUTEX_GENERIC(name, 1, (owner))
+
+static inline void mutex_init(struct mutex *mutex,
+			      unsigned state,
+			      struct task_struct *owner)
+{
+	mutex->state = state;
+	spin_lock_init(&mutex->wait_lock);
+	INIT_LIST_HEAD(&mutex->wait_list);
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+	mutex->__owner = owner;
+#endif
+}
+
+static inline void init_MUTEX(struct mutex *mutex)
+{
+	mutex_init(mutex, 0, NULL);
+}
+
+static inline void init_MUTEX_LOCKED (struct mutex *mutex)
+{
+	mutex_init(mutex, 1, current);
+}
+
+extern void __down(struct mutex *mutex);
+extern int  __down_interruptible(struct mutex *mutex);
+extern void __up(struct mutex *mutex);
+
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+extern void __up_bad(struct mutex *mutex);
+#endif
+
+/*
+ * sleep until we get the mutex
+ */
+static inline void down(struct mutex *mutex)
+{
+	if (mutex_grab(mutex)) {
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+		mutex->__owner = current;
+#endif
+	}
+	else {
+		__down(mutex);
+	}
+}
+
+/*
+ * sleep interruptibly until we get the mutex
+ * - return 0 if successful, -EINTR if interrupted
+ */
+static inline int down_interruptible(struct mutex *mutex)
+{
+	if (mutex_grab(mutex)) {
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+		mutex->__owner = current;
+#endif
+		return 0;
+	}
+
+	return __down_interruptible(mutex);
+}
+
+/*
+ * attempt to grab the mutex without waiting for it to become available
+ * - returns zero if we acquired it
+ */
+static inline int down_trylock(struct mutex *mutex)
+{
+	if (mutex_grab(mutex)) {
+		/* success */
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+		mutex->__owner = current;
+#endif
+		return 0;
+	}
+
+	/* failure */
+	return 1;
+}
+
+/*
+ * release the mutex
+ */
+static inline void up(struct mutex *mutex)
+{
+	unsigned long flags;
+
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+	if (mutex->__owner != current)
+		__up_bad(mutex);
+	mutex->__owner = NULL;
+#endif
+
+	/* must prevent a race */
+	spin_lock_irqsave(&mutex->wait_lock, flags);
+	if (!list_empty(&mutex->wait_list))
+		__up(mutex);
+	else
+		mutex_release(mutex);
+	spin_unlock_irqrestore(&mutex->wait_lock, flags);
+}
+
+#endif /* __ASSEMBLY__ */
+#endif /* _LINUX_MUTEX_SIMPLE_H */
diff -uNrp /warthog/kernels/linux-2.6.15-rc5/lib/Kconfig.debug linux-2.6.15-rc5-mutex/lib/Kconfig.debug
--- /warthog/kernels/linux-2.6.15-rc5/lib/Kconfig.debug	2005-12-08 16:23:56.000000000 +0000
+++ linux-2.6.15-rc5-mutex/lib/Kconfig.debug	2005-12-12 16:59:35.000000000 +0000
@@ -111,6 +111,14 @@ config DEBUG_SPINLOCK_SLEEP
 	  If you say Y here, various routines which may sleep will become very
 	  noisy if they are called with a spinlock held.
 
+config DEBUG_MUTEX_OWNER
+	bool "Mutex owner tracking and checking"
+	depends on DEBUG_KERNEL
+	help
+	  If you say Y here, the process currently owning a mutex will be
+	  remembered, and a warning will be issued if anyone other than that
+	  process releases it.
+
 config DEBUG_KOBJECT
 	bool "kobject debugging"
 	depends on DEBUG_KERNEL
diff -uNrp /warthog/kernels/linux-2.6.15-rc5/lib/Makefile linux-2.6.15-rc5-mutex/lib/Makefile
--- /warthog/kernels/linux-2.6.15-rc5/lib/Makefile	2005-12-08 16:23:56.000000000 +0000
+++ linux-2.6.15-rc5-mutex/lib/Makefile	2005-12-12 18:59:21.000000000 +0000
@@ -28,6 +28,10 @@ ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
   lib-y += dec_and_lock.o
 endif
 
+ifneq ($(CONFIG_ARCH_IMPLEMENTS_MUTEX),y)
+  lib-y += mutex-simple.o
+endif
+
 obj-$(CONFIG_CRC_CCITT)	+= crc-ccitt.o
 obj-$(CONFIG_CRC16)	+= crc16.o
 obj-$(CONFIG_CRC32)	+= crc32.o
diff -uNrp /warthog/kernels/linux-2.6.15-rc5/lib/mutex-simple.c linux-2.6.15-rc5-mutex/lib/mutex-simple.c
--- /warthog/kernels/linux-2.6.15-rc5/lib/mutex-simple.c	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.15-rc5-mutex/lib/mutex-simple.c	2005-12-12 22:27:00.000000000 +0000
@@ -0,0 +1,178 @@
+/* mutex-simple.c: simple mutex slow paths
+ *
+ * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/config.h>
+#include <linux/sched.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+
+struct mutex_waiter {
+	struct list_head	list;
+	struct task_struct	*task;
+};
+
+/*
+ * wait for a token to be granted from a mutex
+ */
+void __down(struct mutex *mutex)
+{
+	struct mutex_waiter waiter;
+	struct task_struct *tsk = current;
+	unsigned long flags;
+
+	/* set up my own style of waitqueue */
+	waiter.task = tsk;
+
+	spin_lock_irqsave(&mutex->wait_lock, flags);
+
+	if (mutex_grab(mutex)) {
+		/* we got the mutex anyway */
+		spin_unlock_irqrestore(&mutex->wait_lock, flags);
+		return;
+	}
+
+	/* need to sleep */
+	get_task_struct(tsk);
+	list_add_tail(&waiter.list, &mutex->wait_list);
+
+	/* we don't need to touch the mutex struct anymore */
+	spin_unlock_irqrestore(&mutex->wait_lock, flags);
+
+	/* wait to be given the mutex */
+	set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+
+	for (;;) {
+		if (list_empty(&waiter.list))
+			break;
+		schedule();
+		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+	}
+
+	tsk->state = TASK_RUNNING;
+}
+
+EXPORT_SYMBOL(__down);
+
+/*
+ * interruptibly wait for a token to be granted from a mutex
+ */
+int __down_interruptible(struct mutex *mutex)
+{
+	struct mutex_waiter waiter;
+	struct task_struct *tsk = current;
+	unsigned long flags;
+	int ret;
+
+	/* set up my own style of waitqueue */
+	waiter.task = tsk;
+
+	spin_lock_irqsave(&mutex->wait_lock, flags);
+
+	if (mutex_grab(mutex)) {
+		/* we got the mutex anyway */
+		spin_unlock_irqrestore(&mutex->wait_lock, flags);
+		return 0;
+	}
+
+	/* need to sleep */
+	get_task_struct(tsk);
+	list_add_tail(&waiter.list, &mutex->wait_list);
+
+	/* we don't need to touch the mutex struct anymore */
+	set_task_state(tsk, TASK_INTERRUPTIBLE);
+
+	spin_unlock_irqrestore(&mutex->wait_lock, flags);
+
+	/* wait to be given the mutex */
+	for (;;) {
+		if (list_empty(&waiter.list))
+			break;
+		if (unlikely(signal_pending(current)))
+			goto interrupted;
+		schedule();
+		set_task_state(tsk, TASK_INTERRUPTIBLE);
+	}
+
+	tsk->state = TASK_RUNNING;
+	return 0;
+
+interrupted:
+	spin_lock_irqsave(&mutex->wait_lock, flags);
+
+	/* we may still have been given the mutex */
+	ret = 0;
+	if (!list_empty(&waiter.list)) {
+		list_del(&waiter.list);
+		ret = -EINTR;
+	}
+
+	spin_unlock_irqrestore(&mutex->wait_lock, flags);
+	if (ret == -EINTR)
+		put_task_struct(current);
+	return ret;
+}
+
+EXPORT_SYMBOL(__down_interruptible);
+
+/*
+ * release a single token back to a mutex
+ * - entered with lock held and interrupts disabled
+ * - the queue will not be empty
+ */
+void __up(struct mutex *mutex)
+{
+	struct mutex_waiter *waiter;
+	struct task_struct *tsk;
+
+	/* grant the token to the process at the front of the queue */
+	waiter = list_entry(mutex->wait_list.next, struct mutex_waiter, list);
+
+	/* we must be careful not to touch 'waiter' after we set ->task = NULL.
+	 * - it is an allocated on the waiter's stack and may become invalid at
+	 *   any time after that point (due to a wakeup from another source).
+	 */
+	list_del_init(&waiter->list);
+	tsk = waiter->task;
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+	mutex->__owner = tsk;
+#endif
+	mb();
+	waiter->task = NULL;
+	wake_up_process(tsk);
+	put_task_struct(tsk);
+}
+
+EXPORT_SYMBOL(__up);
+
+/*
+ * report an up() that doesn't match a down()
+ */
+#ifdef CONFIG_DEBUG_MUTEX_OWNER
+void __up_bad(struct mutex *mutex)
+{
+	if (!mutex->__owner) {
+		printk(KERN_ERR
+		       "BUG: process %d [%s] releasing unowned mutex\n",
+		       current->pid,
+		       current->comm);
+	}
+	else {
+		printk(KERN_ERR
+		       "BUG: process %d [%s] releasing mutex owned by %d [%s]\n",
+		       current->pid,
+		       current->comm,
+		       mutex->__owner->pid,
+		       mutex->__owner->comm);
+	}
+}
+
+EXPORT_SYMBOL(__up_bad);
+#endif
diff -uNrp /warthog/kernels/linux-2.6.15-rc5/lib/semaphore-sleepers.c linux-2.6.15-rc5-mutex/lib/semaphore-sleepers.c
--- /warthog/kernels/linux-2.6.15-rc5/lib/semaphore-sleepers.c	2005-11-01 13:19:22.000000000 +0000
+++ linux-2.6.15-rc5-mutex/lib/semaphore-sleepers.c	2005-12-12 17:58:35.000000000 +0000
@@ -49,12 +49,12 @@
  *    we cannot lose wakeup events.
  */
 
-fastcall void __up(struct semaphore *sem)
+fastcall void __up_sem(struct semaphore *sem)
 {
 	wake_up(&sem->wait);
 }
 
-fastcall void __sched __down(struct semaphore * sem)
+fastcall void __sched __down_sem(struct semaphore * sem)
 {
 	struct task_struct *tsk = current;
 	DECLARE_WAITQUEUE(wait, tsk);
@@ -91,7 +91,7 @@ fastcall void __sched __down(struct sema
 	tsk->state = TASK_RUNNING;
 }
 
-fastcall int __sched __down_interruptible(struct semaphore * sem)
+fastcall int __sched __down_sem_interruptible(struct semaphore * sem)
 {
 	int retval = 0;
 	struct task_struct *tsk = current;
@@ -154,7 +154,7 @@ fastcall int __sched __down_interruptibl
  * single "cmpxchg" without failure cases,
  * but then it wouldn't work on a 386.
  */
-fastcall int __down_trylock(struct semaphore * sem)
+fastcall int __down_sem_trylock(struct semaphore * sem)
 {
 	int sleepers;
 	unsigned long flags;

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
@ 2005-12-13  0:13 ` Nick Piggin
  2005-12-13  0:19 ` Nick Piggin
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-13  0:13 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:
> The attached patch introduces a simple mutex implementation as an alternative
> to the usual semaphore implementation where simple mutex functionality is all
> that is required.
> 
> This is useful in two ways:
> 
>  (1) A number of archs only provide very simple atomic instructions (such as
>      XCHG on i386, TAS on M68K, SWAP on FRV) which aren't sufficient to
>      implement full semaphore support directly. Instead spinlocks must be
>      employed to implement fuller functionality.
> 

We have atomic_cmpxchg. Can you use that for a sufficient generic
implementation?

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
  2005-12-13  0:13 ` Nick Piggin
@ 2005-12-13  0:19 ` Nick Piggin
  2005-12-13  0:19 ` Andrew Morton
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-13  0:19 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:

> +	/* set up my own style of waitqueue */
> +	waiter.task = tsk;
> +

Any reason why you're setting up your own style of waitqueue in
mutex-simple.c instead of just using the kernel's style of waitqueue?

> +
> +/*
> + * release a single token back to a mutex
> + * - entered with lock held and interrupts disabled
> + * - the queue will not be empty
> + */
> +void __up(struct mutex *mutex)
> +{
> +	struct mutex_waiter *waiter;
> +	struct task_struct *tsk;
> +
> +	/* grant the token to the process at the front of the queue */
> +	waiter = list_entry(mutex->wait_list.next, struct mutex_waiter, list);
> +
> +	/* we must be careful not to touch 'waiter' after we set ->task = NULL.
> +	 * - it is an allocated on the waiter's stack and may become invalid at
> +	 *   any time after that point (due to a wakeup from another source).
> +	 */
> +	list_del_init(&waiter->list);
> +	tsk = waiter->task;
> +#ifdef CONFIG_DEBUG_MUTEX_OWNER
> +	mutex->__owner = tsk;
> +#endif
> +	mb();

This should be smp_mb(), I think.

> +	waiter->task = NULL;
> +	wake_up_process(tsk);
> +	put_task_struct(tsk);
> +}

Thanks,
Nick

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
  2005-12-13  0:13 ` Nick Piggin
  2005-12-13  0:19 ` Nick Piggin
@ 2005-12-13  0:19 ` Andrew Morton
  2005-12-13  7:54   ` Ingo Molnar
  2005-12-13  0:30 ` Arnd Bergmann
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  0:19 UTC (permalink / raw)
  To: David Howells; +Cc: torvalds, hch, arjan, matthew, linux-kernel, linux-arch

David Howells <dhowells@redhat.com> wrote:
>
> The attached patch introduces a simple mutex implementation as an alternative
> to the usual semaphore implementation where simple mutex functionality is all
> that is required.
> 
> This is useful in two ways:
> 
>  (1) A number of archs only provide very simple atomic instructions (such as
>      XCHG on i386, TAS on M68K, SWAP on FRV) which aren't sufficient to
>      implement full semaphore support directly. Instead spinlocks must be
>      employed to implement fuller functionality.
> 
>  (2) This makes it obvious in what way the semaphore is being used: whether
>      it's being used as a mutex or being used as a counter.
> 
> This patch set does the following:
> 
>  (1) Provides a simple xchg() based semaphore as a default for all
>      architectures that don't wish to override it and provide their own.
> 
>      Overriding is possible by setting CONFIG_ARCH_IMPLEMENTS_MUTEX and
>      supplying asm/mutex.h
> 
>      Partial overriding is possible by #defining mutex_grab(), mutex_release()
>      and is_mutex_locked() to perform the appropriate optimised functions.
> 
>  (2) Provides linux/mutex.h as a common include for gaining access to mutex
>      semaphores.
> 
>  (3) Provides linux/semaphore.h as a common include for gaining access to all
>      the different types of semaphore that may be used from within the kernel.
> 
>  (4) Renames down*() to down_sem*() and up() to up_sem() for the traditional
>      semaphores, and removes init_MUTEX*() and DECLARE_MUTEX*() from
>      asm/semaphore.h
> 
>  (5) Redirects the following to apply to the new mutexes rather than the
>      traditional semaphores:
> 
> 	down()
> 	down_trylock()
> 	down_interruptible()
> 	up()
> 	init_MUTEX()
>      	init_MUTEX_LOCKED()
> 	DECLARE_MUTEX()
> 	DECLARE_MUTEX_LOCKED()
> 
>      On the basis that most usages of semaphores are as mutexes, this makes
>      sense for in most cases it's just then a matter of changing the type from
>      struct semaphore to struct mutex. In some cases, sema_init() has to be
>      changed to init_MUTEX*() also.
> 
>  (6) Generally include linux/semaphore.h in place of asm/semaphore.h.
> 
>  (7) Provides a debugging config option CONFIG_DEBUG_MUTEX_OWNER by which the
>      mutex owner can be tracked and by which over-upping can be detected.

Maybe I'm not understanding all this, but...

I'd have thought that the way to do this is to simply reimplement down(),
up(), down_trylock(), etc using the new xchg-based code and to then hunt
down those few parts of the kernel which actually use the old semaphore's
counting feature and convert them to use down_sem(), up_sem(), etc.  And
rename all the old semaphore code: s/down/down_sem/etc.

So after such a transformation, this new "mutex" thingy would not exist.

>  include/linux/mutex.h        |   32 +++++++

But it does.

> +#define mutex_grab(mutex)	(xchg(&(mutex)->state, 1) == 0)

mutex_trylock(), please.

> +#define is_mutex_locked(mutex)	((mutex)->state)

Let's keep the namespace consistent.  mutex_is_locked().

> +static inline void down(struct mutex *mutex)
> +{
> +	if (mutex_grab(mutex)) {

likely()

> +#ifdef CONFIG_DEBUG_MUTEX_OWNER
> +		mutex->__owner = current;
> +#endif
> +	}
> +	else {
> +		__down(mutex);
> +	}
> +}
> +
> +/*
> + * sleep interruptibly until we get the mutex
> + * - return 0 if successful, -EINTR if interrupted
> + */
> +static inline int down_interruptible(struct mutex *mutex)
> +{
> +	if (mutex_grab(mutex)) {

likely()

> +static inline int down_trylock(struct mutex *mutex)
> +{
> +	if (mutex_grab(mutex)) {

etc.

You could just put likely() into mutex_trylock().  err, mutex_grab().

> +/*
> + * release the mutex
> + */
> +static inline void up(struct mutex *mutex)
> +{
> +	unsigned long flags;
> +
> +#ifdef CONFIG_DEBUG_MUTEX_OWNER
> +	if (mutex->__owner != current)
> +		__up_bad(mutex);
> +	mutex->__owner = NULL;
> +#endif
> +
> +	/* must prevent a race */
> +	spin_lock_irqsave(&mutex->wait_lock, flags);
> +	if (!list_empty(&mutex->wait_list))
> +		__up(mutex);
> +	else
> +		mutex_release(mutex);
> +	spin_unlock_irqrestore(&mutex->wait_lock, flags);
> +}

This is too large to inline.

It's also significantly slower than the existing up()?


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  0:19 ` Andrew Morton
@ 2005-12-13  7:54   ` Ingo Molnar
  2005-12-13  7:58     ` Andi Kleen
                       ` (3 more replies)
  0 siblings, 4 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  7:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Howells, torvalds, hch, arjan, matthew, linux-kernel, linux-arch


* Andrew Morton <akpm@osdl.org> wrote:

> I'd have thought that the way to do this is to simply reimplement 
> down(), up(), down_trylock(), etc using the new xchg-based code and to 
> then hunt down those few parts of the kernel which actually use the 
> old semaphore's counting feature and convert them to use down_sem(), 
> up_sem(), etc.  And rename all the old semaphore code: 
> s/down/down_sem/etc.

even better than that, why not use the solution that we've implemented 
for the -rt patchset, more than a year ago?

the solution i took was this:

- i did not touch the 'struct semaphore' namespace, but introduced a
  'struct compat_semaphore'.

- i introduced a 'type-sensitive' macro wrapper that switches down() 
  (and the other APIs) to either to the assembly variant (if the 
  variable's type is struct compat_semaphore), or switches it to the new 
  generic mutex (if the type is struct semaphore), at build-time. There 
  is no runtime overhead due to this build-time-switching.

- for many months we worked with upstream maintainers to convert dozens
  of mutex users over to struct completion, where this was appropriate.

all this simplified the 'compatibility conversion' to the patch below.  
No other non-generic changes are needed.

	Ingo

----
convert the remaining users of 'full Linux semaphore semantics' over to 
compat_semaphore.

 drivers/acpi/osl.c                        |   12 ++++++------
 drivers/ieee1394/ieee1394_types.h         |    2 +-
 drivers/ieee1394/nodemgr.c                |    2 +-
 drivers/ieee1394/raw1394-private.h        |    2 +-
 drivers/media/dvb/dvb-core/dvb_frontend.c |    2 +-
 drivers/media/dvb/dvb-core/dvb_frontend.h |    2 +-
 drivers/net/3c527.c                       |    2 +-
 drivers/net/hamradio/6pack.c              |    2 +-
 drivers/net/hamradio/mkiss.c              |    2 +-
 drivers/net/plip.c                        |    5 ++++-
 drivers/net/ppp_async.c                   |    2 +-
 drivers/net/ppp_synctty.c                 |    2 +-
 drivers/pci/hotplug/cpci_hotplug_core.c   |    4 ++--
 drivers/pci/hotplug/cpqphp_ctrl.c         |    4 ++--
 drivers/pci/hotplug/ibmphp_hpc.c          |    2 +-
 drivers/pci/hotplug/pciehp_ctrl.c         |    4 ++--
 drivers/pci/hotplug/shpchp_ctrl.c         |    4 ++--
 drivers/scsi/aacraid/aacraid.h            |    4 ++--
 drivers/scsi/aic7xxx/aic79xx_osm.h        |    2 +-
 drivers/scsi/aic7xxx/aic7xxx_osm.h        |    2 +-
 drivers/scsi/qla2xxx/qla_def.h            |    2 +-
 drivers/usb/storage/usb.h                 |    2 +-
 fs/xfs/linux-2.6/mutex.h                  |    2 +-
 fs/xfs/linux-2.6/sema.h                   |    2 +-
 fs/xfs/linux-2.6/xfs_buf.h                |    4 ++--
 include/linux/jffs2_fs_i.h                |   10 +++++++++-
 include/linux/jffs2_fs_sb.h               |    6 +++---
 include/linux/parport.h                   |    2 +-
 include/pcmcia/ss.h                       |    2 +-
 include/scsi/scsi_transport_spi.h         |    2 +-
 30 files changed, 54 insertions(+), 43 deletions(-)

Index: linux/drivers/acpi/osl.c
===================================================================
--- linux.orig/drivers/acpi/osl.c
+++ linux/drivers/acpi/osl.c
@@ -728,14 +728,14 @@ void acpi_os_delete_lock(acpi_handle han
 acpi_status
 acpi_os_create_semaphore(u32 max_units, u32 initial_units, acpi_handle * handle)
 {
-	struct semaphore *sem = NULL;
+	struct compat_semaphore *sem = NULL;
 
 	ACPI_FUNCTION_TRACE("os_create_semaphore");
 
-	sem = acpi_os_allocate(sizeof(struct semaphore));
+	sem = acpi_os_allocate(sizeof(struct compat_semaphore));
 	if (!sem)
 		return_ACPI_STATUS(AE_NO_MEMORY);
-	memset(sem, 0, sizeof(struct semaphore));
+	memset(sem, 0, sizeof(struct compat_semaphore));
 
 	sema_init(sem, initial_units);
 
@@ -758,7 +758,7 @@ EXPORT_SYMBOL(acpi_os_create_semaphore);
 
 acpi_status acpi_os_delete_semaphore(acpi_handle handle)
 {
-	struct semaphore *sem = (struct semaphore *)handle;
+	struct compat_semaphore *sem = (struct compat_semaphore *)handle;
 
 	ACPI_FUNCTION_TRACE("os_delete_semaphore");
 
@@ -787,7 +787,7 @@ EXPORT_SYMBOL(acpi_os_delete_semaphore);
 acpi_status acpi_os_wait_semaphore(acpi_handle handle, u32 units, u16 timeout)
 {
 	acpi_status status = AE_OK;
-	struct semaphore *sem = (struct semaphore *)handle;
+	struct compat_semaphore *sem = (struct compat_semaphore *)handle;
 	int ret = 0;
 
 	ACPI_FUNCTION_TRACE("os_wait_semaphore");
@@ -868,7 +868,7 @@ EXPORT_SYMBOL(acpi_os_wait_semaphore);
  */
 acpi_status acpi_os_signal_semaphore(acpi_handle handle, u32 units)
 {
-	struct semaphore *sem = (struct semaphore *)handle;
+	struct compat_semaphore *sem = (struct compat_semaphore *)handle;
 
 	ACPI_FUNCTION_TRACE("os_signal_semaphore");
 
Index: linux/drivers/ieee1394/ieee1394_types.h
===================================================================
--- linux.orig/drivers/ieee1394/ieee1394_types.h
+++ linux/drivers/ieee1394/ieee1394_types.h
@@ -19,7 +19,7 @@ struct hpsb_tlabel_pool {
 	spinlock_t lock;
 	u8 next;
 	u32 allocations;
-	struct semaphore count;
+	struct compat_semaphore count;
 };
 
 #define HPSB_TPOOL_INIT(_tp)			\
Index: linux/drivers/ieee1394/nodemgr.c
===================================================================
--- linux.orig/drivers/ieee1394/nodemgr.c
+++ linux/drivers/ieee1394/nodemgr.c
@@ -114,7 +114,7 @@ struct host_info {
 	struct hpsb_host *host;
 	struct list_head list;
 	struct completion exited;
-	struct semaphore reset_sem;
+	struct compat_semaphore reset_sem;
 	int pid;
 	char daemon_name[15];
 	int kill_me;
Index: linux/drivers/ieee1394/raw1394-private.h
===================================================================
--- linux.orig/drivers/ieee1394/raw1394-private.h
+++ linux/drivers/ieee1394/raw1394-private.h
@@ -29,7 +29,7 @@ struct file_info {
 
         struct list_head req_pending;
         struct list_head req_complete;
-        struct semaphore complete_sem;
+        struct compat_semaphore complete_sem;
         spinlock_t reqlists_lock;
         wait_queue_head_t poll_wait_complete;
 
Index: linux/drivers/media/dvb/dvb-core/dvb_frontend.c
===================================================================
--- linux.orig/drivers/media/dvb/dvb-core/dvb_frontend.c
+++ linux/drivers/media/dvb/dvb-core/dvb_frontend.c
@@ -95,7 +95,7 @@ struct dvb_frontend_private {
 	struct dvb_device *dvbdev;
 	struct dvb_frontend_parameters parameters;
 	struct dvb_fe_events events;
-	struct semaphore sem;
+	struct compat_semaphore sem;
 	struct list_head list_head;
 	wait_queue_head_t wait_queue;
 	pid_t thread_pid;
Index: linux/drivers/media/dvb/dvb-core/dvb_frontend.h
===================================================================
--- linux.orig/drivers/media/dvb/dvb-core/dvb_frontend.h
+++ linux/drivers/media/dvb/dvb-core/dvb_frontend.h
@@ -86,7 +86,7 @@ struct dvb_fe_events {
 	int			  eventr;
 	int			  overflow;
 	wait_queue_head_t	  wait_queue;
-	struct semaphore	  sem;
+	struct compat_semaphore	  sem;
 };
 
 struct dvb_frontend {
Index: linux/drivers/net/3c527.c
===================================================================
--- linux.orig/drivers/net/3c527.c
+++ linux/drivers/net/3c527.c
@@ -182,7 +182,7 @@ struct mc32_local 
 
 	u16 rx_ring_tail;       /* index to rx de-queue end */ 
 
-	struct semaphore cmd_mutex;    /* Serialises issuing of execute commands */
+	struct compat_semaphore cmd_mutex;    /* Serialises issuing of execute commands */
         struct completion execution_cmd; /* Card has completed an execute command */
 	struct completion xceiver_cmd;   /* Card has completed a tx or rx command */
 };
Index: linux/drivers/net/hamradio/6pack.c
===================================================================
--- linux.orig/drivers/net/hamradio/6pack.c
+++ linux/drivers/net/hamradio/6pack.c
@@ -124,7 +124,7 @@ struct sixpack {
 	struct timer_list	tx_t;
 	struct timer_list	resync_t;
 	atomic_t		refcnt;
-	struct semaphore	dead_sem;
+	struct compat_semaphore	dead_sem;
 	spinlock_t		lock;
 };
 
Index: linux/drivers/net/hamradio/mkiss.c
===================================================================
--- linux.orig/drivers/net/hamradio/mkiss.c
+++ linux/drivers/net/hamradio/mkiss.c
@@ -85,7 +85,7 @@ struct mkiss {
 #define CRC_MODE_SMACK_TEST	4
 
 	atomic_t		refcnt;
-	struct semaphore	dead_sem;
+	struct compat_semaphore	dead_sem;
 };
 
 /*---------------------------------------------------------------------------*/
Index: linux/drivers/net/plip.c
===================================================================
--- linux.orig/drivers/net/plip.c
+++ linux/drivers/net/plip.c
@@ -229,7 +229,10 @@ struct net_local {
 	                              struct hh_cache *hh);
 	spinlock_t lock;
 	atomic_t kill_timer;
-	struct semaphore killed_timer_sem;
+	/*
+	 * PREEMPT_RT: this isnt a mutex, it should be struct completion.
+	 */
+	struct compat_semaphore killed_timer_sem;
 };
 
 static inline void enable_parport_interrupts (struct net_device *dev)
Index: linux/drivers/net/ppp_async.c
===================================================================
--- linux.orig/drivers/net/ppp_async.c
+++ linux/drivers/net/ppp_async.c
@@ -66,7 +66,7 @@ struct asyncppp {
 	struct tasklet_struct tsk;
 
 	atomic_t	refcnt;
-	struct semaphore dead_sem;
+	struct compat_semaphore dead_sem;
 	struct ppp_channel chan;	/* interface to generic ppp layer */
 	unsigned char	obuf[OBUFSIZE];
 };
Index: linux/drivers/net/ppp_synctty.c
===================================================================
--- linux.orig/drivers/net/ppp_synctty.c
+++ linux/drivers/net/ppp_synctty.c
@@ -70,7 +70,7 @@ struct syncppp {
 	struct tasklet_struct tsk;
 
 	atomic_t	refcnt;
-	struct semaphore dead_sem;
+	struct compat_semaphore dead_sem;
 	struct ppp_channel chan;	/* interface to generic ppp layer */
 };
 
Index: linux/drivers/pci/hotplug/cpci_hotplug_core.c
===================================================================
--- linux.orig/drivers/pci/hotplug/cpci_hotplug_core.c
+++ linux/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -60,8 +60,8 @@ static int slots;
 static atomic_t extracting;
 int cpci_debug;
 static struct cpci_hp_controller *controller;
-static struct semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
-static struct semaphore thread_exit;		/* guard ensure thread has exited before calling it quits */
+static struct compat_semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
+static struct compat_semaphore thread_exit;		/* guard ensure thread has exited before calling it quits */
 static int thread_finished = 1;
 
 static int enable_slot(struct hotplug_slot *slot);
Index: linux/drivers/pci/hotplug/cpqphp_ctrl.c
===================================================================
--- linux.orig/drivers/pci/hotplug/cpqphp_ctrl.c
+++ linux/drivers/pci/hotplug/cpqphp_ctrl.c
@@ -45,8 +45,8 @@ static int configure_new_function(struct
 			u8 behind_bridge, struct resource_lists *resources);
 static void interrupt_event_handler(struct controller *ctrl);
 
-static struct semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
-static struct semaphore event_exit;		/* guard ensure thread has exited before calling it quits */
+static struct compat_semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
+static struct compat_semaphore event_exit;		/* guard ensure thread has exited before calling it quits */
 static int event_finished;
 static unsigned long pushbutton_pending;	/* = 0 */
 
Index: linux/drivers/pci/hotplug/ibmphp_hpc.c
===================================================================
--- linux.orig/drivers/pci/hotplug/ibmphp_hpc.c
+++ linux/drivers/pci/hotplug/ibmphp_hpc.c
@@ -104,7 +104,7 @@ static int tid_poll;
 static struct semaphore sem_hpcaccess;	// lock access to HPC
 static struct semaphore semOperations;	// lock all operations and
 					// access to data structures
-static struct semaphore sem_exit;	// make sure polling thread goes away
+static struct compat_semaphore sem_exit;	// make sure polling thread goes away
 //----------------------------------------------------------------------------
 // local function prototypes
 //----------------------------------------------------------------------------
Index: linux/drivers/pci/hotplug/pciehp_ctrl.c
===================================================================
--- linux.orig/drivers/pci/hotplug/pciehp_ctrl.c
+++ linux/drivers/pci/hotplug/pciehp_ctrl.c
@@ -37,8 +37,8 @@
 
 static void interrupt_event_handler(struct controller *ctrl);
 
-static struct semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
-static struct semaphore event_exit;		/* guard ensure thread has exited before calling it quits */
+static struct compat_semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
+static struct compat_semaphore event_exit;		/* guard ensure thread has exited before calling it quits */
 static int event_finished;
 static unsigned long pushbutton_pending;	/* = 0 */
 static unsigned long surprise_rm_pending;	/* = 0 */
Index: linux/drivers/pci/hotplug/shpchp_ctrl.c
===================================================================
--- linux.orig/drivers/pci/hotplug/shpchp_ctrl.c
+++ linux/drivers/pci/hotplug/shpchp_ctrl.c
@@ -37,8 +37,8 @@
 
 static void interrupt_event_handler(struct controller *ctrl);
 
-static struct semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
-static struct semaphore event_exit;		/* guard ensure thread has exited before calling it quits */
+static struct compat_semaphore event_semaphore;	/* mutex for process loop (up if something to process) */
+static struct compat_semaphore event_exit;		/* guard ensure thread has exited before calling it quits */
 static int event_finished;
 static unsigned long pushbutton_pending;	/* = 0 */
 
Index: linux/drivers/scsi/aacraid/aacraid.h
===================================================================
--- linux.orig/drivers/scsi/aacraid/aacraid.h
+++ linux/drivers/scsi/aacraid/aacraid.h
@@ -735,7 +735,7 @@ struct aac_fib_context {
 	u32			unique;		// unique value representing this context
 	ulong			jiffies;	// used for cleanup - dmb changed to ulong
 	struct list_head	next;		// used to link context's into a linked list
-	struct semaphore 	wait_sem;	// this is used to wait for the next fib to arrive.
+	struct compat_semaphore	wait_sem;	// this is used to wait for the next fib to arrive.
 	int			wait;		// Set to true when thread is in WaitForSingleObject
 	unsigned long		count;		// total number of FIBs on FibList
 	struct list_head	fib_list;	// this holds fibs and their attachd hw_fibs
@@ -804,7 +804,7 @@ struct fib {
 	 *	This is the event the sendfib routine will wait on if the
 	 *	caller did not pass one and this is synch io.
 	 */
-	struct semaphore 	event_wait;
+	struct compat_semaphore	event_wait;
 	spinlock_t		event_lock;
 
 	u32			done;	/* gets set to 1 when fib is complete */
Index: linux/drivers/scsi/aic7xxx/aic79xx_osm.h
===================================================================
--- linux.orig/drivers/scsi/aic7xxx/aic79xx_osm.h
+++ linux/drivers/scsi/aic7xxx/aic79xx_osm.h
@@ -390,7 +390,7 @@ struct ahd_platform_data {
 	spinlock_t		 spin_lock;
 	u_int			 qfrozen;
 	struct timer_list	 reset_timer;
-	struct semaphore	 eh_sem;
+	struct compat_semaphore	 eh_sem;
 	struct Scsi_Host        *host;		/* pointer to scsi host */
 #define AHD_LINUX_NOIRQ	((uint32_t)~0)
 	uint32_t		 irq;		/* IRQ for this adapter */
Index: linux/drivers/scsi/aic7xxx/aic7xxx_osm.h
===================================================================
--- linux.orig/drivers/scsi/aic7xxx/aic7xxx_osm.h
+++ linux/drivers/scsi/aic7xxx/aic7xxx_osm.h
@@ -394,7 +394,7 @@ struct ahc_platform_data {
 	spinlock_t		 spin_lock;
 	u_int			 qfrozen;
 	struct timer_list	 reset_timer;
-	struct semaphore	 eh_sem;
+	struct compat_semaphore	 eh_sem;
 	struct Scsi_Host        *host;		/* pointer to scsi host */
 #define AHC_LINUX_NOIRQ	((uint32_t)~0)
 	uint32_t		 irq;		/* IRQ for this adapter */
Index: linux/drivers/scsi/qla2xxx/qla_def.h
===================================================================
--- linux.orig/drivers/scsi/qla2xxx/qla_def.h
+++ linux/drivers/scsi/qla2xxx/qla_def.h
@@ -2411,7 +2411,7 @@ typedef struct scsi_qla_host {
 	spinlock_t	mbx_reg_lock;   /* Mbx Cmd Register Lock */
 
 	struct semaphore mbx_cmd_sem;	/* Serialialize mbx access */
-	struct semaphore mbx_intr_sem;  /* Used for completion notification */
+	struct compat_semaphore mbx_intr_sem;  /* Used for completion notification */
 
 	uint32_t	mbx_flags;
 #define  MBX_IN_PROGRESS	BIT_0
Index: linux/drivers/usb/storage/usb.h
===================================================================
--- linux.orig/drivers/usb/storage/usb.h
+++ linux/drivers/usb/storage/usb.h
@@ -171,7 +171,7 @@ struct us_data {
 	dma_addr_t		iobuf_dma;
 
 	/* mutual exclusion and synchronization structures */
-	struct semaphore	sema;		 /* to sleep thread on	    */
+	struct compat_semaphore	sema;		 /* to sleep thread on	    */
 	struct completion	notify;		 /* thread begin/end	    */
 	wait_queue_head_t	delay_wait;	 /* wait during scan, reset */
 
Index: linux/fs/xfs/linux-2.6/mutex.h
===================================================================
--- linux.orig/fs/xfs/linux-2.6/mutex.h
+++ linux/fs/xfs/linux-2.6/mutex.h
@@ -28,7 +28,7 @@
  * callers.
  */
 #define MUTEX_DEFAULT		0x0
-typedef struct semaphore	mutex_t;
+typedef struct compat_semaphore	mutex_t;
 
 #define mutex_init(lock, type, name)		sema_init(lock, 1)
 #define mutex_destroy(lock)			sema_init(lock, -99)
Index: linux/fs/xfs/linux-2.6/sema.h
===================================================================
--- linux.orig/fs/xfs/linux-2.6/sema.h
+++ linux/fs/xfs/linux-2.6/sema.h
@@ -27,7 +27,7 @@
  * sema_t structure just maps to struct semaphore in Linux kernel.
  */
 
-typedef struct semaphore sema_t;
+typedef struct compat_semaphore sema_t;
 
 #define init_sema(sp, val, c, d)	sema_init(sp, val)
 #define initsema(sp, val)		sema_init(sp, val)
Index: linux/fs/xfs/linux-2.6/xfs_buf.h
===================================================================
--- linux.orig/fs/xfs/linux-2.6/xfs_buf.h
+++ linux/fs/xfs/linux-2.6/xfs_buf.h
@@ -114,7 +114,7 @@ typedef int (*page_buf_bdstrat_t)(struct
 #define PB_PAGES	2
 
 typedef struct xfs_buf {
-	struct semaphore	pb_sema;	/* semaphore for lockables  */
+	struct compat_semaphore	pb_sema;	/* semaphore for lockables  */
 	unsigned long		pb_queuetime;	/* time buffer was queued   */
 	atomic_t		pb_pin_count;	/* pin count		    */
 	wait_queue_head_t	pb_waiters;	/* unpin waiters	    */
@@ -134,7 +134,7 @@ typedef struct xfs_buf {
 	page_buf_iodone_t	pb_iodone;	/* I/O completion function */
 	page_buf_relse_t	pb_relse;	/* releasing function */
 	page_buf_bdstrat_t	pb_strat;	/* pre-write function */
-	struct semaphore	pb_iodonesema;	/* Semaphore for I/O waiters */
+	struct compat_semaphore	pb_iodonesema;	/* Semaphore for I/O waiters */
 	void			*pb_fspriv;
 	void			*pb_fspriv2;
 	void			*pb_fspriv3;
Index: linux/include/linux/jffs2_fs_i.h
===================================================================
--- linux.orig/include/linux/jffs2_fs_i.h
+++ linux/include/linux/jffs2_fs_i.h
@@ -14,7 +14,15 @@ struct jffs2_inode_info {
 	   before letting GC proceed. Or we'd have to put ugliness
 	   into the GC code so it didn't attempt to obtain the i_sem
 	   for the inode(s) which are already locked */
-	struct semaphore sem;
+	/*
+	 * (On PREEMPT_RT: while use of ei->sem is mostly mutex-alike, the
+	 * SLAB cache keeps the semaphore locked, which breaks the strict
+	 * "owner must exist" properties of rt_mutexes. Fix it the easy
+	 * way: by going to a compat_semaphore. But the real fix would be
+	 * to cache inodes in an unlocked state and lock them when
+	 * allocating a new inode.)
+	 */
+	struct compat_semaphore sem;
 
 	/* The highest (datanode) version number used for this ino */
 	uint32_t highest_version;
Index: linux/include/linux/jffs2_fs_sb.h
===================================================================
--- linux.orig/include/linux/jffs2_fs_sb.h
+++ linux/include/linux/jffs2_fs_sb.h
@@ -35,7 +35,7 @@ struct jffs2_sb_info {
 	struct completion gc_thread_start; /* GC thread start completion */
 	struct completion gc_thread_exit; /* GC thread exit completion port */
 
-	struct semaphore alloc_sem;	/* Used to protect all the following
+	struct compat_semaphore alloc_sem; /* Used to protect all the following
 					   fields, and also to protect against
 					   out-of-order writing of nodes. And GC. */
 	uint32_t cleanmarker_size;	/* Size of an _inline_ CLEANMARKER
@@ -93,7 +93,7 @@ struct jffs2_sb_info {
 	/* Sem to allow jffs2_garbage_collect_deletion_dirent to
 	   drop the erase_completion_lock while it's holding a pointer
 	   to an obsoleted node. I don't like this. Alternatives welcomed. */
-	struct semaphore erase_free_sem;
+	struct compat_semaphore erase_free_sem;
 
 	uint32_t wbuf_pagesize; /* 0 for NOR and other flashes with no wbuf */
 
@@ -104,7 +104,7 @@ struct jffs2_sb_info {
 	uint32_t wbuf_len;
 	struct jffs2_inodirty *wbuf_inodes;
 
-	struct rw_semaphore wbuf_sem;	/* Protects the write buffer */
+	struct compat_rw_semaphore wbuf_sem;	/* Protects the write buffer */
 
 	/* Information about out-of-band area usage... */
 	struct nand_oobinfo *oobinfo;
Index: linux/include/linux/parport.h
===================================================================
--- linux.orig/include/linux/parport.h
+++ linux/include/linux/parport.h
@@ -254,7 +254,7 @@ enum ieee1284_phase {
 struct ieee1284_info {
 	int mode;
 	volatile enum ieee1284_phase phase;
-	struct semaphore irq;
+	struct compat_semaphore irq;
 };
 
 /* A parallel port */
Index: linux/include/pcmcia/ss.h
===================================================================
--- linux.orig/include/pcmcia/ss.h
+++ linux/include/pcmcia/ss.h
@@ -243,7 +243,7 @@ struct pcmcia_socket {
 #endif
 
 	/* state thread */
-	struct semaphore		skt_sem;	/* protects socket h/w state */
+	struct compat_semaphore		skt_sem;	/* protects socket h/w state */
 
 	struct task_struct		*thread;
 	struct completion		thread_done;
Index: linux/include/scsi/scsi_transport_spi.h
===================================================================
--- linux.orig/include/scsi/scsi_transport_spi.h
+++ linux/include/scsi/scsi_transport_spi.h
@@ -51,7 +51,7 @@ struct spi_transport_attrs {
 	unsigned int support_qas; /* supports quick arbitration and selection */
 	/* Private Fields */
 	unsigned int dv_pending:1; /* Internal flag */
-	struct semaphore dv_sem; /* semaphore to serialise dv */
+	struct compat_semaphore dv_sem; /* semaphore to serialise dv */
 };
 
 enum spi_signal_type {

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  7:54   ` Ingo Molnar
@ 2005-12-13  7:58     ` Andi Kleen
  2005-12-13  8:42       ` Andrew Morton
  2005-12-13  8:00     ` Arjan van de Ven
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  7:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, David Howells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

> - i introduced a 'type-sensitive' macro wrapper that switches down() 
>   (and the other APIs) to either to the assembly variant (if the 
>   variable's type is struct compat_semaphore), or switches it to the new 
>   generic mutex (if the type is struct semaphore), at build-time. There 
>   is no runtime overhead due to this build-time-switching.

Didn't that drop compatibility with 2.95?  The necessary builtins
are only in 3.x. 

Not that I'm not in favour - I would like to use C99 everywhere 
and it would get of the ugly spinlock workaround for i386
and x86-64 doesn't support earlier compilers anyways - 
but others might not agree.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  7:58     ` Andi Kleen
@ 2005-12-13  8:42       ` Andrew Morton
  2005-12-13  8:49         ` Andi Kleen
  2005-12-13  9:03         ` Christoph Hellwig
  0 siblings, 2 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  8:42 UTC (permalink / raw)
  To: Andi Kleen
  Cc: mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel, linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> > - i introduced a 'type-sensitive' macro wrapper that switches down() 
> >   (and the other APIs) to either to the assembly variant (if the 
> >   variable's type is struct compat_semaphore), or switches it to the new 
> >   generic mutex (if the type is struct semaphore), at build-time. There 
> >   is no runtime overhead due to this build-time-switching.
> 
> Didn't that drop compatibility with 2.95?  The necessary builtins
> are only in 3.x. 
> 
> Not that I'm not in favour - I would like to use C99 everywhere 
> and it would get of the ugly spinlock workaround for i386
> and x86-64 doesn't support earlier compilers anyways - 
> but others might not agree.
> 

2.95.x is basically buggered at present.  There's one scsi driver which
doesn't compile due to weird __VA_ARGS__ tricks and the rather useful
scsi/sd.c is currently getting an ICE.  None of the new SAS code compiles,
due to extensive use of anonymous unions.  The V4L guys are very good at
exploiting the gcc-2.95.x macro expansion bug (_why_ does each driver need
to implement its own debug macros?) and various people keep on sneaking in
anonymous unions.

It's time to give up on it and just drink more coffee or play more tetris
or something, I'm afraid.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:42       ` Andrew Morton
@ 2005-12-13  8:49         ` Andi Kleen
  2005-12-13  9:01             ` Andrew Morton
                             ` (3 more replies)
  2005-12-13  9:03         ` Christoph Hellwig
  1 sibling, 4 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  8:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

> It's time to give up on it and just drink more coffee or play more tetris
> or something, I'm afraid.

Or start using icecream (http://wiki.kde.org/icecream) 

Anyways cool.  Gratulations. Can you please apply the following patch then? 

Remove -Wdeclaration-after-statement

Now that gcc 2.95 is not supported anymore it's ok to use C99
style mixed declarations everywhere.

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/Makefile
===================================================================
--- linux/Makefile
+++ linux/Makefile
@@ -535,9 +535,6 @@ include $(srctree)/arch/$(ARCH)/Makefile
 NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
 CHECKFLAGS     += $(NOSTDINC_FLAGS)
 
-# warn about C99 declaration after statement
-CFLAGS += $(call cc-option,-Wdeclaration-after-statement,)
-
 # disable pointer signedness warnings in gcc 4.0
 CFLAGS += $(call cc-option,-Wno-pointer-sign,)
 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:49         ` Andi Kleen
@ 2005-12-13  9:01             ` Andrew Morton
  2005-12-13  9:04           ` Christoph Hellwig
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> Can you please apply the following patch then? 
> 
>  Remove -Wdeclaration-after-statement

OK.

Thus far I have this:


From: Andrew Morton <akpm@osdl.org>

There's one scsi driver which doesn't compile due to weird __VA_ARGS__ tricks
and the rather useful scsi/sd.c is currently getting an ICE.  None of the new
SAS code compiles, due to extensive use of anonymous unions.  The V4L guys are
very good at exploiting the gcc-2.95.x macro expansion bug (_why_ does each
driver need to implement its own debug macros?) and various people keep on
sneaking in anonymous unions.

Plus anonymous unions are rather useful.

Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 dev/null                 |   29 -----------------------------
 include/linux/compiler.h |    2 --
 init/main.c              |    7 +------
 3 files changed, 1 insertion(+), 37 deletions(-)

diff -puN init/main.c~abandon-gcc-295x init/main.c
--- devel/init/main.c~abandon-gcc-295x	2005-12-13 00:48:17.000000000 -0800
+++ devel-akpm/init/main.c	2005-12-13 00:48:17.000000000 -0800
@@ -58,11 +58,6 @@
  * This is one of the first .c files built. Error out early
  * if we have compiler trouble..
  */
-#if __GNUC__ == 2 && __GNUC_MINOR__ == 96
-#ifdef CONFIG_FRAME_POINTER
-#error This compiler cannot compile correctly with frame pointers enabled
-#endif
-#endif
 
 #ifdef CONFIG_X86_LOCAL_APIC
 #include <asm/smp.h>
@@ -74,7 +69,7 @@
  * To avoid associated bogus bug reports, we flatly refuse to compile
  * with a gcc that is known to be too old from the very beginning.
  */
-#if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 95)
+#if __GNUC__ < 3
 #error Sorry, your GCC is too old. It builds incorrect kernels.
 #endif
 
diff -L include/linux/compiler-gcc2.h -puN include/linux/compiler-gcc2.h~abandon-gcc-295x /dev/null
--- devel/include/linux/compiler-gcc2.h
+++ /dev/null	2003-09-15 06:40:47.000000000 -0700
@@ -1,29 +0,0 @@
-/* Never include this file directly.  Include <linux/compiler.h> instead.  */
-
-/* These definitions are for GCC v2.x.  */
-
-/* Somewhere in the middle of the GCC 2.96 development cycle, we implemented
-   a mechanism by which the user can annotate likely branch directions and
-   expect the blocks to be reordered appropriately.  Define __builtin_expect
-   to nothing for earlier compilers.  */
-#include <linux/compiler-gcc.h>
-
-#if __GNUC_MINOR__ < 96
-# define __builtin_expect(x, expected_value) (x)
-#endif
-
-#define __attribute_used__	__attribute__((__unused__))
-
-/*
- * The attribute `pure' is not implemented in GCC versions earlier
- * than 2.96.
- */
-#if __GNUC_MINOR__ >= 96
-# define __attribute_pure__	__attribute__((pure))
-# define __attribute_const__	__attribute__((__const__))
-#endif
-
-/* GCC 2.95.x/2.96 recognize __va_copy, but not va_copy. Actually later GCC's
- * define both va_copy and __va_copy, but the latter may go away, so limit this
- * to this header */
-#define va_copy			__va_copy
diff -puN include/linux/compiler.h~abandon-gcc-295x include/linux/compiler.h
--- devel/include/linux/compiler.h~abandon-gcc-295x	2005-12-13 00:48:17.000000000 -0800
+++ devel-akpm/include/linux/compiler.h	2005-12-13 00:48:17.000000000 -0800
@@ -42,8 +42,6 @@ extern void __chk_io_ptr(void __iomem *)
 # include <linux/compiler-gcc4.h>
 #elif __GNUC__ == 3
 # include <linux/compiler-gcc3.h>
-#elif __GNUC__ == 2
-# include <linux/compiler-gcc2.h>
 #else
 # error Sorry, your compiler is too old/not recognized.
 #endif
_


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-13  9:01             ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel, linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> Can you please apply the following patch then? 
> 
>  Remove -Wdeclaration-after-statement

OK.

Thus far I have this:


From: Andrew Morton <akpm@osdl.org>

There's one scsi driver which doesn't compile due to weird __VA_ARGS__ tricks
and the rather useful scsi/sd.c is currently getting an ICE.  None of the new
SAS code compiles, due to extensive use of anonymous unions.  The V4L guys are
very good at exploiting the gcc-2.95.x macro expansion bug (_why_ does each
driver need to implement its own debug macros?) and various people keep on
sneaking in anonymous unions.

Plus anonymous unions are rather useful.

Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 dev/null                 |   29 -----------------------------
 include/linux/compiler.h |    2 --
 init/main.c              |    7 +------
 3 files changed, 1 insertion(+), 37 deletions(-)

diff -puN init/main.c~abandon-gcc-295x init/main.c
--- devel/init/main.c~abandon-gcc-295x	2005-12-13 00:48:17.000000000 -0800
+++ devel-akpm/init/main.c	2005-12-13 00:48:17.000000000 -0800
@@ -58,11 +58,6 @@
  * This is one of the first .c files built. Error out early
  * if we have compiler trouble..
  */
-#if __GNUC__ == 2 && __GNUC_MINOR__ == 96
-#ifdef CONFIG_FRAME_POINTER
-#error This compiler cannot compile correctly with frame pointers enabled
-#endif
-#endif
 
 #ifdef CONFIG_X86_LOCAL_APIC
 #include <asm/smp.h>
@@ -74,7 +69,7 @@
  * To avoid associated bogus bug reports, we flatly refuse to compile
  * with a gcc that is known to be too old from the very beginning.
  */
-#if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 95)
+#if __GNUC__ < 3
 #error Sorry, your GCC is too old. It builds incorrect kernels.
 #endif
 
diff -L include/linux/compiler-gcc2.h -puN include/linux/compiler-gcc2.h~abandon-gcc-295x /dev/null
--- devel/include/linux/compiler-gcc2.h
+++ /dev/null	2003-09-15 06:40:47.000000000 -0700
@@ -1,29 +0,0 @@
-/* Never include this file directly.  Include <linux/compiler.h> instead.  */
-
-/* These definitions are for GCC v2.x.  */
-
-/* Somewhere in the middle of the GCC 2.96 development cycle, we implemented
-   a mechanism by which the user can annotate likely branch directions and
-   expect the blocks to be reordered appropriately.  Define __builtin_expect
-   to nothing for earlier compilers.  */
-#include <linux/compiler-gcc.h>
-
-#if __GNUC_MINOR__ < 96
-# define __builtin_expect(x, expected_value) (x)
-#endif
-
-#define __attribute_used__	__attribute__((__unused__))
-
-/*
- * The attribute `pure' is not implemented in GCC versions earlier
- * than 2.96.
- */
-#if __GNUC_MINOR__ >= 96
-# define __attribute_pure__	__attribute__((pure))
-# define __attribute_const__	__attribute__((__const__))
-#endif
-
-/* GCC 2.95.x/2.96 recognize __va_copy, but not va_copy. Actually later GCC's
- * define both va_copy and __va_copy, but the latter may go away, so limit this
- * to this header */
-#define va_copy			__va_copy
diff -puN include/linux/compiler.h~abandon-gcc-295x include/linux/compiler.h
--- devel/include/linux/compiler.h~abandon-gcc-295x	2005-12-13 00:48:17.000000000 -0800
+++ devel-akpm/include/linux/compiler.h	2005-12-13 00:48:17.000000000 -0800
@@ -42,8 +42,6 @@ extern void __chk_io_ptr(void __iomem *)
 # include <linux/compiler-gcc4.h>
 #elif __GNUC__ == 3
 # include <linux/compiler-gcc3.h>
-#elif __GNUC__ == 2
-# include <linux/compiler-gcc2.h>
 #else
 # error Sorry, your compiler is too old/not recognized.
 #endif
_

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:01             ` Andrew Morton
  (?)
@ 2005-12-13  9:02             ` Andrew Morton
  2005-12-13 10:07               ` Jakub Jelinek
  2005-12-14 10:46               ` Russell King
  -1 siblings, 2 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:02 UTC (permalink / raw)
  To: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

Andrew Morton <akpm@osdl.org> wrote:
>
> Thus far I have this:
>

And this:


From: Andrew Morton <akpm@osdl.org>

Remove various things which were checking for gcc-1.x and gcc-2.x compilers.


Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 arch/arm/kernel/asm-offsets.c     |    7 +------
 arch/arm26/kernel/asm-offsets.c   |    7 -------
 arch/ia64/kernel/head.S           |    2 +-
 arch/ia64/kernel/ia64_ksyms.c     |    2 +-
 arch/ia64/oprofile/backtrace.c    |    2 +-
 drivers/md/raid0.c                |    6 ------
 fs/xfs/xfs_log.h                  |    8 +-------
 include/asm-um/rwsem.h            |    4 ----
 include/asm-v850/unistd.h         |   18 ------------------
 include/linux/seccomp.h           |    6 +-----
 include/linux/spinlock_types_up.h |   14 --------------
 11 files changed, 6 insertions(+), 70 deletions(-)

diff -puN drivers/md/raid0.c~remove-gcc2-checks drivers/md/raid0.c
--- devel/drivers/md/raid0.c~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
+++ devel-akpm/drivers/md/raid0.c	2005-12-13 00:51:35.000000000 -0800
@@ -307,9 +307,6 @@ static int raid0_run (mddev_t *mddev)
 	printk("raid0 : conf->hash_spacing is %llu blocks.\n",
 		(unsigned long long)conf->hash_spacing);
 	{
-#if __GNUC__ < 3
-		volatile
-#endif
 		sector_t s = mddev->array_size;
 		sector_t space = conf->hash_spacing;
 		int round;
@@ -440,9 +437,6 @@ static int raid0_make_request (request_q
  
 
 	{
-#if __GNUC__ < 3
-		volatile
-#endif
 		sector_t x = block >> conf->preshift;
 		sector_div(x, (u32)conf->hash_spacing);
 		zone = conf->hash_table[x];
diff -puN fs/xfs/xfs_log.h~remove-gcc2-checks fs/xfs/xfs_log.h
--- devel/fs/xfs/xfs_log.h~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
+++ devel-akpm/fs/xfs/xfs_log.h	2005-12-13 00:52:10.000000000 -0800
@@ -30,13 +30,7 @@
  * By comparing each compnent, we don't have to worry about extra
  * endian issues in treating two 32 bit numbers as one 64 bit number
  */
-static
-#if defined(__GNUC__) && (__GNUC__ == 2) && ( (__GNUC_MINOR__ == 95) || (__GNUC_MINOR__ == 96))
-__attribute__((unused))	/* gcc 2.95, 2.96 miscompile this when inlined */
-#else
-__inline__
-#endif
-xfs_lsn_t	_lsn_cmp(xfs_lsn_t lsn1, xfs_lsn_t lsn2)
+static inline xfs_lsn_t	_lsn_cmp(xfs_lsn_t lsn1, xfs_lsn_t lsn2)
 {
 	if (CYCLE_LSN(lsn1) != CYCLE_LSN(lsn2))
 		return (CYCLE_LSN(lsn1)<CYCLE_LSN(lsn2))? -999 : 999;
diff -puN arch/arm/kernel/asm-offsets.c~remove-gcc2-checks arch/arm/kernel/asm-offsets.c
--- devel/arch/arm/kernel/asm-offsets.c~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
+++ devel-akpm/arch/arm/kernel/asm-offsets.c	2005-12-13 00:53:27.000000000 -0800
@@ -23,18 +23,13 @@
 #error Sorry, your compiler targets APCS-26 but this kernel requires APCS-32
 #endif
 /*
- * GCC 2.95.1, 2.95.2: ignores register clobber list in asm().
  * GCC 3.0, 3.1: general bad code generation.
  * GCC 3.2.0: incorrect function argument offset calculation.
  * GCC 3.2.x: miscompiles NEW_AUX_ENT in fs/binfmt_elf.c
  *            (http://gcc.gnu.org/PR8896) and incorrect structure
  *	      initialisation in fs/jffs2/erase.c
  */
-#if __GNUC__ < 2 || \
-   (__GNUC__ == 2 && __GNUC_MINOR__ < 95) || \
-   (__GNUC__ == 2 && __GNUC_MINOR__ == 95 && __GNUC_PATCHLEVEL__ != 0 && \
-					     __GNUC_PATCHLEVEL__ < 3) || \
-   (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
+#if __GNUC__ < 2 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
 #error Your compiler is too buggy; it is known to miscompile kernels.
 #error    Known good compilers: 2.95.3, 2.95.4, 2.96, 3.3
 #endif
diff -puN arch/arm26/kernel/asm-offsets.c~remove-gcc2-checks arch/arm26/kernel/asm-offsets.c
--- devel/arch/arm26/kernel/asm-offsets.c~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
+++ devel-akpm/arch/arm26/kernel/asm-offsets.c	2005-12-13 00:53:47.000000000 -0800
@@ -25,13 +25,6 @@
 #if defined(__APCS_32__) && defined(CONFIG_CPU_26)
 #error Sorry, your compiler targets APCS-32 but this kernel requires APCS-26
 #endif
-#if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 95)
-#error Sorry, your compiler is known to miscompile kernels.  Only use gcc 2.95.3 and later.
-#endif
-#if __GNUC__ == 2 && __GNUC_MINOR__ == 95
-/* shame we can't detect the .1 or .2 releases */
-#warning GCC 2.95.2 and earlier miscompiles kernels.
-#endif
 
 /* Use marker if you need to separate the values later */
 
diff -puN arch/ia64/kernel/ia64_ksyms.c~remove-gcc2-checks arch/ia64/kernel/ia64_ksyms.c
--- devel/arch/ia64/kernel/ia64_ksyms.c~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
+++ devel-akpm/arch/ia64/kernel/ia64_ksyms.c	2005-12-13 00:54:02.000000000 -0800
@@ -103,7 +103,7 @@ EXPORT_SYMBOL(unw_init_running);
 
 #ifdef ASM_SUPPORTED
 # ifdef CONFIG_SMP
-#  if __GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
+#  if (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
 /*
  * This is not a normal routine and we don't want a function descriptor for it, so we use
  * a fake declaration here.
diff -puN arch/ia64/kernel/head.S~remove-gcc2-checks arch/ia64/kernel/head.S
--- devel/arch/ia64/kernel/head.S~remove-gcc2-checks	2005-12-13 00:51:15.000000000 -0800
+++ devel-akpm/arch/ia64/kernel/head.S	2005-12-13 00:54:10.000000000 -0800
@@ -1060,7 +1060,7 @@ SET_REG(b5);
 	 * the clobber lists for spin_lock() in include/asm-ia64/spinlock.h.
 	 */
 
-#if __GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
+#if (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
 
 GLOBAL_ENTRY(ia64_spinlock_contention_pre3_4)
 	.prologue
diff -puN arch/ia64/oprofile/backtrace.c~remove-gcc2-checks arch/ia64/oprofile/backtrace.c
--- devel/arch/ia64/oprofile/backtrace.c~remove-gcc2-checks	2005-12-13 00:51:15.000000000 -0800
+++ devel-akpm/arch/ia64/oprofile/backtrace.c	2005-12-13 00:54:16.000000000 -0800
@@ -32,7 +32,7 @@ typedef struct
 	u64 *prev_pfs_loc;	/* state for WAR for old spinlock ool code */
 } ia64_backtrace_t;
 
-#if __GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
+#if (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
 /*
  * Returns non-zero if the PC is in the spinlock contention out-of-line code
  * with non-standard calling sequence (on older compilers).
diff -puN include/linux/spinlock_types_up.h~remove-gcc2-checks include/linux/spinlock_types_up.h
--- devel/include/linux/spinlock_types_up.h~remove-gcc2-checks	2005-12-13 00:51:15.000000000 -0800
+++ devel-akpm/include/linux/spinlock_types_up.h	2005-12-13 00:55:14.000000000 -0800
@@ -22,30 +22,16 @@ typedef struct {
 
 #else
 
-/*
- * All gcc 2.95 versions and early versions of 2.96 have a nasty bug
- * with empty initializers.
- */
-#if (__GNUC__ > 2)
 typedef struct { } raw_spinlock_t;
 
 #define __RAW_SPIN_LOCK_UNLOCKED { }
-#else
-typedef struct { int gcc_is_buggy; } raw_spinlock_t;
-#define __RAW_SPIN_LOCK_UNLOCKED (raw_spinlock_t) { 0 }
-#endif
 
 #endif
 
-#if (__GNUC__ > 2)
 typedef struct {
 	/* no debug version on UP */
 } raw_rwlock_t;
 
 #define __RAW_RW_LOCK_UNLOCKED { }
-#else
-typedef struct { int gcc_is_buggy; } raw_rwlock_t;
-#define __RAW_RW_LOCK_UNLOCKED (raw_rwlock_t) { 0 }
-#endif
 
 #endif /* __LINUX_SPINLOCK_TYPES_UP_H */
diff -puN include/linux/seccomp.h~remove-gcc2-checks include/linux/seccomp.h
--- devel/include/linux/seccomp.h~remove-gcc2-checks	2005-12-13 00:51:15.000000000 -0800
+++ devel-akpm/include/linux/seccomp.h	2005-12-13 00:55:25.000000000 -0800
@@ -26,11 +26,7 @@ static inline int has_secure_computing(s
 
 #else /* CONFIG_SECCOMP */
 
-#if (__GNUC__ > 2)
-  typedef struct { } seccomp_t;
-#else
-  typedef struct { int gcc_is_buggy; } seccomp_t;
-#endif
+typedef struct { } seccomp_t;
 
 #define secure_computing(x) do { } while (0)
 /* static inline to preserve typechecking */
diff -puN include/asm-um/rwsem.h~remove-gcc2-checks include/asm-um/rwsem.h
--- devel/include/asm-um/rwsem.h~remove-gcc2-checks	2005-12-13 00:51:15.000000000 -0800
+++ devel-akpm/include/asm-um/rwsem.h	2005-12-13 00:55:41.000000000 -0800
@@ -1,10 +1,6 @@
 #ifndef __UM_RWSEM_H__
 #define __UM_RWSEM_H__
 
-#if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 96)
-#define __builtin_expect(exp,c) (exp)
-#endif
-
 #include "asm/arch/rwsem.h"
 
 #endif
diff -puN include/asm-v850/unistd.h~remove-gcc2-checks include/asm-v850/unistd.h
--- devel/include/asm-v850/unistd.h~remove-gcc2-checks	2005-12-13 00:51:15.000000000 -0800
+++ devel-akpm/include/asm-v850/unistd.h	2005-12-13 00:56:07.000000000 -0800
@@ -241,9 +241,6 @@
 /* User programs sometimes end up including this header file
    (indirectly, via uClibc header files), so I'm a bit nervous just
    including <linux/compiler.h>.  */
-#if !defined(__builtin_expect) && __GNUC__ == 2 && __GNUC_MINOR__ < 96
-#define __builtin_expect(x, expected_value) (x)
-#endif
 
 #define __syscall_return(type, res)					      \
   do {									      \
@@ -346,20 +343,6 @@ type name (atype a, btype b, ctype c, dt
   __syscall_return (type, __ret);					      \
 }
 
-#if __GNUC__ < 3
-/* In older versions of gcc, `asm' statements with more than 10
-   input/output arguments produce a fatal error.  To work around this
-   problem, we use two versions, one for gcc-3.x and one for earlier
-   versions of gcc (the `earlier gcc' version doesn't work with gcc-3.x
-   because gcc-3.x doesn't allow clobbers to also be input arguments).  */
-#define __SYSCALL6_TRAP(syscall, ret, a, b, c, d, e, f)			      \
-  __asm__ __volatile__ ("trap " SYSCALL_LONG_TRAP			      \
-			: "=r" (ret), "=r" (syscall)			      \
-			: "1" (syscall),				      \
-			"r" (a), "r" (b), "r" (c), "r" (d),		      \
- 			"r" (e), "r" (f)				      \
-			: SYSCALL_CLOBBERS, SYSCALL_ARG4, SYSCALL_ARG5);
-#else /* __GNUC__ >= 3 */
 #define __SYSCALL6_TRAP(syscall, ret, a, b, c, d, e, f)			      \
   __asm__ __volatile__ ("trap " SYSCALL_LONG_TRAP			      \
 			: "=r" (ret), "=r" (syscall),			      \
@@ -368,7 +351,6 @@ type name (atype a, btype b, ctype c, dt
 			"r" (a), "r" (b), "r" (c), "r" (d),		      \
 			"2" (e), "3" (f)				      \
 			: SYSCALL_CLOBBERS);
-#endif
 
 #define _syscall6(type, name, atype, a, btype, b, ctype, c, dtype, d, etype, e, ftype, f) \
 type name (atype a, btype b, ctype c, dtype d, etype e, ftype f)	      \
_


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:02             ` Andrew Morton
@ 2005-12-13 10:07               ` Jakub Jelinek
  2005-12-13 10:11                 ` Andi Kleen
  2005-12-14 10:46               ` Russell King
  1 sibling, 1 reply; 239+ messages in thread
From: Jakub Jelinek @ 2005-12-13 10:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

On Tue, Dec 13, 2005 at 01:02:33AM -0800, Andrew Morton wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > Thus far I have this:
> >
> 
> And this:
> 
> 
> From: Andrew Morton <akpm@osdl.org>
> 
> Remove various things which were checking for gcc-1.x and gcc-2.x compilers.

> --- devel/arch/arm/kernel/asm-offsets.c~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
> +++ devel-akpm/arch/arm/kernel/asm-offsets.c	2005-12-13 00:53:27.000000000 -0800
> @@ -23,18 +23,13 @@
>  #error Sorry, your compiler targets APCS-26 but this kernel requires APCS-32
>  #endif
>  /*
> - * GCC 2.95.1, 2.95.2: ignores register clobber list in asm().
>   * GCC 3.0, 3.1: general bad code generation.
>   * GCC 3.2.0: incorrect function argument offset calculation.
>   * GCC 3.2.x: miscompiles NEW_AUX_ENT in fs/binfmt_elf.c
>   *            (http://gcc.gnu.org/PR8896) and incorrect structure
>   *	      initialisation in fs/jffs2/erase.c
>   */
> -#if __GNUC__ < 2 || \
> -   (__GNUC__ == 2 && __GNUC_MINOR__ < 95) || \
> -   (__GNUC__ == 2 && __GNUC_MINOR__ == 95 && __GNUC_PATCHLEVEL__ != 0 && \
> -					     __GNUC_PATCHLEVEL__ < 3) || \
> -   (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
> +#if __GNUC__ < 2 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
>  #error Your compiler is too buggy; it is known to miscompile kernels.
>  #error    Known good compilers: 2.95.3, 2.95.4, 2.96, 3.3
>  #endif

Guess

#if __GNUC__ == 3 && __GNUC_MINOR__ < 3
#error Your compiler is too buggy; it is known to miscompile kernels.
#error    Known good compilers: 3.3, 3.4, 4.0
#endif

would be better.  __GNUC__ < 2 will certainly be errored about in other
places and it is bad to suggest compilers that are no longer supported
as known good ones.

	Jakub

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:07               ` Jakub Jelinek
@ 2005-12-13 10:11                 ` Andi Kleen
  2005-12-13 10:15                   ` Jakub Jelinek
  2005-12-13 10:25                     ` Andrew Morton
  0 siblings, 2 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13 10:11 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Andrew Morton, ak, mingo, dhowells, torvalds, hch, arjan,
	matthew, linux-kernel, linux-arch

> Guess
> 
> #if __GNUC__ == 3 && __GNUC_MINOR__ < 3
> #error Your compiler is too buggy; it is known to miscompile kernels.
> #error    Known good compilers: 3.3, 3.4, 4.0
> #endif
> 
> would be better.  __GNUC__ < 2 will certainly be errored about in other
> places and it is bad to suggest compilers that are no longer supported
> as known good ones.

Are there really any known serious miscompilation with 3.1/3.2?  
(I knew it used to miscompile some loops on x86-64, but I think I worked
around all that) 

Preventing SLES9 and RHEL3 users from easily compiling new kernels
isn't good.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:11                 ` Andi Kleen
@ 2005-12-13 10:15                   ` Jakub Jelinek
  2005-12-13 10:25                     ` Andrew Morton
  1 sibling, 0 replies; 239+ messages in thread
From: Jakub Jelinek @ 2005-12-13 10:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 11:11:53AM +0100, Andi Kleen wrote:
> > Guess
> > 
> > #if __GNUC__ == 3 && __GNUC_MINOR__ < 3
> > #error Your compiler is too buggy; it is known to miscompile kernels.
> > #error    Known good compilers: 3.3, 3.4, 4.0
> > #endif
> > 
> > would be better.  __GNUC__ < 2 will certainly be errored about in other
> > places and it is bad to suggest compilers that are no longer supported
> > as known good ones.
> 
> Are there really any known serious miscompilation with 3.1/3.2?  
> (I knew it used to miscompile some loops on x86-64, but I think I worked
> around all that) 
> 
> Preventing SLES9 and RHEL3 users from easily compiling new kernels
> isn't good.

The above is ARM solely, the comment there mentions some ARM postreload bug
that was only fixed in 3.3+.
I'd say 3.2 should be generally supported for the time being on arches
where there weren't significant problems with it.

	Jakub

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:11                 ` Andi Kleen
@ 2005-12-13 10:25                     ` Andrew Morton
  2005-12-13 10:25                     ` Andrew Morton
  1 sibling, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13 10:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: jakub, ak, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> > Guess
> > 
> > #if __GNUC__ == 3 && __GNUC_MINOR__ < 3
> > #error Your compiler is too buggy; it is known to miscompile kernels.
> > #error    Known good compilers: 3.3, 3.4, 4.0
> > #endif
> > 
> > would be better.  __GNUC__ < 2 will certainly be errored about in other
> > places and it is bad to suggest compilers that are no longer supported
> > as known good ones.
> 
> Are there really any known serious miscompilation with 3.1/3.2?  
> (I knew it used to miscompile some loops on x86-64, but I think I worked
> around all that) 
> 
> Preventing SLES9 and RHEL3 users from easily compiling new kernels
> isn't good.
> 

3.2.1 works OK on x86.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-13 10:25                     ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13 10:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: jakub, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> > Guess
> > 
> > #if __GNUC__ == 3 && __GNUC_MINOR__ < 3
> > #error Your compiler is too buggy; it is known to miscompile kernels.
> > #error    Known good compilers: 3.3, 3.4, 4.0
> > #endif
> > 
> > would be better.  __GNUC__ < 2 will certainly be errored about in other
> > places and it is bad to suggest compilers that are no longer supported
> > as known good ones.
> 
> Are there really any known serious miscompilation with 3.1/3.2?  
> (I knew it used to miscompile some loops on x86-64, but I think I worked
> around all that) 
> 
> Preventing SLES9 and RHEL3 users from easily compiling new kernels
> isn't good.
> 

3.2.1 works OK on x86.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:02             ` Andrew Morton
  2005-12-13 10:07               ` Jakub Jelinek
@ 2005-12-14 10:46               ` Russell King
  1 sibling, 0 replies; 239+ messages in thread
From: Russell King @ 2005-12-14 10:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

On Tue, Dec 13, 2005 at 01:02:33AM -0800, Andrew Morton wrote:
> diff -puN arch/arm/kernel/asm-offsets.c~remove-gcc2-checks arch/arm/kernel/asm-offsets.c
> --- devel/arch/arm/kernel/asm-offsets.c~remove-gcc2-checks	2005-12-13 00:51:14.000000000 -0800
> +++ devel-akpm/arch/arm/kernel/asm-offsets.c	2005-12-13 00:53:27.000000000 -0800
> @@ -23,18 +23,13 @@
>  #error Sorry, your compiler targets APCS-26 but this kernel requires APCS-32
>  #endif
>  /*
> - * GCC 2.95.1, 2.95.2: ignores register clobber list in asm().
>   * GCC 3.0, 3.1: general bad code generation.
>   * GCC 3.2.0: incorrect function argument offset calculation.
>   * GCC 3.2.x: miscompiles NEW_AUX_ENT in fs/binfmt_elf.c
>   *            (http://gcc.gnu.org/PR8896) and incorrect structure
>   *	      initialisation in fs/jffs2/erase.c
>   */
> -#if __GNUC__ < 2 || \
> -   (__GNUC__ == 2 && __GNUC_MINOR__ < 95) || \
> -   (__GNUC__ == 2 && __GNUC_MINOR__ == 95 && __GNUC_PATCHLEVEL__ != 0 && \
> -					     __GNUC_PATCHLEVEL__ < 3) || \
> -   (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
> +#if __GNUC__ < 2 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)

Shouldn't this be:

+#if __GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 3)

?

>  #error Your compiler is too buggy; it is known to miscompile kernels.
>  #error    Known good compilers: 2.95.3, 2.95.4, 2.96, 3.3

And this should also have the 2.95 and 2.96 stuff edited out.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:01             ` Andrew Morton
  (?)
  (?)
@ 2005-12-13  9:05             ` Andi Kleen
  2005-12-13  9:15                 ` Andrew Morton
  2005-12-13 22:18               ` Adrian Bunk
  -1 siblings, 2 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  9:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 01:01:26AM -0800, Andrew Morton wrote:
> Andi Kleen <ak@suse.de> wrote:
> >
> > Can you please apply the following patch then? 
> > 
> >  Remove -Wdeclaration-after-statement
> 
> OK.
> 
> Thus far I have this:

Would it be possible to drop support for gcc 3.0 too? 
AFAIK it has never been widely used. If we assume 3.1+ minimum it has the 
advantage that named assembly arguments work, which make
the inline assembly often a lot easier to read and maintain.

-Andi


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:05             ` Andi Kleen
@ 2005-12-13  9:15                 ` Andrew Morton
  2005-12-13 22:18               ` Adrian Bunk
  1 sibling, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> On Tue, Dec 13, 2005 at 01:01:26AM -0800, Andrew Morton wrote:
> > Andi Kleen <ak@suse.de> wrote:
> > >
> > > Can you please apply the following patch then? 
> > > 
> > >  Remove -Wdeclaration-after-statement
> > 
> > OK.
> > 
> > Thus far I have this:
> 
> Would it be possible to drop support for gcc 3.0 too? 

Spose so - I don't know what people are using out there.

> AFAIK it has never been widely used. If we assume 3.1+ minimum it has the 
> advantage that named assembly arguments work, which make
> the inline assembly often a lot easier to read and maintain.

There are a few places in the tree which refuse to compile with 3.1 and 3.2.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-13  9:15                 ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel, linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> On Tue, Dec 13, 2005 at 01:01:26AM -0800, Andrew Morton wrote:
> > Andi Kleen <ak@suse.de> wrote:
> > >
> > > Can you please apply the following patch then? 
> > > 
> > >  Remove -Wdeclaration-after-statement
> > 
> > OK.
> > 
> > Thus far I have this:
> 
> Would it be possible to drop support for gcc 3.0 too? 

Spose so - I don't know what people are using out there.

> AFAIK it has never been widely used. If we assume 3.1+ minimum it has the 
> advantage that named assembly arguments work, which make
> the inline assembly often a lot easier to read and maintain.

There are a few places in the tree which refuse to compile with 3.1 and 3.2.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:15                 ` Andrew Morton
  (?)
@ 2005-12-13  9:24                 ` Andi Kleen
  2005-12-13  9:44                     ` Andrew Morton
                                     ` (2 more replies)
  -1 siblings, 3 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  9:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

> Spose so - I don't know what people are using out there.

I don't think it was shipped in major distros at least (AFAIK) 
They all went from 2.95 to 3.1/3.2 

Perhaps stick an error for 3.0 in and wait if people are complaining? 

> 
> > AFAIK it has never been widely used. If we assume 3.1+ minimum it has the 
> > advantage that named assembly arguments work, which make
> > the inline assembly often a lot easier to read and maintain.
> 
> There are a few places in the tree which refuse to compile with 3.1 and 3.2.

Really? Which ones? 

Haven't seen that and I still use 3.2 occasionally (it's the default
compiler on SLES9 and I believe on RHEL3 too)  

-Andi


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:24                 ` Andi Kleen
@ 2005-12-13  9:44                     ` Andrew Morton
  2005-12-13 10:28                   ` Andreas Schwab
  2005-12-13 12:33                   ` Matthew Wilcox
  2 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:44 UTC (permalink / raw)
  To: Andi Kleen
  Cc: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> > There are a few places in the tree which refuse to compile with 3.1 and 3.2.
> 
>  Really? Which ones? 

grep for __GNUC_MINOR__

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-13  9:44                     ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:44 UTC (permalink / raw)
  To: Andi Kleen
  Cc: mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel, linux-arch

Andi Kleen <ak@suse.de> wrote:
>
> > There are a few places in the tree which refuse to compile with 3.1 and 3.2.
> 
>  Really? Which ones? 

grep for __GNUC_MINOR__

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:44                     ` Andrew Morton
  (?)
@ 2005-12-13  9:49                     ` Andi Kleen
  -1 siblings, 0 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  9:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 01:44:37AM -0800, Andrew Morton wrote:
> Andi Kleen <ak@suse.de> wrote:
> >
> > > There are a few places in the tree which refuse to compile with 3.1 and 3.2.
> > 
> >  Really? Which ones? 
> 
> grep for __GNUC_MINOR__

I reviewed them and I didn't find any that refused 3.2 or 3.3.

Some architectures have special code for old gccs, but nothing
generic.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:24                 ` Andi Kleen
  2005-12-13  9:44                     ` Andrew Morton
@ 2005-12-13 10:28                   ` Andreas Schwab
  2005-12-13 10:30                     ` Andi Kleen
  2005-12-13 12:33                   ` Matthew Wilcox
  2 siblings, 1 reply; 239+ messages in thread
From: Andreas Schwab @ 2005-12-13 10:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

Andi Kleen <ak@suse.de> writes:

> Haven't seen that and I still use 3.2 occasionally (it's the default
> compiler on SLES9 and I believe on RHEL3 too)  

SLES9 has 3.3-hammer.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:28                   ` Andreas Schwab
@ 2005-12-13 10:30                     ` Andi Kleen
  0 siblings, 0 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13 10:30 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Andi Kleen, Andrew Morton, mingo, dhowells, torvalds, hch, arjan,
	matthew, linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 11:28:41AM +0100, Andreas Schwab wrote:
> Andi Kleen <ak@suse.de> writes:
> 
> > Haven't seen that and I still use 3.2 occasionally (it's the default
> > compiler on SLES9 and I believe on RHEL3 too)  
> 
> SLES9 has 3.3-hammer.

You're right - i meant to write SLES8 where 3.2 was default.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:24                 ` Andi Kleen
  2005-12-13  9:44                     ` Andrew Morton
  2005-12-13 10:28                   ` Andreas Schwab
@ 2005-12-13 12:33                   ` Matthew Wilcox
  2 siblings, 0 replies; 239+ messages in thread
From: Matthew Wilcox @ 2005-12-13 12:33 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, torvalds, hch, arjan,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 10:24:37AM +0100, Andi Kleen wrote:
> > Spose so - I don't know what people are using out there.
> 
> I don't think it was shipped in major distros at least (AFAIK) 
> They all went from 2.95 to 3.1/3.2 

Debian Woody (3.0) shipped a mess of compilers -- 2.95 for most, 2.96
for ia64 and 3.0 for parisc.  That was released in July 2002.  Sarge
(3.1) shipped in June 2005 and uses GCC 3.3 on all architectures.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:05             ` Andi Kleen
  2005-12-13  9:15                 ` Andrew Morton
@ 2005-12-13 22:18               ` Adrian Bunk
  2005-12-13 22:25                 ` Andi Kleen
  1 sibling, 1 reply; 239+ messages in thread
From: Adrian Bunk @ 2005-12-13 22:18 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 10:05:18AM +0100, Andi Kleen wrote:
> On Tue, Dec 13, 2005 at 01:01:26AM -0800, Andrew Morton wrote:
> > Andi Kleen <ak@suse.de> wrote:
> > >
> > > Can you please apply the following patch then? 
> > > 
> > >  Remove -Wdeclaration-after-statement
> > 
> > OK.
> > 
> > Thus far I have this:
> 
> Would it be possible to drop support for gcc 3.0 too? 
> AFAIK it has never been widely used. If we assume 3.1+ minimum it has the 
> advantage that named assembly arguments work, which make
> the inline assembly often a lot easier to read and maintain.

3.2+ would be better than 3.1+

Remember that 3.2 would have been named 3.1.2 if there wasn't the C++
ABI change, and I don't remember any big Linux distribution actually 
using gcc 3.1 as default compiler.

And since gcc 3.2 was released one and a half years before kernel 2.6.0, 
I doubt there's any distribution both supporting kernel 2.6 and not 
shipping any gcc >= 3.2 .

> -Andi

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 22:18               ` Adrian Bunk
@ 2005-12-13 22:25                 ` Andi Kleen
  2005-12-13 22:32                   ` Adrian Bunk
  0 siblings, 1 reply; 239+ messages in thread
From: Andi Kleen @ 2005-12-13 22:25 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Andi Kleen, Andrew Morton, mingo, dhowells, torvalds, hch, arjan,
	matthew, linux-kernel, linux-arch

> 3.2+ would be better than 3.1+
> 
> Remember that 3.2 would have been named 3.1.2 if there wasn't the C++
> ABI change, and I don't remember any big Linux distribution actually 
> using gcc 3.1 as default compiler.

Yes, but the kernel doesn't use C++ and afaik other than that there were only
a few minor bugfixes between 3.1 and 3.2. So it doesn't make any
difference for this special case.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 22:25                 ` Andi Kleen
@ 2005-12-13 22:32                   ` Adrian Bunk
  0 siblings, 0 replies; 239+ messages in thread
From: Adrian Bunk @ 2005-12-13 22:32 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 11:25:43PM +0100, Andi Kleen wrote:
> > 3.2+ would be better than 3.1+
> > 
> > Remember that 3.2 would have been named 3.1.2 if there wasn't the C++
> > ABI change, and I don't remember any big Linux distribution actually 
> > using gcc 3.1 as default compiler.
> 
> Yes, but the kernel doesn't use C++ and afaik other than that there were only
> a few minor bugfixes between 3.1 and 3.2. So it doesn't make any
> difference for this special case.

gcc 3.2.3 is four bugfix releases and nine months later than 3.1.1, and 
there are virtually no gcc 3.1 users.

It's not a strong opinion, but if the question is whether to draw the 
line before or after gcc 3.1 I'd vote for dropping gcc 3.1 support.

> -Andi

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:01             ` Andrew Morton
                               ` (2 preceding siblings ...)
  (?)
@ 2005-12-13  9:11             ` Ingo Molnar
  -1 siblings, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch


* Andrew Morton <akpm@osdl.org> wrote:

> Andi Kleen <ak@suse.de> wrote:
> >
> > Can you please apply the following patch then? 
> > 
> >  Remove -Wdeclaration-after-statement
> 
> OK.
> 
> Thus far I have this:
> 
> 
> From: Andrew Morton <akpm@osdl.org>

hurray!!

This-Move-Is-Emphatically-Supported-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:49         ` Andi Kleen
  2005-12-13  9:01             ` Andrew Morton
@ 2005-12-13  9:04           ` Christoph Hellwig
  2005-12-13  9:13             ` Ingo Molnar
  2005-12-13 10:11             ` Jakub Jelinek
  2005-12-13  9:09           ` Ingo Molnar
  2005-12-13 16:16           ` Linus Torvalds
  3 siblings, 2 replies; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-13  9:04 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

> 
> Remove -Wdeclaration-after-statement
> 
> Now that gcc 2.95 is not supported anymore it's ok to use C99
> style mixed declarations everywhere.

Nack.  This code style is pure obsfucation and we should disallow it forever.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:04           ` Christoph Hellwig
@ 2005-12-13  9:13             ` Ingo Molnar
  2005-12-13 10:11             ` Jakub Jelinek
  1 sibling, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:13 UTC (permalink / raw)
  To: Christoph Hellwig, Andi Kleen, Andrew Morton, dhowells, torvalds,
	arjan, matthew, linux-kernel, linux-arch


* Christoph Hellwig <hch@infradead.org> wrote:

> > 
> > Remove -Wdeclaration-after-statement
> > 
> > Now that gcc 2.95 is not supported anymore it's ok to use C99
> > style mixed declarations everywhere.
> 
> Nack.  This code style is pure obsfucation and we should disallow it 
> forever.

agreed. I often get quilt mismerges uncovered by that warning. If 
someone wants to start a new section of code that is too large, it 
should go into a separate function.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:04           ` Christoph Hellwig
  2005-12-13  9:13             ` Ingo Molnar
@ 2005-12-13 10:11             ` Jakub Jelinek
  2005-12-13 10:19               ` Christoph Hellwig
  1 sibling, 1 reply; 239+ messages in thread
From: Jakub Jelinek @ 2005-12-13 10:11 UTC (permalink / raw)
  To: Christoph Hellwig, Andi Kleen, Andrew Morton, mingo, dhowells,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 09:04:29AM +0000, Christoph Hellwig wrote:
> > 
> > Remove -Wdeclaration-after-statement
> > 
> > Now that gcc 2.95 is not supported anymore it's ok to use C99
> > style mixed declarations everywhere.
> 
> Nack.  This code style is pure obsfucation and we should disallow it forever.

Why?  It greatly increases readability when variable declarations can be
moved close to their actual uses.  glibc changed a lot of its codebase
this way and from my experience it really helps.

	Jakub

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:11             ` Jakub Jelinek
@ 2005-12-13 10:19               ` Christoph Hellwig
  2005-12-13 10:27                 ` Ingo Molnar
  2005-12-15  4:53                 ` Miles Bader
  0 siblings, 2 replies; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-13 10:19 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Christoph Hellwig, Andi Kleen, Andrew Morton, mingo, dhowells,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 05:11:41AM -0500, Jakub Jelinek wrote:
> On Tue, Dec 13, 2005 at 09:04:29AM +0000, Christoph Hellwig wrote:
> > > 
> > > Remove -Wdeclaration-after-statement
> > > 
> > > Now that gcc 2.95 is not supported anymore it's ok to use C99
> > > style mixed declarations everywhere.
> > 
> > Nack.  This code style is pure obsfucation and we should disallow it forever.
> 
> Why?  It greatly increases readability when variable declarations can be
> moved close to their actual uses.  glibc changed a lot of its codebase
> this way and from my experience it really helps.

mentioning glibc and readability in the same sentence disqualies your here,
sorry ;-)

But serious, having to look all over the source instead of just a block
beginning decreases code readability a lot.  And if you have to scroll more
than a page to the block beginning on a 80x24 terminal means the code needs
a refactoring anyway.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:19               ` Christoph Hellwig
@ 2005-12-13 10:27                 ` Ingo Molnar
  2005-12-15  4:53                 ` Miles Bader
  1 sibling, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 10:27 UTC (permalink / raw)
  To: Christoph Hellwig, Jakub Jelinek, Andi Kleen, Andrew Morton,
	dhowells, torvalds, arjan, matthew, linux-kernel, linux-arch


* Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Dec 13, 2005 at 05:11:41AM -0500, Jakub Jelinek wrote:
> > On Tue, Dec 13, 2005 at 09:04:29AM +0000, Christoph Hellwig wrote:
> > > > 
> > > > Remove -Wdeclaration-after-statement
> > > > 
> > > > Now that gcc 2.95 is not supported anymore it's ok to use C99
> > > > style mixed declarations everywhere.
> > > 
> > > Nack.  This code style is pure obsfucation and we should disallow it forever.
> > 
> > Why?  It greatly increases readability when variable declarations can be
> > moved close to their actual uses.  glibc changed a lot of its codebase
> > this way and from my experience it really helps.
> 
> mentioning glibc and readability in the same sentence disqualies your 
> here, sorry ;-)

it's a different coding style, but otherwise i find glibc highly 
readable and well-maintained. It is also a more mature piece of code 
than say the kernel, e.g. API-wise, so we could indeed learn a few 
things. Just consider the fact that glibc has 10 times more APIs than 
the kernel, and still it is breaking apps less often than the kernel.  
But i digress :-)

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:19               ` Christoph Hellwig
  2005-12-13 10:27                 ` Ingo Molnar
@ 2005-12-15  4:53                 ` Miles Bader
  2005-12-15  5:05                   ` Nick Piggin
  1 sibling, 1 reply; 239+ messages in thread
From: Miles Bader @ 2005-12-15  4:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jakub Jelinek, Andi Kleen, Andrew Morton, mingo, dhowells,
	torvalds, arjan, matthew, linux-kernel, linux-arch

Christoph Hellwig <hch@infradead.org> writes:
> But serious, having to look all over the source instead of just a block
> beginning decreases code readability a lot.

My experience is quite the opposite.

Being forced to put declarations at the beginning of the block in
practice means that people simply separate declarations from the first
assignment.  That uglifies and bloats the code, and seems to often cause
bugs as well (because people seem to often not pay attention to what
happens to a variable between the declaration and first assignment;
having it simply _not exist_ before the first assignment helps quite a
bit).

-Miles
-- 
Run away!  Run away!

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15  4:53                 ` Miles Bader
@ 2005-12-15  5:05                   ` Nick Piggin
  0 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-15  5:05 UTC (permalink / raw)
  To: Miles Bader
  Cc: Christoph Hellwig, Jakub Jelinek, Andi Kleen, Andrew Morton,
	mingo, dhowells, torvalds, arjan, matthew, linux-kernel,
	linux-arch

Miles Bader wrote:
> Christoph Hellwig <hch@infradead.org> writes:
> 
>>But serious, having to look all over the source instead of just a block
>>beginning decreases code readability a lot.
> 
> 
> My experience is quite the opposite.
> 
> Being forced to put declarations at the beginning of the block in
> practice means that people simply separate declarations from the first
> assignment.  That uglifies and bloats the code, and seems to often cause
> bugs as well (because people seem to often not pay attention to what
> happens to a variable between the declaration and first assignment;
> having it simply _not exist_ before the first assignment helps quite a
> bit).
> 

If your blocks are so big that you lose track of variables like
this... then it is too big and/or complex.

And the argument about having it simply _not exist_ before the
first assignment isn't convincing to me, because you cannot
undeclare variables after you finish with them (do you also see
code where people cause bugs by forgetting about the variable after
its last use?).

IMO, the system of declaring all variables at the top of the block
and they all disappear at the end is nice and symmetric... although
I probably agree with Linus on the 'for (int i = 0;' feature.

Nick

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:49         ` Andi Kleen
  2005-12-13  9:01             ` Andrew Morton
  2005-12-13  9:04           ` Christoph Hellwig
@ 2005-12-13  9:09           ` Ingo Molnar
  2005-12-13  9:21             ` Andi Kleen
  2005-12-13 16:16           ` Linus Torvalds
  3 siblings, 1 reply; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:09 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch


* Andi Kleen <ak@suse.de> wrote:

> > It's time to give up on it and just drink more coffee or play more tetris
> > or something, I'm afraid.
> 
> Or start using icecream (http://wiki.kde.org/icecream)

distcc is pretty good too. I have a minimal kernel build done in 19 
seconds, a fuller build (1.5MB bzImage that boots on all my testboxes) 
done in 45 seconds, using gcc 4.0.2.

with the default settings, distcc wasnt saturating my boxes, the key was 
to start distccd with a longer queue size (/etc/sysconfig/distccd):

 OPTIONS="--nice 5 --jobs 128"

and to get the DISTCC_HOSTS tuning right:

 export DISTCC_HOSTS='j/16 n/120 v/40 s/13 e2/7'

in fact my distcc builds are almost as fast as a fully cached ccache 
build coming straight out of RAM ...

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:09           ` Ingo Molnar
@ 2005-12-13  9:21             ` Andi Kleen
  0 siblings, 0 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  9:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, Andrew Morton, dhowells, torvalds, hch, arjan,
	matthew, linux-kernel, linux-arch

> > Or start using icecream (http://wiki.kde.org/icecream)
> 
> distcc is pretty good too. I have a minimal kernel build done in 19 
> seconds, a fuller build (1.5MB bzImage that boots on all my testboxes) 
> done in 45 seconds, using gcc 4.0.2.

icecream is better though - it reacts dynamically to your network
and it handles different installed compiler versions and cross compilation
nicely.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:49         ` Andi Kleen
                             ` (2 preceding siblings ...)
  2005-12-13  9:09           ` Ingo Molnar
@ 2005-12-13 16:16           ` Linus Torvalds
  3 siblings, 0 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-13 16:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, mingo, dhowells, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, 13 Dec 2005, Andi Kleen wrote:
> 
> Remove -Wdeclaration-after-statement

Please don't.

It's a coding style issue. We put our variable declarations where people 
can _find_ them, not in random places in the code.

Putting variables in the middle of code only improves readability when you 
have messy code. 

Now, one feature that _may_ be worth it is the loop counter thing:

	for (int i = 10; i; i--)
		...

kind of syntax actually makes sense and is a real feature (it makes "i" 
local to the loop, and can actually help people avoid bugs - you can't use 
"i" by mistake after the loop).

But I think you need "--std=c99" for gcc to take that.

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:42       ` Andrew Morton
  2005-12-13  8:49         ` Andi Kleen
@ 2005-12-13  9:03         ` Christoph Hellwig
  2005-12-13  9:14             ` Andrew Morton
  1 sibling, 1 reply; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-13  9:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, mingo, dhowells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 12:42:57AM -0800, Andrew Morton wrote:
> scsi/sd.c is currently getting an ICE.  None of the new SAS code compiles,
> due to extensive use of anonymous unions.

This is just the headers in the luben code which need redoing completely
because they're doing other stupid things like using bitfields for on the
wire structures.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:03         ` Christoph Hellwig
@ 2005-12-13  9:14             ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: ak, mingo, dhowells, torvalds, hch, arjan, matthew, linux-kernel,
	linux-arch

Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, Dec 13, 2005 at 12:42:57AM -0800, Andrew Morton wrote:
> > scsi/sd.c is currently getting an ICE.  None of the new SAS code compiles,
> > due to extensive use of anonymous unions.
> 
> This is just the headers in the luben code which need redoing completely
> because they're doing other stupid things like using bitfields for on the
> wire structures.

Don't think so (you're referring to Jeff's git-sas-jg.patch?).  It dies
with current -linus tree.


drivers/scsi/sd.c: In function `sd_read_capacity':
drivers/scsi/sd.c:1301: internal error--unrecognizable insn:
(insn 1274 1273 1797 (parallel[ 
            (set (reg:SI 0 %eax)
                (asm_operands ("") ("=a") 0[ 
                        (reg:DI 1 %edx)
                    ] 
                    [ 
                        (asm_input:DI ("A"))
                    ]  ("drivers/scsi/sd.c") 1282))
            (set (reg:SI 1 %edx)
                (asm_operands ("") ("=d") 1[ 
                        (reg:DI 1 %edx)
                    ] 
                    [ 
                        (asm_input:DI ("A"))
                    ]  ("drivers/scsi/sd.c") 1282))
        ] ) -1 (insn_list 1269 (nil))
    (nil))

It'll be workable aroundable of course, but it's a hassle.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-13  9:14             ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-13  9:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: ak, mingo, dhowells, torvalds, arjan, matthew, linux-kernel, linux-arch

Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, Dec 13, 2005 at 12:42:57AM -0800, Andrew Morton wrote:
> > scsi/sd.c is currently getting an ICE.  None of the new SAS code compiles,
> > due to extensive use of anonymous unions.
> 
> This is just the headers in the luben code which need redoing completely
> because they're doing other stupid things like using bitfields for on the
> wire structures.

Don't think so (you're referring to Jeff's git-sas-jg.patch?).  It dies
with current -linus tree.


drivers/scsi/sd.c: In function `sd_read_capacity':
drivers/scsi/sd.c:1301: internal error--unrecognizable insn:
(insn 1274 1273 1797 (parallel[ 
            (set (reg:SI 0 %eax)
                (asm_operands ("") ("=a") 0[ 
                        (reg:DI 1 %edx)
                    ] 
                    [ 
                        (asm_input:DI ("A"))
                    ]  ("drivers/scsi/sd.c") 1282))
            (set (reg:SI 1 %edx)
                (asm_operands ("") ("=d") 1[ 
                        (reg:DI 1 %edx)
                    ] 
                    [ 
                        (asm_input:DI ("A"))
                    ]  ("drivers/scsi/sd.c") 1282))
        ] ) -1 (insn_list 1269 (nil))
    (nil))

It'll be workable aroundable of course, but it's a hassle.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:14             ` Andrew Morton
  (?)
@ 2005-12-13  9:21             ` Christoph Hellwig
  -1 siblings, 0 replies; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-13  9:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, ak, mingo, dhowells, torvalds, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 01:14:13AM -0800, Andrew Morton wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Tue, Dec 13, 2005 at 12:42:57AM -0800, Andrew Morton wrote:
> > > scsi/sd.c is currently getting an ICE.  None of the new SAS code compiles,
> > > due to extensive use of anonymous unions.
> > 
> > This is just the headers in the luben code which need redoing completely
> > because they're doing other stupid things like using bitfields for on the
> > wire structures.
> 
> Don't think so (you're referring to Jeff's git-sas-jg.patch?).  It dies
> with current -linus tree.

I didn't mean sd.c but the anonymous union usage.  Everything that's stuffed
into include/scsi/sas/ in -mm is far from mergeable.  It's really badly done
headers that need to be redone.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  7:54   ` Ingo Molnar
  2005-12-13  7:58     ` Andi Kleen
@ 2005-12-13  8:00     ` Arjan van de Ven
  2005-12-13  9:03       ` Ingo Molnar
  2005-12-13  9:02     ` Christoph Hellwig
  2005-12-13  9:55     ` Ingo Molnar
  3 siblings, 1 reply; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-13  8:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, David Howells, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Tue, 2005-12-13 at 08:54 +0100, Ingo Molnar wrote:
> * Andrew Morton <akpm@osdl.org> wrote:
> 
> > I'd have thought that the way to do this is to simply reimplement 
> > down(), up(), down_trylock(), etc using the new xchg-based code and to 
> > then hunt down those few parts of the kernel which actually use the 
> > old semaphore's counting feature and convert them to use down_sem(), 
> > up_sem(), etc.  And rename all the old semaphore code: 
> > s/down/down_sem/etc.
> 
> even better than that, why not use the solution that we've implemented 
> for the -rt patchset, more than a year ago?
> 
> the solution i took was this:
> 
> - i did not touch the 'struct semaphore' namespace, but introduced a
>   'struct compat_semaphore'.

which I think is wrong. THis naming sucks. Sure doing a full sed on the
tree is not pretty but it's also not THAT painful. And the pain of wrong
names is something the kernel needs to carry around for years.
> 
> - i introduced a 'type-sensitive' macro wrapper that switches down() 
>   (and the other APIs) to either to the assembly variant (if the 
>   variable's type is struct compat_semaphore), or switches it to the new 
>   generic mutex (if the type is struct semaphore), at build-time. There 
>   is no runtime overhead due to this build-time-switching.

while this is a smart trick, I rather prefer seperate functions, just so
that people are "aware" which they use. Since 99% of the users is a
mutex anyway, the new names are only used in a few special cases.



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  8:00     ` Arjan van de Ven
@ 2005-12-13  9:03       ` Ingo Molnar
  2005-12-13  9:09         ` Andi Kleen
  2005-12-13  9:19         ` Arjan van de Ven
  0 siblings, 2 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:03 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Andrew Morton, David Howells, torvalds, hch, matthew,
	linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]


* Arjan van de Ven <arjan@infradead.org> wrote:

> > even better than that, why not use the solution that we've implemented 
> > for the -rt patchset, more than a year ago?
> > 
> > the solution i took was this:
> > 
> > - i did not touch the 'struct semaphore' namespace, but introduced a
> >   'struct compat_semaphore'.
> 
> which I think is wrong. THis naming sucks. Sure doing a full sed on 
> the tree is not pretty but it's also not THAT painful. And the pain of 
> wrong names is something the kernel needs to carry around for years.

well, i'm all for renaming struct semaphore to struct mutex, but dont 
the same arguments apply as to 'struct timer_list'?

just to see the scope, i've attached semaphore-to-mutex.patch, which 
just dumbly converts all 'struct semaphore' occurances to 'struct 
mutex', against Linus-git-curr:

 405 files changed, 568 insertions(+), 568 deletions(-)

it's not _that_ bad, if done overnight. It does not touch any of the 
down/up APIs. Touching those would create a monster patch and monster 
impact.

	Ingo

[-- Attachment #2: semaphore-to-mutex.patch.bz2 --]
[-- Type: application/x-bzip2, Size: 40881 bytes --]

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:03       ` Ingo Molnar
@ 2005-12-13  9:09         ` Andi Kleen
  2005-12-13  9:34           ` Ingo Molnar
  2005-12-13  9:37           ` Ingo Molnar
  2005-12-13  9:19         ` Arjan van de Ven
  1 sibling, 2 replies; 239+ messages in thread
From: Andi Kleen @ 2005-12-13  9:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arjan van de Ven, Andrew Morton, David Howells, torvalds, hch,
	matthew, linux-kernel, linux-arch

> it's not _that_ bad, if done overnight. It does not touch any of the 
> down/up APIs. Touching those would create a monster patch and monster 
> impact.

One argument for a full rename (and abandoning the old "struct semaphore"
name completely) would be that it would offer a clean break for out tree code,
no silent breakage. 

-Andi


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:09         ` Andi Kleen
@ 2005-12-13  9:34           ` Ingo Molnar
  2005-12-13 14:33             ` Mark Lord
  2005-12-13  9:37           ` Ingo Molnar
  1 sibling, 1 reply; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:34 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Arjan van de Ven, Andrew Morton, David Howells, torvalds, hch,
	matthew, linux-kernel, linux-arch

* Andi Kleen <ak@suse.de> wrote:

> > it's not _that_ bad, if done overnight. It does not touch any of the 
> > down/up APIs. Touching those would create a monster patch and monster 
> > impact.
> 
> One argument for a full rename (and abandoning the old "struct 
> semaphore" name completely) would be that it would offer a clean break 
> for out tree code, no silent breakage.

yeah. Another way to handle it would be to keep 'struct semaphore' for 
the traditional semaphore type (together with the APIs), and to mark 
them deprecated. I.e. we'd have 3 separate types and 3 separate sets of 
APIs:

 'struct mutex' & APIs
 'struct semaphore' & APIs
 'struct compat_semaphore' & APIs

phase #1: we do an overnight rename to 'struct mutex' and to
          'struct compat_semaphore', based on the info that has been 
          mapped by the -rt tree. We mark 'struct semaphore' deprecated.

phase #2: we let out-of-tree code still work that uses struct 
          semaphore, but for new code applied, it must not be used.

phase #3: we remove 'struct semaphore' and APIs.

the problem with this approach is that it touches the semaphore APIs 
too, which increases the impact of the rename by a _factor of 10_. Right 
now we have ~600 places that use 'struct semaphore', but we have over 
7000 places that use the APIs! I dont think it's realistic to do an 
overnight change of all the APIs, we'd break every out-of-kernel tree in 
a massive way. (the type change alone is much more manageable)

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:34           ` Ingo Molnar
@ 2005-12-13 14:33             ` Mark Lord
  2005-12-13 14:45               ` Arjan van de Ven
  0 siblings, 1 reply; 239+ messages in thread
From: Mark Lord @ 2005-12-13 14:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, Arjan van de Ven, Andrew Morton, David Howells,
	torvalds, hch, matthew, linux-kernel, linux-arch

 >'struct compat_semaphore'

I really think this data type needs a better name,
one that reflects what it does.

Something like 'struct binary_semaphore' or something.

Cheers

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 14:33             ` Mark Lord
@ 2005-12-13 14:45               ` Arjan van de Ven
  0 siblings, 0 replies; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-13 14:45 UTC (permalink / raw)
  To: Mark Lord
  Cc: Ingo Molnar, Andi Kleen, Andrew Morton, David Howells, torvalds,
	hch, matthew, linux-kernel, linux-arch

On Tue, 2005-12-13 at 09:33 -0500, Mark Lord wrote:
>  >'struct compat_semaphore'
> 
> I really think this data type needs a better name,
> one that reflects what it does.
> 
> Something like 'struct binary_semaphore' or something.

see the thing is.. this is the counting one ;)
the -rt naming is just too confusing (but done to keep patch maintenance
reasonable, which is fair enough for that purpose, but not good enough
for kernel.org)


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:09         ` Andi Kleen
  2005-12-13  9:34           ` Ingo Molnar
@ 2005-12-13  9:37           ` Ingo Molnar
  1 sibling, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:37 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Arjan van de Ven, Andrew Morton, David Howells, torvalds, hch,
	matthew, linux-kernel, linux-arch


* Andi Kleen <ak@suse.de> wrote:

> > it's not _that_ bad, if done overnight. It does not touch any of the 
> > down/up APIs. Touching those would create a monster patch and monster 
> > impact.
> 
> One argument for a full rename (and abandoning the old "struct 
> semaphore" name completely) would be that it would offer a clean break 
> for out tree code, no silent breakage.

btw., in the -rt tree we rarely had 'silent breakage' - roughly 80% of 
the cases were caught build-time: we eliminated DECLARE_MUTEX_LOCKED, 
which is a clear sign for non-mutex semaphore usage. Another 19% was 
caught by runtime checks: 'does owner unlock the mutex'. The remaining 
1% was breakage that was not found quickly.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:03       ` Ingo Molnar
  2005-12-13  9:09         ` Andi Kleen
@ 2005-12-13  9:19         ` Arjan van de Ven
  1 sibling, 0 replies; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-13  9:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, David Howells, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Tue, 2005-12-13 at 10:03 +0100, Ingo Molnar wrote:
> * Arjan van de Ven <arjan@infradead.org> wrote:
> 
> > > even better than that, why not use the solution that we've implemented 
> > > for the -rt patchset, more than a year ago?
> > > 
> > > the solution i took was this:
> > > 
> > > - i did not touch the 'struct semaphore' namespace, but introduced a
> > >   'struct compat_semaphore'.
> > 
> > which I think is wrong. THis naming sucks. Sure doing a full sed on 
> > the tree is not pretty but it's also not THAT painful. And the pain of 
> > wrong names is something the kernel needs to carry around for years.
> 
> well, i'm all for renaming struct semaphore to struct mutex, but dont 
> the same arguments apply as to 'struct timer_list'?

don't think so; this is not a "lets do them one by one over the year",
this is a "do them all right now at once" move.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  7:54   ` Ingo Molnar
  2005-12-13  7:58     ` Andi Kleen
  2005-12-13  8:00     ` Arjan van de Ven
@ 2005-12-13  9:02     ` Christoph Hellwig
  2005-12-13  9:39       ` Ingo Molnar
  2005-12-13  9:55     ` Ingo Molnar
  3 siblings, 1 reply; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-13  9:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, David Howells, torvalds, hch, arjan, matthew,
	linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 08:54:41AM +0100, Ingo Molnar wrote:
> - i did not touch the 'struct semaphore' namespace, but introduced a
>   'struct compat_semaphore'.

Because it's totally brindead.  Your compat_semaphore is a real semaphore
and your semaphore is a mutex.  So name them as such.

> - i introduced a 'type-sensitive' macro wrapper that switches down() 
>   (and the other APIs) to either to the assembly variant (if the 
>   variable's type is struct compat_semaphore), or switches it to the new 
>   generic mutex (if the type is struct semaphore), at build-time. There 
>   is no runtime overhead due to this build-time-switching.

And this one is probably is a great help to win the obsfucated C contests,
but otherwise just harmfull.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:02     ` Christoph Hellwig
@ 2005-12-13  9:39       ` Ingo Molnar
  2005-12-13 10:00         ` Ingo Molnar
  0 siblings, 1 reply; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:39 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, David Howells, torvalds, arjan,
	matthew, linux-kernel, linux-arch


* Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Dec 13, 2005 at 08:54:41AM +0100, Ingo Molnar wrote:
> > - i did not touch the 'struct semaphore' namespace, but introduced a
> >   'struct compat_semaphore'.
> 
> Because it's totally braindead.  Your compat_semaphore is a real 
> semaphore and your semaphore is a mutex.  So name them as such.

well, i had the choice between a 30K patch, a 300K patch and a 3000K 
patch. I went for the 30K patch ;-)

> > - i introduced a 'type-sensitive' macro wrapper that switches down() 
> >   (and the other APIs) to either to the assembly variant (if the 
> >   variable's type is struct compat_semaphore), or switches it to the new 
> >   generic mutex (if the type is struct semaphore), at build-time. There 
> >   is no runtime overhead due to this build-time-switching.
> 
> And this one is probably is a great help to win the obsfucated C 
> contests, but otherwise just harmfull.

i never found it to be harmful in any way, and we've now got a year of 
experience with them. Could you elaborate?

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:39       ` Ingo Molnar
@ 2005-12-13 10:00         ` Ingo Molnar
  2005-12-13 17:40           ` Paul Jackson
  2005-12-13 18:34           ` David Howells
  0 siblings, 2 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 10:00 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, David Howells, torvalds, arjan,
	matthew, linux-kernel, linux-arch


* Ingo Molnar <mingo@elte.hu> wrote:

> > On Tue, Dec 13, 2005 at 08:54:41AM +0100, Ingo Molnar wrote:
> > > - i did not touch the 'struct semaphore' namespace, but introduced a
> > >   'struct compat_semaphore'.
> > 
> > Because it's totally braindead.  Your compat_semaphore is a real 
> > semaphore and your semaphore is a mutex.  So name them as such.
> 
> well, i had the choice between a 30K patch, a 300K patch and a 3000K 
> patch. I went for the 30K patch ;-)

in that sense i'm all for going for the 300K patch, which is roughly the 
direction David is heading into: rename to 'struct mutex' but keep the 
down/up APIs, and introduce sem_down()/sem_up()/ for the cases that need 
full semaphores.

i dont think the 3000K patch (full API rename, introduction of 
mutex_down()/mutex_up()) is realistic.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:00         ` Ingo Molnar
@ 2005-12-13 17:40           ` Paul Jackson
  2005-12-13 18:34           ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: Paul Jackson @ 2005-12-13 17:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: hch, akpm, dhowells, torvalds, arjan, matthew, linux-kernel, linux-arch

If we are doing global rename patches, could we make one of the
deliverables a sed or perl script that exactly produces the patch,
suitable for running one-time on out-of-kernel trees?  Add the script
in the kernel scripts directory.

It is usually too easy to produce a nearly correct script, and too
difficult to produce an exactly right one, for all but serious sed or
perl regex hackers.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:00         ` Ingo Molnar
  2005-12-13 17:40           ` Paul Jackson
@ 2005-12-13 18:34           ` David Howells
  2005-12-13 22:31               ` Paul Jackson
                               ` (2 more replies)
  1 sibling, 3 replies; 239+ messages in thread
From: David Howells @ 2005-12-13 18:34 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Ingo Molnar, hch, akpm, dhowells, torvalds, arjan, matthew,
	linux-kernel, linux-arch

Paul Jackson <pj@sgi.com> wrote:

> It is usually too easy to produce a nearly correct script, and too
> difficult to produce an exactly right one, for all but serious sed or
> perl regex hackers.

I'd be especially impressed if you can get it to also analyse the context in
which the semaphore is used and determine whether or not it should be a
counting semaphore, a mutex or a completion. You can probably do this sort of
thing with Perl regexes... they seem to be terrifically[*] powerful.

 [*] and I mean that in the proper sense:-)

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 18:34           ` David Howells
@ 2005-12-13 22:31               ` Paul Jackson
  2005-12-14 11:02             ` David Howells
  2005-12-14 11:12             ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: Paul Jackson @ 2005-12-13 22:31 UTC (permalink / raw)
  To: David Howells
  Cc: mingo, hch, akpm, dhowells, torvalds, arjan, matthew,
	linux-kernel, linux-arch

> I'd be especially impressed if you can get it to also analyse the context in
> which the semaphore is used and determine whether or not it should be a
> counting semaphore, a mutex or a completion

That would impress me too, if I could do that.

I think that is well beyond my humble capabilities.

The sed/perl script to make the textual change should be practical.
Indeed, I would claim that the initial big patch -should- be done
that way.  Keep refining a sed script until manual inspection and
trial builds of all arch's, allconfig, show that it seems to be right.
Each time you find an error doing this, don't manually edit the
kernel source; rather refine the script and try applying it again.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-13 22:31               ` Paul Jackson
  0 siblings, 0 replies; 239+ messages in thread
From: Paul Jackson @ 2005-12-13 22:31 UTC (permalink / raw)
  To: David Howells
  Cc: mingo, hch, akpm, torvalds, arjan, matthew, linux-kernel, linux-arch

> I'd be especially impressed if you can get it to also analyse the context in
> which the semaphore is used and determine whether or not it should be a
> counting semaphore, a mutex or a completion

That would impress me too, if I could do that.

I think that is well beyond my humble capabilities.

The sed/perl script to make the textual change should be practical.
Indeed, I would claim that the initial big patch -should- be done
that way.  Keep refining a sed script until manual inspection and
trial builds of all arch's, allconfig, show that it seems to be right.
Each time you find an error doing this, don't manually edit the
kernel source; rather refine the script and try applying it again.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 18:34           ` David Howells
  2005-12-13 22:31               ` Paul Jackson
@ 2005-12-14 11:02             ` David Howells
  2005-12-14 11:12             ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 11:02 UTC (permalink / raw)
  To: Paul Jackson
  Cc: David Howells, mingo, hch, akpm, torvalds, arjan, matthew,
	linux-kernel, linux-arch

Paul Jackson <pj@sgi.com> wrote:

> The sed/perl script to make the textual change should be practical.
> Indeed, I would claim that the initial big patch -should- be done
> that way.  Keep refining a sed script until manual inspection and
> trial builds of all arch's, allconfig, show that it seems to be right.
> Each time you find an error doing this, don't manually edit the
> kernel source; rather refine the script and try applying it again.

Actually, you may have a point.

If the order of patches is:

 (1) Create new mutex as struct mutex/up_mutex/down_mutex, say.

 (2) Make counting semaphore implementation struct semaphore/up_sem/down_sem.

 (3) Convert uses of semaphores that should be completions into completions.

 (4) Convert uses of semaphores that should be counting semaphores to use
     up_sem/down_sem.

 (5) Mass convert by script all the remaining ups and downs into up_mutex and
     down_mutex.

 (6) Make wrappers for up/down that map to counting semaphores with the
     deprecation attribute set.

That might work, and would be a lot easier; except for the humongous patch
generated at step 5 - which could be regenerated by script. I think I can make
a simple perl script to do that.

Note that I am assuming above that down == down/down_trylock/down_interruptible
for clarity.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 18:34           ` David Howells
  2005-12-13 22:31               ` Paul Jackson
  2005-12-14 11:02             ` David Howells
@ 2005-12-14 11:12             ` David Howells
  2005-12-14 11:18               ` Alan Cox
  2005-12-14 12:35                 ` David Howells
  2 siblings, 2 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 11:12 UTC (permalink / raw)
  To: David Howells
  Cc: Paul Jackson, mingo, hch, akpm, torvalds, arjan, matthew,
	linux-kernel, linux-arch

David Howells <dhowells@redhat.com> wrote:

> 
>  (6) Make wrappers for up/down that map to counting semaphores with the
>      deprecation attribute set.

 (7) After a couple of months, remove up and down entirely.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:12             ` David Howells
@ 2005-12-14 11:18               ` Alan Cox
  2005-12-14 12:35                 ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: Alan Cox @ 2005-12-14 11:18 UTC (permalink / raw)
  To: David Howells
  Cc: Paul Jackson, mingo, hch, akpm, torvalds, arjan, matthew,
	linux-kernel, linux-arch

On Mer, 2005-12-14 at 11:12 +0000, David Howells wrote:
> David Howells <dhowells@redhat.com> wrote:
> 
> > 
> >  (6) Make wrappers for up/down that map to counting semaphores with the
> >      deprecation attribute set.
> 
>  (7) After a couple of months, remove up and down entirely.

Why bother. As has already been discussed up and down are the natural
and normal names for counting semaphores. You don't need to obsolete the
old API thats just silly, you need to add a new one and wait for people
to use it.

The old API is still very useful for some applications that want
counting semaphores.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:12             ` David Howells
@ 2005-12-14 12:35                 ` David Howells
  2005-12-14 12:35                 ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 12:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, Paul Jackson, mingo, hch, akpm, torvalds, arjan,
	matthew, linux-kernel, linux-arch

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> Why bother. As has already been discussed up and down are the natural
> and normal names for counting semaphores. You don't need to obsolete the
> old API thats just silly, you need to add a new one and wait for people
> to use it.

The vast majority of ups and downs are actually mutex related not semaphore
related, so by majority share, up/down perhaps ought to be repurposed to
mutexes: they _are_ the preeminent uses.

>From my modified tree, I see:

	semaphore	up	down	down_in	down_try
	Counting	41	59	1	0
	Mutex		4405	2824	362	107

> The old API is still very useful for some applications that want
> counting semaphores.

Whilst that is true, they're in a small minority, and it'd be easier to change
them.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-14 12:35                 ` David Howells
  0 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 12:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, Paul Jackson, mingo, hch, akpm, torvalds, arjan,
	matthew, linux-kernel, linux-arch

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> Why bother. As has already been discussed up and down are the natural
> and normal names for counting semaphores. You don't need to obsolete the
> old API thats just silly, you need to add a new one and wait for people
> to use it.

The vast majority of ups and downs are actually mutex related not semaphore
related, so by majority share, up/down perhaps ought to be repurposed to
mutexes: they _are_ the preeminent uses.

From my modified tree, I see:

	semaphore	up	down	down_in	down_try
	Counting	41	59	1	0
	Mutex		4405	2824	362	107

> The old API is still very useful for some applications that want
> counting semaphores.

Whilst that is true, they're in a small minority, and it'd be easier to change
them.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 12:35                 ` David Howells
  (?)
@ 2005-12-14 13:58                 ` Thomas Gleixner
  2005-12-14 23:40                   ` Mark Lord
  -1 siblings, 1 reply; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-14 13:58 UTC (permalink / raw)
  To: David Howells
  Cc: Alan Cox, Paul Jackson, mingo, hch, akpm, torvalds, arjan,
	matthew, linux-kernel, linux-arch

On Wed, 2005-12-14 at 12:35 +0000, David Howells wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
> > Why bother. As has already been discussed up and down are the natural
> > and normal names for counting semaphores. You don't need to obsolete the
> > old API thats just silly, you need to add a new one and wait for people
> > to use it.
> 
> The vast majority of ups and downs are actually mutex related not semaphore
> related, so by majority share, up/down perhaps ought to be repurposed to
> mutexes: they _are_ the preeminent uses.
> 
> From my modified tree, I see:
> 
> 	semaphore	up	down	down_in	down_try
> 	Counting	41	59	1	0
> 	Mutex		4405	2824	362	107
> 
> > The old API is still very useful for some applications that want
> > counting semaphores.
> 
> Whilst that is true, they're in a small minority, and it'd be easier to change
> them.

You can do a full scripted rename of up/down to the mutex API and then
fix up the 100 places used by semaphores manually.

	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 13:58                 ` Thomas Gleixner
@ 2005-12-14 23:40                   ` Mark Lord
  2005-12-14 23:54                     ` Andrew Morton
                                       ` (2 more replies)
  0 siblings, 3 replies; 239+ messages in thread
From: Mark Lord @ 2005-12-14 23:40 UTC (permalink / raw)
  To: tglx
  Cc: David Howells, Alan Cox, Paul Jackson, mingo, hch, akpm,
	torvalds, arjan, matthew, linux-kernel, linux-arch

Thomas Gleixner wrote:
>
> You can do a full scripted rename of up/down to the mutex API and then
> fix up the 100 places used by semaphores manually.

Again, folks, this only works for current in-tree kernel code.

There are huge amounts of kernel code out-of-tree that still use
up/down as (or potentially as) counting semaphores.

Yes, some of that code is closed-source, but most of it is open-source
stuff in people's "queues", such as the network patch-o-matic queue
and other stuff.  Lots of open-source out-of-tree drivers, too.

Re-using the existing up()/down() names for a new purpose is
a very very Bad Idea.  Removing up()/down() entirely is not quite so bad,
because at least then people will eventually notice the change.

Leaving up()/down() as-is is really the most sensible option.

Cheers

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:40                   ` Mark Lord
@ 2005-12-14 23:54                     ` Andrew Morton
  2005-12-15 13:41                       ` Nikita Danilov
  2005-12-15 14:41                       ` Steven Rostedt
  2005-12-14 23:57                     ` Thomas Gleixner
  2005-12-15 15:37                     ` David Howells
  2 siblings, 2 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-14 23:54 UTC (permalink / raw)
  To: Mark Lord
  Cc: tglx, dhowells, alan, pj, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

Mark Lord <lkml@rtr.ca> wrote:
>
> Leaving up()/down() as-is is really the most sensible option.
>

Absolutely.

I must say that my interest in this stuff is down in
needs-an-electron-microscope-to-locate territory.  down() and up() work
just fine and they're small, efficient, well-debugged and well-understood. 
We need a damn good reason for taking on tree-wide churn or incompatible
renames or addition of risk.  What's the damn good reason here?

Please.  Go fix some bugs.  We're not short of them.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:54                     ` Andrew Morton
@ 2005-12-15 13:41                       ` Nikita Danilov
  2005-12-15 14:56                         ` Alan Cox
  2005-12-15 14:41                       ` Steven Rostedt
  1 sibling, 1 reply; 239+ messages in thread
From: Nikita Danilov @ 2005-12-15 13:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: tglx, dhowells, alan, pj, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

Andrew Morton writes:
 > Mark Lord <lkml@rtr.ca> wrote:
 > >
 > > Leaving up()/down() as-is is really the most sensible option.
 > >
 > 
 > Absolutely.
 > 
 > I must say that my interest in this stuff is down in
 > needs-an-electron-microscope-to-locate territory.  down() and up() work
 > just fine and they're small, efficient, well-debugged and well-understood. 
 > We need a damn good reason for taking on tree-wide churn or incompatible
 > renames or addition of risk.  What's the damn good reason here?
 > 
 > Please.  Go fix some bugs.  We're not short of them.

But this change is about fixing bugs: mutex assumes that

 - only owner can unlock, and

 - owner cannot lock (immediate self-deadlock).

This can be checked by the debugging code, and yes, these kinds of
errors do happen.

Not to say that by looking at

        struct foo_bar_baz {
                struct mutex fbb_mutex;
                ...
        };

one can instantly infer that ->fbb_mutex is used to serialize something
rather than serves as some fancy signaling mechanism.

Nikita.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 13:41                       ` Nikita Danilov
@ 2005-12-15 14:56                         ` Alan Cox
  2005-12-15 15:52                           ` Nikita Danilov
  2005-12-15 15:55                           ` David Howells
  0 siblings, 2 replies; 239+ messages in thread
From: Alan Cox @ 2005-12-15 14:56 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Andrew Morton, tglx, dhowells, pj, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

On Iau, 2005-12-15 at 16:41 +0300, Nikita Danilov wrote:
> But this change is about fixing bugs: mutex assumes that
> 
>  - only owner can unlock, and
> 
>  - owner cannot lock (immediate self-deadlock).

So add mutex_up/mutex_down that use the same semaphores but do extra
checks if lock debugging is enabled. All you need is an owner field for
debugging.

Now generate a trace dump on up when up and to check for sleeping on a
lock you already hold (for both sem and mutex).

Alan


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 14:56                         ` Alan Cox
@ 2005-12-15 15:52                           ` Nikita Danilov
  2005-12-15 16:50                             ` Christopher Friesen
  2005-12-15 15:55                           ` David Howells
  1 sibling, 1 reply; 239+ messages in thread
From: Nikita Danilov @ 2005-12-15 15:52 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, tglx, dhowells, pj, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

Alan Cox writes:
 > On Iau, 2005-12-15 at 16:41 +0300, Nikita Danilov wrote:
 > > But this change is about fixing bugs: mutex assumes that
 > > 
 > >  - only owner can unlock, and
 > > 
 > >  - owner cannot lock (immediate self-deadlock).
 > 
 > So add mutex_up/mutex_down that use the same semaphores but do extra
 > checks if lock debugging is enabled. All you need is an owner field for
 > debugging.

And to convert almost all calls to down/up to mutex_{down,up}. At which
point, it no longer makes sense to share the same data-type for
semaphore and mutex.

Also, (as was already mentioned several times) having separate data-type
for mutex makes code easier to understand, as it specifies intended
usage.

To avoid duplicating code, mutex can be implemented on top of semaphore,
like

struct mutex {
        struct semaphore sema;
#ifdef DEBUG_MUTEX
        void *owner;
#endif
};

or something similar.

 > 
 > Now generate a trace dump on up when up and to check for sleeping on a
 > lock you already hold (for both sem and mutex).

Sleeping on a semaphore "held" by the current thread is perfectly
reasonable usage of a generic counting semaphore, as it can be upped by
another thread.

 > 
 > Alan

Nikita.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:52                           ` Nikita Danilov
@ 2005-12-15 16:50                             ` Christopher Friesen
  2005-12-15 20:53                               ` Steven Rostedt
  0 siblings, 1 reply; 239+ messages in thread
From: Christopher Friesen @ 2005-12-15 16:50 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Alan Cox, Andrew Morton, tglx, dhowells, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

Nikita Danilov wrote:

> And to convert almost all calls to down/up to mutex_{down,up}. At which
> point, it no longer makes sense to share the same data-type for
> semaphore and mutex.

If we're going to call it a mutex, it would make sense to use familiar 
terminology and call it lock/unlock rather than down/up.

Chris


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:50                             ` Christopher Friesen
@ 2005-12-15 20:53                               ` Steven Rostedt
  0 siblings, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 20:53 UTC (permalink / raw)
  To: Christopher Friesen
  Cc: linux-arch, linux-kernel, matthew, arjan, torvalds, hch, mingo,
	pj, dhowells, tglx, Andrew Morton, Alan Cox, Nikita Danilov

On Thu, 2005-12-15 at 10:50 -0600, Christopher Friesen wrote:
> Nikita Danilov wrote:
> 
> > And to convert almost all calls to down/up to mutex_{down,up}. At which
> > point, it no longer makes sense to share the same data-type for
> > semaphore and mutex.
> 
> If we're going to call it a mutex, it would make sense to use familiar 
> terminology and call it lock/unlock rather than down/up.

ACK!

-- Steve


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 14:56                         ` Alan Cox
  2005-12-15 15:52                           ` Nikita Danilov
@ 2005-12-15 15:55                           ` David Howells
  2005-12-15 16:22                               ` linux-os (Dick Johnson)
                                               ` (4 more replies)
  1 sibling, 5 replies; 239+ messages in thread
From: David Howells @ 2005-12-15 15:55 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Alan Cox, Andrew Morton, tglx, dhowells, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

Nikita Danilov <nikita@clusterfs.com> wrote:

> And to convert almost all calls to down/up to mutex_{down,up}. At which
> point, it no longer makes sense to share the same data-type for
> semaphore and mutex.

But what to do about DECLARE_MUTEX? :-/

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:55                           ` David Howells
@ 2005-12-15 16:22                               ` linux-os (Dick Johnson)
  2005-12-15 16:28                             ` Linus Torvalds
                                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 239+ messages in thread
From: linux-os (Dick Johnson) @ 2005-12-15 16:22 UTC (permalink / raw)
  To: David Howells
  Cc: Nikita Danilov, Alan Cox, Andrew Morton, tglx, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Thu, 15 Dec 2005, David Howells wrote:

> Nikita Danilov <nikita@clusterfs.com> wrote:
>
>> And to convert almost all calls to down/up to mutex_{down,up}. At which
>> point, it no longer makes sense to share the same data-type for
>> semaphore and mutex.
>
> But what to do about DECLARE_MUTEX? :-/
>
> David

Isn't "struct semaphore" already an opaque type. Nobody, except
the optimizer wizards, should even care what's in them. They are
already manipulated with init_MUTEX, up, down, etc. There shouldn't
be any code changes if the actual internal workings are changed.

If some code is peeking into the internal workings, then it's
broken. Don't break the whole kernel by a name-change. Sharing
the same data-type, as long as there are no alignment problems,
has no negative impact at all. If there is stuff inside those
structures that is not used for a particular instance, who cares?
Somebody doing debugging? If they are doing kernel debugging,
they should know what they are doing, you don't dumb-down the
kernel to the lowest common denominator because there may be
different structure members used for different purposes!

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.
.

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-15 16:22                               ` linux-os (Dick Johnson)
  0 siblings, 0 replies; 239+ messages in thread
From: linux-os (Dick Johnson) @ 2005-12-15 16:22 UTC (permalink / raw)
  To: David Howells
  Cc: Nikita Danilov, Alan Cox, Andrew Morton, tglx, pj, mingo, hch,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Thu, 15 Dec 2005, David Howells wrote:

> Nikita Danilov <nikita@clusterfs.com> wrote:
>
>> And to convert almost all calls to down/up to mutex_{down,up}. At which
>> point, it no longer makes sense to share the same data-type for
>> semaphore and mutex.
>
> But what to do about DECLARE_MUTEX? :-/
>
> David

Isn't "struct semaphore" already an opaque type. Nobody, except
the optimizer wizards, should even care what's in them. They are
already manipulated with init_MUTEX, up, down, etc. There shouldn't
be any code changes if the actual internal workings are changed.

If some code is peeking into the internal workings, then it's
broken. Don't break the whole kernel by a name-change. Sharing
the same data-type, as long as there are no alignment problems,
has no negative impact at all. If there is stuff inside those
structures that is not used for a particular instance, who cares?
Somebody doing debugging? If they are doing kernel debugging,
they should know what they are doing, you don't dumb-down the
kernel to the lowest common denominator because there may be
different structure members used for different purposes!

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.
.

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:55                           ` David Howells
  2005-12-15 16:22                               ` linux-os (Dick Johnson)
@ 2005-12-15 16:28                             ` Linus Torvalds
  2005-12-15 17:04                               ` Thomas Gleixner
                                                 ` (2 more replies)
  2005-12-15 16:51                             ` David Howells
                                               ` (2 subsequent siblings)
  4 siblings, 3 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-15 16:28 UTC (permalink / raw)
  To: David Howells
  Cc: Nikita Danilov, Alan Cox, Andrew Morton, tglx, pj, mingo, hch,
	arjan, matthew, linux-kernel, linux-arch

On Thu, 15 Dec 2005, David Howells wrote:
> 
> But what to do about DECLARE_MUTEX? :-/

It's correctly named right now (it _does_ declare a mutex, despite the 
insane noise from the sidelines).

I would suggest that if you create a new "mutex" type, you just keep the 
lower-case name. Don't re-use the DECLARE_MUTEX format, just do

	struct mutex my_mutex = UNLOCKED_MUTEX;

for new code that uses the new stuff.

Think about it a bit. We don't have DECLARE_SPINLOCK either. Why?

Hint: we have DECLARE_MUTEX exactly because it's also DOCUMENTATION that 
we use a semaphore as a pure binary mutex. Not because we need it.

If you create a real "struct mutex", then something like the current 
DECLARE_MUTEX() is simply not relevant for the new type.

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:28                             ` Linus Torvalds
@ 2005-12-15 17:04                               ` Thomas Gleixner
  2005-12-15 17:09                               ` Paul Jackson
  2005-12-15 17:17                               ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-15 17:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, Nikita Danilov, Alan Cox, Andrew Morton, pj,
	Ingo Molnar, hch, arjan, matthew, linux-kernel, linux-arch

On Thu, 2005-12-15 at 08:28 -0800, Linus Torvalds wrote:
> I would suggest that if you create a new "mutex" type, you just keep the 
> lower-case name. Don't re-use the DECLARE_MUTEX format, just do
> 
> 	struct mutex my_mutex = UNLOCKED_MUTEX;
> 
> for new code that uses the new stuff.
> 
> Think about it a bit. We don't have DECLARE_SPINLOCK either. Why?

Well, we have DEFINE_SPINLOCK() and we should have a matching one for
mutexes DEFINE_MUTEX().

The reason is that you can implement complex initialization for
debugging or extensions which can't be done by a var = INITIALZER,
because you dont have a reference to var.

	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:28                             ` Linus Torvalds
  2005-12-15 17:04                               ` Thomas Gleixner
@ 2005-12-15 17:09                               ` Paul Jackson
  2005-12-15 17:17                               ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: Paul Jackson @ 2005-12-15 17:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, nikita, alan, akpm, tglx, mingo, hch, arjan, matthew,
	linux-kernel, linux-arch

Linus wrote:
> Hint: we have DECLARE_MUTEX exactly because it's also DOCUMENTATION that 
> we use a semaphore as a pure binary mutex. Not because we need it.

That's insane ... 

This is stealth documentation at its finest.  Who besides Linus even
knew that's what this spelling of the DECLARE macro was telling us?

  Paul "Hand me that chain saw, Billy Jo.  This limb is coming -down-" Jackson

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 16:28                             ` Linus Torvalds
  2005-12-15 17:04                               ` Thomas Gleixner
  2005-12-15 17:09                               ` Paul Jackson
@ 2005-12-15 17:17                               ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-15 17:17 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Linus Torvalds, dhowells, nikita, alan, akpm, tglx, mingo, hch,
	arjan, matthew, linux-kernel, linux-arch

Paul Jackson <pj@sgi.com> wrote:

> > Hint: we have DECLARE_MUTEX exactly because it's also DOCUMENTATION that 
> > we use a semaphore as a pure binary mutex. Not because we need it.
> 
> That's insane ... 

And abused/misused...

> This is stealth documentation at its finest.  Who besides Linus even
> knew that's what this spelling of the DECLARE macro was telling us?
> 
>   Paul "Hand me that chain saw, Billy Jo.  This limb is coming -down-" Jackson

I hope you're talking about trees...

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:55                           ` David Howells
  2005-12-15 16:22                               ` linux-os (Dick Johnson)
  2005-12-15 16:28                             ` Linus Torvalds
@ 2005-12-15 16:51                             ` David Howells
  2005-12-15 16:56                               ` Paul Jackson
  2005-12-15 17:28                             ` David Howells
  4 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-15 16:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, Nikita Danilov, Alan Cox, Andrew Morton, tglx, pj,
	mingo, hch, arjan, matthew, linux-kernel, linux-arch

Linus Torvalds <torvalds@osdl.org> wrote:

> Think about it a bit. We don't have DECLARE_SPINLOCK either. Why?

I thought it was something to do with initialising struct list_heads, which
spinlocks don't have.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:55                           ` David Howells
@ 2005-12-15 16:56                               ` Paul Jackson
  2005-12-15 16:28                             ` Linus Torvalds
                                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 239+ messages in thread
From: Paul Jackson @ 2005-12-15 16:56 UTC (permalink / raw)
  To: David Howells
  Cc: nikita, alan, akpm, tglx, dhowells, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

> But what to do about DECLARE_MUTEX? :-/

A phased change of just the renames:
	DECLARE_MUTEX ==> DECLARE_SEM
	init_MUTEX ==> init_SEM
	DECLARE_MUTEX_LOCKED ==> DECLARE_SEM_LOCKED
	init_MUTEX_LOCKED ==> init_SEM_LOCKED

seems doable.  A scripted replacement, so long as it specifies whole
word replacement only, seems to be a very robust replacement for these
four symbols, unlike "up"/"down", which are scary at best to consider
wholesale replacement.

Add the new *_SEM in one release as aliases for the current *_MUTEX,
do the wholesale replacement of the above names, leaving the old as
aliases in a second release, remove the old *_MUTEX aliases in a third
release, and them restore them as new 'real mutex' methods in a fourth
release.  Be sure that the new *_MUTEX versions will generate a compile
error if handed the old counting semaphore type.

I'm a stickler for names ... at least until Linus/Andrew show me
the foolishness of my ways, I could find such a change appealing.

Of course, they're the ones with all the sweat equity on the line,
not me.

... I'd better duck and get back to bug fixing, before I get hit ...

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-15 16:56                               ` Paul Jackson
  0 siblings, 0 replies; 239+ messages in thread
From: Paul Jackson @ 2005-12-15 16:56 UTC (permalink / raw)
  To: David Howells
  Cc: nikita, alan, akpm, tglx, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

> But what to do about DECLARE_MUTEX? :-/

A phased change of just the renames:
	DECLARE_MUTEX ==> DECLARE_SEM
	init_MUTEX ==> init_SEM
	DECLARE_MUTEX_LOCKED ==> DECLARE_SEM_LOCKED
	init_MUTEX_LOCKED ==> init_SEM_LOCKED

seems doable.  A scripted replacement, so long as it specifies whole
word replacement only, seems to be a very robust replacement for these
four symbols, unlike "up"/"down", which are scary at best to consider
wholesale replacement.

Add the new *_SEM in one release as aliases for the current *_MUTEX,
do the wholesale replacement of the above names, leaving the old as
aliases in a second release, remove the old *_MUTEX aliases in a third
release, and them restore them as new 'real mutex' methods in a fourth
release.  Be sure that the new *_MUTEX versions will generate a compile
error if handed the old counting semaphore type.

I'm a stickler for names ... at least until Linus/Andrew show me
the foolishness of my ways, I could find such a change appealing.

Of course, they're the ones with all the sweat equity on the line,
not me.

... I'd better duck and get back to bug fixing, before I get hit ...

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:55                           ` David Howells
                                               ` (3 preceding siblings ...)
  2005-12-15 16:56                               ` Paul Jackson
@ 2005-12-15 17:28                             ` David Howells
  2005-12-15 17:48                               ` Linus Torvalds
  4 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-15 17:28 UTC (permalink / raw)
  To: Paul Jackson
  Cc: David Howells, nikita, alan, akpm, tglx, mingo, hch, torvalds,
	arjan, matthew, linux-kernel, linux-arch

Paul Jackson <pj@sgi.com> wrote:

> 
> A phased change of just the renames:
> 	DECLARE_MUTEX ==> DECLARE_SEM
> 	init_MUTEX ==> init_SEM
> 	DECLARE_MUTEX_LOCKED ==> DECLARE_SEM_LOCKED
> 	init_MUTEX_LOCKED ==> init_SEM_LOCKED

I'd prefer:

	FROM				TO
	==============================	=========================
	DECLARE_MUTEX			DECLARE_SEM_MUTEX
	DECLARE_MUTEX_LOCKED		DECLARE_SEM_MUTEX_LOCKED
	Proper counting semaphore	DECLARE_SEM

That way people can show their intent and can be seen more easily when
violating it.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 17:28                             ` David Howells
@ 2005-12-15 17:48                               ` Linus Torvalds
  2005-12-15 18:20                                 ` Nikita Danilov
  2005-12-15 19:21                                 ` Andrew Morton
  0 siblings, 2 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-15 17:48 UTC (permalink / raw)
  To: David Howells
  Cc: Paul Jackson, nikita, alan, akpm, tglx, mingo, hch, arjan,
	matthew, linux-kernel, linux-arch

On Thu, 15 Dec 2005, David Howells wrote:
> 
> 	FROM				TO
> 	==============================	=========================
> 	DECLARE_MUTEX			DECLARE_SEM_MUTEX
> 	DECLARE_MUTEX_LOCKED		DECLARE_SEM_MUTEX_LOCKED
> 	Proper counting semaphore	DECLARE_SEM

That sounds fine. I wouldn't be adverse to doing that - but it would have 
to be independently of any other changes, and it would need to simmer for 
a while for out-of-tree drivers etc to notice (ie you should _not_ just 
introduce a new "DECLARE_MUTEX()" immediately to confuse things).

The patch could probably be fairly trivially generated with some trivial 
sed-script. Not that I'll take it at this point, but after the next 
release..

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 17:48                               ` Linus Torvalds
@ 2005-12-15 18:20                                 ` Nikita Danilov
  2005-12-15 20:58                                   ` Steven Rostedt
  2005-12-15 19:21                                 ` Andrew Morton
  1 sibling, 1 reply; 239+ messages in thread
From: Nikita Danilov @ 2005-12-15 18:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, Paul Jackson, alan, akpm, tglx, mingo, hch, arjan,
	matthew, linux-kernel, linux-arch

Linus Torvalds writes:
 > 
 > 
 > On Thu, 15 Dec 2005, David Howells wrote:
 > > 
 > > 	FROM				TO
 > > 	==============================	=========================
 > > 	DECLARE_MUTEX			DECLARE_SEM_MUTEX
 > > 	DECLARE_MUTEX_LOCKED		DECLARE_SEM_MUTEX_LOCKED
 > > 	Proper counting semaphore	DECLARE_SEM
 > 
 > That sounds fine. I wouldn't be adverse to doing that - but it would have 
 > to be independently of any other changes, and it would need to simmer for 
 > a while for out-of-tree drivers etc to notice (ie you should _not_ just 
 > introduce a new "DECLARE_MUTEX()" immediately to confuse things).

Going off at a tangent (or tangle, rather), why do we need DECLARE_FOO()
macros at all? They

 - do not look like C variable declarations, hide variable type, and
 hence are confusing,

 - contrary to their naming actually _define_ rather than _declare_ an
 object.

In most cases 

        type var = INIT_FOO;

is much better (more readable and easier to understand) than

        DECLARE_FOO(var); /* what is the type of var? */

In the cases where initializer needs an address of object being
initialized

        type var = INIT_FOO(var);

can be used.

Nikita.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 18:20                                 ` Nikita Danilov
@ 2005-12-15 20:58                                   ` Steven Rostedt
  0 siblings, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 20:58 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: linux-arch, linux-kernel, matthew, arjan, hch, mingo, tglx, akpm,
	alan, Paul Jackson, David Howells, Linus Torvalds

On Thu, 2005-12-15 at 21:20 +0300, Nikita Danilov wrote:

> Going off at a tangent (or tangle, rather), why do we need DECLARE_FOO()
> macros at all? They
> 
>  - do not look like C variable declarations, hide variable type, and
>  hence are confusing,
> 
>  - contrary to their naming actually _define_ rather than _declare_ an
>  object.
> 
> In most cases 
> 
>         type var = INIT_FOO;
> 
> is much better (more readable and easier to understand) than
> 
>         DECLARE_FOO(var); /* what is the type of var? */
> 
> In the cases where initializer needs an address of object being
> initialized
> 
>         type var = INIT_FOO(var);
> 
> can be used.

That's just error prone.  In the RT patch we had several bugs caused by
cut and paste errors like:

type foo = INIT_TYPE(foo);
type bar = INIT_TYPE(foo);

These are not always easy to find.

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 17:48                               ` Linus Torvalds
  2005-12-15 18:20                                 ` Nikita Danilov
@ 2005-12-15 19:21                                 ` Andrew Morton
  2005-12-15 19:38                                   ` Linus Torvalds
  2005-12-15 20:28                                   ` Steven Rostedt
  1 sibling, 2 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-15 19:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, pj, nikita, alan, tglx, mingo, hch, arjan, matthew,
	linux-kernel, linux-arch

Linus Torvalds <torvalds@osdl.org> wrote:
>
> On Thu, 15 Dec 2005, David Howells wrote:
>  > 
>  > 	FROM				TO
>  > 	==============================	=========================
>  > 	DECLARE_MUTEX			DECLARE_SEM_MUTEX
>  > 	DECLARE_MUTEX_LOCKED		DECLARE_SEM_MUTEX_LOCKED
>  > 	Proper counting semaphore	DECLARE_SEM
> 
>  That sounds fine.

They should be renamed to DEFINE_* while we're there.  A "declaration" is
"this thing is defined somewhere else".  A "definition" is "this thing is
defined here".

> I wouldn't be adverse to doing that

argh.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:21                                 ` Andrew Morton
@ 2005-12-15 19:38                                   ` Linus Torvalds
  2005-12-15 20:28                                   ` Steven Rostedt
  1 sibling, 0 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-15 19:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: dhowells, pj, nikita, alan, tglx, mingo, hch, arjan, matthew,
	linux-kernel, linux-arch



On Thu, 15 Dec 2005, Andrew Morton wrote:
> 
> They should be renamed to DEFINE_* while we're there.  A "declaration" is
> "this thing is defined somewhere else".  A "definition" is "this thing is
> defined here".

Yeah, I confuse the two. Although by now I've gotten so used to DECLARE_ 
that at least me personally I like it. 

> > I wouldn't be adverse to doing that
> 
> argh.

Heh. At least there's only 310 DECLARE_MUTEX* references in the whole 
kernel. So we're not actually talking about a huge patch. 

It's also fairly simple to work with in out-of-tree drivers, since it's 
always bound to be a #define, so you can do things like

	#ifndef DECLARE_SEM_MUTEX
	#define DECLARE_SEM_MUTEX(x) DECLARE_MUTEX(x)
	#endif

or something.

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:21                                 ` Andrew Morton
  2005-12-15 19:38                                   ` Linus Torvalds
@ 2005-12-15 20:28                                   ` Steven Rostedt
  2005-12-15 20:32                                     ` Geert Uytterhoeven
  1 sibling, 1 reply; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 20:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-arch, linux-kernel, matthew, arjan, hch, mingo, tglx, alan,
	nikita, pj, dhowells, Linus Torvalds

On Thu, 2005-12-15 at 11:21 -0800, Andrew Morton wrote:
> Linus Torvalds <torvalds@osdl.org> wrote:
> >
> > On Thu, 15 Dec 2005, David Howells wrote:
> >  > 
> >  > 	FROM				TO
> >  > 	==============================	=========================
> >  > 	DECLARE_MUTEX			DECLARE_SEM_MUTEX
> >  > 	DECLARE_MUTEX_LOCKED		DECLARE_SEM_MUTEX_LOCKED
> >  > 	Proper counting semaphore	DECLARE_SEM
> > 
> >  That sounds fine.
> 
> They should be renamed to DEFINE_* while we're there.  A "declaration" is
> "this thing is defined somewhere else".  A "definition" is "this thing is
> defined here".

Why have the "MUTEX" part in there?  Shouldn't that just be DECLARE_SEM
(oops, I mean DEFINE_SEM).  Especially that MUTEX_LOCKED! What is that?
How does a MUTEX start off as locked.  It can't, since a mutex must
always have an owner (which, by the way, helped us in the -rt patch to
find our "compat_semaphores").  So who's the owner of a
DEFINE_SEM_MUTEX_LOCKED?

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 20:28                                   ` Steven Rostedt
@ 2005-12-15 20:32                                     ` Geert Uytterhoeven
  2005-12-16 21:41                                       ` Thomas Gleixner
  0 siblings, 1 reply; 239+ messages in thread
From: Geert Uytterhoeven @ 2005-12-15 20:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, mingo, tglx, Alan Cox, nikita, pj,
	dhowells, Linus Torvalds

On Thu, 15 Dec 2005, Steven Rostedt wrote:
> On Thu, 2005-12-15 at 11:21 -0800, Andrew Morton wrote:
> > Linus Torvalds <torvalds@osdl.org> wrote:
> > >
> > > On Thu, 15 Dec 2005, David Howells wrote:
> > >  > 
> > >  > 	FROM				TO
> > >  > 	==============================	=========================
> > >  > 	DECLARE_MUTEX			DECLARE_SEM_MUTEX
> > >  > 	DECLARE_MUTEX_LOCKED		DECLARE_SEM_MUTEX_LOCKED
> > >  > 	Proper counting semaphore	DECLARE_SEM
> > > 
> > >  That sounds fine.
> > 
> > They should be renamed to DEFINE_* while we're there.  A "declaration" is
> > "this thing is defined somewhere else".  A "definition" is "this thing is
> > defined here".
> 
> Why have the "MUTEX" part in there?  Shouldn't that just be DECLARE_SEM
> (oops, I mean DEFINE_SEM).  Especially that MUTEX_LOCKED! What is that?
> How does a MUTEX start off as locked.  It can't, since a mutex must
> always have an owner (which, by the way, helped us in the -rt patch to
> find our "compat_semaphores").  So who's the owner of a
> DEFINE_SEM_MUTEX_LOCKED?

No one. It's not really a mutex, but a completion.

Gr{oetje,eeting}s,

						Geert

P.S. Long live the common vocabulary ;-)
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 20:32                                     ` Geert Uytterhoeven
@ 2005-12-16 21:41                                       ` Thomas Gleixner
  2005-12-16 21:41                                         ` Linus Torvalds
  0 siblings, 1 reply; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-16 21:41 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells, Linus Torvalds

On Thu, 2005-12-15 at 21:32 +0100, Geert Uytterhoeven wrote:
> > Why have the "MUTEX" part in there?  Shouldn't that just be DECLARE_SEM
> > (oops, I mean DEFINE_SEM).  Especially that MUTEX_LOCKED! What is that?
> > How does a MUTEX start off as locked.  It can't, since a mutex must
> > always have an owner (which, by the way, helped us in the -rt patch to
> > find our "compat_semaphores").  So who's the owner of a
> > DEFINE_SEM_MUTEX_LOCKED?
> 
> No one. It's not really a mutex, but a completion.

Well, then let us use a completion and not some semantically wrong
workaround

	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 21:41                                       ` Thomas Gleixner
@ 2005-12-16 21:41                                         ` Linus Torvalds
  2005-12-16 22:06                                           ` Thomas Gleixner
  0 siblings, 1 reply; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 21:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Geert Uytterhoeven, Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells

On Fri, 16 Dec 2005, Thomas Gleixner wrote:

> On Thu, 2005-12-15 at 21:32 +0100, Geert Uytterhoeven wrote:
> > > Why have the "MUTEX" part in there?  Shouldn't that just be DECLARE_SEM
> > > (oops, I mean DEFINE_SEM).  Especially that MUTEX_LOCKED! What is that?
> > > How does a MUTEX start off as locked.  It can't, since a mutex must
> > > always have an owner (which, by the way, helped us in the -rt patch to
> > > find our "compat_semaphores").  So who's the owner of a
> > > DEFINE_SEM_MUTEX_LOCKED?
> > 
> > No one. It's not really a mutex, but a completion.
> 
> Well, then let us use a completion and not some semantically wrong
> workaround

It is _not_ wrong to have a semaphore start out in locked state.

For example, it makes perfect sense if the data structures that the 
semaphore needs need initialization. The way you _should_ handle that is 
to make the semaphore come up as locked, and the data structures in some 
"don't matter" state, and then the thing that initializes stuff can do so 
properly and then release the semaphore.

Yes, in some cases such a locked semaphore is only used once, and ends up 
being a "completion", but that doesn't invalidate the fact that this is 
a perfectly fine way to handle a real issue.

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 21:41                                         ` Linus Torvalds
@ 2005-12-16 22:06                                           ` Thomas Gleixner
  2005-12-16 22:19                                             ` Linus Torvalds
  0 siblings, 1 reply; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-16 22:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geert Uytterhoeven, Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells

On Fri, 2005-12-16 at 13:41 -0800, Linus Torvalds wrote:
> 
> > > No one. It's not really a mutex, but a completion.
> > 
> > Well, then let us use a completion and not some semantically wrong
> > workaround
> 
> It is _not_ wrong to have a semaphore start out in locked state.
> 
> For example, it makes perfect sense if the data structures that the 
> semaphore needs need initialization. The way you _should_ handle that is 
> to make the semaphore come up as locked, and the data structures in some 
> "don't matter" state, and then the thing that initializes stuff can do so 
> properly and then release the semaphore.
> 
> Yes, in some cases such a locked semaphore is only used once, and ends up 
> being a "completion", but that doesn't invalidate the fact that this is 
> a perfectly fine way to handle a real issue.

Well, in case of a semaphore it is a semantically correct use case. In
case of of a mutex it is not.

Gerd was talking about a mutex. The fact that a mutex is implemented on
top (or on actually the same) mechanism as a semaphore - for what ever
reason - does not change the semantical difference between semaphores
and mutexes.

	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:06                                           ` Thomas Gleixner
@ 2005-12-16 22:19                                             ` Linus Torvalds
  2005-12-16 22:32                                               ` Steven Rostedt
  2005-12-16 22:42                                               ` Thomas Gleixner
  0 siblings, 2 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 22:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Geert Uytterhoeven, Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells



On Fri, 16 Dec 2005, Thomas Gleixner wrote:
> 
> Well, in case of a semaphore it is a semantically correct use case. In
> case of of a mutex it is not.

I disagree.

Think of "initialization" as a user. The system starts out initializing 
stuff, and as such the mutex should start out being held. It's that 
simple. It _is_ mutual exclusion, with one user being the early bootup 
state.

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:19                                             ` Linus Torvalds
@ 2005-12-16 22:32                                               ` Steven Rostedt
  2005-12-16 22:42                                               ` Thomas Gleixner
  1 sibling, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-16 22:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Geert Uytterhoeven, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells

On Fri, 2005-12-16 at 14:19 -0800, Linus Torvalds wrote:
> 
> On Fri, 16 Dec 2005, Thomas Gleixner wrote:
> > 
> > Well, in case of a semaphore it is a semantically correct use case. In
> > case of of a mutex it is not.
> 
> I disagree.
> 
> Think of "initialization" as a user. The system starts out initializing 
> stuff, and as such the mutex should start out being held. It's that 
> simple. It _is_ mutual exclusion, with one user being the early bootup 
> state.

That's stretching it quite a bit.  So you are saying that the owner is
the first swapper task, from the booting CPU?  Well, you better have
that same process unlock that mutex, since a mutex has a owner and the
owner _must_ be the one to unlock it.  And in lots of these cases, it's
some other thread that releases the lock.

With mutexs, the owner is not a state, but a task.

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:19                                             ` Linus Torvalds
  2005-12-16 22:32                                               ` Steven Rostedt
@ 2005-12-16 22:42                                               ` Thomas Gleixner
  2005-12-16 22:41                                                 ` Linus Torvalds
  1 sibling, 1 reply; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-16 22:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geert Uytterhoeven, Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells

On Fri, 2005-12-16 at 14:19 -0800, Linus Torvalds wrote:
> 
> On Fri, 16 Dec 2005, Thomas Gleixner wrote:
> > 
> > Well, in case of a semaphore it is a semantically correct use case. In
> > case of of a mutex it is not.
> 
> I disagree.
> 
> Think of "initialization" as a user. The system starts out initializing 
> stuff, and as such the mutex should start out being held. It's that 
> simple. It _is_ mutual exclusion, with one user being the early bootup 
> state.

Mutual exclusion is available with various semantical characteristics.
If you want to have a particular semantical functionality you have to
chose a variant which fits that need. Arguing that the underlying
mechanism (implemenation) can handle your request is broken by
definition. It can, but it still is semantically wrong.

Mutexes have a well defined semantic of lock ownership, i.e. the thread
which locked a mutex has to unlock it. Semaphores do not have this
semantical requirement.

Therefor, if you want to handle that "init protection" scenario, do not
use a mutex, because the owner can not be defined at compile -
allocation time.

You can still implement (chose a mechanism) a mutex on top - or in case
of lack of priority inheritance or debugging with exactly the same -
mechanism as a semaphore, but this does not change the semantical
difference at all.

	tglx

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:42                                               ` Thomas Gleixner
@ 2005-12-16 22:41                                                 ` Linus Torvalds
  2005-12-16 22:49                                                   ` Steven Rostedt
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 22:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Geert Uytterhoeven, Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells



On Fri, 16 Dec 2005, Thomas Gleixner wrote:
> 
> Therefor, if you want to handle that "init protection" scenario, do not
> use a mutex, because the owner can not be defined at compile -
> allocation time.

Sure it could. We certainly have "init_task", for example. It may or may 
not be the right thing to use, of course. Depends on what the situation 
is.

> You can still implement (chose a mechanism) a mutex on top - or in case
> of lack of priority inheritance or debugging with exactly the same -
> mechanism as a semaphore, but this does not change the semantical
> difference at all.

"Friends don't let friends use priority inheritance".

Just don't do it. If you really need it, your system is broken anyway.

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:41                                                 ` Linus Torvalds
@ 2005-12-16 22:49                                                   ` Steven Rostedt
  2005-12-16 23:29                                                   ` Thomas Gleixner
  2005-12-17  0:29                                                   ` Joe Korty
  2 siblings, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-16 22:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Geert Uytterhoeven, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells



On Fri, 16 Dec 2005, Linus Torvalds wrote:

>
> "Friends don't let friends use priority inheritance".
>
> Just don't do it. If you really need it, your system is broken anyway.

You've been hanging around Victor Yodaiken too much ;)

-- Steve

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:41                                                 ` Linus Torvalds
  2005-12-16 22:49                                                   ` Steven Rostedt
@ 2005-12-16 23:29                                                   ` Thomas Gleixner
  2005-12-17  0:29                                                   ` Joe Korty
  2 siblings, 0 replies; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-16 23:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Geert Uytterhoeven, Steven Rostedt, Andrew Morton, linux-arch,
	Linux Kernel Development, matthew, arjan, Christoph Hellwig,
	mingo, Alan Cox, nikita, pj, dhowells

On Fri, 2005-12-16 at 14:41 -0800, Linus Torvalds wrote:

> > You can still implement (chose a mechanism) a mutex on top - or in case
> > of lack of priority inheritance or debugging with exactly the same -
> > mechanism as a semaphore, but this does not change the semantical
> > difference at all.
> 
> "Friends don't let friends use priority inheritance".
> 
> Just don't do it. If you really need it, your system is broken anyway.

We are not talking about priority inheritance and its usefulness at all.

Fact is that you can implement two semanticaly different concurrency
controls with or on top of the same mechanism under given circumstances
(no debugging, no ...). But the reverse attempt is wrong by defintion.


	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:41                                                 ` Linus Torvalds
  2005-12-16 22:49                                                   ` Steven Rostedt
  2005-12-16 23:29                                                   ` Thomas Gleixner
@ 2005-12-17  0:29                                                   ` Joe Korty
  2005-12-17  1:00                                                     ` Linus Torvalds
  2 siblings, 1 reply; 239+ messages in thread
From: Joe Korty @ 2005-12-17  0:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Geert Uytterhoeven, Steven Rostedt,
	Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Fri, Dec 16, 2005 at 02:41:16PM -0800, Linus Torvalds wrote:

> "Friends don't let friends use priority inheritance".
> 
> Just don't do it. If you really need it, your system is broken anyway.

The Mars Pathfinder incident is sufficient proof that some solution to
the priority inversion problem is required in real systems.

	http://www.cs.cmu.edu/afs/cs/user/raj/www/mars.html

Regards,
Joe
--
"All the revision in the world will not save a bad first draft, for the
architecture of the thing comes, or fails to come, in the first conception,
and revision only affects the detail and ornament. -- T.E. Lawrence

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  0:29                                                   ` Joe Korty
@ 2005-12-17  1:00                                                     ` Linus Torvalds
  2005-12-17  3:13                                                       ` Steven Rostedt
  2005-12-19 23:46                                                       ` Keith Owens
  0 siblings, 2 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-17  1:00 UTC (permalink / raw)
  To: Joe Korty
  Cc: Thomas Gleixner, Geert Uytterhoeven, Steven Rostedt,
	Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Fri, 16 Dec 2005, Joe Korty wrote:
> 
> The Mars Pathfinder incident is sufficient proof that some solution to
> the priority inversion problem is required in real systems.

Ehh. 

The Mars Pathfinder is just about the worst case "real system", and if I 
recall correctly, the reason it was able to continue was _not_ because it 
handled priority inversion, but because it reset itself every 24 hours or 
something like that, and had debugging facilities..

The _real_ lesson you should take away from it is not that priority 
inheritance is a good solution to priority inversion, but that having a 
failsafe switch when everthing goes wrong is critical. You don't know 
_what_ bug you'll encounter.

The bug itself could have been solved without priority inheritance, 
although I think in this case enabling that in VxWorks was the particular 
solution to the problem as being the least invasive.

Personally, I don't care what user space does. If some app wants to use 
priority inheritance to solve its bugs, that's fine. But it's like 
recursive locks: it's generally a _bandaid_ for bad locking. I definitely 
don't want the kernel depending on either.

So put a watchdog on your critical systems, and make sure you can debug 
them. Especially if they're on Mars.

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  1:00                                                     ` Linus Torvalds
@ 2005-12-17  3:13                                                       ` Steven Rostedt
  2005-12-17  7:34                                                         ` Linus Torvalds
  2005-12-19 23:46                                                       ` Keith Owens
  1 sibling, 1 reply; 239+ messages in thread
From: Steven Rostedt @ 2005-12-17  3:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Joe Korty, Thomas Gleixner, Geert Uytterhoeven, Andrew Morton,
	linux-arch, Linux Kernel Development, matthew, arjan,
	Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells


On Fri, 16 Dec 2005, Linus Torvalds wrote:
>
> On Fri, 16 Dec 2005, Joe Korty wrote:
> >
> > The Mars Pathfinder incident is sufficient proof that some solution to
> > the priority inversion problem is required in real systems.
>
> Ehh.
>
> The Mars Pathfinder is just about the worst case "real system", and if I
> recall correctly, the reason it was able to continue was _not_ because it
> handled priority inversion, but because it reset itself every 24 hours or
> something like that, and had debugging facilities..
>
> The _real_ lesson you should take away from it is not that priority
> inheritance is a good solution to priority inversion, but that having a
> failsafe switch when everthing goes wrong is critical. You don't know
> _what_ bug you'll encounter.
>
> The bug itself could have been solved without priority inheritance,
> although I think in this case enabling that in VxWorks was the particular
> solution to the problem as being the least invasive.
>
> Personally, I don't care what user space does. If some app wants to use
> priority inheritance to solve its bugs, that's fine. But it's like
> recursive locks: it's generally a _bandaid_ for bad locking. I definitely
> don't want the kernel depending on either.

So how does one handle real-time tasks that must contend with locks within
the kernel that is shared with low priority tasks?  Do you prefer the RTAI
approach?

-- Steve


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  3:13                                                       ` Steven Rostedt
@ 2005-12-17  7:34                                                         ` Linus Torvalds
  2005-12-17 23:43                                                           ` Matthew Wilcox
                                                                             ` (2 more replies)
  0 siblings, 3 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-17  7:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Joe Korty, Thomas Gleixner, Geert Uytterhoeven, Andrew Morton,
	linux-arch, Linux Kernel Development, matthew, arjan,
	Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Fri, 16 Dec 2005, Steven Rostedt wrote:
> 
> So how does one handle real-time tasks that must contend with locks within
> the kernel that is shared with low priority tasks?  Do you prefer the RTAI
> approach?

If you want hard real-time, either that, or just make sure you don't get 
locks that might be slow (for one reason or another). Finer granularities 
help there.

For example, to make things really concrete, please just name a semaphore 
that is relevant to a real-time task and that isn't fine enough grain that 
a careful and controlled environment can't avoid it being a bottle-neck 
for a real-time task.

The real problems often end up happening in things like memory management, 
and waiting for IO, where it's not about the locking at all, it's about 
event scheduling. And you just have to avoid those (through pre-allocation 
and buffering) in those kinds of real-time situations.

I really can't think of any blocking kernel lock where priority 
inheritance would make _any_ sense at all. Please give me an example. 

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  7:34                                                         ` Linus Torvalds
@ 2005-12-17 23:43                                                           ` Matthew Wilcox
  2005-12-18  0:05                                                             ` Lee Revell
  2005-12-22 12:27                                                             ` Bill Huey
  2005-12-19 16:08                                                           ` Ingo Molnar
  2005-12-22 12:40                                                           ` Bill Huey
  2 siblings, 2 replies; 239+ messages in thread
From: Matthew Wilcox @ 2005-12-17 23:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Joe Korty, Thomas Gleixner, Geert Uytterhoeven,
	Andrew Morton, linux-arch, Linux Kernel Development, arjan,
	Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Fri, Dec 16, 2005 at 11:34:03PM -0800, Linus Torvalds wrote:
> I really can't think of any blocking kernel lock where priority 
> inheritance would make _any_ sense at all. Please give me an example. 

I have a better example of something we currently get wrong that I
haven't heard any RT person worry about yet.  If two tasks are sleeping
on the same semaphore, the one to be woken up will be the first one to
wait for it, not the highest-priority task.

Obviously, this was introduced by the wake-one semantics.  But how to
fix it?  Should we scan the entire queue looking for the best task to
wake?  Should we try to maintain the wait list in priority order?  Or
should we just not care?  Should we document that we don't care?  ;-)

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17 23:43                                                           ` Matthew Wilcox
@ 2005-12-18  0:05                                                             ` Lee Revell
  2005-12-18  0:21                                                               ` Matthew Wilcox
  2005-12-22 12:27                                                             ` Bill Huey
  1 sibling, 1 reply; 239+ messages in thread
From: Lee Revell @ 2005-12-18  0:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Steven Rostedt, Joe Korty, Thomas Gleixner,
	Geert Uytterhoeven, Andrew Morton, linux-arch,
	Linux Kernel Development, arjan, Christoph Hellwig, mingo,
	Alan Cox, nikita, pj, dhowells

On Sat, 2005-12-17 at 16:43 -0700, Matthew Wilcox wrote:
> On Fri, Dec 16, 2005 at 11:34:03PM -0800, Linus Torvalds wrote:
> > I really can't think of any blocking kernel lock where priority 
> > inheritance would make _any_ sense at all. Please give me an example. 
> 
> I have a better example of something we currently get wrong that I
> haven't heard any RT person worry about yet.  If two tasks are sleeping
> on the same semaphore, the one to be woken up will be the first one to
> wait for it, not the highest-priority task.
> 
> Obviously, this was introduced by the wake-one semantics.  But how to
> fix it?  Should we scan the entire queue looking for the best task to
> wake?  Should we try to maintain the wait list in priority order?  Or
> should we just not care?  Should we document that we don't care?  ;-)

It's well known that this is a problem:

http://developer.osdl.org/dev/robustmutexes/src/fusyn.hg/Documentation/fusyn/fusyn-why.txt

Lee


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-18  0:05                                                             ` Lee Revell
@ 2005-12-18  0:21                                                               ` Matthew Wilcox
  2005-12-18  1:25                                                                 ` Lee Revell
  0 siblings, 1 reply; 239+ messages in thread
From: Matthew Wilcox @ 2005-12-18  0:21 UTC (permalink / raw)
  To: Lee Revell
  Cc: Linus Torvalds, Steven Rostedt, Joe Korty, Thomas Gleixner,
	Geert Uytterhoeven, Andrew Morton, linux-arch,
	Linux Kernel Development, arjan, Christoph Hellwig, mingo,
	Alan Cox, nikita, pj, dhowells

On Sat, Dec 17, 2005 at 07:05:21PM -0500, Lee Revell wrote:
> On Sat, 2005-12-17 at 16:43 -0700, Matthew Wilcox wrote:
> > I have a better example of something we currently get wrong that I
> > haven't heard any RT person worry about yet.  If two tasks are sleeping
> > on the same semaphore, the one to be woken up will be the first one to
> > wait for it, not the highest-priority task.
> > 
> > Obviously, this was introduced by the wake-one semantics.  But how to
> > fix it?  Should we scan the entire queue looking for the best task to
> > wake?  Should we try to maintain the wait list in priority order?  Or
> > should we just not care?  Should we document that we don't care?  ;-)
> 
> It's well known that this is a problem:
> 
> http://developer.osdl.org/dev/robustmutexes/src/fusyn.hg/Documentation/fusyn/fusyn-why.txt

Erm.  That paper is talking about user-space semaphores based on futexes.
I'm talking about kernel semaphores.  At a first glance, fixing futexes
would be a very different job from fixing semaphores.

BTW, fuqueues?  HAHAHAHA.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-18  0:21                                                               ` Matthew Wilcox
@ 2005-12-18  1:25                                                                 ` Lee Revell
  0 siblings, 0 replies; 239+ messages in thread
From: Lee Revell @ 2005-12-18  1:25 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Steven Rostedt, Joe Korty, Thomas Gleixner,
	Geert Uytterhoeven, Andrew Morton, linux-arch,
	Linux Kernel Development, arjan, Christoph Hellwig, mingo,
	Alan Cox, nikita, pj, dhowells

On Sat, 2005-12-17 at 17:21 -0700, Matthew Wilcox wrote:
> On Sat, Dec 17, 2005 at 07:05:21PM -0500, Lee Revell wrote:
> > On Sat, 2005-12-17 at 16:43 -0700, Matthew Wilcox wrote:
> > > I have a better example of something we currently get wrong that I
> > > haven't heard any RT person worry about yet.  If two tasks are sleeping
> > > on the same semaphore, the one to be woken up will be the first one to
> > > wait for it, not the highest-priority task.
> > > 
> > > Obviously, this was introduced by the wake-one semantics.  But how to
> > > fix it?  Should we scan the entire queue looking for the best task to
> > > wake?  Should we try to maintain the wait list in priority order?  Or
> > > should we just not care?  Should we document that we don't care?  ;-)
> > 
> > It's well known that this is a problem:
> > 
> > http://developer.osdl.org/dev/robustmutexes/src/fusyn.hg/Documentation/fusyn/fusyn-why.txt
> 
> Erm.  That paper is talking about user-space semaphores based on futexes.
> I'm talking about kernel semaphores.  At a first glance, fixing futexes
> would be a very different job from fixing semaphores.
> 
> BTW, fuqueues?  HAHAHAHA.
> 

Hmm, interesting, so in fact the scheduler does not always run the
highest priority runnable process?  Do you have a test case where
userspace would experience priority inversion due to this?

Maybe it has not been a problem as all the PI cases would involve two RT
processes that make system calls which end up blocking on a semaphore in
the kernel, which is bad RT design anyway - normally you would separate
the RT parts of the app which carefully avoid possibly blocking system
calls from the non RT parts and communicate via lock free ringbuffers or
a similar mechanism.

Lee


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17 23:43                                                           ` Matthew Wilcox
  2005-12-18  0:05                                                             ` Lee Revell
@ 2005-12-22 12:27                                                             ` Bill Huey
  1 sibling, 0 replies; 239+ messages in thread
From: Bill Huey @ 2005-12-22 12:27 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Steven Rostedt, Joe Korty, Thomas Gleixner,
	Geert Uytterhoeven, Andrew Morton, linux-arch,
	Linux Kernel Development, arjan, Christoph Hellwig, mingo,
	Alan Cox, nikita, pj, dhowells

On Sat, Dec 17, 2005 at 04:43:05PM -0700, Matthew Wilcox wrote:
> I have a better example of something we currently get wrong that I
> haven't heard any RT person worry about yet.  If two tasks are sleeping
> on the same semaphore, the one to be woken up will be the first one to
> wait for it, not the highest-priority task.
> 
> Obviously, this was introduced by the wake-one semantics.  But how to
> fix it?  Should we scan the entire queue looking for the best task to
> wake?  Should we try to maintain the wait list in priority order?  Or
> should we just not care?  Should we document that we don't care?  ;-)

-rt deals with this using priority sorted wait queue and direct ownership
hand off to the woken thread. It's working fine for now, but things like
wake-all and company should probably be explored for various uses. A
strict general purpose and RT usage of the Linux kernel have different
performance characteristic and mutex selection at compile time should
address things precisely.

bill


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  7:34                                                         ` Linus Torvalds
  2005-12-17 23:43                                                           ` Matthew Wilcox
@ 2005-12-19 16:08                                                           ` Ingo Molnar
  2005-12-22 12:40                                                           ` Bill Huey
  2 siblings, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-19 16:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Joe Korty, Thomas Gleixner, Geert Uytterhoeven,
	Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, Alan Cox, nikita, pj, dhowells

* Linus Torvalds <torvalds@osdl.org> wrote:

> I really can't think of any blocking kernel lock where priority 
> inheritance would make _any_ sense at all. Please give me an example.

you are completely right, most of the blocking kernel locks we have in 
Linux are an extremely poor candidate for priority inheritance. Most of 
them are totally noncritical, or cover so long codepaths that 'latency 
guarantees' and 'priority inheritance' make little sense for them.

the reason why we still have transformed most of the stock semaphore 
users to generic mutexes in the -rt kernel is mostly because i wanted to 
have _one_ central lock implementation. Firstly, it makes alot of sense 
to consolidate code on embedded platforms anyway. Secondly, it was much 
easier to implement (and validate) one robust locking primitive, than to 
implement 5 separate primitives. The fact that this also made semaphores 
PI-able is just a side-effect, with little to no practical relevance.

maybe ->i_sem is one notable exception: it is often held in critical 
codepaths, and it's really hard for even the most-well-controlled RT app 
to achieve _total_ isolation from the 'unprivileged' filesystem space.  
So occasional 'resource sharing' in form of hitting an i_sem of another 
task may still occur, and PI can at least reduce the worst-case cost 
somehow. (even though i_sem codepaths are by no means deterministic!)

so in the -rt kernel, every semaphore, rw-semaphore, spinlock, rwlock 
and seqlock [and in the latest -rt patches, every futex too] is 
abstracted off a central generic mutex type, which mutex is blocking and 
supports priority queueing and priority inheritance. It also has an 
extensive debugging framework, which we didnt want to duplicate for 
every separate lock object.

we also have a facility in the -rt kernel that traces the worst-case 
latency path of critical applications (the 'latency tracer'), when mixed 
with non-critical workloads, so we have quite good experience about what 
kind of locks make a difference and what kind of locks make no 
difference.

we also definitely know that priority inheritance done over _all_ these 
lock objects makes a very real quality difference in practice, resulting 
in real-time apps experiencing much lower worst-case latencies than 
under the stock kernel. It is true that you could live without priority 
inheritance, in cases were the RT app can be rewritten to have totally 
separated resources, but in practice that is only possible for the 
simplest applications.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  7:34                                                         ` Linus Torvalds
  2005-12-17 23:43                                                           ` Matthew Wilcox
  2005-12-19 16:08                                                           ` Ingo Molnar
@ 2005-12-22 12:40                                                           ` Bill Huey
  2005-12-22 12:45                                                             ` Bill Huey
  2 siblings, 1 reply; 239+ messages in thread
From: Bill Huey @ 2005-12-22 12:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Joe Korty, Thomas Gleixner, Geert Uytterhoeven,
	Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Fri, Dec 16, 2005 at 11:34:03PM -0800, Linus Torvalds wrote:
> For example, to make things really concrete, please just name a semaphore 
> that is relevant to a real-time task and that isn't fine enough grain that 
> a careful and controlled environment can't avoid it being a bottle-neck 
> for a real-time task.

Problem here is ownership tracking with a semaphore is an extremely difficult
problem to solve without serializing the entire thing with a single spinlock.
You lose parallelism here and possible create other problems since the
contention window is larger surround critical sections using it.

> The real problems often end up happening in things like memory management, 
> and waiting for IO, where it's not about the locking at all, it's about 
> event scheduling. And you just have to avoid those (through pre-allocation 
> and buffering) in those kinds of real-time situations.
> 
> I really can't think of any blocking kernel lock where priority 
> inheritance would make _any_ sense at all. Please give me an example. 

The current kernel mostly using traditional spinlocks doesn't have locking
complicated enough to warrant it. However, the -rt patch does create a
circumstance where a fully preemptible may sleep task with mutexes held create
and needs resolve priority inversions that results from it. That's of
course assuming that priority is something that needs to be strictly
obeyed in this variant of the kernel with consideration to priority
inheritance.

bill

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-22 12:40                                                           ` Bill Huey
@ 2005-12-22 12:45                                                             ` Bill Huey
  0 siblings, 0 replies; 239+ messages in thread
From: Bill Huey @ 2005-12-22 12:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Joe Korty, Thomas Gleixner, Geert Uytterhoeven,
	Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Thu, Dec 22, 2005 at 04:40:27AM -0800, Bill Huey wrote:
> The current kernel mostly using traditional spinlocks doesn't have locking
> complicated enough to warrant it. However, the -rt patch does create[s] a
> circumstance where a fully preemptible [kernel] may sleep task with mutexes held create[ing]
> [-and needs] [a need to] resolve priority inversions that results from it. That's of

With corrections...

Sorry, I meant a fully preemptive kernel has priority inversion as an
inheritant property and needs to resolved using some kind of priority
inheritance.

> course assuming that priority is something that needs to be strictly
> obeyed in this variant of the kernel with consideration to priority
> inheritance.

bill


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  1:00                                                     ` Linus Torvalds
  2005-12-17  3:13                                                       ` Steven Rostedt
@ 2005-12-19 23:46                                                       ` Keith Owens
  1 sibling, 0 replies; 239+ messages in thread
From: Keith Owens @ 2005-12-19 23:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Joe Korty, Thomas Gleixner, Geert Uytterhoeven, Steven Rostedt,
	Andrew Morton, linux-arch, Linux Kernel Development, matthew,
	arjan, Christoph Hellwig, mingo, Alan Cox, nikita, pj, dhowells

On Fri, 16 Dec 2005 17:00:09 -0800 (PST),
Linus Torvalds <torvalds@osdl.org> wrote:
>The Mars Pathfinder is just about the worst case "real system", and if I
>recall correctly, the reason it was able to continue was _not_ because it
>handled priority inversion, but because it reset itself every 24 hours or
>something like that, and had debugging facilities..
>...
>So put a watchdog on your critical systems, and make sure you can debug
>them. Especially if they're on Mars.

Who are you and what have you done with the real[1]

  Linus "I'm a sick and twisted person, and I trust people who write code
  without debuggers a lot more than I trust those who don't" Torvalds :-)

[1] http://www.ussg.iu.edu/hypermail/linux/kernel/9510/0103.html


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:54                     ` Andrew Morton
  2005-12-15 13:41                       ` Nikita Danilov
@ 2005-12-15 14:41                       ` Steven Rostedt
  1 sibling, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 14:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-arch, linux-kernel, matthew, arjan, torvalds, hch, mingo,
	pj, alan, dhowells, tglx, Mark Lord

On Wed, 2005-12-14 at 15:54 -0800, Andrew Morton wrote:
> Mark Lord <lkml@rtr.ca> wrote:
> >
> > Leaving up()/down() as-is is really the most sensible option.
> >
> 
> Absolutely.
> 
> I must say that my interest in this stuff is down in
> needs-an-electron-microscope-to-locate territory.  down() and up() work
> just fine and they're small, efficient, well-debugged and well-understood. 
> We need a damn good reason for taking on tree-wide churn or incompatible
> renames or addition of risk.  What's the damn good reason here?
> 

****
> Please.  Go fix some bugs.  We're not short of them.
****

I'd give that the quote of the day!

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:40                   ` Mark Lord
  2005-12-14 23:54                     ` Andrew Morton
@ 2005-12-14 23:57                     ` Thomas Gleixner
  2005-12-14 23:57                       ` Mark Lord
  2005-12-15 15:37                     ` David Howells
  2 siblings, 1 reply; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-14 23:57 UTC (permalink / raw)
  To: Mark Lord
  Cc: David Howells, Alan Cox, Paul Jackson, mingo, hch, akpm,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Wed, 2005-12-14 at 18:40 -0500, Mark Lord wrote:
> Thomas Gleixner wrote:
> >
> > You can do a full scripted rename of up/down to the mutex API and then
> > fix up the 100 places used by semaphores manually.
> 
> Again, folks, this only works for current in-tree kernel code.
> 
> There are huge amounts of kernel code out-of-tree that still use
> up/down as (or potentially as) counting semaphores.
> 
> Yes, some of that code is closed-source, but most of it is open-source
> stuff in people's "queues", such as the network patch-o-matic queue
> and other stuff.  Lots of open-source out-of-tree drivers, too.
> 
> Re-using the existing up()/down() names for a new purpose is
> a very very Bad Idea. 

Ack.

>  Removing up()/down() entirely is not quite so bad,
> because at least then people will eventually notice the change.
> 
> Leaving up()/down() as-is is really the most sensible option.

Not at all.

Doing a s/down/lock_mutex/ s/up/unlock_mutex/ - or whatever naming
convention we want to use - all over the place for mutexes while keeping
the up/down for counting semaphores is an one time issue.

After the conversion every code breaks at compile time which tries to do
up/down(mutex_type).

So the out of tree drivers have a clear indication what to fix. This is
also a one time issue.

So where is the problem - except for fixing "huge" amounts of out of
kernel code once ?


	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:57                     ` Thomas Gleixner
@ 2005-12-14 23:57                       ` Mark Lord
  2005-12-15  0:10                         ` Thomas Gleixner
  0 siblings, 1 reply; 239+ messages in thread
From: Mark Lord @ 2005-12-14 23:57 UTC (permalink / raw)
  To: tglx
  Cc: David Howells, Alan Cox, Paul Jackson, mingo, hch, akpm,
	torvalds, arjan, matthew, linux-kernel, linux-arch

Thomas Gleixner wrote:
> On Wed, 2005-12-14 at 18:40 -0500, Mark Lord wrote:
...
>>Leaving up()/down() as-is is really the most sensible option.
> 
...
>Doing a s/down/lock_mutex/ s/up/unlock_mutex/ - or whatever naming
> convention we want to use - all over the place for mutexes while keeping
> the up/down for counting semaphores is an one time issue.
> 
> After the conversion every code breaks at compile time which tries to do
> up/down(mutex_type).
> 
> So the out of tree drivers have a clear indication what to fix. This is
> also a one time issue.
> 
> So where is the problem - except for fixing "huge" amounts of out of
> kernel code once ?

Pointless API breakage.  The same functions continue to exist,
the old names CANNOT be reused for some (longish) time,
so there's no point in renaming them.  It just breaks an API
for no good reason whatsoever.

Cheers

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:57                       ` Mark Lord
@ 2005-12-15  0:10                         ` Thomas Gleixner
  2005-12-15  2:46                           ` Linus Torvalds
  2005-12-15 15:53                           ` David Howells
  0 siblings, 2 replies; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-15  0:10 UTC (permalink / raw)
  To: Mark Lord
  Cc: David Howells, Alan Cox, Paul Jackson, mingo, hch, akpm,
	torvalds, arjan, matthew, linux-kernel, linux-arch

On Wed, 2005-12-14 at 18:57 -0500, Mark Lord wrote:
> >>Leaving up()/down() as-is is really the most sensible option.
> > 
> ...
> >Doing a s/down/lock_mutex/ s/up/unlock_mutex/ - or whatever naming
> > convention we want to use - all over the place for mutexes while keeping
> > the up/down for counting semaphores is an one time issue.
> > 
> > After the conversion every code breaks at compile time which tries to do
> > up/down(mutex_type).
> > 
> > So the out of tree drivers have a clear indication what to fix. This is
> > also a one time issue.
> > 
> > So where is the problem - except for fixing "huge" amounts of out of
> > kernel code once ?
> 
> Pointless API breakage.  The same functions continue to exist,
> the old names CANNOT be reused for some (longish) time,
> so there's no point in renaming them.  It just breaks an API
> for no good reason whatsoever.

Well, depends on the POV. A counting sempahore is a different beast than
a mutex. At least as far as my limited knowledge of concurrency controls
goes.

The API breakage was introduced by using up/down for mutexes and not by
correcting this to a sane API.

	tglx



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15  0:10                         ` Thomas Gleixner
@ 2005-12-15  2:46                           ` Linus Torvalds
  2005-12-15 15:53                           ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-15  2:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mark Lord, David Howells, Alan Cox, Paul Jackson, mingo, hch,
	akpm, arjan, matthew, linux-kernel, linux-arch

On Thu, 15 Dec 2005, Thomas Gleixner wrote:
> 
> Well, depends on the POV. A counting sempahore is a different beast than
> a mutex. At least as far as my limited knowledge of concurrency controls
> goes.

A real semaphore is counting. 

Dammit, unless the pure mutex has a _huge_ performance advantage on major 
architectures, we're not changing it. There's absolutely zero point. A 
counting semaphore is a perfectly fine mutex - the fact that it can _also_ 
be used to allow more than 1 user into a critical region and generally do 
other things is totally immaterial.

It's _extra_ stupid to re-use the names "down()" and "up()" on a 
non-counting mutex, since then the names make zero sense at all. Use 
"lock_mutex()" and "unlock_mutex()" or something, and don't break existing 
code for no measurable gain.

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15  0:10                         ` Thomas Gleixner
  2005-12-15  2:46                           ` Linus Torvalds
@ 2005-12-15 15:53                           ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-15 15:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Mark Lord, David Howells, Alan Cox,
	Paul Jackson, mingo, hch, akpm, arjan, matthew, linux-kernel,
	linux-arch

Linus Torvalds <torvalds@osdl.org> wrote:

> Dammit, unless the pure mutex has a _huge_ performance advantage on major 
> architectures, we're not changing it.

Whilst it's true that the major archs are generally where the least advantage
will be seen, consider the following points:

 (1) The major archs are generally the ones where consuming a few extra bytes
     of kernel code so as to hold the slow paths for the mutexes would matter
     least.

 (2) The minor archs are where the performance gain would be most noticable
     because many of them only have unconditional state substitution
     capabilities (XCHG/TAS/SWAP/BSET), and no matter how much you may not
     care for them, they do matter.

     Having to use spinlocks and interrupt disablement in lieu of conditional
     state substitution (such as CMPXCHG) can cost quite a bit.

 (3) Mutex performance should in no way be slower on any arch than counting
     semaphores being used to do the same job. Now, admittedly, my first
     attempt was suboptimal for archs that have better-than-XCHG capabilities,
     but I've amended that with Ingo's help, just not released it yet.

> There's absolutely zero point. A counting semaphore is a perfectly fine
> mutex

But that isn't so in one particular case: debugging. A mutex would balk at a
double-release, but a counting semaphore will just silently let things go
wrong, because that's the nature of the beast.

> - the fact that it can _also_ be used to allow more than 1 user into a
> critical region and generally do other things is totally immaterial.

There are about a dozen such uses of counting semaphores in the kernel, and
they're mainly used as token/message counters.

> It's _extra_ stupid to re-use the names "down()" and "up()" on a 
> non-counting mutex, since then the names make zero sense at all. Use 
> "lock_mutex()" and "unlock_mutex()" or something, and don't break existing 
> code for no measurable gain.

Okay. Repurposing up(), down(), DECLARE_MUTEX() and init_MUTEX() had the major
benefit that the kernel required relatively few changes. The biggest problem
with doing a whole new mutex type with a new and different API is that
DECLARE_MUTEX and init_MUTEX are already taken... :-/

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 23:40                   ` Mark Lord
  2005-12-14 23:54                     ` Andrew Morton
  2005-12-14 23:57                     ` Thomas Gleixner
@ 2005-12-15 15:37                     ` David Howells
  2005-12-15 19:28                         ` Andrew Morton
  2 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-15 15:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mark Lord, tglx, dhowells, alan, pj, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

Andrew Morton <akpm@osdl.org> wrote:

> I must say that my interest in this stuff is down in
> needs-an-electron-microscope-to-locate territory.  down() and up() work
> just fine and they're small, efficient, well-debugged and well-understood. 
> We need a damn good reason for taking on tree-wide churn or incompatible
> renames or addition of risk.  What's the damn good reason here?

Well...

 (1) On some platforms counting semaphores _can't_ be implemented all that
     efficiently because the only atomic op you've got is something very
     simple that can only unconditionally exchange one state for another
     (XCHG/TAS/SWAP). In such cases counting semaphores have to be be
     implemented by disabling interrupts and taking spinlocks.

     Okay, spinlocks are null ops when CONFIG_SMP and CONFIG_DEBUG_SPINLOCK
     are both disabled, but you still have to disable interrupts, and that
     slows things down, sometimes quite appreciably. It is, for example,
     something I really want to avoid doing on FRV as it takes a *lot* of
     cycles.

 (2) I think Ingo has some RT requirements, but he's probably better to speak
     about them.

 (3) As a slight aside, in a number of cases counting semaphores and their
     operators are being misused: there are, for example, places where
     completions should be used instead and places where *_MUTEX_LOCKED are
     used to initialise counting semaphores. There are also cases in there
     that seem unsure as to whether they're using counting semaphores or
     mutexes.

     Whilst this is not an argument for a galaxy wide churn, in and of itself,
     it does show that a good review is needed and at the very least these
     cases need to be fixed.

 (4) Various people want a mutex for which the semantics are tighter: in
     particular requiring that mutexes must be released in their owner's
     context. This makes debugging easier.

 (5) Mutexes can catch a double-release, which counting semaphores by their
     very nature can't.

So... Would you then object to an implementation of a mutex appearing in the
tree which semaphores that are being used as strict mutexes can be migrated
over to as the opportunity arises?

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 15:37                     ` David Howells
@ 2005-12-15 19:28                         ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-15 19:28 UTC (permalink / raw)
  To: David Howells
  Cc: lkml, tglx, dhowells, alan, pj, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

David Howells <dhowells@redhat.com> wrote:
>
> So... Would you then object to an implementation of a mutex appearing in the
>  tree which semaphores that are being used as strict mutexes can be migrated
>  over to as the opportunity arises?

That would be sane.  The semaphore->completion migration didn't hurt.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-15 19:28                         ` Andrew Morton
  0 siblings, 0 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-15 19:28 UTC (permalink / raw)
  To: David Howells
  Cc: lkml, tglx, alan, pj, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

David Howells <dhowells@redhat.com> wrote:
>
> So... Would you then object to an implementation of a mutex appearing in the
>  tree which semaphores that are being used as strict mutexes can be migrated
>  over to as the opportunity arises?

That would be sane.  The semaphore->completion migration didn't hurt.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:28                         ` Andrew Morton
  (?)
@ 2005-12-15 20:18                         ` Andrew Morton
  2005-12-15 21:28                           ` Steven Rostedt
  2005-12-16 22:02                           ` Thomas Gleixner
  -1 siblings, 2 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-15 20:18 UTC (permalink / raw)
  To: dhowells, lkml, tglx, alan, pj, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

Andrew Morton <akpm@osdl.org> wrote:
>
> David Howells <dhowells@redhat.com> wrote:
> >
> > So... Would you then object to an implementation of a mutex appearing in the
> >  tree which semaphores that are being used as strict mutexes can be migrated
> >  over to as the opportunity arises?
> 
> That would be sane.
>

But not very.

Look at it from the POV of major architectures: there's no way the new
mutex code will be faster than down() and up(), so we're adding a bunch of
new tricky locking code which bloats the kernel and has to be understood
and debugged for no gain.

And I don't buy the debuggability argument really.  It'd be pretty simple
to add debug code to the existing semaphore code to trap non-mutex usages. 
Then go through the few valid non-mutex users and do:

#if debug
	sem->this_is_not_a_mutex = 1;
#endif

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 20:18                         ` Andrew Morton
@ 2005-12-15 21:28                           ` Steven Rostedt
  2005-12-16 22:02                           ` Thomas Gleixner
  1 sibling, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-15 21:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-arch, linux-kernel, matthew, arjan, torvalds, hch, mingo,
	pj, alan, tglx, lkml, dhowells

On Thu, 2005-12-15 at 12:18 -0800, Andrew Morton wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > David Howells <dhowells@redhat.com> wrote:
> > >
> > > So... Would you then object to an implementation of a mutex appearing in the
> > >  tree which semaphores that are being used as strict mutexes can be migrated
> > >  over to as the opportunity arises?
> > 
> > That would be sane.
> >
> 
> But not very.
> 
> Look at it from the POV of major architectures: there's no way the new
> mutex code will be faster than down() and up(), so we're adding a bunch of
> new tricky locking code which bloats the kernel and has to be understood
> and debugged for no gain.

I see it as a stepping stone for RT ;)

> 
> And I don't buy the debuggability argument really.  It'd be pretty simple
> to add debug code to the existing semaphore code to trap non-mutex usages. 
> Then go through the few valid non-mutex users and do:
> 
> #if debug
> 	sem->this_is_not_a_mutex = 1;
> #endif

That just looks plain ugly.  Still, if you want to keep the major archs
unchanged (at least until RT is in!) then just add the following:

#define mutex_lock(x) down(x)
#define mutex_unlock(x) up(x)
#define mutex_trylock(x) (!down_trylock(x))  /* see previous email! */

Then you can add your ugly patch ;) where on debug we define those
declared with DEFINE_SEM(x) add the this_is_not_a_mutex = 1

-- Steve


> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 20:18                         ` Andrew Morton
  2005-12-15 21:28                           ` Steven Rostedt
@ 2005-12-16 22:02                           ` Thomas Gleixner
  1 sibling, 0 replies; 239+ messages in thread
From: Thomas Gleixner @ 2005-12-16 22:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: dhowells, lkml, alan, pj, mingo, hch, torvalds, arjan, matthew,
	linux-kernel, linux-arch

On Thu, 2005-12-15 at 12:18 -0800, Andrew Morton wrote:

> Look at it from the POV of major architectures: there's no way the new
> mutex code will be faster than down() and up(), so we're adding a bunch of
> new tricky locking code which bloats the kernel and has to be understood
> and debugged for no gain.

Look at it from the semantical POV first, which is the most important
one.

semaphores are semantically different from mutexes, so they require
different APIs.

When you have semantically different APIs, you can still implement them
for whatever (e.g. peformance) reason on top of the same mechanism, but
you can not make this work the other way round.

	tglx
	



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-15 19:28                         ` Andrew Morton
  (?)
  (?)
@ 2005-12-16 10:45                         ` David Howells
  -1 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-16 10:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: dhowells, lkml, tglx, alan, pj, mingo, hch, torvalds, arjan,
	matthew, linux-kernel, linux-arch

Andrew Morton <akpm@osdl.org> wrote:

> Look at it from the POV of major architectures: there's no way the new
> mutex code will be faster than down() and up()

I'm thinking of making the default implementation of mutexes a straight
wrapper around down() and up(). That way it'll be exactly the same as counting
semaphores, just with extra constraints when the debugging is enabled _and_
effectively extra inline documentation.

But! for archs where it does matter (and we have several - you might not care,
but others do), it can be overridden with something faster.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  7:54   ` Ingo Molnar
                       ` (2 preceding siblings ...)
  2005-12-13  9:02     ` Christoph Hellwig
@ 2005-12-13  9:55     ` Ingo Molnar
  3 siblings, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13  9:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Howells, torvalds, hch, arjan, matthew, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 356 bytes --]


* Ingo Molnar <mingo@elte.hu> wrote:

> all this simplified the 'compatibility conversion' to the patch below.  
> No other non-generic changes are needed.

there were 3 more patches needed, which convert some semaphores to 
completions:

 sx8-sem2completions.patch
 cpu5wdt-sem2completions.patch
 ide-gendev-sem-to-completion.patch

all attached.

	Ingo

[-- Attachment #2: sx8-sem2completions.patch --]
[-- Type: text/plain, Size: 1526 bytes --]

 drivers/block/sx8.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

Index: linux/drivers/block/sx8.c
===================================================================
--- linux.orig/drivers/block/sx8.c
+++ linux/drivers/block/sx8.c
@@ -27,6 +27,7 @@
 #include <linux/time.h>
 #include <linux/hdreg.h>
 #include <linux/dma-mapping.h>
+#include <linux/completion.h>
 #include <asm/io.h>
 #include <asm/semaphore.h>
 #include <asm/uaccess.h>
@@ -303,7 +304,7 @@ struct carm_host {
 
 	struct work_struct		fsm_task;
 
-	struct semaphore		probe_sem;
+	struct completion		probe_comp;
 };
 
 struct carm_response {
@@ -1365,7 +1366,7 @@ static void carm_fsm_task (void *_data)
 	}
 
 	case HST_PROBE_FINISHED:
-		up(&host->probe_sem);
+		complete(&host->probe_comp);
 		break;
 
 	case HST_ERROR:
@@ -1641,7 +1642,7 @@ static int carm_init_one (struct pci_dev
 	host->flags = pci_dac ? FL_DAC : 0;
 	spin_lock_init(&host->lock);
 	INIT_WORK(&host->fsm_task, carm_fsm_task, host);
-	init_MUTEX_LOCKED(&host->probe_sem);
+	init_completion(&host->probe_comp);
 
 	for (i = 0; i < ARRAY_SIZE(host->req); i++)
 		host->req[i].tag = i;
@@ -1710,8 +1711,8 @@ static int carm_init_one (struct pci_dev
 	if (rc)
 		goto err_out_free_irq;
 
-	DPRINTK("waiting for probe_sem\n");
-	down(&host->probe_sem);
+	DPRINTK("waiting for probe_comp\n");
+	wait_for_completion(&host->probe_comp);
 
 	printk(KERN_INFO "%s: pci %s, ports %d, io %lx, irq %u, major %d\n",
 	       host->name, pci_name(pdev), (int) CARM_MAX_PORTS,

[-- Attachment #3: cpu5wdt-sem2completions.patch --]
[-- Type: text/plain, Size: 1419 bytes --]

 drivers/char/watchdog/cpu5wdt.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

Index: linux/drivers/char/watchdog/cpu5wdt.c
===================================================================
--- linux.orig/drivers/char/watchdog/cpu5wdt.c
+++ linux/drivers/char/watchdog/cpu5wdt.c
@@ -28,6 +28,7 @@
 #include <linux/init.h>
 #include <linux/ioport.h>
 #include <linux/timer.h>
+#include <linux/completion.h>
 #include <linux/jiffies.h>
 #include <asm/io.h>
 #include <asm/uaccess.h>
@@ -57,7 +58,7 @@ static int ticks = 10000;
 /* some device data */
 
 static struct {
-	struct semaphore stop;
+	struct completion stop;
 	volatile int running;
 	struct timer_list timer;
 	volatile int queue;
@@ -85,7 +86,7 @@ static void cpu5wdt_trigger(unsigned lon
 	}
 	else {
 		/* ticks doesn't matter anyway */
-		up(&cpu5wdt_device.stop);
+		complete(&cpu5wdt_device.stop);
 	}
 
 }
@@ -239,7 +240,7 @@ static int __devinit cpu5wdt_init(void)
 	if ( !val )
 		printk(KERN_INFO PFX "sorry, was my fault\n");
 
-	init_MUTEX_LOCKED(&cpu5wdt_device.stop);
+	init_completion(&cpu5wdt_device.stop);
 	cpu5wdt_device.queue = 0;
 
 	clear_bit(0, &cpu5wdt_device.inuse);
@@ -269,7 +270,7 @@ static void __devexit cpu5wdt_exit(void)
 {
 	if ( cpu5wdt_device.queue ) {
 		cpu5wdt_device.queue = 0;
-		down(&cpu5wdt_device.stop);
+		wait_for_completion(&cpu5wdt_device.stop);
 	}
 
 	misc_deregister(&cpu5wdt_misc);

[-- Attachment #4: ide-gendev-sem-to-completion.patch --]
[-- Type: text/plain, Size: 3535 bytes --]

The following patch is from Montavista.  I modified it slightly.
Semaphores are currently being used where it makes more sense for
completions.  This patch corrects that.

-- Steve

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Source: MontaVista Software, Inc.
Signed-off-by: Aleksey Makarov <amakarov@ru.mvista.com>
Description:
	The patch changes semaphores that are initialized as 
	locked to complete().

 drivers/ide/ide-probe.c |    4 ++--
 drivers/ide/ide.c       |    8 ++++----
 include/linux/ide.h     |    5 +++--
 3 files changed, 9 insertions(+), 8 deletions(-)

Index: linux/drivers/ide/ide-probe.c
===================================================================
--- linux.orig/drivers/ide/ide-probe.c
+++ linux/drivers/ide/ide-probe.c
@@ -655,7 +655,7 @@ static void hwif_release_dev (struct dev
 {
 	ide_hwif_t *hwif = container_of(dev, ide_hwif_t, gendev);
 
-	up(&hwif->gendev_rel_sem);
+	complete(&hwif->gendev_rel_comp);
 }
 
 static void hwif_register (ide_hwif_t *hwif)
@@ -1325,7 +1325,7 @@ static void drive_release_dev (struct de
 	drive->queue = NULL;
 	spin_unlock_irq(&ide_lock);
 
-	up(&drive->gendev_rel_sem);
+	complete(&drive->gendev_rel_comp);
 }
 
 /*
Index: linux/drivers/ide/ide.c
===================================================================
--- linux.orig/drivers/ide/ide.c
+++ linux/drivers/ide/ide.c
@@ -222,7 +222,7 @@ static void init_hwif_data(ide_hwif_t *h
 	hwif->mwdma_mask = 0x80;	/* disable all mwdma */
 	hwif->swdma_mask = 0x80;	/* disable all swdma */
 
-	sema_init(&hwif->gendev_rel_sem, 0);
+	init_completion(&hwif->gendev_rel_comp);
 
 	default_hwif_iops(hwif);
 	default_hwif_transport(hwif);
@@ -245,7 +245,7 @@ static void init_hwif_data(ide_hwif_t *h
 		drive->is_flash			= 0;
 		drive->vdma			= 0;
 		INIT_LIST_HEAD(&drive->list);
-		sema_init(&drive->gendev_rel_sem, 0);
+		init_completion(&drive->gendev_rel_comp);
 	}
 }
 
@@ -602,7 +602,7 @@ void ide_unregister(unsigned int index)
 		}
 		spin_unlock_irq(&ide_lock);
 		device_unregister(&drive->gendev);
-		down(&drive->gendev_rel_sem);
+		wait_for_completion(&drive->gendev_rel_comp);
 		spin_lock_irq(&ide_lock);
 	}
 	hwif->present = 0;
@@ -662,7 +662,7 @@ void ide_unregister(unsigned int index)
 	/* More messed up locking ... */
 	spin_unlock_irq(&ide_lock);
 	device_unregister(&hwif->gendev);
-	down(&hwif->gendev_rel_sem);
+	wait_for_completion(&hwif->gendev_rel_comp);
 
 	/*
 	 * Remove us from the kernel's knowledge
Index: linux/include/linux/ide.h
===================================================================
--- linux.orig/include/linux/ide.h
+++ linux/include/linux/ide.h
@@ -18,6 +18,7 @@
 #include <linux/bio.h>
 #include <linux/device.h>
 #include <linux/pci.h>
+#include <linux/completion.h>
 #include <asm/byteorder.h>
 #include <asm/system.h>
 #include <asm/io.h>
@@ -759,7 +760,7 @@ typedef struct ide_drive_s {
 	int		crc_count;	/* crc counter to reduce drive speed */
 	struct list_head list;
 	struct device	gendev;
-	struct semaphore gendev_rel_sem;	/* to deal with device release() */
+	struct completion gendev_rel_comp;	/* to deal with device release() */
 } ide_drive_t;
 
 #define to_ide_device(dev)container_of(dev, ide_drive_t, gendev)
@@ -915,7 +916,7 @@ typedef struct hwif_s {
 	unsigned	sg_mapped  : 1;	/* sg_table and sg_nents are ready */
 
 	struct device	gendev;
-	struct semaphore gendev_rel_sem; /* To deal with device release() */
+	struct completion gendev_rel_comp; /* To deal with device release() */
 
 	void		*hwif_data;	/* extra hwif data */
 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (2 preceding siblings ...)
  2005-12-13  0:19 ` Andrew Morton
@ 2005-12-13  0:30 ` Arnd Bergmann
  2005-12-13  0:57 ` Daniel Walker
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 239+ messages in thread
From: Arnd Bergmann @ 2005-12-13  0:30 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

Am Dienstag 13 Dezember 2005 00:45 schrieb David Howells:
>  (7) Provides a debugging config option CONFIG_DEBUG_MUTEX_OWNER by which
> the mutex owner can be tracked and by which over-upping can be detected.

I can't see how your code actually detects the over-upping, although it's 
fairly obvious how it would be done. Did you miss one patch for this?

	Arnd <><

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (3 preceding siblings ...)
  2005-12-13  0:30 ` Arnd Bergmann
@ 2005-12-13  0:57 ` Daniel Walker
  2005-12-13  3:23   ` Steven Rostedt
  2005-12-13  2:57 ` Mark Lord
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 239+ messages in thread
From: Daniel Walker @ 2005-12-13  0:57 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

On Mon, 2005-12-12 at 23:45 +0000, David Howells wrote:

>  (1) Provides a simple xchg() based semaphore as a default for all
>      architectures that don't wish to override it and provide their own.
> 
>      Overriding is possible by setting CONFIG_ARCH_IMPLEMENTS_MUTEX and
>      supplying asm/mutex.h
> 
>      Partial overriding is possible by #defining mutex_grab(), mutex_release()
>      and is_mutex_locked() to perform the appropriate optimised functions.

Your code is really similar to the RT mutex, which does everything that
your mutex does at least ? Assuming you've reviewed the RT mutex, why
would we want to use yours over it?

Daniel


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  0:57 ` Daniel Walker
@ 2005-12-13  3:23   ` Steven Rostedt
  0 siblings, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-13  3:23 UTC (permalink / raw)
  To: Daniel Walker
  Cc: linux-arch, linux-kernel, matthew, arjan, hch, akpm, torvalds,
	David Howells, Ingo Molnar

On Mon, 2005-12-12 at 16:57 -0800, Daniel Walker wrote:
> On Mon, 2005-12-12 at 23:45 +0000, David Howells wrote:
> 
> >  (1) Provides a simple xchg() based semaphore as a default for all
> >      architectures that don't wish to override it and provide their own.
> > 
> >      Overriding is possible by setting CONFIG_ARCH_IMPLEMENTS_MUTEX and
> >      supplying asm/mutex.h
> > 
> >      Partial overriding is possible by #defining mutex_grab(), mutex_release()
> >      and is_mutex_locked() to perform the appropriate optimised functions.
> 
> Your code is really similar to the RT mutex, which does everything that
> your mutex does at least ? Assuming you've reviewed the RT mutex, why
> would we want to use yours over it?

Maybe this would be the better !PREEMPT_RT version.  But the true mutex
that Ingo is making would be used for the PREEMPT_RT side.

This code at least brings down the over head of semaphores where they
are not really needed.  Looking at the code slightly (I must admit, I
spent maybe 30 seconds looking at it), it does seem a little similar to
Ingo's.  Could just be coincidence, since the methods are pretty much
what multiple people would come up with.  But you both work for RedHat,
hmm.

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (4 preceding siblings ...)
  2005-12-13  0:57 ` Daniel Walker
@ 2005-12-13  2:57 ` Mark Lord
  2005-12-13  3:17   ` Steven Rostedt
  2005-12-13  9:06   ` Christoph Hellwig
  2005-12-13  9:54 ` David Howells
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 239+ messages in thread
From: Mark Lord @ 2005-12-13  2:57 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

 > (5) Redirects the following to apply to the new mutexes rather than the
 >     traditional semaphores:
 >
 >	down()
 >	down_trylock()
 >	down_interruptible()
 >	up()

This will BREAK a lot of out-of-tree stuff if merged.

So please figure out some way to hang a HUGE banner out there
so that the external codebases know they need updating.

The simplest way would be to NOT re-use the up()/down() symbols,
but rather to either keep them as-is (counting semaphores),
or delete them entirely (so that external code *knows* of the change).

Cheers

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  2:57 ` Mark Lord
@ 2005-12-13  3:17   ` Steven Rostedt
  2005-12-13  9:06   ` Christoph Hellwig
  1 sibling, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-13  3:17 UTC (permalink / raw)
  To: Mark Lord
  Cc: linux-arch, linux-kernel, matthew, arjan, hch, akpm, torvalds,
	David Howells

On Mon, 2005-12-12 at 21:57 -0500, Mark Lord wrote:
>  > (5) Redirects the following to apply to the new mutexes rather than the
>  >     traditional semaphores:
>  >
>  >	down()
>  >	down_trylock()
>  >	down_interruptible()
>  >	up()
> 
> This will BREAK a lot of out-of-tree stuff if merged.
> 
> So please figure out some way to hang a HUGE banner out there
> so that the external codebases know they need updating.
> 
> The simplest way would be to NOT re-use the up()/down() symbols,
> but rather to either keep them as-is (counting semaphores),
> or delete them entirely (so that external code *knows* of the change).

Actually, up and down don't imply mutex at all.  So maybe it would be
better to keep up and down as normal semaphores, rename what you want to
mutex_lock / mutex_unlock which makes it obvious what it is, and then
you can go through and find all the semaphores that are being used as
mutexes (or is that mutices?) and make the change more incrementally.

-- Steve



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  2:57 ` Mark Lord
  2005-12-13  3:17   ` Steven Rostedt
@ 2005-12-13  9:06   ` Christoph Hellwig
  1 sibling, 0 replies; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-13  9:06 UTC (permalink / raw)
  To: Mark Lord
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

On Mon, Dec 12, 2005 at 09:57:40PM -0500, Mark Lord wrote:
> This will BREAK a lot of out-of-tree stuff if merged.

Well, bad luck for them.

> The simplest way would be to NOT re-use the up()/down() symbols,
> but rather to either keep them as-is (counting semaphores),
> or delete them entirely (so that external code *knows* of the change).

That I agree with actually.  Keeping the semaphore interface as-is
would simplify in-kernel transition a lot aswell and make it easier for
people to get the API read.  And the mutex symbols could get far more sensible
names like mutex_lock, mutex_unlock and mutex_trylock..

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (5 preceding siblings ...)
  2005-12-13  2:57 ` Mark Lord
@ 2005-12-13  9:54 ` David Howells
  2005-12-13 10:13   ` Ingo Molnar
                     ` (2 more replies)
  2005-12-13 10:48 ` David Howells
                   ` (7 subsequent siblings)
  14 siblings, 3 replies; 239+ messages in thread
From: David Howells @ 2005-12-13  9:54 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> We have atomic_cmpxchg. Can you use that for a sufficient generic
> implementation?

No. CMPXCHG/CAS is not as available as XCHG, and it's also unnecessary.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:54 ` David Howells
@ 2005-12-13 10:13   ` Ingo Molnar
  2005-12-13 10:34     ` Ingo Molnar
  2005-12-14  1:00   ` Nick Piggin
  2005-12-14 10:54   ` David Howells
  2 siblings, 1 reply; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 10:13 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch


* David Howells <dhowells@redhat.com> wrote:

> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> > We have atomic_cmpxchg. Can you use that for a sufficient generic
> > implementation?
> 
> No. CMPXCHG/CAS is not as available as XCHG, and it's also unnecessary.

take a look at the PREEMPT_RT implementation of mutexes: it uses 
cmpxchg(), and thus both the down() and the up() fastpath is lockless!  
(And that is a mutex type that does alot more things, as it supports 
priority inheritance.)

architectures which dont have cmpxchg can use a spinlock just fine.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:13   ` Ingo Molnar
@ 2005-12-13 10:34     ` Ingo Molnar
  2005-12-13 10:37       ` Ingo Molnar
  2005-12-13 12:47       ` Oliver Neukum
  0 siblings, 2 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 10:34 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

* Ingo Molnar <mingo@elte.hu> wrote:

> > Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > 
> > > We have atomic_cmpxchg. Can you use that for a sufficient generic
> > > implementation?
> > 
> > No. CMPXCHG/CAS is not as available as XCHG, and it's also unnecessary.
> 
> take a look at the PREEMPT_RT implementation of mutexes: it uses 
> cmpxchg(), and thus both the down() and the up() fastpath is lockless!  
> (And that is a mutex type that does alot more things, as it supports 
> priority inheritance.)
> 
> architectures which dont have cmpxchg can use a spinlock just fine.

the cost of a spinlock-based generic_cmpxchg could be significantly 
reduced by adding a generic_cmpxchg() variant that also includes a 
'spinlock pointer' parameter.

Architectures that do not have the instruction, can use the specified 
spinlock to do the cmpxchg. This means that there wont be one single 
global spinlock to emulate cmpxchg, but the mutex's own spinlock can be 
used for it.

Architectures that have the cmpxchg instruction would simply ignore the 
parameter, and would incur no overhead.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:34     ` Ingo Molnar
@ 2005-12-13 10:37       ` Ingo Molnar
  2005-12-13 12:47       ` Oliver Neukum
  1 sibling, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 10:37 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch


* Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > > Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > > 
> > > > We have atomic_cmpxchg. Can you use that for a sufficient generic
> > > > implementation?
> > > 
> > > No. CMPXCHG/CAS is not as available as XCHG, and it's also unnecessary.
> > 
> > take a look at the PREEMPT_RT implementation of mutexes: it uses 
> > cmpxchg(), and thus both the down() and the up() fastpath is lockless!  
> > (And that is a mutex type that does alot more things, as it supports 
> > priority inheritance.)
> > 
> > architectures which dont have cmpxchg can use a spinlock just fine.
> 
> the cost of a spinlock-based generic_cmpxchg could be significantly 
> reduced by adding a generic_cmpxchg() variant that also includes a 
> 'spinlock pointer' parameter.
> 
> Architectures that do not have the instruction, can use the specified 
> spinlock to do the cmpxchg. This means that there wont be one single 
> global spinlock to emulate cmpxchg, but the mutex's own spinlock can 
> be used for it.
> 
> Architectures that have the cmpxchg instruction would simply ignore 
> the parameter, and would incur no overhead.

an additional twist: we could add generic_cmpxchg_lock(), which would 
return the spinlock locked if the cmpxchg fails. (this is what we want 
to do anyway) This way architectures that dont have CMPXCHG would take 
the spinlock unconditionally and do the cmp-xchg emulation, while the 
other architectures would take it only if the cmpxchg fails.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:34     ` Ingo Molnar
  2005-12-13 10:37       ` Ingo Molnar
@ 2005-12-13 12:47       ` Oliver Neukum
  2005-12-13 13:09         ` Alan Cox
  1 sibling, 1 reply; 239+ messages in thread
From: Oliver Neukum @ 2005-12-13 12:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Howells, Nick Piggin, torvalds, akpm, hch, arjan, matthew,
	linux-kernel, linux-arch

Am Dienstag, 13. Dezember 2005 11:34 schrieb Ingo Molnar:

> the cost of a spinlock-based generic_cmpxchg could be significantly 
> reduced by adding a generic_cmpxchg() variant that also includes a 
> 'spinlock pointer' parameter.
> 
> Architectures that do not have the instruction, can use the specified 
> spinlock to do the cmpxchg. This means that there wont be one single 
> global spinlock to emulate cmpxchg, but the mutex's own spinlock can be 
> used for it.

Can't you use the pointer as a hash input?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 12:47       ` Oliver Neukum
@ 2005-12-13 13:09         ` Alan Cox
  2005-12-13 13:13           ` Matthew Wilcox
  2005-12-13 13:24           ` Oliver Neukum
  0 siblings, 2 replies; 239+ messages in thread
From: Alan Cox @ 2005-12-13 13:09 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Ingo Molnar, David Howells, Nick Piggin, torvalds, akpm, hch,
	arjan, matthew, linux-kernel, linux-arch

On Maw, 2005-12-13 at 13:47 +0100, Oliver Neukum wrote:
> > spinlock to do the cmpxchg. This means that there wont be one single 
> > global spinlock to emulate cmpxchg, but the mutex's own spinlock can be 
> > used for it.
> 
> Can't you use the pointer as a hash input?

Some platforms already do this for certain sets of operations like
atomic_t. The downside however is that you no longer control the lock
contention or cache line bouncing. It becomes a question of luck rather
than science as to how well it scales.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:09         ` Alan Cox
@ 2005-12-13 13:13           ` Matthew Wilcox
  2005-12-13 14:04             ` Alan Cox
  2005-12-13 13:24           ` Oliver Neukum
  1 sibling, 1 reply; 239+ messages in thread
From: Matthew Wilcox @ 2005-12-13 13:13 UTC (permalink / raw)
  To: Alan Cox
  Cc: Oliver Neukum, Ingo Molnar, David Howells, Nick Piggin, torvalds,
	akpm, hch, arjan, linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 01:09:31PM +0000, Alan Cox wrote:
> On Maw, 2005-12-13 at 13:47 +0100, Oliver Neukum wrote:
> > Can't you use the pointer as a hash input?
> 
> Some platforms already do this for certain sets of operations like
> atomic_t. The downside however is that you no longer control the lock
> contention or cache line bouncing. It becomes a question of luck rather
> than science as to how well it scales.

s/luck/statistics/

You can always increase the size of the hash table if you encounter
scaling problems.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:13           ` Matthew Wilcox
@ 2005-12-13 14:04             ` Alan Cox
  0 siblings, 0 replies; 239+ messages in thread
From: Alan Cox @ 2005-12-13 14:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Oliver Neukum, Ingo Molnar, David Howells, Nick Piggin, torvalds,
	akpm, hch, arjan, linux-kernel, linux-arch

On Maw, 2005-12-13 at 06:13 -0700, Matthew Wilcox wrote:
> > Some platforms already do this for certain sets of operations like
> > atomic_t. The downside however is that you no longer control the lock
> > contention or cache line bouncing. It becomes a question of luck rather
> > than science as to how well it scales.
> 
> s/luck/statistics/

Unfortunately not always. Statistical probability models generally
assume that samples are independent, as does just growing the hash
table. If there are correlations then how those correlations and the
hash function interact isn't simple statistics so growing the hash might
not work as well as would be hoped.

A second problem with the hash is it makes priority inversions
entertaining and unpredictable when using Ingo's -rt work. That isn't a
big problem with the atomic_t stuff in the parisc tree because atomic_t
is effectively the top of the lock ordering for the system.

Growing the hash while it may improve the behaviour isn't going to work
as well as embedding the lock in the object.

Alan

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:09         ` Alan Cox
  2005-12-13 13:13           ` Matthew Wilcox
@ 2005-12-13 13:24           ` Oliver Neukum
  1 sibling, 0 replies; 239+ messages in thread
From: Oliver Neukum @ 2005-12-13 13:24 UTC (permalink / raw)
  To: Alan Cox
  Cc: Ingo Molnar, David Howells, Nick Piggin, torvalds, akpm, hch,
	arjan, matthew, linux-kernel, linux-arch

Am Dienstag, 13. Dezember 2005 14:09 schrieb Alan Cox:
> On Maw, 2005-12-13 at 13:47 +0100, Oliver Neukum wrote:
> > > spinlock to do the cmpxchg. This means that there wont be one single 
> > > global spinlock to emulate cmpxchg, but the mutex's own spinlock can be 
> > > used for it.
> > 
> > Can't you use the pointer as a hash input?
> 
> Some platforms already do this for certain sets of operations like
> atomic_t. The downside however is that you no longer control the lock
> contention or cache line bouncing. It becomes a question of luck rather
> than science as to how well it scales.

On the other hand you don't control cache eviction either, do you?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:54 ` David Howells
  2005-12-13 10:13   ` Ingo Molnar
@ 2005-12-14  1:00   ` Nick Piggin
  2005-12-14 10:54   ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-14  1:00 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> 
>>We have atomic_cmpxchg. Can you use that for a sufficient generic
>>implementation?
> 
> 
> No. CMPXCHG/CAS is not as available as XCHG, and it's also unnecessary.
> 

atomic_cmpxchg should be available on all platforms.

While it may be strictly unnecessary, if it can be used to avoid
having a crappy default implementation that requires it to be
reimplemented in all architectures then that would be a good thing.

Any arguments about bad scalability or RT behaviour of the hashed
spinlock emulation atomic_t implementations are silly because they
are used by all atomic_ operations. It is an arch implementation
detail that generic code should not have to worry about.

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13  9:54 ` David Howells
  2005-12-13 10:13   ` Ingo Molnar
  2005-12-14  1:00   ` Nick Piggin
@ 2005-12-14 10:54   ` David Howells
  2005-12-14 11:17     ` Nick Piggin
  2005-12-14 11:46     ` David Howells
  2 siblings, 2 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 10:54 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> atomic_cmpxchg should be available on all platforms.

Two points:

 (1) If it's using spinlocks, then it's pointless to use atomic_cmpxchg.

 (2) atomic_t is a 32-bit type, and on a 64-bit platform I will want a 64-bit
     type so that I can stick the owner address in there (I've got a second
     variant not yet released).

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 10:54   ` David Howells
@ 2005-12-14 11:17     ` Nick Piggin
  2005-12-14 11:46     ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-14 11:17 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> 
>>atomic_cmpxchg should be available on all platforms.
> 
> 
> Two points:
> 
>  (1) If it's using spinlocks, then it's pointless to use atomic_cmpxchg.
> 

Why?

>  (2) atomic_t is a 32-bit type, and on a 64-bit platform I will want a 64-bit
>      type so that I can stick the owner address in there (I've got a second
>      variant not yet released).
> 

I'm sure you could use a seperate field as it would be a debug
option, right?

But atomic longs are coming along and it is probably feasable to
do 64-bit atomic_cmpxchg on all 64-bit word architectures if you
really needed that.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 10:54   ` David Howells
  2005-12-14 11:17     ` Nick Piggin
@ 2005-12-14 11:46     ` David Howells
  2005-12-14 21:23       ` Nick Piggin
  2005-12-16 12:00       ` David Howells
  1 sibling, 2 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 11:46 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> >  (1) If it's using spinlocks, then it's pointless to use atomic_cmpxchg.
> > 
> 
> Why?

Because you're going to end up with loops around the cmpxchg bit in certain
places, and if you do, then you've effectively got this:

	spin_lock_irqsave(mutexlock, flags);
	do {
		new = calc_state(orig, oldstate);
		spin_lock_irqsave(atomiclock, flags2);
		oldstate = __cmpxchg(&mutex->count, orig, new)
		spin_unlock_irqrestore(atomiclock, flags2);
	} while (oldstate != orig);
	spin_unlock_irqrestore(mutexlock, flags);

which is very bad. You _should_ have:

	spin_lock_irqsave(mutexlock, flags);
	oldstate = mutex->count;
	mutex->count = modify_state(mutex->count);
	spin_unlock_irqrestore(mutexlock, flags);

instead.

No. If you have XCHG/TAS/BSET/SWAP, but not CMPXCHG/CAS then you can do a lot
better by not pretending that cmpxchg exists. That way the fast paths don't
have to take any spinlocks at all.

And if you've got LLD/SCD or LDARX/STDCX or similar then you can probably do
better than CMPXCHG also.

If you want an illustration, then consider this:

	#define __mutex_trylock(mutex)					\
	({								\
		int oldstate;						\
									\
		asm volatile("swap%I0 %M0,%1"				\
			     : "+m"(mutex->state), "=r"(oldstate)	\
			     : "1"(1)					\
			     : "memory");				\
									\
		oldstate == 0;						\
	})

	static inline int down_trylock(struct mutex *mutex)
	{
		if (likely(__mutex_trylock(mutex))) {
			/* success */
			return 0;
		}

		/* failure */
		return 1;
	}

	void fastcall __sched down(struct mutex *mutex)
	{
		if (down_trylock(mutex) == 1)
			__down(mutex);
	}

	EXPORT_SYMBOL(down);

On FRV, this can be made to map to:

	setlos	0x1,gr4
	ori	gr4,0,gr5
	swap	@(gr8,gr0),gr5
	subicc	gr5,0,gr0,icc0
	beqlr	icc0,0x2	<-- probable-rated conditional return
	sethi.p	0xc01c,gr14
	setlo	0x9df0,gr14
	jmpl	@(gr14,gr0)

That's an out-of-line fast path of _5_ instructions. Attempting to emulate
CMPXCHG requires a lot more. On FRV, the case is alleviated somewhat since it
doesn't yet provide spinlocks and support SMP, but you'd be very hard pressed
to squeeze it down to just five instructions.

> >  (2) atomic_t is a 32-bit type, and on a 64-bit platform I will want a
> >      64-bit type so that I can stick the owner address in there (I've got
> >      a second variant not yet released).
> > 
> 
> I'm sure you could use a seperate field as it would be a debug
> option, right?

True. Ingo suggested this, and it seems reasonable. OTOH, shrinking the count
by 4 bytes would allow the whole structure to shrink by 8 on a 64-bit platform
with a 4-byte spinlock, which would be even better.

> But atomic longs are coming along and it is probably feasable to
> do 64-bit atomic_cmpxchg on all 64-bit word architectures if you
> really needed that.

That would be fine; except they don't yet exist. The way I'd do it is to
provide a default __mutex_cmpxchg() that the arch can override if it wants to.

> Send instant messages to your online friends http://au.messenger.yahoo.com 

No thanks.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:46     ` David Howells
@ 2005-12-14 21:23       ` Nick Piggin
  2005-12-16 12:00       ` David Howells
  1 sibling, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-14 21:23 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> 
>>> (1) If it's using spinlocks, then it's pointless to use atomic_cmpxchg.
>>>
>>
>>Why?
> 
> 
> Because you're going to end up with loops around the cmpxchg bit in certain
> places, and if you do, then you've effectively got this:
> 
> 	spin_lock_irqsave(mutexlock, flags);
> 	do {
> 		new = calc_state(orig, oldstate);
> 		spin_lock_irqsave(atomiclock, flags2);
> 		oldstate = __cmpxchg(&mutex->count, orig, new)
> 		spin_unlock_irqrestore(atomiclock, flags2);
+> 	} while (oldstate != orig);
> 	spin_unlock_irqrestore(mutexlock, flags);
> 
> which is very bad. You _should_ have:
> 
> 	spin_lock_irqsave(mutexlock, flags);
> 	oldstate = mutex->count;
> 	mutex->count = modify_state(mutex->count);
> 	spin_unlock_irqrestore(mutexlock, flags);
> 
> instead.

I was under the impression that with cmpxchg, you don't need the mutex lock.
If you do then sure, cmpxchg doesn't buy you anything (even if the arch does
natively support it).

> 
> No. If you have XCHG/TAS/BSET/SWAP, but not CMPXCHG/CAS then you can do a lot
> better by not pretending that cmpxchg exists. That way the fast paths don't
> have to take any spinlocks at all.
> 
> And if you've got LLD/SCD or LDARX/STDCX or similar then you can probably do
> better than CMPXCHG also.
> 
> If you want an illustration, then consider this:
> 
> 	#define __mutex_trylock(mutex)					\
> 	({								\
> 		int oldstate;						\
> 									\
> 		asm volatile("swap%I0 %M0,%1"				\
> 			     : "+m"(mutex->state), "=r"(oldstate)	\
> 			     : "1"(1)					\
> 			     : "memory");				\
> 									\
> 		oldstate == 0;						\
> 	})
> 
> 	static inline int down_trylock(struct mutex *mutex)
> 	{
> 		if (likely(__mutex_trylock(mutex))) {
> 			/* success */
> 			return 0;
> 		}
> 
> 		/* failure */
> 		return 1;
> 	}
> 
> 	void fastcall __sched down(struct mutex *mutex)
> 	{
> 		if (down_trylock(mutex) == 1)
> 			__down(mutex);
> 	}
> 
> 	EXPORT_SYMBOL(down);
> 
> On FRV, this can be made to map to:
> 
> 	setlos	0x1,gr4
> 	ori	gr4,0,gr5
> 	swap	@(gr8,gr0),gr5
> 	subicc	gr5,0,gr0,icc0
> 	beqlr	icc0,0x2	<-- probable-rated conditional return
> 	sethi.p	0xc01c,gr14
> 	setlo	0x9df0,gr14
> 	jmpl	@(gr14,gr0)
> 
> That's an out-of-line fast path of _5_ instructions. Attempting to emulate
> CMPXCHG requires a lot more. On FRV, the case is alleviated somewhat since it
> doesn't yet provide spinlocks and support SMP, but you'd be very hard pressed
> to squeeze it down to just five instructions.
> 

I think all of about parisc and sparc32 "emulate" cmpxchg with spinlocks.
For architectures like i386, x86_64, ppc, ia64, etc. the cmpxchg will
give good code.

Then if FRV was still unhappy, then you could override the mutex in that
architecture. This just seemed better to me than having a crappy simple
implementation that *everyone* will want to override (and I see FRV
overrides it as well, I don't see how you can complain about that).

But I guess that's moot if you can't to do a lockless version using
cmpxchg.

> 
>>> (2) atomic_t is a 32-bit type, and on a 64-bit platform I will want a
>>>     64-bit type so that I can stick the owner address in there (I've got
>>>     a second variant not yet released).
>>>
>>
>>I'm sure you could use a seperate field as it would be a debug
>>option, right?
> 
> 
> True. Ingo suggested this, and it seems reasonable. OTOH, shrinking the count
> by 4 bytes would allow the whole structure to shrink by 8 on a 64-bit platform
> with a 4-byte spinlock, which would be even better.
> 

I'm sure you'd manage. spinlocks can get pretty large with debugging on too.

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:46     ` David Howells
  2005-12-14 21:23       ` Nick Piggin
@ 2005-12-16 12:00       ` David Howells
  2005-12-16 13:16         ` Nick Piggin
                           ` (2 more replies)
  1 sibling, 3 replies; 239+ messages in thread
From: David Howells @ 2005-12-16 12:00 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> I was under the impression that with cmpxchg, you don't need the mutex lock.
> If you do then sure, cmpxchg doesn't buy you anything (even if the arch does
> natively support it).

Consider the slow path...

Imagine the mutex has three states: 0 (unset), 1 (held), 3 (contention).

	MUTEX	PROCESS A	PROCESS B	PROCESS C
	======	==============	==============	==============
	1,B,-
	1,B,-	-->mutex_lock()
	1,B,-	cmpxchg(0,1) [failed]
	1,B,-	-->__mutex_lock()
	1,B,-			-->mutex_unlock()
	1,B,-			cmpxchg(1,0) [success]
	0,-,-			<--mutex_unlock()
	0,-,-	spin_lock
	0,-,A	cmpxchg(0,1) [success]
	1,A,A	spin_unlock
	1,A,-	<--__mutex_lock()
	1,A,-	<--mutex_lock()

Or:

	MUTEX	PROCESS A	PROCESS B	PROCESS C
	======	==============	==============	==============	
	1,B,-
	1,B,-	-->mutex_lock()
	1,B,-	cmpxchg(0,1) [failed]
	1,B,-	-->__mutex_lock()
	1,B,-	spin_lock
	1,B,A	cmpxchg(0,1) [failed]
	1,B,A			-->mutex_unlock()
	1,B,A			cmpxchg(1,0) [success]
	0,-,A			<--mutex_unlock()
	0,-,A	cmpxchg(1,3) [failed]
	0,-,A	cmpxchg(0,1) [success]
	1,A,A	spin_unlock
	1,A,-	<--__mutex_lock()
	1,A,-	<--mutex_lock()

Or:

	MUTEX	PROCESS A	PROCESS B	PROCESS C
	======	==============	==============	==============
	1,B,-
	1,B,-	-->mutex_lock()
	1,B,-	cmpxchg(0,1) [failed]
	1,B,-	-->__mutex_lock()
	1,B,-	spin_lock
	1,B,A	cmpxchg(0,1) [failed]
	1,B,A			-->mutex_unlock()
	1,B,A			cmpxchg(1,0) [success]
	0,-,A			<--mutex_unlock()
	0,-,A	cmpxchg(1,3) [failed]
	0,-,A					-->mutex_lock()
	0,-,A					cmpxchg(0,1) [success]
	1,C,A					<--mutex_lock()
	1,C,A	cmpxchg(0,1) [failed]
	1,C,A	cmpxchg(1,3) [success]
	3,A,A	spin_unlock
	3,A,-	<--__mutex_lock()
	3,A,-	<--mutex_lock()

See how many cmpxchgs you may end up doing? Now imagine that cmpxchg() is
implemented as:

	spin_lock_irqsave(&common_lock[N], flags);
	actual = *state;
	if (actual == old)
		*state = new;
	spin_unlock_irqrestore(&common_lock[N], flags);

Now my point about using LL/SC is that:

	1,C,A	cmpxchg(0,1) [failed]
	1,C,A	cmpxchg(1,3) [success]
	3,C,A	...

Can be turned into:

	1,C,A	x = LL()
	1,C,A	x |= 2;
	1,C,A	SC(3) [success]
	3,C,A	...

On x86 you could use:

	1,C,A	LOCK OR (2)
	3,C,A	...

instead.

Now, contention isn't very likely, so using CMPXCHG _may_ be good enough _if_
you have it. But if you have to emulate it by using spinlocks, you're far
better off just wrapping the entire thing in spinlocks and not pretending use
atomic ops to access the counter; unless, of course, you have somthing that
can do a 1-bit XCHG or better...

	MUTEX	PROCESS A	PROCESS B	PROCESS C
	======	==============	==============	==============
	0,-,-			-->mutex_lock()
	0,-,-			xchg(1) == 0
	1,B,-			<--mutex_lock()
	1,B,-
	1,B,-	-->mutex_lock()
	1,B,-	xchg(1) == 1
	1,B,-	-->__mutex_lock()
	1,B,-			-->mutex_unlock()
	1,B,B			spin_lock
	1,B,B			set(0)
	0,-,B			spin_unlock
	0,-,-			<--mutex_unlock()
	0,-,-	spin_lock
	0,-,A	xchg(1) == 0
	1,A,A	spin_unlock
	1,A,-	<--__mutex_lock()
	1,A,-	<--mutex_lock()

mutex_unlock() should get the spinlock here before modifying the count,
because if there's anything on the queue, it should wake up the first waiter
rather than clearing the count.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 12:00       ` David Howells
@ 2005-12-16 13:16         ` Nick Piggin
  2005-12-16 15:53         ` David Howells
  2005-12-16 16:02         ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-16 13:16 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:

> See how many cmpxchgs you may end up doing? Now imagine that cmpxchg() is
> implemented as:
> 

Yes, 2 architectures do this and they would probably want to optimise it.

> 	spin_lock_irqsave(&common_lock[N], flags);
> 	actual = *state;
> 	if (actual == old)
> 		*state = new;
> 	spin_unlock_irqrestore(&common_lock[N], flags);
> 
> Now my point about using LL/SC is that:
> 
> 	1,C,A	cmpxchg(0,1) [failed]
> 	1,C,A	cmpxchg(1,3) [success]
> 	3,C,A	...
> 
> Can be turned into:
> 
> 	1,C,A	x = LL()
> 	1,C,A	x |= 2;
> 	1,C,A	SC(3) [success]
> 	3,C,A	...
> 
> On x86 you could use:
> 
> 	1,C,A	LOCK OR (2)
> 	3,C,A	...
> 
> instead.
> 
> Now, contention isn't very likely, so using CMPXCHG _may_ be good enough _if_
> you have it. But if you have to emulate it by using spinlocks, you're far
> better off just wrapping the entire thing in spinlocks and not pretending use
> atomic ops to access the counter; unless, of course, you have somthing that

Yes, the architecture code knows whether or not it implements atomic ops
with spinlocks, so that architecture is in the position to decide to override
the mutex implementation. *generic* code shouldn't worry about that, it should
use the interfaces available, and if that isn't optimal on some architecture
then that architecture can override it.

It is not even clear that any ll/sc based architectures would need to override
an atomic_cmpxchg variant at all because you can assume an unlocked fastpath
and not do the additional initial load to prime the cmpxchg.

So I don't know why you're so worried about sparc32 and parisc while preferring
to introduce a worse default implementation that even your frv architecture wants
to override...?

However: considering everyone and their dog has already implemented their own
semaphore, the best mutex default I guess is to probably use that as you say.
So: disregard my suggestion :P

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 12:00       ` David Howells
  2005-12-16 13:16         ` Nick Piggin
@ 2005-12-16 15:53         ` David Howells
  2005-12-16 23:41           ` Nick Piggin
  2005-12-16 16:02         ` David Howells
  2 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-16 15:53 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> Yes, the architecture code knows whether or not it implements atomic ops
> with spinlocks, so that architecture is in the position to decide to override
> the mutex implementation. *generic* code shouldn't worry about that, it should
> use the interfaces available, and if that isn't optimal on some architecture
> then that architecture can override it.

However, a number of generic templates can be provided if it makes things
easier for the arches because all they need to is:

	[arch/wibble/Kconfig]
	config MUTEX_TYPE_FOO
		bool
		default y

	[include/asm-wibble/system.h]
	#define __mutex_foo_this() { ... }
	#define __mutex_foo_that() { ... }

The unconditional two-state exchange I think will be a useful template for a
number of archs that don't have anything more advanced than XCHG/TAS/BSET/SWAP.

> It is not even clear that any ll/sc based architectures would need to override
> an atomic_cmpxchg variant at all because you can assume an unlocked fastpath

That's irrelevant. Any arch that has LL/SC almost certainly emulates CMPXCHG
with LL/SC.

> and not do the additional initial load to prime the cmpxchg.

Two points:

 (1) LL/SC does not require an additional initial load.

 (2) CMPXCHG does an implicit load; how else can it compare?

LL/SC can never be worse than CMPXCHG, if only because you're very unlikely to
have both, but it can be better.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 15:53         ` David Howells
@ 2005-12-16 23:41           ` Nick Piggin
  0 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-16 23:41 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:

>>It is not even clear that any ll/sc based architectures would need to override
>>an atomic_cmpxchg variant at all because you can assume an unlocked fastpath
> 
> 
> That's irrelevant. Any arch that has LL/SC almost certainly emulates CMPXCHG
> with LL/SC.
> 

It is not irrelevant because many architectures that would care are ll/sc
based and many others have a native cmpxchg ie. cmpxchg wouldn't be a bad
choice for default.

> 
>>and not do the additional initial load to prime the cmpxchg.
> 
> 
> Two points:
> 
>  (1) LL/SC does not require an additional initial load.
> 

?? I was only talking about cmpxchg

>  (2) CMPXCHG does an implicit load; how else can it compare?
> 

Read Russell's posts. He points out that most usages of cmpxchg
will require an additional load compared with an llsc in order to
find the value to work on.

cmpxchg(lock, UNLOCKED, LOCKED)

does not (although it may still require an extra branch).

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 12:00       ` David Howells
  2005-12-16 13:16         ` Nick Piggin
  2005-12-16 15:53         ` David Howells
@ 2005-12-16 16:02         ` David Howells
  2 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-16 16:02 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> So I don't know why you're so worried about sparc32 and parisc while
> preferring to introduce a worse default implementation that even your frv
> architecture wants to override...?

I now think the base default should be a wrapper around the counting
semaphores, because that is the easiest path (they already exist) and it's
also the fastest path on some platforms.

But I want to be able to override the implementation on such as FRV because I
can do a better mutex than a counting semaphore there as I only have SWAP
available as an atomic op.

However, I would like to make the unconditional-exchange mutex a template that
can be overridden so that other archs can use it with one Kconfig option and a
few #defines in asm/system.h.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (6 preceding siblings ...)
  2005-12-13  9:54 ` David Howells
@ 2005-12-13 10:48 ` David Howells
  2005-12-13 12:39   ` Matthew Wilcox
  2005-12-13 10:54 ` Ingo Molnar
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-13 10:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Howells, torvalds, hch, arjan, matthew, mingo,
	linux-kernel, linux-arch

Andrew Morton <akpm@osdl.org> wrote:

> Maybe I'm not understanding all this, but...
> 
> I'd have thought that the way to do this is to simply reimplement down(),
> up(), down_trylock(), etc using the new xchg-based code

Which I did.

> and to then hunt down those few parts of the kernel which actually use the
> old semaphore's counting feature and convert them to use down_sem(),
> up_sem(), etc.

Done, I think. It's not always 100% obvious.

> And rename all the old semaphore code: s/down/down_sem/etc.

Done.

> So after such a transformation, this new "mutex" thingy would not exist.

Why not? I want to make them different types so that you can't use the wrong
operators by accident or mix them.

> >  include/linux/mutex.h        |   32 +++++++
> 
> But it does.

Well, I could fold this into each asm/semaphore.h.

> > +#define mutex_grab(mutex)	(xchg(&(mutex)->state, 1) == 0)
> 
> mutex_trylock(), please.

You're right.

> > +#define is_mutex_locked(mutex)	((mutex)->state)
> 
> Let's keep the namespace consistent.  mutex_is_locked().

But that's a poor name: it turns it from a question into a statement:-(

> > +static inline void down(struct mutex *mutex)
> > +{
> > +	if (mutex_grab(mutex)) {
> 
> likely()

No... down_trylock().

> > +static inline int down_interruptible(struct mutex *mutex)
> > +{
> > +	if (mutex_grab(mutex)) {
> 
> likely()

down_trylock() again.

> > +static inline int down_trylock(struct mutex *mutex)
> > +{
> > +	if (mutex_grab(mutex)) {
> 
> etc.

Yes.

> You could just put likely() into mutex_trylock().  err, mutex_grab().
> 
> > +/*
> > + * release the mutex
> > + */
> > +static inline void up(struct mutex *mutex)
> > +{
> > +	unsigned long flags;
> > +
> > +#ifdef CONFIG_DEBUG_MUTEX_OWNER
> > +	if (mutex->__owner != current)
> > +		__up_bad(mutex);
> > +	mutex->__owner = NULL;
> > +#endif
> > +
> > +	/* must prevent a race */
> > +	spin_lock_irqsave(&mutex->wait_lock, flags);
> > +	if (!list_empty(&mutex->wait_list))
> > +		__up(mutex);
> > +	else
> > +		mutex_release(mutex);
> > +	spin_unlock_irqrestore(&mutex->wait_lock, flags);
> > +}
> 
> This is too large to inline.

You're probably right.

> It's also significantly slower than the existing up()?

Hmmm... If you've only got two states available to you and/or you can only
exchange states, then there's a limit to what you can actually do. You can lose
the spinlock in the up() fastpath if you're willing to forgo fairness or resort
to waking up processes superfluously.

Ingo and Nick have a point about using CMPXCHG or equivalent if it's
available. This lets you modify the state you have, rather than swapping it for
a whole new state; in which case the state can be annotated to indicate that
there is waking up to be done, thus permitting the fast path to be much
faster. But this can only be done in the case where the state may be modified.

As I tried to make clear: this is the simplest I could come up with, but I have
made provision for overriding it with something better if that's possible.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 10:48 ` David Howells
@ 2005-12-13 12:39   ` Matthew Wilcox
  0 siblings, 0 replies; 239+ messages in thread
From: Matthew Wilcox @ 2005-12-13 12:39 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, torvalds, hch, arjan, mingo, linux-kernel, linux-arch

On Tue, Dec 13, 2005 at 10:48:19AM +0000, David Howells wrote:
> > > +#define is_mutex_locked(mutex)	((mutex)->state)
> > 
> > Let's keep the namespace consistent.  mutex_is_locked().
> 
> But that's a poor name: it turns it from a question into a statement:-(

Ah, but look at it in context of how it's used:

	if (is_mutex_locked())
That's gramatically incorrect!

	if (mutex_is_locked())
much better.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (7 preceding siblings ...)
  2005-12-13 10:48 ` David Howells
@ 2005-12-13 10:54 ` Ingo Molnar
  2005-12-13 11:23 ` David Howells
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 10:54 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch


* David Howells <dhowells@redhat.com> wrote:

>      	init_MUTEX_LOCKED()
> 	DECLARE_MUTEX_LOCKED()

please kill these two in the simple mutex implementation - they are a 
sign of mutexes used as completions.

>  (7) Provides a debugging config option CONFIG_DEBUG_MUTEX_OWNER by which the
>      mutex owner can be tracked and by which over-upping can be detected.

another simplification: also enforce that only the owner can unlock the 
mutex. This is what we are doing in the -rt patch. (This rule also 
ensures that such mutexes can be used for priority inheritance.)

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (8 preceding siblings ...)
  2005-12-13 10:54 ` Ingo Molnar
@ 2005-12-13 11:23 ` David Howells
  2005-12-13 11:24 ` David Howells
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-13 11:23 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> Any reason why you're setting up your own style of waitqueue in
> mutex-simple.c instead of just using the kernel's style of waitqueue?

Because I can steal the code from FRV's semaphores or rw-semaphores, and this
way I can be sure of what I'm doing.

Note that the sleeping processes are generally dequeued and dispatched by the
up() function, which means they don't have to take the spinlock themselves.
This may be possible to do magically with the waitqueue stuff, but I'm not sure
how to do it; it's horribly complicated to read through the sources and there
isn't much documentation.

> > +	mb();
> 
> This should be smp_mb(), I think.

Yes.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (9 preceding siblings ...)
  2005-12-13 11:23 ` David Howells
@ 2005-12-13 11:24 ` David Howells
  2005-12-13 13:45   ` Ingo Molnar
  2005-12-13 11:34 ` David Howells
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-13 11:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Ingo Molnar <mingo@elte.hu> wrote:

> >      	init_MUTEX_LOCKED()
> > 	DECLARE_MUTEX_LOCKED()
> 
> please kill these two in the simple mutex implementation - they are a 
> sign of mutexes used as completions.

That can be done later. It's not necessary to do it in this particular patch
set.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 11:24 ` David Howells
@ 2005-12-13 13:45   ` Ingo Molnar
  0 siblings, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-13 13:45 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch


* David Howells <dhowells@redhat.com> wrote:

> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > >      	init_MUTEX_LOCKED()
> > > 	DECLARE_MUTEX_LOCKED()
> > 
> > please kill these two in the simple mutex implementation - they are a 
> > sign of mutexes used as completions.
> 
> That can be done later. It's not necessary to do it in this particular 
> patch set.

i disagree - it's necessary that we dont build complexities into the 
'simple' mutex type, or the whole game starts again. I.e. the 'owner 
unlocks the mutex' rule must be enforced - which makes 
DECLARE_MUTEX_LOCKED() meaningless.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (10 preceding siblings ...)
  2005-12-13 11:24 ` David Howells
@ 2005-12-13 11:34 ` David Howells
  2005-12-13 13:05 ` Alan Cox
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-13 11:34 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

David Howells <dhowells@redhat.com> wrote:

> > Any reason why you're setting up your own style of waitqueue in
> > mutex-simple.c instead of just using the kernel's style of waitqueue?
> 
> Because I can steal the code from FRV's semaphores or rw-semaphores, and this
> way I can be sure of what I'm doing.

And because:

	struct mutex {
		int			state;
		wait_queue_head_t	wait_queue;
	};

Wastes 8 more bytes of memory than:

	struct mutex {
		int			state;
		spinlock_t		wait_lock;
		struct list_head	wait_list;
	};

on a 64-bit machine if spinlock_t is 4 bytes. Both waste 4 bytes if spinlock_t
is 8 bytes.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (11 preceding siblings ...)
  2005-12-13 11:34 ` David Howells
@ 2005-12-13 13:05 ` Alan Cox
  2005-12-13 13:15   ` Alan Cox
  2005-12-13 13:32 ` David Howells
  2005-12-13 21:03 ` David Howells
  14 siblings, 1 reply; 239+ messages in thread
From: Alan Cox @ 2005-12-13 13:05 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

On Llu, 2005-12-12 at 23:45 +0000, David Howells wrote:
>  (5) Redirects the following to apply to the new mutexes rather than the
>      traditional semaphores:
> 
> 	down()
> 	down_trylock()
> 	down_interruptible()
> 	up()
> 	init_MUTEX()
>      	init_MUTEX_LOCKED()
> 	DECLARE_MUTEX()
> 	DECLARE_MUTEX_LOCKED()

And you've audited every occurence ?

>      On the basis that most usages of semaphores are as mutexes, this makes
>      sense for in most cases it's just then a matter of changing the type from
>      struct semaphore to struct mutex. 

You propose to rename the existing up and down, which are counting
semaphores, documented and used that way everywhere with mutexes which
are not. Worse still up/down are, second to P/V, the usual forms of
referring to _counting_ semaphores.

It seems to me it would be far far saner to define something like

	sleep_lock(&foo)
	sleep_unlock(&foo)
	sleep_trylock(&foo)

given the new mutex interface is actually a sleeping interface with the
semantics of the spin_lock interface. Its then obvious what it does, you
don't randomly break other drivers you've not reviewed and the interface
is intuitive rather than obfuscated.

It won't take long for people to then change the name of the performance
critical cases and the others will catch up in time.

It also saves breaking every piece of out of tree kernel code for now
good reason.

Alan

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:05 ` Alan Cox
@ 2005-12-13 13:15   ` Alan Cox
  2005-12-13 23:21     ` Nikita Danilov
  0 siblings, 1 reply; 239+ messages in thread
From: Alan Cox @ 2005-12-13 13:15 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

Actually a PS to this while I think about it. spin_locks and mutex type
locks could both do with a macro for

	call_locked(&lock, foo(a,b,c,d))

to cut down on all the error path forgot to release a lock type errors.


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:15   ` Alan Cox
@ 2005-12-13 23:21     ` Nikita Danilov
  0 siblings, 0 replies; 239+ messages in thread
From: Nikita Danilov @ 2005-12-13 23:21 UTC (permalink / raw)
  To: Alan Cox; +Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

Alan Cox writes:
 > Actually a PS to this while I think about it. spin_locks and mutex type
 > locks could both do with a macro for
 > 
 > 	call_locked(&lock, foo(a,b,c,d))

reiser4 code was publicly humiliated for such macros, but indeed they
are useful. The only problem is that one needs two macros: one for foo()
returning void and one for all other cases.

 > 
 > to cut down on all the error path forgot to release a lock type errors.
 > 

Nikita.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (12 preceding siblings ...)
  2005-12-13 13:05 ` Alan Cox
@ 2005-12-13 13:32 ` David Howells
  2005-12-13 14:00   ` Alan Cox
                     ` (4 more replies)
  2005-12-13 21:03 ` David Howells
  14 siblings, 5 replies; 239+ messages in thread
From: David Howells @ 2005-12-13 13:32 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> >  (5) Redirects the following to apply to the new mutexes rather than the
> >      traditional semaphores:
> > 
> > 	down()
> ...
> 
> And you've audited every occurence ?

Outside of the arch directories, yes; but I don't know that I've made the
correct decision in 100% of the cases.

I've changed some of the uses into completions, and found about a dozen or so
uses of counting semaphores; but the vast majority of occurrences seem to be
wanting mutex behaviour.

> It seems to me it would be far far saner to define something like
> 
> 	sleep_lock(&foo)
> 	sleep_unlock(&foo)
> 	sleep_trylock(&foo)

Which would be a _lot_ more work. It would involve about ten times as many
changes, I think, and thus be more prone to errors.

> Its then obvious what it does, you don't randomly break other drivers you've
> not reviewed and the interface is intuitive rather than obfuscated.

I've attempted to review everything in 2.6.15-rc5 outside of most of the archs.
I can't easily modify any driver not contained in that tarball, but at least
the compiler will barf and force a review.

> It won't take long for people to then change the name of the performance
> critical cases and the others will catch up in time.

It took about ten hours to go through the declarations of struct semaphore and
review them; I hate to think how long it'd take to go through all the ups and
downs too.

> It also saves breaking every piece of out of tree kernel code for now
> good reason.

But my patch means the changes required are in the most cases minimal: just
changing struct semaphore to struct mutex is sufficient for the vast majority
of cases.

Your way requires a lot more work, both in the tree and out of it.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:32 ` David Howells
@ 2005-12-13 14:00   ` Alan Cox
  2005-12-13 14:35   ` Christopher Friesen
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 239+ messages in thread
From: Alan Cox @ 2005-12-13 14:00 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

On Maw, 2005-12-13 at 13:32 +0000, David Howells wrote:
> > Its then obvious what it does, you don't randomly break other drivers you've
> > not reviewed and the interface is intuitive rather than obfuscated.
> 
> I've attempted to review everything in 2.6.15-rc5 outside of most of the archs.
> I can't easily modify any driver not contained in that tarball, but at least
> the compiler will barf and force a review.


Is there a reason you didnt answer the comment about down/up being the
usual way computing refers to the operations on counting semaphores but
just deleted it ?


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:32 ` David Howells
  2005-12-13 14:00   ` Alan Cox
@ 2005-12-13 14:35   ` Christopher Friesen
  2005-12-13 14:44     ` Arjan van de Ven
  2005-12-13 15:23   ` David Howells
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 239+ messages in thread
From: Christopher Friesen @ 2005-12-13 14:35 UTC (permalink / raw)
  To: David Howells
  Cc: Alan Cox, torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

>>It seems to me it would be far far saner to define something like
>>
>>	sleep_lock(&foo)
>>	sleep_unlock(&foo)
>>	sleep_trylock(&foo)
> 
> Which would be a _lot_ more work. It would involve about ten times as many
> changes, I think, and thus be more prone to errors.

"lots of work" has never been a valid reason for not doing a kernel 
change...

In this case, introducing a new API means the changes can be made over time.

As time goes on you can convert more and more code to the mutex/sleep 
lock and any tricky code just stays with the older API until someone who 
understands it can vet it.

As Alan mentioned, the standard counting semaphore API is up/down. 
Making those refer to a sleeping mutex violates the principle of least 
surprise.

Chris

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 14:35   ` Christopher Friesen
@ 2005-12-13 14:44     ` Arjan van de Ven
  2005-12-13 14:59       ` Christopher Friesen
  0 siblings, 1 reply; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-13 14:44 UTC (permalink / raw)
  To: Christopher Friesen
  Cc: David Howells, Alan Cox, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Tue, 2005-12-13 at 08:35 -0600, Christopher Friesen wrote:
> David Howells wrote:
> > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
> >>It seems to me it would be far far saner to define something like
> >>
> >>	sleep_lock(&foo)
> >>	sleep_unlock(&foo)
> >>	sleep_trylock(&foo)
> > 
> > Which would be a _lot_ more work. It would involve about ten times as many
> > changes, I think, and thus be more prone to errors.
> 
> "lots of work" has never been a valid reason for not doing a kernel 
> change...
> 
> In this case, introducing a new API means the changes can be made over time.

in this case, doing this change gradual I think is a mistake. We should
do all of the in-kernel code at least... 


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 14:44     ` Arjan van de Ven
@ 2005-12-13 14:59       ` Christopher Friesen
  0 siblings, 0 replies; 239+ messages in thread
From: Christopher Friesen @ 2005-12-13 14:59 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: David Howells, Alan Cox, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

Arjan van de Ven wrote:
> On Tue, 2005-12-13 at 08:35 -0600, Christopher Friesen wrote:

>>In this case, introducing a new API means the changes can be made over time.
> 
> 
> in this case, doing this change gradual I think is a mistake. We should
> do all of the in-kernel code at least... 

This means verifying all the users before patch submission, which may be 
problematic.

I guess the point I'm trying to make is that if you create a new API you 
have the option of converting the obvious cases first, which should 
cover the majority of users.  Anywhere the behaviour is non-obvious can 
be left using the old API, and the out-of-tree users will continue to 
work correctly.

Chris

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:32 ` David Howells
  2005-12-13 14:00   ` Alan Cox
  2005-12-13 14:35   ` Christopher Friesen
@ 2005-12-13 15:23   ` David Howells
  2005-12-15  5:24     ` Miles Bader
  2005-12-13 15:39   ` David Howells
  2005-12-13 20:04   ` Steven Rostedt
  4 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-13 15:23 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> Is there a reason you didnt answer the comment about down/up being the
> usual way computing refers to the operations on counting semaphores but
> just deleted it ?

up/down is also used in conjunction with mutexes and R/W semaphores, so
counting semaphores do not have exclusive rights to the terminology.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 15:23   ` David Howells
@ 2005-12-15  5:24     ` Miles Bader
  0 siblings, 0 replies; 239+ messages in thread
From: Miles Bader @ 2005-12-15  5:24 UTC (permalink / raw)
  To: David Howells
  Cc: Alan Cox, torvalds, akpm, hch, arjan, matthew, linux-kernel, linux-arch

David Howells <dhowells@redhat.com> writes:
>> Is there a reason you didnt answer the comment about down/up being the
>> usual way computing refers to the operations on counting semaphores but
>> just deleted it ?
>
> up/down is also used in conjunction with mutexes and R/W semaphores, so
> counting semaphores do not have exclusive rights to the terminology.

I suspect that is only the case where the author of the mutex was
accustomed to using semaphores as mutexes before-hand; "up/down" are
rather poor names for mutex operations otherwise.  lock/unlock are much
better.

-Miles
-- 
Next to fried food, the South has suffered most from oratory.
  			-- Walter Hines Page

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:32 ` David Howells
                     ` (2 preceding siblings ...)
  2005-12-13 15:23   ` David Howells
@ 2005-12-13 15:39   ` David Howells
  2005-12-13 16:10     ` Alan Cox
  2005-12-14  8:31     ` Ingo Molnar
  2005-12-13 20:04   ` Steven Rostedt
  4 siblings, 2 replies; 239+ messages in thread
From: David Howells @ 2005-12-13 15:39 UTC (permalink / raw)
  To: Christopher Friesen
  Cc: David Howells, Alan Cox, torvalds, akpm, hch, arjan, matthew,
	linux-kernel, linux-arch

Christopher Friesen <cfriesen@nortel.com> wrote:

> > Which would be a _lot_ more work. It would involve about ten times as many
> > changes, I think, and thus be more prone to errors.
> 
> "lots of work" has never been a valid reason for not doing a kernel change...

There are a number of considerations:

 (1) If _I_ am going to be doing the work, then I'm quite happy to reduce the
     load by 90%. And I think it'd be at least that, probably more. Finding
     struct semaphore with grep is much easier than finding up/down with grep
     because of:

	(a) comments

	(b) other instances of up/down names, including rw_semaphores

     There are a lot fewer instances of struct semaphore than up and down.

 (2) It makes it easier for other people. In most cases, all they need do is
     change "struct semaphore" to "struct mutex". If they've used
     DECLARE_MUTEX() then they need do nothing at all, and if they've used
     init_MUTEX(), then they don't need to convert sema_init() either.

 (3) It forces people to reconsider how they want to use their semaphores.

I have no objection to making life easier for other people. I suspect most
other people don't care that their semaphores are now mutexes, and think of
them that way anyway.

I admit that there are downsides:

 (1) up and down now do something effectively different (though in most cases
     it's also exactly the same).

 (2) Users of counting semaphores have to change, but they're in the minority
     by quite a way.

 (3) Some people want mutexes to be:

     (a) only releasable in the same context as they were taken

     (b) not accessible in interrupt context, or that (a) applies here also

     (c) not initialisable to the locked state

     But this means that the current usages all have to be carefully audited,
     and sometimes that unobvious.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 15:39   ` David Howells
@ 2005-12-13 16:10     ` Alan Cox
  2005-12-14 10:29       ` Arjan van de Ven
  2005-12-14  8:31     ` Ingo Molnar
  1 sibling, 1 reply; 239+ messages in thread
From: Alan Cox @ 2005-12-13 16:10 UTC (permalink / raw)
  To: David Howells
  Cc: Christopher Friesen, torvalds, akpm, hch, arjan, matthew,
	linux-kernel, linux-arch

On Maw, 2005-12-13 at 15:39 +0000, David Howells wrote:
>  (3) Some people want mutexes to be:
> 
>      (a) only releasable in the same context as they were taken
> 
>      (b) not accessible in interrupt context, or that (a) applies here also
> 
>      (c) not initialisable to the locked state
> 
>      But this means that the current usages all have to be carefully audited,
>      and sometimes that unobvious.

Only if you insist on replacing them immediately. If you submit a
*small* patch which just adds the new mutexes then a series of small
patches can gradually convert code where mutexes are better. People will
naturally hit the hot and critical points first meaning that in a short
time the users of semaphores will be those who need it, and those who
are not critical to performance.

There is a problemn with init_MUTEX*/DECLARE_MUTEX naming being used for
semaphore struct init and I don't see a nice way to fix that either. I'd
rather see people just have to fix those as compiler errors (or a perl
-e regexp run to make them all init_SEM/DECLARE_SEM before any other
changes are made).



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 16:10     ` Alan Cox
@ 2005-12-14 10:29       ` Arjan van de Ven
  2005-12-14 11:03         ` Arjan van de Ven
  2005-12-14 11:03         ` Alan Cox
  0 siblings, 2 replies; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-14 10:29 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, Christopher Friesen, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Tue, 2005-12-13 at 16:10 +0000, Alan Cox wrote:
> On Maw, 2005-12-13 at 15:39 +0000, David Howells wrote:
> >  (3) Some people want mutexes to be:
> > 
> >      (a) only releasable in the same context as they were taken
> > 
> >      (b) not accessible in interrupt context, or that (a) applies here also
> > 
> >      (c) not initialisable to the locked state
> > 
> >      But this means that the current usages all have to be carefully audited,
> >      and sometimes that unobvious.
> 
> Only if you insist on replacing them immediately. If you submit a
> *small* patch which just adds the new mutexes then a series of small
> patches can gradually convert code where mutexes are better. 

this unfortunately is not very realistic in practice... 


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 10:29       ` Arjan van de Ven
@ 2005-12-14 11:03         ` Arjan van de Ven
  2005-12-14 11:03         ` Alan Cox
  1 sibling, 0 replies; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-14 11:03 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, Christopher Friesen, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Wed, 2005-12-14 at 11:29 +0100, Arjan van de Ven wrote:
> On Tue, 2005-12-13 at 16:10 +0000, Alan Cox wrote:
> > On Maw, 2005-12-13 at 15:39 +0000, David Howells wrote:
> > >  (3) Some people want mutexes to be:
> > > 
> > >      (a) only releasable in the same context as they were taken
> > > 
> > >      (b) not accessible in interrupt context, or that (a) applies here also
> > > 
> > >      (c) not initialisable to the locked state
> > > 
> > >      But this means that the current usages all have to be carefully audited,
> > >      and sometimes that unobvious.
> > 
> > Only if you insist on replacing them immediately. If you submit a
> > *small* patch which just adds the new mutexes then a series of small
> > patches can gradually convert code where mutexes are better. 
> 
> this unfortunately is not very realistic in practice... 

to expand on this; this kind of change no matter what needs a mass
change inside the kernel, or the point of it all is sort of moot. The
idea is to make the mutex type the most common one, since most users ARE
mutexes. To make that happen a one time "rather big" change is needed
and planned afaics.

What's remaining is
1) transition period for in kernel stuff
2) out of the kernel code compatibility
3) should a forgotten item be a compile time failure or be allowed to
work still 

1) is a matter of "do we do it all now" or in phases. I don't see a
reason to not do it all now, otherwise a 2 year process will happen

2) that is a semi moot issue; sure a big bang change will break this
compatibility, but so will a gradual switchover. A gradual switchover of
converting core semaphores into mutexes will need changes in external
modules regardless (think vfs but there's many more). The question is
doing it once or doing it multiple times over a period of 2 years.

3) history has shown that non-compiletime items keep lingering on
forever, since there is no incentive or even detection of "old" use. At
minimum a compiler warning is needed. Just look at the sleep_on_*() api;
more than half the users in 2.6 are *new code* in 2.6, even though it's
a deprecated api for ... how long? 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 10:29       ` Arjan van de Ven
  2005-12-14 11:03         ` Arjan van de Ven
@ 2005-12-14 11:03         ` Alan Cox
  2005-12-14 11:08           ` Arjan van de Ven
  1 sibling, 1 reply; 239+ messages in thread
From: Alan Cox @ 2005-12-14 11:03 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: David Howells, Christopher Friesen, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Mer, 2005-12-14 at 11:29 +0100, Arjan van de Ven wrote:
> > >      But this means that the current usages all have to be carefully audited,
> > >      and sometimes that unobvious.
> > 
> > Only if you insist on replacing them immediately. If you submit a
> > *small* patch which just adds the new mutexes then a series of small
> > patches can gradually convert code where mutexes are better. 
> 
> this unfortunately is not very realistic in practice... 

Strange because it is how most such work has been done in the past, from
the big kernel lock to the scsi core rewrite. You also forgot to attach
a reason you think it isnt realistic ?

Alan

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:03         ` Alan Cox
@ 2005-12-14 11:08           ` Arjan van de Ven
  2005-12-14 11:24             ` Alan Cox
  0 siblings, 1 reply; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-14 11:08 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, Christopher Friesen, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Wed, 2005-12-14 at 11:03 +0000, Alan Cox wrote:
> On Mer, 2005-12-14 at 11:29 +0100, Arjan van de Ven wrote:
> > > >      But this means that the current usages all have to be carefully audited,
> > > >      and sometimes that unobvious.
> > > 
> > > Only if you insist on replacing them immediately. If you submit a
> > > *small* patch which just adds the new mutexes then a series of small
> > > patches can gradually convert code where mutexes are better. 
> > 
> > this unfortunately is not very realistic in practice... 
> 
> Strange because it is how most such work has been done in the past, from
> the big kernel lock to the scsi core rewrite.

1) the BKL change hasn't finished, and we're 5 years down the line. API
changes done gradual tend to take forever in practice, esp if there's no
"compile" incentive for people to fix things. 

2) the scsi rewrite was a major functional change. that's different from
a basically pure API change (split). Splitting functional changes to one
part of the kernel up into a sequence is very good. THat's different
though: even in the scsi change, API changes that went outside the scsi
core into the drivers were mostly done in one bang (not all of them, the
ones that weren't ended up being rather painful)

>  You also forgot to attach
> a reason you think it isnt realistic ?
> 
that was in a follow up email

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:08           ` Arjan van de Ven
@ 2005-12-14 11:24             ` Alan Cox
  2005-12-14 11:35               ` Andrew Morton
  2005-12-14 11:42               ` Arjan van de Ven
  0 siblings, 2 replies; 239+ messages in thread
From: Alan Cox @ 2005-12-14 11:24 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: David Howells, Christopher Friesen, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Mer, 2005-12-14 at 12:08 +0100, Arjan van de Ven wrote:
> 1) the BKL change hasn't finished, and we're 5 years down the line. API
> changes done gradual tend to take forever in practice, esp if there's no
> "compile" incentive for people to fix things. 

This isn't a "fix" however, its merely a performance tweak. Drivers
using the old API are not a problem because

a) The old API is needed long term for true counting sem users
b) Its a minor performance hit at most

Thats rather different to the BKL


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:24             ` Alan Cox
@ 2005-12-14 11:35               ` Andrew Morton
  2005-12-14 11:44                 ` Arjan van de Ven
                                   ` (2 more replies)
  2005-12-14 11:42               ` Arjan van de Ven
  1 sibling, 3 replies; 239+ messages in thread
From: Andrew Morton @ 2005-12-14 11:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: arjan, dhowells, cfriesen, torvalds, hch, matthew, linux-kernel,
	linux-arch


Could someone please remind me why we're even discussing this, given that
mutex_down() is slightly more costly than current down(), and mutex_up() is
appreciably more costly than current up()?

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:35               ` Andrew Morton
@ 2005-12-14 11:44                 ` Arjan van de Ven
  2005-12-14 11:52                   ` Andi Kleen
  2005-12-14 11:57                 ` David Howells
  2005-12-14 12:17                 ` Christoph Hellwig
  2 siblings, 1 reply; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-14 11:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, dhowells, cfriesen, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Wed, 2005-12-14 at 03:35 -0800, Andrew Morton wrote:
> Could someone please remind me why we're even discussing this,

* cleaner API
* more declarative in terms of intent

which in turn allow
* higher performance
* enhanced options like the -rt patch is doing, such as boosting
processes when a semaphore they're holding hits contention
* mutex use is a candidate for a "spinaphore" treatment (unlike counting
semaphores)

>  given that
> mutex_down() is slightly more costly than current down(), and mutex_up() is
> appreciably more costly than current up()?

that's an implementation flaw in the current implementation that is not
needed by any means and that Ingo has fixed in his version of this



^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:44                 ` Arjan van de Ven
@ 2005-12-14 11:52                   ` Andi Kleen
  2005-12-14 11:55                     ` Arjan van de Ven
  0 siblings, 1 reply; 239+ messages in thread
From: Andi Kleen @ 2005-12-14 11:52 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Andrew Morton, Alan Cox, dhowells, cfriesen, torvalds, hch,
	matthew, linux-kernel, linux-arch

> * mutex use is a candidate for a "spinaphore" treatment (unlike counting
> semaphores)

I think that would be interesting experiment for page faults.
But they actually use rwsems, not normal semaphores.

-Andi

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:52                   ` Andi Kleen
@ 2005-12-14 11:55                     ` Arjan van de Ven
  0 siblings, 0 replies; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-14 11:55 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Alan Cox, dhowells, cfriesen, torvalds, hch,
	matthew, linux-kernel, linux-arch

On Wed, 2005-12-14 at 12:52 +0100, Andi Kleen wrote:
> > * mutex use is a candidate for a "spinaphore" treatment (unlike counting
> > semaphores)
> 
> I think that would be interesting experiment for page faults.
> But they actually use rwsems, not normal semaphores.

at least rwsems are only used as mutexes afaik... so those would just
end up being mutexes.. .and could thus do this too...

(I don't think anyone ever thought of doing a counting rwsem... at least
I sure hope so; the page fault one sure is a mutex)


^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:35               ` Andrew Morton
  2005-12-14 11:44                 ` Arjan van de Ven
@ 2005-12-14 11:57                 ` David Howells
  2005-12-14 12:19                   ` Jakub Jelinek
                                     ` (3 more replies)
  2005-12-14 12:17                 ` Christoph Hellwig
  2 siblings, 4 replies; 239+ messages in thread
From: David Howells @ 2005-12-14 11:57 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Andrew Morton, Alan Cox, dhowells, cfriesen, torvalds, hch,
	matthew, linux-kernel, linux-arch

Arjan van de Ven <arjan@infradead.org> wrote:

> >  given that
> > mutex_down() is slightly more costly than current down(), and mutex_up() is
> > appreciably more costly than current up()?
> 
> that's an implementation flaw in the current implementation that is not
> needed by any means and that Ingo has fixed in his version of this

As do I. I wrote it yesterday with Ingo looking over my shoulder, as it were,
but I haven't released it yet.

What I provided was a base implementation that anything can use provided it
has an atomic op capable of exchanging between two states, and I suspect
everything that can do multiprocessing has - if you can do spinlocks, then you
can do this. I ALSO provided a mechanism by which it could be overridden if
there's something better available on that arch.

As I see it there are four classes of arch:

 (0) Those that have no atomic ops at all - in which case xchg is trivially
     implemented by disabling interrupts, and spinlocks must be null because
     they can't be implemented.

 (1) Those that only have a limited exchange functionality. Several archs do
     fall into this category: arm, frv, mn10300, 68000, i386.

 (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64, sparc.

 (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.

(This isn't an exhaustive list of archs)

Each higher class can emulate all the lower classes, but can probably do a
better implementation than the lower class because they have more flexibility.

For instance class (1) mutexes can only practically support two states, but
class (2) and (3) can support multiple states, and so can improve the up()
fastpath as well as the down() fastpaths.

With some archs, such as FRV, it might be possible to emulate a higher class,
but it's not necessarily practical in all circumstances.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:57                 ` David Howells
@ 2005-12-14 12:19                   ` Jakub Jelinek
  2005-12-16  1:54                   ` Nick Piggin
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 239+ messages in thread
From: Jakub Jelinek @ 2005-12-14 12:19 UTC (permalink / raw)
  To: David Howells
  Cc: Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen, torvalds,
	hch, matthew, linux-kernel, linux-arch

On Wed, Dec 14, 2005 at 11:57:12AM +0000, David Howells wrote:
>  (1) Those that only have a limited exchange functionality. Several archs do
>      fall into this category: arm, frv, mn10300, 68000, i386.

sparc (32-bit CPUs) fall into this category too.  V7 CPUs have just
atomic load byte and store 0xff, later CPUs have swap insn, which is like
ia32 xchg.

>  (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64, sparc.

sparc64 here.

>  (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.

	Jakub

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:57                 ` David Howells
  2005-12-14 12:19                   ` Jakub Jelinek
@ 2005-12-16  1:54                   ` Nick Piggin
  2005-12-16 11:02                   ` David Howells
  2005-12-16 11:30                   ` David Howells
  3 siblings, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-16  1:54 UTC (permalink / raw)
  To: David Howells
  Cc: Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen, torvalds,
	hch, matthew, linux-kernel, linux-arch

David Howells wrote:
> Arjan van de Ven <arjan@infradead.org> wrote:
> 
> 
>>> given that
>>>mutex_down() is slightly more costly than current down(), and mutex_up() is
>>>appreciably more costly than current up()?
>>
>>that's an implementation flaw in the current implementation that is not
>>needed by any means and that Ingo has fixed in his version of this
> 
> 
> As do I. I wrote it yesterday with Ingo looking over my shoulder, as it were,
> but I haven't released it yet.
> 
> What I provided was a base implementation that anything can use provided it
> has an atomic op capable of exchanging between two states, and I suspect
> everything that can do multiprocessing has - if you can do spinlocks, then you
> can do this. I ALSO provided a mechanism by which it could be overridden if
> there's something better available on that arch.
> 
> As I see it there are four classes of arch:
> 
>  (0) Those that have no atomic ops at all - in which case xchg is trivially
>      implemented by disabling interrupts, and spinlocks must be null because
>      they can't be implemented.
> 
>  (1) Those that only have a limited exchange functionality. Several archs do
>      fall into this category: arm, frv, mn10300, 68000, i386.
> 
>  (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64, sparc.
> 
>  (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.
> 

cmpxchg is basically exactly equivalent to a store-conditional, so 2 and 3
are the same level.

I don't know why you don't implement a "good" default implementation with
atomic_cmpxchg.

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:57                 ` David Howells
  2005-12-14 12:19                   ` Jakub Jelinek
  2005-12-16  1:54                   ` Nick Piggin
@ 2005-12-16 11:02                   ` David Howells
  2005-12-16 13:01                     ` Nick Piggin
  2005-12-16 16:28                     ` Linus Torvalds
  2005-12-16 11:30                   ` David Howells
  3 siblings, 2 replies; 239+ messages in thread
From: David Howells @ 2005-12-16 11:02 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	cfriesen, torvalds, hch, matthew, linux-kernel, linux-arch

Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> >  (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64,
> > sparc.
> >  (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.
> > 
> 
> cmpxchg is basically exactly equivalent to a store-conditional, so 2 and 3
> are the same level.

No, they're not. LL/SC is more flexible than CMPXCHG because under some
circumstances, you can get away without doing the SC, and because sometimes
you can do one LL/SC in lieu of two CMPXCHG's because LL/SC allows you to
retrieve the value, consider it and then modify it if you want to. With
CMPXCHG you have to anticipate, and so you're more likely to get it wrong.

> I don't know why you don't implement a "good" default implementation with
> atomic_cmpxchg.

Because it wouldn't be a good default. I'm thinking the best default is simply
to wrap a counting semaphore. Where overriding this really matters is class 1
CPUs that don't have CMPXCHG, LL/SC, or in the x86 case, LOCK INC/DEC.

I've had a play with x86, and on there CMPXCHG, XCHG and XADD give worse
performance than INC/DEC for some reason. I assume this is something to do
with how the PPro CPU optimises itself. On PPro CPUs at least, counting
semaphores really are the most efficient way. CMPXCHG, whilst it ought to be
better, really isn't.

One thing I have noticed, though, is that the counting semaphore tends to be
quite uneven in its distribution across threads in a situation where a lot of
threads are all trying to thrash the semaphore at the same time:

	insmod /tmp/synchro-test.ko v=1 do_sched=1 sm=20 ism=1

gives:

	SEM: 2% 1% 2% 5% 4% 4% 3% 11% 2% 33% 1% 1% 5% 2% 2% 1% 2% 3% 3% 4%

on a dual 200MHz PPro.

Whereas my mutexes are much more even:

	MTX: 5% 5% 4% 4% 4% 5% 5% 5% 5% 4% 4% 4% 4% 4% 6% 5% 4% 4% 4% 6%

(See attached module).

Now, I don't think that this situation is very likely to crop up in ordinary
use, but it seems odd.

David

/* synchro-test.c: run some threads to test the synchronisation primitives
 *
 * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved.
 * Written by David Howells (dhowells@redhat.com)
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version
 * 2 of the License, or (at your option) any later version.
 *
 * run as something like:
 *
 *	insmod synchro-test.ko rd=2 wr=2
 *	insmod synchro-test.ko mx=1
 *	insmod synchro-test.ko sm=2 ism=1
 *	insmod synchro-test.ko sm=2 ism=2
 */

#include <linux/config.h>
#include <linux/module.h>
#include <linux/poll.h>
#include <linux/moduleparam.h>
#include <linux/stat.h>
#include <linux/init.h>
#include <asm/atomic.h>
#include <linux/personality.h>
#include <linux/smp_lock.h>
#include <linux/delay.h>
#include <linux/timer.h>
#include <linux/completion.h>
#include <linux/mutex.h>

#define VALIDATE_OPERATORS 0

static int nummx = 0;
static int numsm = 0, seminit = 4;
static int numrd = 0, numwr = 0, numdg = 0;
static int elapse = 5, load = 0, do_sched = 0;
static int verbose = 0;

MODULE_AUTHOR("David Howells");
MODULE_DESCRIPTION("Synchronisation primitive test demo");
MODULE_LICENSE("GPL");

module_param_named(v, verbose, int, 0);
MODULE_PARM_DESC(verbose, "Verbosity");

module_param_named(mx, nummx, int, 0);
MODULE_PARM_DESC(nummx, "Number of mutex threads");

module_param_named(sm, numsm, int, 0);
MODULE_PARM_DESC(numsm, "Number of semaphore threads");

module_param_named(ism, seminit, int, 0);
MODULE_PARM_DESC(seminit, "Initial semaphore value");

module_param_named(rd, numrd, int, 0);
MODULE_PARM_DESC(numrd, "Number of reader threads");

module_param_named(wr, numwr, int, 0);
MODULE_PARM_DESC(numwr, "Number of writer threads");

module_param_named(dg, numdg, int, 0);
MODULE_PARM_DESC(numdg, "Number of downgrader threads");

module_param(elapse, int, 0);
MODULE_PARM_DESC(elapse, "Number of seconds to run for");

module_param(load, int, 0);
MODULE_PARM_DESC(load, "Length of load in uS");

module_param(do_sched, int, 0);
MODULE_PARM_DESC(do_sched, "True if each thread should schedule regularly");

/* the semaphores under test */
static struct mutex ____cacheline_aligned mutex;
static struct semaphore ____cacheline_aligned sem;
static struct rw_semaphore ____cacheline_aligned rwsem;

static atomic_t ____cacheline_aligned do_stuff		= ATOMIC_INIT(0);

#if VALIDATE_OPERATORS
static atomic_t ____cacheline_aligned mutexes		= ATOMIC_INIT(0);
static atomic_t ____cacheline_aligned semaphores	= ATOMIC_INIT(0);
static atomic_t ____cacheline_aligned readers		= ATOMIC_INIT(0);
static atomic_t ____cacheline_aligned writers		= ATOMIC_INIT(0);
#endif

static unsigned int ____cacheline_aligned mutexes_taken[20];
static unsigned int ____cacheline_aligned semaphores_taken[20];
static unsigned int ____cacheline_aligned reads_taken[20];
static unsigned int ____cacheline_aligned writes_taken[20];
static unsigned int ____cacheline_aligned downgrades_taken[20];

static struct completion ____cacheline_aligned mx_comp[20];
static struct completion ____cacheline_aligned sm_comp[20];
static struct completion ____cacheline_aligned rd_comp[20];
static struct completion ____cacheline_aligned wr_comp[20];
static struct completion ____cacheline_aligned dg_comp[20];

static struct timer_list ____cacheline_aligned timer;

#define ACCOUNT(var, N) var##_taken[N]++;

#if VALIDATE_OPERATORS
#define TRACK(var, dir) atomic_##dir(&(var))

#define CHECK(var, cond, val)						\
do {									\
	int x = atomic_read(&(var));					\
	if (unlikely(!(x cond (val))))					\
		printk("check [%s %s %d, == %d] failed in %s\n",	\
		       #var, #cond, (val), x, __func__);		\
} while (0)

#else
#define TRACK(var, dir)		do {} while(0)
#define CHECK(var, cond, val)	do {} while(0)
#endif

static inline void do_mutex_lock(unsigned int N)
{
	mutex_lock(&mutex);

	ACCOUNT(mutexes, N);
	TRACK(mutexes, inc);
	CHECK(mutexes, ==, 1);
}

static inline void do_mutex_unlock(unsigned int N)
{
	CHECK(mutexes, ==, 1);
	TRACK(mutexes, dec);

	mutex_unlock(&mutex);
}

static inline void do_down(unsigned int N)
{
	CHECK(mutexes, <, seminit);

	down(&sem);

	ACCOUNT(semaphores, N);
	TRACK(semaphores, inc);
}

static inline void do_up(unsigned int N)
{
	CHECK(semaphores, >, 0);
	TRACK(semaphores, dec);

	up(&sem);
}

static inline void do_down_read(unsigned int N)
{
	down_read(&rwsem);

	ACCOUNT(reads, N);
	TRACK(readers, inc);
	CHECK(readers, >, 0);
	CHECK(writers, ==, 0);
}

static inline void do_up_read(unsigned int N)
{
	CHECK(readers, >, 0);
	CHECK(writers, ==, 0);
	TRACK(readers, dec);

	up_read(&rwsem);
}

static inline void do_down_write(unsigned int N)
{
	down_write(&rwsem);

	ACCOUNT(writes, N);
	TRACK(writers, inc);
	CHECK(writers, ==, 1);
	CHECK(readers, ==, 0);
}

static inline void do_up_write(unsigned int N)
{
	CHECK(writers, ==, 1);
	CHECK(readers, ==, 0);
	TRACK(writers, dec);

	up_write(&rwsem);
}

static inline void do_downgrade_write(unsigned int N)
{
	CHECK(writers, ==, 1);
	CHECK(readers, ==, 0);
	TRACK(writers, dec);
	TRACK(readers, inc);

	downgrade_write(&rwsem);

	ACCOUNT(downgrades, N);
}

static inline void sched(void)
{
	if (do_sched)
		schedule();
}

int mutexer(void *arg)
{
	unsigned int N = (unsigned long) arg;

	daemonize("Mutex%u", N);

	while (atomic_read(&do_stuff)) {
		do_mutex_lock(N);
		if (load)
			udelay(load);
		do_mutex_unlock(N);
		sched();
	}

	if (verbose >= 2)
		printk("%s: done\n", current->comm);
	complete_and_exit(&mx_comp[N], 0);
}

int semaphorer(void *arg)
{
	unsigned int N = (unsigned long) arg;

	daemonize("Sem%u", N);

	while (atomic_read(&do_stuff)) {
		do_down(N);
		if (load)
			udelay(load);
		do_up(N);
		sched();
	}

	if (verbose >= 2)
		printk("%s: done\n", current->comm);
	complete_and_exit(&sm_comp[N], 0);
}

int reader(void *arg)
{
	unsigned int N = (unsigned long) arg;

	daemonize("Read%u", N);

	while (atomic_read(&do_stuff)) {
		do_down_read(N);
#ifdef LOAD_TEST
		if (load)
			udelay(load);
#endif
		do_up_read(N);
		sched();
	}

	if (verbose >= 2)
		printk("%s: done\n", current->comm);
	complete_and_exit(&rd_comp[N], 0);
}

int writer(void *arg)
{
	unsigned int N = (unsigned long) arg;

	daemonize("Write%u", N);

	while (atomic_read(&do_stuff)) {
		do_down_write(N);
#ifdef LOAD_TEST
		if (load)
			udelay(load);
#endif
		do_up_write(N);
		sched();
	}

	if (verbose >= 2)
		printk("%s: done\n", current->comm);
	complete_and_exit(&wr_comp[N], 0);
}

int downgrader(void *arg)
{
	unsigned int N = (unsigned long) arg;

	daemonize("Down%u", N);

	while (atomic_read(&do_stuff)) {
		do_down_write(N);
#ifdef LOAD_TEST
		if (load)
			udelay(load);
#endif
		do_downgrade_write(N);
#ifdef LOAD_TEST
		if (load)
			udelay(load);
#endif
		do_up_read(N);
		sched();
	}

	if (verbose >= 2)
		printk("%s: done\n", current->comm);
	complete_and_exit(&dg_comp[N], 0);
}

static void stop_test(unsigned long dummy)
{
	atomic_set(&do_stuff, 0);
}

static unsigned int total(const char *what, unsigned int counts[], int num)
{
	unsigned int tot = 0, max = 0, min = UINT_MAX, zeros = 0, cnt;
	int loop;

	for (loop = 0; loop < num; loop++) {
		cnt = counts[loop];

		if (cnt == 0) {
			zeros++;
			min = 0;
			continue;
		}

		tot += cnt;
		if (tot > max)
			max = tot;
		if (tot < min)
			min = tot;
	}

	if (verbose && tot > 0) {
		printk("%s:", what);

		for (loop = 0; loop < num; loop++) {
			cnt = counts[loop];

			if (cnt == 0)
				printk(" zzz");
			else
				printk(" %d%%", cnt * 100 / tot);
		}

		printk("\n");
	}

	return tot;
}

/*****************************************************************************/
/*
 *
 */
static int __init do_tests(void)
{
	unsigned long loop;
	unsigned int mutex_total, sem_total, rd_total, wr_total, dg_total;

	if (nummx < 0 || nummx > 20 ||
	    numsm < 0 || numsm > 20 ||
	    numrd < 0 || numrd > 20 ||
	    numwr < 0 || numwr > 20 ||
	    numdg < 0 || numdg > 20 ||
	    seminit < 1 ||
	    elapse < 1
	    ) {
		printk("Parameter out of range\n");
		return -ERANGE;
	}

	if ((nummx | numsm | numrd | numwr | numdg) == 0) {
		printk("Nothing to do\n");
		return -EINVAL;
	}

	if (verbose)
		printk("\nStarting synchronisation primitive tests...\n");

	mutex_init(&mutex);
	sema_init(&sem, seminit);
	init_rwsem(&rwsem);
	atomic_set(&do_stuff, 1);

	/* kick off all the children */
	for (loop = 0; loop < 20; loop++) {
		if (loop < nummx) {
			init_completion(&mx_comp[loop]);
			kernel_thread(mutexer, (void *) loop, 0);
		}

		if (loop < numsm) {
			init_completion(&sm_comp[loop]);
			kernel_thread(semaphorer, (void *) loop, 0);
		}

		if (loop < numrd) {
			init_completion(&rd_comp[loop]);
			kernel_thread(reader, (void *) loop, 0);
		}

		if (loop < numwr) {
			init_completion(&wr_comp[loop]);
			kernel_thread(writer, (void *) loop, 0);
		}

		if (loop < numdg) {
			init_completion(&dg_comp[loop]);
			kernel_thread(downgrader, (void *) loop, 0);
		}
	}

	/* set a stop timer */
	init_timer(&timer);
	timer.function = stop_test;
	timer.expires = jiffies + elapse * HZ;
	add_timer(&timer);

	/* now wait until it's all done */
	for (loop = 0; loop < nummx; loop++)
		wait_for_completion(&mx_comp[loop]);

	for (loop = 0; loop < numsm; loop++)
		wait_for_completion(&sm_comp[loop]);

	for (loop = 0; loop < numrd; loop++)
		wait_for_completion(&rd_comp[loop]);

	for (loop = 0; loop < numwr; loop++)
		wait_for_completion(&wr_comp[loop]);

	for (loop = 0; loop < numdg; loop++)
		wait_for_completion(&dg_comp[loop]);

	atomic_set(&do_stuff, 0);
	del_timer(&timer);

	if (mutex_is_locked(&mutex))
		printk(KERN_ERR "Mutex is still locked!\n");

	/* count up */
	mutex_total	= total("MTX", mutexes_taken, nummx);
	sem_total	= total("SEM", semaphores_taken, numsm);
	rd_total	= total("RD ", reads_taken, numrd);
	wr_total	= total("WR ", writes_taken, numwr);
	dg_total	= total("DG ", downgrades_taken, numdg);

	/* print the results */
	if (verbose) {
		printk("mutexes taken: %u\n", mutex_total);
		printk("semaphores taken: %u\n", sem_total);
		printk("reads taken: %u\n", rd_total);
		printk("writes taken: %u\n", wr_total);
		printk("downgrades taken: %u\n", dg_total);
	}
	else {
		printk("%3d %3d %3d %3d %3d %c %3d %9u %9u %9u %9u %9u\n",
		       nummx, numsm, numrd, numwr, numdg,
		       do_sched ? 's' : '-',
		       load,
		       mutex_total,
		       sem_total,
		       rd_total,
		       wr_total,
		       dg_total);
	}

	/* tell insmod to discard the module */
	if (verbose)
		printk("Tests complete\n");
	return -ENOANO;

} /* end do_tests() */

module_init(do_tests);

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 11:02                   ` David Howells
@ 2005-12-16 13:01                     ` Nick Piggin
  2005-12-16 13:21                       ` Russell King
  2005-12-17 15:57                       ` Nikita Danilov
  2005-12-16 16:28                     ` Linus Torvalds
  1 sibling, 2 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-16 13:01 UTC (permalink / raw)
  To: David Howells
  Cc: Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen, torvalds,
	hch, matthew, linux-kernel, linux-arch

David Howells wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> 
>>> (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64,
>>>sparc.
>>> (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.
>>>
>>
>>cmpxchg is basically exactly equivalent to a store-conditional, so 2 and 3
>>are the same level.
> 
> 
> No, they're not. LL/SC is more flexible than CMPXCHG because under some
> circumstances, you can get away without doing the SC, and because sometimes
> you can do one LL/SC in lieu of two CMPXCHG's because LL/SC allows you to
> retrieve the value, consider it and then modify it if you want to. With
> CMPXCHG you have to anticipate, and so you're more likely to get it wrong.
> 

I don't think that is more flexible, just different. For example with
cmpxchg you may not have to do the explicit load if you anticipate an
unlocked mutex as the fastpath.

My point is that they are of semantically equal strength.

> 
>>I don't know why you don't implement a "good" default implementation with
>>atomic_cmpxchg.
> 
> 
> Because it wouldn't be a good default.

You were proposing a worse default, which is the reason I suggested it.

> I'm thinking the best default is simply
> to wrap a counting semaphore.

Probably.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 13:01                     ` Nick Piggin
@ 2005-12-16 13:21                       ` Russell King
  2005-12-16 13:41                         ` Nick Piggin
  2005-12-16 13:46                         ` Linh Dang
  2005-12-17 15:57                       ` Nikita Danilov
  1 sibling, 2 replies; 239+ messages in thread
From: Russell King @ 2005-12-16 13:21 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	cfriesen, torvalds, hch, matthew, linux-kernel, linux-arch

On Sat, Dec 17, 2005 at 12:01:27AM +1100, Nick Piggin wrote:
> You were proposing a worse default, which is the reason I suggested it.

I'd like to qualify that.  "for architectures with native cmpxchg".

For general consumption (not specifically related to mutex stuff)...

For architectures with llsc, sequences stuch as:

	load
	modify
	cmpxchg

are inefficient because they have to be implemented as:

	load
	modify
	load
	compare
	store conditional

Now, if we consider using llsc as the basis of atomic operations:

	load
	modify
	store conditional

and for cmpxchg-based architectures:

	load
	modify
	cmpxchg

Notice that the cmpxchg-based case does _not_ get any worse - in fact
it's exactly identical.  Note, however, that the llsc case becomes
more efficient.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 13:21                       ` Russell King
@ 2005-12-16 13:41                         ` Nick Piggin
  2005-12-16 13:46                         ` Linh Dang
  1 sibling, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-16 13:41 UTC (permalink / raw)
  To: Russell King
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	cfriesen, torvalds, hch, matthew, linux-kernel, linux-arch

Russell King wrote:
> On Sat, Dec 17, 2005 at 12:01:27AM +1100, Nick Piggin wrote:
> 
>>You were proposing a worse default, which is the reason I suggested it.
> 
> 
> I'd like to qualify that.  "for architectures with native cmpxchg".
> 
> For general consumption (not specifically related to mutex stuff)...
> 
> For architectures with llsc, sequences stuch as:
> 
> 	load
> 	modify
> 	cmpxchg
> 
> are inefficient because they have to be implemented as:
> 
> 	load
> 	modify
> 	load
> 	compare
> 	store conditional
> 
> Now, if we consider using llsc as the basis of atomic operations:
> 
> 	load
> 	modify
> 	store conditional
> 
> and for cmpxchg-based architectures:
> 
> 	load
> 	modify
> 	cmpxchg
> 
> Notice that the cmpxchg-based case does _not_ get any worse - in fact
> it's exactly identical.  Note, however, that the llsc case becomes
> more efficient.
> 

True in many cases. However in a lock fastpath one could do the
atomic_cmpxchg without an initial load, assuming the lock is
unlocked.

atomic_cmpxchg(&lock, UNLOCKED, LOCKED)

which should basically wind up to the most optimal code on both the
cmpxchg and ll/sc platforms (aside from other quirks David pointed
out like cmpxchg being worse than lock inc on x86).

Ah - I see you pointed out "for general consumption", I missed that.
Indeed for general consumption one should still be careful using
atomic_cmpxchg.

Nick

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 13:21                       ` Russell King
  2005-12-16 13:41                         ` Nick Piggin
@ 2005-12-16 13:46                         ` Linh Dang
  2005-12-16 14:31                           ` Russell King
  2005-12-16 15:46                           ` David Howells
  1 sibling, 2 replies; 239+ messages in thread
From: Linh Dang @ 2005-12-16 13:46 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	Christopher Friesen, torvalds, hch, matthew, linux-kernel,
	linux-arch


Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> On Sat, Dec 17, 2005 at 12:01:27AM +1100, Nick Piggin wrote:
>> You were proposing a worse default, which is the reason I suggested
>> it.
>
> I'd like to qualify that.  "for architectures with native cmpxchg".
>
> For general consumption (not specifically related to mutex stuff)...
>
> For architectures with llsc, sequences stuch as:
>
> 	load
> 	modify
> 	cmpxchg
>
> are inefficient because they have to be implemented as:
>
> 	load
> 	modify
> 	load
> 	compare
> 	store conditional
>

I dont know what arch u have in mind but for ppc it is:

        load-reserve
        modify
        store-conditional

and NOT the sequence you show.

-- 
Linh Dang

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 13:46                         ` Linh Dang
@ 2005-12-16 14:31                           ` Russell King
  2005-12-16 15:24                             ` Linh Dang
  2005-12-16 15:49                             ` Linh Dang
  2005-12-16 15:46                           ` David Howells
  1 sibling, 2 replies; 239+ messages in thread
From: Russell King @ 2005-12-16 14:31 UTC (permalink / raw)
  To: Linh Dang
  Cc: Nick Piggin, David Howells, Arjan van de Ven, Andrew Morton,
	Alan Cox, Christopher Friesen, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Fri, Dec 16, 2005 at 08:46:44AM -0500, Linh Dang wrote:
> 
> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> 
> > On Sat, Dec 17, 2005 at 12:01:27AM +1100, Nick Piggin wrote:
> >> You were proposing a worse default, which is the reason I suggested
> >> it.
> >
> > I'd like to qualify that.  "for architectures with native cmpxchg".
> >
> > For general consumption (not specifically related to mutex stuff)...
> >
> > For architectures with llsc, sequences stuch as:
> >
> > 	load
> > 	modify
> > 	cmpxchg
> >
> > are inefficient because they have to be implemented as:
> >
> > 	load
> > 	modify
> > 	load
> > 	compare
> > 	store conditional
> >
> 
> I dont know what arch u have in mind but for ppc it is:
> 
>         load-reserve
>         modify
>         store-conditional
> 
> and NOT the sequence you show.

Wrong - because you haven't understood what I'm getting at.  If you're
using "cmpxchg" as the low level generic atomic operation (as in the
atomic_cmpxchg() function) then atomic_cmpxchg _has_ to be implemented
on llsc as:

	load (reserve if you need this detail)
	compare
	store conditional

So, let's illustrate this.  Let's say you want to atomically multiply
a value by N.

	do {
		old = atomic_read(&foo);
		new = old * N;
	} while(atomic_cmpxchg(&foo, old, new) != old);

For an architecture supporting cmpxchg, this becomes:

loop:	load foo => old
	new = old * N
	cmpxchg ret, old, new, foo
	compare ret & old
	if not equal goto loop

And for architectures with llsc, this becomes:

loop:	load foo => old
	new = old * N
loop2:	load locked foo => ret
	compare ret & old
	if equal store conditional new in foo
		if store failed because we lost the lock, goto loop2
	compare ret & old
	if not equal goto loop

Do you now see what I mean?  (yup, ARM is a llsc architecture.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 14:31                           ` Russell King
@ 2005-12-16 15:24                             ` Linh Dang
  2005-12-16 15:35                               ` Nick Piggin
  2005-12-16 15:40                               ` Kyle Moffett
  2005-12-16 15:49                             ` Linh Dang
  1 sibling, 2 replies; 239+ messages in thread
From: Linh Dang @ 2005-12-16 15:24 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	Christopher Friesen, torvalds, hch, matthew, linux-kernel,
	linux-arch

Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> On Fri, Dec 16, 2005 at 08:46:44AM -0500, Linh Dang wrote:
>>
>> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
>>
>>> On Sat, Dec 17, 2005 at 12:01:27AM +1100, Nick Piggin wrote:
>>>> You were proposing a worse default, which is the reason I
>>>> suggested it.
>>>
>>> I'd like to qualify that.  "for architectures with native
>>> cmpxchg".
>>>
>>> For general consumption (not specifically related to mutex
>>> stuff)...
>>>
>>> For architectures with llsc, sequences stuch as:
>>>
>>> 	load
>>> 	modify
>>> 	cmpxchg
>>>
>>> are inefficient because they have to be implemented as:
>>>
>>> 	load
>>> 	modify
>>> 	load
>>> 	compare
>>> 	store conditional
>>>
>>
>> I dont know what arch u have in mind but for ppc it is:
>>
>> load-reserve
>> modify
>> store-conditional
>>
>> and NOT the sequence you show.
>
> Wrong - because you haven't understood what I'm getting at.  If
> you're using "cmpxchg" as the low level generic atomic operation (as
> in the atomic_cmpxchg() function) then atomic_cmpxchg _has_ to be
> implemented on llsc as:
>
> 	load (reserve if you need this detail)
> 	compare
> 	store conditional
>
> So, let's illustrate this.  Let's say you want to atomically
> multiply a value by N.
>
> 	do {
> 		old = atomic_read(&foo);
> 		new = old * N;
> 	} while(atomic_cmpxchg(&foo, old, new) != old);
>
> For an architecture supporting cmpxchg, this becomes:
>
> loop:	load foo => old
> 	new = old * N
> 	cmpxchg ret, old, new, foo
> 	compare ret & old
> 	if not equal goto loop
>
> And for architectures with llsc, this becomes:
>
> loop:	load foo => old
> 	new = old * N
> loop2:	load locked foo => ret
> 	compare ret & old
> 	if equal store conditional new in foo
> 		if store failed because we lost the lock, goto loop2
> 	compare ret & old
> 	if not equal goto loop
>
> Do you now see what I mean?  (yup, ARM is a llsc architecture.)

Well, it may be true for ARM but for ppc (i dunno what exactly llsc
means but someone in the thread put ppc in llsc group)  it's:

   loop:
        load-reserve foo => old
        new = old * N
        store-conditional new => foo
        if failed goto loop     

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 15:24                             ` Linh Dang
@ 2005-12-16 15:35                               ` Nick Piggin
  2005-12-16 15:40                               ` Kyle Moffett
  1 sibling, 0 replies; 239+ messages in thread
From: Nick Piggin @ 2005-12-16 15:35 UTC (permalink / raw)
  To: Linh Dang
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	Christopher Friesen, torvalds, hch, matthew, linux-kernel,
	linux-arch

Linh Dang wrote:

>>Do you now see what I mean?  (yup, ARM is a llsc architecture.)
> 
> 
> Well, it may be true for ARM but for ppc (i dunno what exactly llsc
> means but someone in the thread put ppc in llsc group)  it's:
> 

load locked or load with lock, IIRC.

>    loop:
>         load-reserve foo => old
>         new = old * N
>         store-conditional new => foo
>         if failed goto loop     
> 

The point is that the typical use case for a cmpxchg is less optimal
if cmpxchg is simulated with llsc than if the same functionality were
directly implemented with llsc instructions.

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 15:24                             ` Linh Dang
  2005-12-16 15:35                               ` Nick Piggin
@ 2005-12-16 15:40                               ` Kyle Moffett
  1 sibling, 0 replies; 239+ messages in thread
From: Kyle Moffett @ 2005-12-16 15:40 UTC (permalink / raw)
  To: Linh Dang
  Cc: Nick Piggin, David Howells, Arjan van de Ven, Andrew Morton,
	Alan Cox, Christopher Friesen, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Dec 16, 2005, at 10:24, Linh Dang wrote:
> Well, it may be true for ARM but for ppc (i dunno what exactly llsc  
> means but someone in the thread put ppc in llsc group)  it's:
>
>    loop:
>         load-reserve foo => old
>         new = old * N
>         store-conditional new => foo
>         if failed goto loop

LLSC == Load-Locked/Store-Conditional.  It's a slightly different  
name for your Load-Reserve/Store-Conditional

You still miss his point.  That is _GOOD_ code.  Russell's point is  
that if somebody does this in generic code:

do {
	old = atomic_read(&foo);
	new = old * 2;
} while (atomic_cmpxchg(&foo, old, new) != old);

On PPC or ARM or another LLSC architecture it does not end up looking  
like the good code, it looks like this (which is clearly inefficient):

>> And for architectures with llsc, this becomes:
>>
>> loop:	load foo => old
>> 	new = old * N
>> loop2:	load locked foo => ret
>> 	compare ret & old
>> 	if equal store conditional new in foo
>> 		if store failed because we lost the lock, goto loop2
>> 	compare ret & old
>> 	if not equal goto loop

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/E/U d- s++: a18 C++++>$ ULBX*++++(+++)>$ P++++(+++)>$ L++++ 
(+++)>$ !E- W+++(++) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+ PGP 
+ t+(+++) 5 X R? !tv-(--) b++++(++) DI+(++) D+++ G e>++++$ h*(+)>++$ r 
%(--)  !y?-(--)
------END GEEK CODE BLOCK------




^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 14:31                           ` Russell King
  2005-12-16 15:24                             ` Linh Dang
@ 2005-12-16 15:49                             ` Linh Dang
  1 sibling, 0 replies; 239+ messages in thread
From: Linh Dang @ 2005-12-16 15:49 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Howells, Arjan van de Ven, Andrew Morton, Alan Cox,
	Christopher Friesen, torvalds, hch, matthew, linux-kernel,
	linux-arch

Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> Do you now see what I mean?  (yup, ARM is a llsc architecture.)

Oh, I do see your point now! sorry for all the newbie noise!

-- 
Linh Dang

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 13:46                         ` Linh Dang
  2005-12-16 14:31                           ` Russell King
@ 2005-12-16 15:46                           ` David Howells
  2005-12-16 15:58                             ` Russell King
  1 sibling, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-16 15:46 UTC (permalink / raw)
  To: Russell King
  Cc: Linh Dang, Nick Piggin, David Howells, Arjan van de Ven,
	Andrew Morton, Alan Cox, Christopher Friesen, torvalds, hch,
	matthew, linux-kernel, linux-arch

Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> Do you now see what I mean?  (yup, ARM is a llsc architecture.)

Out of interest, at what point did ARM become so? ARM6?

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 15:46                           ` David Howells
@ 2005-12-16 15:58                             ` Russell King
  0 siblings, 0 replies; 239+ messages in thread
From: Russell King @ 2005-12-16 15:58 UTC (permalink / raw)
  To: David Howells
  Cc: Linh Dang, Nick Piggin, Arjan van de Ven, Andrew Morton,
	Alan Cox, Christopher Friesen, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Fri, Dec 16, 2005 at 03:46:41PM +0000, David Howells wrote:
> Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> 
> > Do you now see what I mean?  (yup, ARM is a llsc architecture.)
> 
> Out of interest, at what point did ARM become so? ARM6?

Yes, ARM architecture version 6.

See the ldrex (load exclusive) / strex (store exclusive) instructions.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 13:01                     ` Nick Piggin
  2005-12-16 13:21                       ` Russell King
@ 2005-12-17 15:57                       ` Nikita Danilov
  1 sibling, 0 replies; 239+ messages in thread
From: Nikita Danilov @ 2005-12-17 15:57 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen, torvalds,
	hch, matthew, linux-kernel, linux-arch

Nick Piggin writes:
 > David Howells wrote:
 > > Nick Piggin <nickpiggin@yahoo.com.au> wrote:
 > > 
 > > 
 > >>> (2) Those that have CMPXCHG or equivalent: 68020, i486+, x86_64, ia64,
 > >>>sparc.
 > >>> (3) Those that have LL/SC or equivalent: mips (some), alpha, powerpc, arm6.
 > >>>
 > >>
 > >>cmpxchg is basically exactly equivalent to a store-conditional, so 2 and 3
 > >>are the same level.
 > > 
 > > 
 > > No, they're not. LL/SC is more flexible than CMPXCHG because under some
 > > circumstances, you can get away without doing the SC, and because sometimes
 > > you can do one LL/SC in lieu of two CMPXCHG's because LL/SC allows you to
 > > retrieve the value, consider it and then modify it if you want to. With
 > > CMPXCHG you have to anticipate, and so you're more likely to get it wrong.
 > > 
 > 
 > I don't think that is more flexible, just different. For example with
 > cmpxchg you may not have to do the explicit load if you anticipate an
 > unlocked mutex as the fastpath.
 > 
 > My point is that they are of semantically equal strength.

In the context of implementing mutex they most likely are. But not
generally: LL/SC fails when _any_ write was made into monitored
location, whereas CAS fails only when value stored in that location
changes. As a result, CAS has to deal with "ABA problem" when value
(e.g., first element in a queue) is changed from A to B (head of the
queue is removed) and then back to A (old head is inserted back).

Nikita.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 11:02                   ` David Howells
  2005-12-16 13:01                     ` Nick Piggin
@ 2005-12-16 16:28                     ` Linus Torvalds
  1 sibling, 0 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 16:28 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen,
	hch, matthew, linux-kernel, linux-arch

On Fri, 16 Dec 2005, David Howells wrote:
> 
> No, they're not. LL/SC is more flexible than CMPXCHG because under some
> circumstances, you can get away without doing the SC, and because sometimes
> you can do one LL/SC in lieu of two CMPXCHG's because LL/SC allows you to
> retrieve the value, consider it and then modify it if you want to. With
> CMPXCHG you have to anticipate, and so you're more likely to get it wrong.

You can think of LL/SC as directly translating into LD/CMPXCHG, so in that 
sense CMPXCHG is no less flexible. LL/SC still has other advantages, 
though. See later.

> I've had a play with x86, and on there CMPXCHG, XCHG and XADD give worse
> performance than INC/DEC for some reason. I assume this is something to do
> with how the PPro CPU optimises itself. On PPro CPUs at least, counting
> semaphores really are the most efficient way. CMPXCHG, whilst it ought to be
> better, really isn't.

The notion that CMPXCHG "ought to be better" is a load of bull.

There are two advantages of "lock inc/dec" over "ld/cmpxchg": one is the 
obvious one that the CPU core just has a much easier time with the 
unconditional one, and never has to worry about things like conditional 
branches or waste cycles on multiple instructions. Just compare the 
sequences:

	lock inc mem

vs

   back:
	load mem,reg1
	reg2 = reg1+1
	cmpxchg mem,reg1,reg2
	jne forward		# get branch prediction right
	return
   forward:
	jmp back

guess which one is faster?

The other one depends on cache coherency: the "lock inc" can just get the 
cacheline for exclusive use immediately ("read with intent to write"). In 
contrast, the ld/cmpxchg first gets the cacheline for reading, and then 
has to turn it into an exclusive one. IOW, there may literally be lots of 
extra bus traffic from doing a load first.

In other words, there are several advantages to just using the simple 
instructions. 

(Of course, some CPU's have "get cacheline for write" instructions, so you 
can then make the second sequence even longer by using that).

Using "xadd" should be fine, although for all I know, even then 
microarchitectural issues may make it cheaper to use the simpler "lock 
add" whenever possible.

In LL/SC, I _think_ LL generally does its read with intent to write. 

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:57                 ` David Howells
                                     ` (2 preceding siblings ...)
  2005-12-16 11:02                   ` David Howells
@ 2005-12-16 11:30                   ` David Howells
  2005-12-16 16:33                       ` Linus Torvalds
  3 siblings, 1 reply; 239+ messages in thread
From: David Howells @ 2005-12-16 11:30 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen,
	torvalds, hch, matthew, linux-kernel, linux-arch

David Howells <dhowells@redhat.com> wrote:

> No, they're not. LL/SC is more flexible than CMPXCHG because under some
> circumstances, you can get away without doing the SC,

Of course, CMPXCHG doesn't have to store either, though it still performs a
locked-write-cycle on x86 if I remember correctly.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 11:30                   ` David Howells
@ 2005-12-16 16:33                       ` Linus Torvalds
  0 siblings, 0 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 16:33 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen,
	hch, matthew, linux-kernel, linux-arch



On Fri, 16 Dec 2005, David Howells wrote:
> 
> Of course, CMPXCHG doesn't have to store either, though it still performs a
> locked-write-cycle on x86 if I remember correctly.

It does so on any sane architecture (side note: you don't do locked 
memory cycles on the bus these days. You do cache coherency protocols).

>From a bus standpoint you _have_ to do the initial read with intent to 
write, nothing else makes any sense. You'll just waste bus cycles 
otherwise. Sure, the write may never come, but it just isn't sensible to 
optimize for the case where the compare will fail. If that's the common 
case, then software is doing something wrong (it should do just a much 
cheaper "load + compare" first if it knows it's probably going to fail).

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
@ 2005-12-16 16:33                       ` Linus Torvalds
  0 siblings, 0 replies; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 16:33 UTC (permalink / raw)
  To: David Howells
  Cc: Nick Piggin, Arjan van de Ven, Andrew Morton, Alan Cox, cfriesen,
	hch, matthew, linux-kernel, linux-arch

On Fri, 16 Dec 2005, David Howells wrote:
> 
> Of course, CMPXCHG doesn't have to store either, though it still performs a
> locked-write-cycle on x86 if I remember correctly.

It does so on any sane architecture (side note: you don't do locked 
memory cycles on the bus these days. You do cache coherency protocols).

From a bus standpoint you _have_ to do the initial read with intent to 
write, nothing else makes any sense. You'll just waste bus cycles 
otherwise. Sure, the write may never come, but it just isn't sensible to 
optimize for the case where the compare will fail. If that's the common 
case, then software is doing something wrong (it should do just a much 
cheaper "load + compare" first if it knows it's probably going to fail).

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 16:33                       ` Linus Torvalds
  (?)
@ 2005-12-16 22:23                       ` David S. Miller
  2005-12-16 22:38                         ` Linus Torvalds
  -1 siblings, 1 reply; 239+ messages in thread
From: David S. Miller @ 2005-12-16 22:23 UTC (permalink / raw)
  To: torvalds
  Cc: dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch, matthew,
	linux-kernel, linux-arch

From: Linus Torvalds <torvalds@osdl.org>
Date: Fri, 16 Dec 2005 08:33:10 -0800 (PST)

> From a bus standpoint you _have_ to do the initial read with intent to 
> write, nothing else makes any sense. You'll just waste bus cycles 
> otherwise. Sure, the write may never come, but it just isn't sensible to 
> optimize for the case where the compare will fail. If that's the common 
> case, then software is doing something wrong (it should do just a much 
> cheaper "load + compare" first if it knows it's probably going to fail).

Actually, this points out a problem with "compare and swap".  The
typical loop is of the form:

	LOAD [MEM], REG1
	OP   REG1, X, REG2
	CAS  [MEM], REG1, REG2

That first LOAD instruction, if it misses in the L2, causes the cache
line to be requested for sharing.  Then the CAS instruction will need
to issue another cache coherency transaction to get the cache line
into owned state.

Basically, this guarentees that you'll have 2 cache coherency
transactions, a huge waste, every time an atomic update sequence
executes for a data item not in cache already.

(Are there any CPUs that peek forward and look for the CAS
 instruction to decide to issue the more appropriate request
 for the cache line in Owned state?  That would be cool...)

At least with "load locked / store conditional" the cpu is being told
that we intend to write to that cache line, so it can request sole
ownership on the bus when the load misses.

The only workaround I can come up with is the do a prefetch for write
right before the LOAD.  I've been tempted to add this on sparc64 for a
long time.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:23                       ` David S. Miller
@ 2005-12-16 22:38                         ` Linus Torvalds
  2005-12-16 22:53                           ` David S. Miller
  0 siblings, 1 reply; 239+ messages in thread
From: Linus Torvalds @ 2005-12-16 22:38 UTC (permalink / raw)
  To: David S. Miller
  Cc: dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch, matthew,
	linux-kernel, linux-arch

On Fri, 16 Dec 2005, David S. Miller wrote:
> 
> Actually, this points out a problem with "compare and swap".  The
> typical loop is of the form:
> 
> 	LOAD [MEM], REG1
> 	OP   REG1, X, REG2
> 	CAS  [MEM], REG1, REG2
> 
> That first LOAD instruction, if it misses in the L2, causes the cache
> line to be requested for sharing.  Then the CAS instruction will need
> to issue another cache coherency transaction to get the cache line
> into owned state.

A number of architectures have a "prefetch for write ownership" 
instruction that you can use for this. Exactly because "ld+cas" should 
not get a shared line initially.

I though sparc had an ASI to do the same? No?

> (Are there any CPUs that peek forward and look for the CAS
>  instruction to decide to issue the more appropriate request
>  for the cache line in Owned state?  That would be cool...)

I don't think anybody does, although it wouldn't be impossible. Any OoO 
processor would _tend_ to have enough visibility that they could see any 
stores that are "close" (not just a cas) and might be clever enough to 
modify a memory read to be a read-with-intent-to-write op.

Of course, if it was in memory (as opposed to somebody elses caches), I 
think most cache protocols will start it up in exclusive state anyway. 
Same may or may not happen when you have a dirty hit on another CPU that 
requires a write-back (ie the other CPU would always invalidate on 
writeback).

It would seem to be the obvious thing to do for better lock performance, 
and I'd assume that locks are some of the most common cases of real cache 
interactions, so maybe the shared case only effectively happens if two 
CPU's are reading at the same time.

Somebody who looks at cache protocol diagrams could check. I'm too lazy.

		Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:38                         ` Linus Torvalds
@ 2005-12-16 22:53                           ` David S. Miller
  2005-12-17  0:41                             ` Jesse Barnes
  2005-12-17 22:38                             ` Richard Henderson
  0 siblings, 2 replies; 239+ messages in thread
From: David S. Miller @ 2005-12-16 22:53 UTC (permalink / raw)
  To: torvalds
  Cc: dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch, matthew,
	linux-kernel, linux-arch

From: Linus Torvalds <torvalds@osdl.org>
Date: Fri, 16 Dec 2005 14:38:47 -0800 (PST)

> A number of architectures have a "prefetch for write ownership" 
> instruction that you can use for this. Exactly because "ld+cas" should 
> not get a shared line initially.
> 
> I though sparc had an ASI to do the same? No?

No, no special ASI exists to do that, although it would be nice. :-)
I'd have to use a prefetch for write.

BTW, it is interesting that you can use CAS to get a cache line into
the local processor in Owned state with %100 certainty (unlike
prefetch for write which might get cancelled) by doing something like:

	CAS	[MEM], ZERO, ZERO

and you can do this to any valid memory address without changing the
contents.  This is useful for doing things like resetting parity bits
while doing memory error recorvery.

> It would seem to be the obvious thing to do for better lock performance, 
> and I'd assume that locks are some of the most common cases of real cache 
> interactions, so maybe the shared case only effectively happens if two 
> CPU's are reading at the same time.
> 
> Somebody who looks at cache protocol diagrams could check. I'm too lazy.

For both MOESI and MOSI cache coherency protocols, misses on loads
result in a Shared state cache line when another processor has the
data in it's cache too, regardless of whether that line in the other
cpu is dirty or not.

When the write comes along, the next transaction occurs to kick it
out the other cpu(s) caches and then the local line is placed into
Owned state.

I'll have to add "put write prefetch in CAS sequences" onto my sparc64
TODO list :-)

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:53                           ` David S. Miller
@ 2005-12-17  0:41                             ` Jesse Barnes
  2005-12-17  7:10                               ` David S. Miller
  2005-12-17 22:38                             ` Richard Henderson
  1 sibling, 1 reply; 239+ messages in thread
From: Jesse Barnes @ 2005-12-17  0:41 UTC (permalink / raw)
  To: David S. Miller
  Cc: torvalds, dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch,
	matthew, linux-kernel, linux-arch

On Friday, December 16, 2005 2:53 pm, David S. Miller wrote:
> When the write comes along, the next transaction occurs to kick it
> out the other cpu(s) caches and then the local line is placed into
> Owned state.
>
> I'll have to add "put write prefetch in CAS sequences" onto my sparc64
> TODO list :-)

Note that under contention prefetching with a write bias can cause a lot 
more cache line bouncing than a regular load into shared state (assuming 
you do a load and test before you try the CAS).  We actually saw this on 
large Altix machines, 
http://lia64.bkbits.net:8080/to-linus-2.5/cset%403f2082b3xCvMG9OSeNu3aWhoe6jnOg?nav=index.html|
src/.|src/include|src/include/asm-ia64|
related/include/asm-ia64/spinlock.h fixed things up for us.

Jesse

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  0:41                             ` Jesse Barnes
@ 2005-12-17  7:10                               ` David S. Miller
  2005-12-17  7:40                                 ` Linus Torvalds
  2005-12-17 17:19                                 ` Jesse Barnes
  0 siblings, 2 replies; 239+ messages in thread
From: David S. Miller @ 2005-12-17  7:10 UTC (permalink / raw)
  To: jbarnes
  Cc: torvalds, dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch,
	matthew, linux-kernel, linux-arch

From: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Fri, 16 Dec 2005 16:41:49 -0800

> Note that under contention prefetching with a write bias can cause a lot 
> more cache line bouncing than a regular load into shared state (assuming 
> you do a load and test before you try the CAS).

If there is some test guarding the CAS, yes.

But if there isn't, for things like atomic increment and
decrement, where the CAS is unconditional, you'll always
eat the two bus transactions without the prefetch for write.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  7:10                               ` David S. Miller
@ 2005-12-17  7:40                                 ` Linus Torvalds
  2005-12-17 17:22                                   ` Jesse Barnes
  2005-12-17 17:19                                 ` Jesse Barnes
  1 sibling, 1 reply; 239+ messages in thread
From: Linus Torvalds @ 2005-12-17  7:40 UTC (permalink / raw)
  To: David S. Miller
  Cc: jbarnes, dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch,
	matthew, linux-kernel, linux-arch

On Fri, 16 Dec 2005, David S. Miller wrote:
> 
> If there is some test guarding the CAS, yes.
> 
> But if there isn't, for things like atomic increment and
> decrement, where the CAS is unconditional, you'll always
> eat the two bus transactions without the prefetch for write.

Side note: there may be hardware cache protocol _scheduling_ reasons why 
some particular hw platform might prefer to go through the "Shared" state 
in their cache protocol.

For example, you might have hardware that otherwise ends up being very 
unfair, where the two-stage lock aquire might actually allow another node 
to come in at all. Fairness and balance often comes at a cost, both in hw 
and in sw.

Arguably such hardware sounds pretty broken, but the point is that these 
things can certainly depend on the platform around the CPU as well as on 
what the CPU itself does.

I'm not saying that that is necessarily what Jesse was arguing about, but 
lock contention behaviour can be "interesting".

			Linus

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  7:40                                 ` Linus Torvalds
@ 2005-12-17 17:22                                   ` Jesse Barnes
  0 siblings, 0 replies; 239+ messages in thread
From: Jesse Barnes @ 2005-12-17 17:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David S. Miller, dhowells, nickpiggin, arjan, akpm, alan,
	cfriesen, hch, matthew, linux-kernel, linux-arch

On Friday, December 16, 2005 11:40 pm, Linus Torvalds wrote:
> Side note: there may be hardware cache protocol _scheduling_ reasons
> why some particular hw platform might prefer to go through the
> "Shared" state in their cache protocol.
>
> For example, you might have hardware that otherwise ends up being
> very unfair, where the two-stage lock aquire might actually allow
> another node to come in at all. Fairness and balance often comes at a
> cost, both in hw and in sw.
>
> Arguably such hardware sounds pretty broken, but the point is that
> these things can certainly depend on the platform around the CPU as
> well as on what the CPU itself does.
>
> I'm not saying that that is necessarily what Jesse was arguing about,
> but lock contention behaviour can be "interesting".

Yeah, that's a good point.  Getting lock behavior 'just right' can get 
pretty platform specific.  For instance, on a CMP type machine, 
bouncing a lock between CPUs can be nearly free, whereas on a large 
directory based machine like the Altix, pretty much any cache line 
write (or fake write like ia64's ld.bias) is an expensive operation 
since it involves lots of relatively slow network activity.

Jesse

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17  7:10                               ` David S. Miller
  2005-12-17  7:40                                 ` Linus Torvalds
@ 2005-12-17 17:19                                 ` Jesse Barnes
  1 sibling, 0 replies; 239+ messages in thread
From: Jesse Barnes @ 2005-12-17 17:19 UTC (permalink / raw)
  To: David S. Miller
  Cc: torvalds, dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch,
	matthew, linux-kernel, linux-arch

On Friday, December 16, 2005 11:10 pm, David S. Miller wrote:
> From: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date: Fri, 16 Dec 2005 16:41:49 -0800
>
> > Note that under contention prefetching with a write bias can cause
> > a lot more cache line bouncing than a regular load into shared
> > state (assuming you do a load and test before you try the CAS).
>
> If there is some test guarding the CAS, yes.

Yeah, I was only referring to that particular case (the ia64 code does 
test then CAS, so removing the write bias on the load avoided a lot of 
thrashing for locks under contention).

> But if there isn't, for things like atomic increment and
> decrement, where the CAS is unconditional, you'll always
> eat the two bus transactions without the prefetch for write.

Right, in that case, biasing for read might make sense, as long as some 
other CPU doesn't cause the line to go back to shared before you 
actually get to the CAS.

Jesse

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-16 22:53                           ` David S. Miller
  2005-12-17  0:41                             ` Jesse Barnes
@ 2005-12-17 22:38                             ` Richard Henderson
  2005-12-17 23:05                               ` David S. Miller
  1 sibling, 1 reply; 239+ messages in thread
From: Richard Henderson @ 2005-12-17 22:38 UTC (permalink / raw)
  To: David S. Miller
  Cc: torvalds, dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch,
	matthew, linux-kernel, linux-arch

On Fri, Dec 16, 2005 at 02:53:06PM -0800, David S. Miller wrote:
> I'll have to add "put write prefetch in CAS sequences" onto my sparc64
> TODO list :-)

You might consider just beginning your loops like

	mov	zero, old
	cas	[mem], zero, old

to do the initial read, since old will now contain the 
contents of the memory, and we havn't changed the memory.


r~

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-17 22:38                             ` Richard Henderson
@ 2005-12-17 23:05                               ` David S. Miller
  0 siblings, 0 replies; 239+ messages in thread
From: David S. Miller @ 2005-12-17 23:05 UTC (permalink / raw)
  To: rth
  Cc: torvalds, dhowells, nickpiggin, arjan, akpm, alan, cfriesen, hch,
	matthew, linux-kernel, linux-arch

From: Richard Henderson <rth@twiddle.net>
Date: Sat, 17 Dec 2005 14:38:24 -0800

> You might consider just beginning your loops like
> 
> 	mov	zero, old
> 	cas	[mem], zero, old
> 
> to do the initial read, since old will now contain the 
> contents of the memory, and we havn't changed the memory.

CAS is 32 cycles minimum on sparc64 even on a cache hit, so I think
the prefetch+load will be faster :-)  But it deserves checking out,
that's for sure.

Either way, that is a clever use of CAS :)

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:35               ` Andrew Morton
  2005-12-14 11:44                 ` Arjan van de Ven
  2005-12-14 11:57                 ` David Howells
@ 2005-12-14 12:17                 ` Christoph Hellwig
  2 siblings, 0 replies; 239+ messages in thread
From: Christoph Hellwig @ 2005-12-14 12:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, arjan, dhowells, cfriesen, torvalds, hch, matthew,
	linux-kernel, linux-arch

On Wed, Dec 14, 2005 at 03:35:36AM -0800, Andrew Morton wrote:
> 
> Could someone please remind me why we're even discussing this, given that
> mutex_down() is slightly more costly than current down(), and mutex_up() is
> appreciably more costly than current up()?

That's a good question.  The new mutex implementation here is big regression
to what we have right now.  What I had in mind when brainstorming something
like this would be to have a slow-path pure C semaphore implementation that
is cross-platform, and keep the current semaphore code as mutex.  Once that
is done the mutex code could be optimized further because it doesn't need to
deal with the broader uses of the semaphore, and we could add lots of useful
debugging.

The current patchkit is far from that.

What might be more useful as a start is to implement a mutex type ontop
of the current semaphore that has lots of additional checks for the DEBUG
build so we have nice diagnostics.  Once we have all users of mutex semantics
using that API we can change the underlying implementation to whatever we want.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-14 11:24             ` Alan Cox
  2005-12-14 11:35               ` Andrew Morton
@ 2005-12-14 11:42               ` Arjan van de Ven
  1 sibling, 0 replies; 239+ messages in thread
From: Arjan van de Ven @ 2005-12-14 11:42 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Howells, Christopher Friesen, torvalds, akpm, hch, matthew,
	linux-kernel, linux-arch

On Wed, 2005-12-14 at 11:24 +0000, Alan Cox wrote:
> On Mer, 2005-12-14 at 12:08 +0100, Arjan van de Ven wrote:
> > 1) the BKL change hasn't finished, and we're 5 years down the line. API
> > changes done gradual tend to take forever in practice, esp if there's no
> > "compile" incentive for people to fix things. 
> 
> This isn't a "fix" however, its merely a performance tweak.

it's a conceptual API split that, if nothing else, declares intent and
usage pattern more specifically. Performance is just one of the angles.
Other angles are that it's possible to treat mutex users different (like
Ingo is doing in -rt, where you can temporary boost a mutex owner if the
mutex gets contended, other uses are better hold time metrics etc etc)

>  Drivers
> using the old API are not a problem because
> 
> a) The old API is needed long term for true counting sem users

this is skipping one bridge ;)
A counting semaphore is needed long term. API is up for debate in the
sense that it's not clear that a non-compile-time thing is the right
solution.

> Thats rather different to the BKL

BKL is different in that it's more work to do a conversion (eg the BKL
semantics are rather complex compared to normal spinlock / semaphore /
mutex semantics). So yes BKL is harder, and not really possible to do in
one go. Unlike these...
For BKL there was no choice. Here there is.

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 15:39   ` David Howells
  2005-12-13 16:10     ` Alan Cox
@ 2005-12-14  8:31     ` Ingo Molnar
  1 sibling, 0 replies; 239+ messages in thread
From: Ingo Molnar @ 2005-12-14  8:31 UTC (permalink / raw)
  To: David Howells
  Cc: Christopher Friesen, Alan Cox, torvalds, akpm, hch, arjan,
	matthew, linux-kernel, linux-arch


* David Howells <dhowells@redhat.com> wrote:

>  (3) Some people want mutexes to be:
> 
>      (a) only releasable in the same context as they were taken
> 
>      (b) not accessible in interrupt context, or that (a) applies here also
> 
>      (c) not initialisable to the locked state
> 
>      But this means that the current usages all have to be carefully audited,
>      and sometimes that unobvious.

(a) and (c) is not a big problem, are they are essentially the 
constraints of -rt mutexes. As long as there's good debugging code, it's 
very much doable. We dont want to change semantics _yet again_, later 
down the line.

	Ingo

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-13 13:32 ` David Howells
                     ` (3 preceding siblings ...)
  2005-12-13 15:39   ` David Howells
@ 2005-12-13 20:04   ` Steven Rostedt
  4 siblings, 0 replies; 239+ messages in thread
From: Steven Rostedt @ 2005-12-13 20:04 UTC (permalink / raw)
  To: David Howells
  Cc: Ingo Molnar, linux-arch, linux-kernel, matthew, arjan, hch, akpm,
	torvalds, Alan Cox

On Tue, 2005-12-13 at 13:32 +0000, David Howells wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
> > >  (5) Redirects the following to apply to the new mutexes rather than the
> > >      traditional semaphores:
> > > 
> > > 	down()
> > ...
> > 
> > And you've audited every occurence ?
> 
> Outside of the arch directories, yes; but I don't know that I've made the
> correct decision in 100% of the cases.

I'm in the crowd that thinks that the mutex downs and ups should be
converted to mutex_lock/mutex_unlock.  Simply because that is basically
what a mutex is doing.  I rather not have another "historical" API in
the kernel.

> 
> I've changed some of the uses into completions, and found about a dozen or so
> uses of counting semaphores; but the vast majority of occurrences seem to be
> wanting mutex behaviour.

And we can take our time in looking at this in a case by case basis.

> 
> > It seems to me it would be far far saner to define something like
> > 
> > 	sleep_lock(&foo)
> > 	sleep_unlock(&foo)
> > 	sleep_trylock(&foo)
> 
> Which would be a _lot_ more work. It would involve about ten times as many
> changes, I think, and thus be more prone to errors.

I don't think this should be a one shot patch.  Your patch (and what you
would be responsible for) would just introduce the use of the mutex.
Let others go around and find the places where a semaphore is used where
a mutex should be.  Yes there is a lot more mutexes than true
semaphores, and that is why we really should look at this in a case by
case basis.  One big global change will probably more likely miss a case
that should be a semaphore.

> 
> > Its then obvious what it does, you don't randomly break other drivers you've
> > not reviewed and the interface is intuitive rather than obfuscated.
> 
> I've attempted to review everything in 2.6.15-rc5 outside of most of the archs.
> I can't easily modify any driver not contained in that tarball, but at least
> the compiler will barf and force a review.
> 
> > It won't take long for people to then change the name of the performance
> > critical cases and the others will catch up in time.
> 
> It took about ten hours to go through the declarations of struct semaphore and
> review them; I hate to think how long it'd take to go through all the ups and
> downs too.

That's why this should be a step by step integration.

> 
> > It also saves breaking every piece of out of tree kernel code for now
> > good reason.
> 
> But my patch means the changes required are in the most cases minimal: just
> changing struct semaphore to struct mutex is sufficient for the vast majority
> of cases.

But not every case.

> 
> Your way requires a lot more work, both in the tree and out of it.

Not really.  Over time this would be all cleaned up, but introducing a
new API should be the first step, then we can go to each and every spot
to find where a semaphore should be a mutex.  You'll get a lot more
people helping you in that method then you globally changing it, and
people only help when it breaks.

I'm sure I'm not the only one that would be happy to send patches in to
convert semaphores to mutexes where I find them.  But I'd be more
confused if something suddenly breaks that use to work, and then have to
see that "Oh this was a semaphore that mistakenly became a mutex!".

-- Steve

^ permalink raw reply	[flat|nested] 239+ messages in thread

* Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
  2005-12-12 23:45 David Howells
                   ` (13 preceding siblings ...)
  2005-12-13 13:32 ` David Howells
@ 2005-12-13 21:03 ` David Howells
  14 siblings, 0 replies; 239+ messages in thread
From: David Howells @ 2005-12-13 21:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Howells, torvalds, akpm, hch, arjan, matthew, linux-kernel,
	linux-arch

Arnd Bergmann <arnd@arndb.de> wrote:

> I can't see how your code actually detects the over-upping, although it's 
> fairly obvious how it would be done. Did you miss one patch for this?

If owner is NULL, then you've probably upped twice.

David

^ permalink raw reply	[flat|nested] 239+ messages in thread

end of thread, other threads:[~2005-12-22 12:48 UTC | newest]

Thread overview: 239+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-15 17:45 [PATCH 1/19] MUTEX: Introduce simple mutex implementation Luck, Tony
2005-12-15 17:45 ` Luck, Tony
2005-12-15 18:00 ` David Howells
2005-12-15 18:48 ` James Bottomley
2005-12-15 20:38 ` Jeff Dike
2005-12-15 23:45   ` Stephen Rothwell
  -- strict thread matches above, loose matches on Subject: below --
2005-12-16 12:49 linux
2005-12-16 15:24 ` David Howells
2005-12-16 18:03   ` linux
2005-12-15 13:58 linux
2005-12-15 16:15 ` Linus Torvalds
2005-12-15 16:52   ` Erik Mouw
2005-12-15 17:23     ` Dick Streefland
2005-12-16 12:17     ` Erik Mouw
2005-12-17 10:59       ` Sander
2005-12-17 14:14         ` Douglas McNaught
2005-12-17 15:09           ` Sander
2005-12-19 10:44         ` Erik Mouw
2005-12-15 19:02   ` Nikita Danilov
2005-12-15 19:09   ` linux
2005-12-15 19:52     ` Linus Torvalds
2005-12-16  1:33       ` linux
2005-12-15 21:18     ` Steven Rostedt
2005-12-15 20:52   ` Steven Rostedt
2005-12-12 23:45 David Howells
2005-12-13  0:13 ` Nick Piggin
2005-12-13  0:19 ` Nick Piggin
2005-12-13  0:19 ` Andrew Morton
2005-12-13  7:54   ` Ingo Molnar
2005-12-13  7:58     ` Andi Kleen
2005-12-13  8:42       ` Andrew Morton
2005-12-13  8:49         ` Andi Kleen
2005-12-13  9:01           ` Andrew Morton
2005-12-13  9:01             ` Andrew Morton
2005-12-13  9:02             ` Andrew Morton
2005-12-13 10:07               ` Jakub Jelinek
2005-12-13 10:11                 ` Andi Kleen
2005-12-13 10:15                   ` Jakub Jelinek
2005-12-13 10:25                   ` Andrew Morton
2005-12-13 10:25                     ` Andrew Morton
2005-12-14 10:46               ` Russell King
2005-12-13  9:05             ` Andi Kleen
2005-12-13  9:15               ` Andrew Morton
2005-12-13  9:15                 ` Andrew Morton
2005-12-13  9:24                 ` Andi Kleen
2005-12-13  9:44                   ` Andrew Morton
2005-12-13  9:44                     ` Andrew Morton
2005-12-13  9:49                     ` Andi Kleen
2005-12-13 10:28                   ` Andreas Schwab
2005-12-13 10:30                     ` Andi Kleen
2005-12-13 12:33                   ` Matthew Wilcox
2005-12-13 22:18               ` Adrian Bunk
2005-12-13 22:25                 ` Andi Kleen
2005-12-13 22:32                   ` Adrian Bunk
2005-12-13  9:11             ` Ingo Molnar
2005-12-13  9:04           ` Christoph Hellwig
2005-12-13  9:13             ` Ingo Molnar
2005-12-13 10:11             ` Jakub Jelinek
2005-12-13 10:19               ` Christoph Hellwig
2005-12-13 10:27                 ` Ingo Molnar
2005-12-15  4:53                 ` Miles Bader
2005-12-15  5:05                   ` Nick Piggin
2005-12-13  9:09           ` Ingo Molnar
2005-12-13  9:21             ` Andi Kleen
2005-12-13 16:16           ` Linus Torvalds
2005-12-13  9:03         ` Christoph Hellwig
2005-12-13  9:14           ` Andrew Morton
2005-12-13  9:14             ` Andrew Morton
2005-12-13  9:21             ` Christoph Hellwig
2005-12-13  8:00     ` Arjan van de Ven
2005-12-13  9:03       ` Ingo Molnar
2005-12-13  9:09         ` Andi Kleen
2005-12-13  9:34           ` Ingo Molnar
2005-12-13 14:33             ` Mark Lord
2005-12-13 14:45               ` Arjan van de Ven
2005-12-13  9:37           ` Ingo Molnar
2005-12-13  9:19         ` Arjan van de Ven
2005-12-13  9:02     ` Christoph Hellwig
2005-12-13  9:39       ` Ingo Molnar
2005-12-13 10:00         ` Ingo Molnar
2005-12-13 17:40           ` Paul Jackson
2005-12-13 18:34           ` David Howells
2005-12-13 22:31             ` Paul Jackson
2005-12-13 22:31               ` Paul Jackson
2005-12-14 11:02             ` David Howells
2005-12-14 11:12             ` David Howells
2005-12-14 11:18               ` Alan Cox
2005-12-14 12:35               ` David Howells
2005-12-14 12:35                 ` David Howells
2005-12-14 13:58                 ` Thomas Gleixner
2005-12-14 23:40                   ` Mark Lord
2005-12-14 23:54                     ` Andrew Morton
2005-12-15 13:41                       ` Nikita Danilov
2005-12-15 14:56                         ` Alan Cox
2005-12-15 15:52                           ` Nikita Danilov
2005-12-15 16:50                             ` Christopher Friesen
2005-12-15 20:53                               ` Steven Rostedt
2005-12-15 15:55                           ` David Howells
2005-12-15 16:22                             ` linux-os (Dick Johnson)
2005-12-15 16:22                               ` linux-os (Dick Johnson)
2005-12-15 16:28                             ` Linus Torvalds
2005-12-15 17:04                               ` Thomas Gleixner
2005-12-15 17:09                               ` Paul Jackson
2005-12-15 17:17                               ` David Howells
2005-12-15 16:51                             ` David Howells
2005-12-15 16:56                             ` Paul Jackson
2005-12-15 16:56                               ` Paul Jackson
2005-12-15 17:28                             ` David Howells
2005-12-15 17:48                               ` Linus Torvalds
2005-12-15 18:20                                 ` Nikita Danilov
2005-12-15 20:58                                   ` Steven Rostedt
2005-12-15 19:21                                 ` Andrew Morton
2005-12-15 19:38                                   ` Linus Torvalds
2005-12-15 20:28                                   ` Steven Rostedt
2005-12-15 20:32                                     ` Geert Uytterhoeven
2005-12-16 21:41                                       ` Thomas Gleixner
2005-12-16 21:41                                         ` Linus Torvalds
2005-12-16 22:06                                           ` Thomas Gleixner
2005-12-16 22:19                                             ` Linus Torvalds
2005-12-16 22:32                                               ` Steven Rostedt
2005-12-16 22:42                                               ` Thomas Gleixner
2005-12-16 22:41                                                 ` Linus Torvalds
2005-12-16 22:49                                                   ` Steven Rostedt
2005-12-16 23:29                                                   ` Thomas Gleixner
2005-12-17  0:29                                                   ` Joe Korty
2005-12-17  1:00                                                     ` Linus Torvalds
2005-12-17  3:13                                                       ` Steven Rostedt
2005-12-17  7:34                                                         ` Linus Torvalds
2005-12-17 23:43                                                           ` Matthew Wilcox
2005-12-18  0:05                                                             ` Lee Revell
2005-12-18  0:21                                                               ` Matthew Wilcox
2005-12-18  1:25                                                                 ` Lee Revell
2005-12-22 12:27                                                             ` Bill Huey
2005-12-19 16:08                                                           ` Ingo Molnar
2005-12-22 12:40                                                           ` Bill Huey
2005-12-22 12:45                                                             ` Bill Huey
2005-12-19 23:46                                                       ` Keith Owens
2005-12-15 14:41                       ` Steven Rostedt
2005-12-14 23:57                     ` Thomas Gleixner
2005-12-14 23:57                       ` Mark Lord
2005-12-15  0:10                         ` Thomas Gleixner
2005-12-15  2:46                           ` Linus Torvalds
2005-12-15 15:53                           ` David Howells
2005-12-15 15:37                     ` David Howells
2005-12-15 19:28                       ` Andrew Morton
2005-12-15 19:28                         ` Andrew Morton
2005-12-15 20:18                         ` Andrew Morton
2005-12-15 21:28                           ` Steven Rostedt
2005-12-16 22:02                           ` Thomas Gleixner
2005-12-16 10:45                         ` David Howells
2005-12-13  9:55     ` Ingo Molnar
2005-12-13  0:30 ` Arnd Bergmann
2005-12-13  0:57 ` Daniel Walker
2005-12-13  3:23   ` Steven Rostedt
2005-12-13  2:57 ` Mark Lord
2005-12-13  3:17   ` Steven Rostedt
2005-12-13  9:06   ` Christoph Hellwig
2005-12-13  9:54 ` David Howells
2005-12-13 10:13   ` Ingo Molnar
2005-12-13 10:34     ` Ingo Molnar
2005-12-13 10:37       ` Ingo Molnar
2005-12-13 12:47       ` Oliver Neukum
2005-12-13 13:09         ` Alan Cox
2005-12-13 13:13           ` Matthew Wilcox
2005-12-13 14:04             ` Alan Cox
2005-12-13 13:24           ` Oliver Neukum
2005-12-14  1:00   ` Nick Piggin
2005-12-14 10:54   ` David Howells
2005-12-14 11:17     ` Nick Piggin
2005-12-14 11:46     ` David Howells
2005-12-14 21:23       ` Nick Piggin
2005-12-16 12:00       ` David Howells
2005-12-16 13:16         ` Nick Piggin
2005-12-16 15:53         ` David Howells
2005-12-16 23:41           ` Nick Piggin
2005-12-16 16:02         ` David Howells
2005-12-13 10:48 ` David Howells
2005-12-13 12:39   ` Matthew Wilcox
2005-12-13 10:54 ` Ingo Molnar
2005-12-13 11:23 ` David Howells
2005-12-13 11:24 ` David Howells
2005-12-13 13:45   ` Ingo Molnar
2005-12-13 11:34 ` David Howells
2005-12-13 13:05 ` Alan Cox
2005-12-13 13:15   ` Alan Cox
2005-12-13 23:21     ` Nikita Danilov
2005-12-13 13:32 ` David Howells
2005-12-13 14:00   ` Alan Cox
2005-12-13 14:35   ` Christopher Friesen
2005-12-13 14:44     ` Arjan van de Ven
2005-12-13 14:59       ` Christopher Friesen
2005-12-13 15:23   ` David Howells
2005-12-15  5:24     ` Miles Bader
2005-12-13 15:39   ` David Howells
2005-12-13 16:10     ` Alan Cox
2005-12-14 10:29       ` Arjan van de Ven
2005-12-14 11:03         ` Arjan van de Ven
2005-12-14 11:03         ` Alan Cox
2005-12-14 11:08           ` Arjan van de Ven
2005-12-14 11:24             ` Alan Cox
2005-12-14 11:35               ` Andrew Morton
2005-12-14 11:44                 ` Arjan van de Ven
2005-12-14 11:52                   ` Andi Kleen
2005-12-14 11:55                     ` Arjan van de Ven
2005-12-14 11:57                 ` David Howells
2005-12-14 12:19                   ` Jakub Jelinek
2005-12-16  1:54                   ` Nick Piggin
2005-12-16 11:02                   ` David Howells
2005-12-16 13:01                     ` Nick Piggin
2005-12-16 13:21                       ` Russell King
2005-12-16 13:41                         ` Nick Piggin
2005-12-16 13:46                         ` Linh Dang
2005-12-16 14:31                           ` Russell King
2005-12-16 15:24                             ` Linh Dang
2005-12-16 15:35                               ` Nick Piggin
2005-12-16 15:40                               ` Kyle Moffett
2005-12-16 15:49                             ` Linh Dang
2005-12-16 15:46                           ` David Howells
2005-12-16 15:58                             ` Russell King
2005-12-17 15:57                       ` Nikita Danilov
2005-12-16 16:28                     ` Linus Torvalds
2005-12-16 11:30                   ` David Howells
2005-12-16 16:33                     ` Linus Torvalds
2005-12-16 16:33                       ` Linus Torvalds
2005-12-16 22:23                       ` David S. Miller
2005-12-16 22:38                         ` Linus Torvalds
2005-12-16 22:53                           ` David S. Miller
2005-12-17  0:41                             ` Jesse Barnes
2005-12-17  7:10                               ` David S. Miller
2005-12-17  7:40                                 ` Linus Torvalds
2005-12-17 17:22                                   ` Jesse Barnes
2005-12-17 17:19                                 ` Jesse Barnes
2005-12-17 22:38                             ` Richard Henderson
2005-12-17 23:05                               ` David S. Miller
2005-12-14 12:17                 ` Christoph Hellwig
2005-12-14 11:42               ` Arjan van de Ven
2005-12-14  8:31     ` Ingo Molnar
2005-12-13 20:04   ` Steven Rostedt
2005-12-13 21:03 ` David Howells

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.