All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: linux-ia64@vger.kernel.org
Subject: Re: [git pull] ia64 changes
Date: Sun, 27 Sep 2009 00:00:52 +0000	[thread overview]
Message-ID: <alpine.LFD.2.01.0909261636570.3303@localhost.localdomain> (raw)
In-Reply-To: <1FE6DD409037234FAB833C420AA843EC0122AEB1@orsmsx424.amr.corp.intel.com>



On Sat, 26 Sep 2009, Matthew Wilcox wrote:

> On Sat, Sep 26, 2009 at 01:39:12PM -0700, Linus Torvalds wrote:
> > Ok, so x86 has obviously done ticket locks for a while now, and I think 
> > it's good that ia64 does too, but I wonder why you made the ticket lock be 
> > a full 64-bit thing?
> 
> This got brought up earlier, and there isn't a fetchadd4, only a fetchadd8.

Hmm. There's a fetchadd4, but it has horrible semantics (very limited 
constants).

You'd want a fetchadd4 that updates the high bits, and have the atomic 
counter in the high 16 bits, and the "current owner" in the low bits. And 
fetchadd4 can't do that, due to its idiotic "only does a few constants" 
thing.

BTW, I also think that the memory ordering is not sufficient in the 
current ia64 ticket locks. It does just a simple

        do {
                cpu_relax();
        } while (ACCESS_ONCE(*p) != turn);

to wait to get the spinlock, and even if the 'fetchadd' it did earlier was 
done with 'acq', the reads in ACCESS_ONCE() are just normal reads, so the 
code that is protected by that spinlock could easily have other reads or 
writes that escape to _before_ the ACCESS_ONCE().

Of course, with an in-order implementation, I guess it will never happen 
in practice, so what do I know. But I do suspect that at least in theory 
that thing is literally buggy as-is.

However, I will ignore that issue, and look at the "4 bytes versus 8 
bytes for the lock" thing. And I think fetchadd4 is still usable, despite 
it's limitations. We make the rule be:

 - the low 15 bits are the "I want to own counter"

   This allows 32768 different CPU's.

 - the high 15 bits are the "this is the current owner" counter

 - the 2 bits in the middle are "don't care" bits, and in particular, we 
   will do non-atomic stores to the upper 16 bits that always clear bit#16 
   in the 32-bit word. Overflow etc may still set it, but we don't care, 
   because the overflow count is still limited to 32767, which means that 
   it's enough that we clear bit #16 more often than that, and we know 
   that the fetchadd will never overflow into the high 15 bits.

and now you have:

 - spinlock:

	fetchadd4.acq r1=[p],1
	target = (r1 & 32767);
	if (target = (r1>>17))
		return; /* The 'acq' was enough */

	while ((ACCESS_ONCE(*p) >> 17) != target)
		cpu_relax();
	/* We need an smp_mb() because the 'acq' was long ago */
	smb_mb();

 - spinunlock

	/*
	 * High bits of the lock - we don't care about atomicity
	 * here, the 'fetchadd4' can't change the bits we care
	 * about.
	 */
	unsigned short *mem = 1+(unsigned short)lock;

	/*
	 * Make sure to clear bit #16 (the ~1 part) when we
	 * store the new value, add add one to bits 17..31 in
	 * the full 4-byte word (it's "*mem+2") to set the
	 * new ticket owner.
	 */
	unsigned short value = (*mem + 2) & ~1;

	/*
	 * Do the final unlock as a 16-bit store with release
	 * semantics
	 */
	st2.rel.nta value,[mem]

and that should work.

Of course, there may be some reason I don't know about why this is a 
violation of the Itanium memory ordering semantics (accessing that thing 
with two different memory access sizes, or maybe 'st2' doesn't have a 
'rel' thing that works or whatever).

Anyway, somebody who knows the Itanium memory ordering should really look 
at this, and even if you don't want to try to do a 4-byte spinlock, at 
least somebody should check the memory ordering with that whole read-loop 
for waiting. I really do believe that you need a memory barrier to make 
sure that the new ticket spinlocks actually work, because the 'fetchadd4' 
you use now doesn't seem to do that whole 'acquire' ordering, and even if 
it does, I don't think it helps for the case where you then have to do 
loads and wait for the value to change.

Hmm?

		Linus

  parent reply	other threads:[~2009-09-27  0:00 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-18 16:34 [git pull] ia64 changes Luck, Tony
2008-04-22 23:36 ` Luck, Tony
2008-04-29 23:36 ` Luck, Tony
2008-05-01 22:57 ` Luck, Tony
2008-05-15 20:46 ` Luck, Tony
2008-05-28 19:52 ` Luck, Tony
2008-06-16 17:28 ` Luck, Tony
2008-06-24 21:49 ` Luck, Tony
2008-06-30 23:52 ` Luck, Tony
2008-07-17 20:31 ` Luck, Tony
2008-07-25 20:30 ` Luck, Tony
2008-08-06 18:32 ` Luck, Tony
2008-08-12 21:55 ` Luck, Tony
2008-08-18 23:46 ` Luck, Tony
2008-08-26 16:59 ` Luck, Tony
2008-09-10 21:17 ` Luck, Tony
2008-09-23 16:59 ` Luck, Tony
2008-09-30 16:18 ` Luck, Tony
2008-10-21 18:01 ` Luck, Tony
2008-11-05  0:43 ` Luck, Tony
2008-11-07 18:00 ` Luck, Tony
2008-11-20 22:47 ` Luck, Tony
2008-12-09 22:28 ` Luck, Tony
2009-01-15 19:45 ` Luck, Tony
2009-02-19 21:02 ` Luck, Tony
2009-02-25 22:44 ` Luck, Tony
2009-03-06 22:17 ` Luck, Tony
2009-03-27 17:46 ` Luck, Tony
2009-04-01 20:20 ` Luck, Tony
2009-04-08 22:33 ` Luck, Tony
2009-04-20 17:32 ` Luck, Tony
2009-05-05 22:42 ` Luck, Tony
2009-06-15 17:20 ` Luck, Tony
2009-07-17 18:11 ` Fenghua Yu
2009-08-11 23:40 ` Fenghua Yu
2009-09-02 17:28 ` Luck, Tony
2009-09-17 17:11 ` Luck, Tony
2009-09-26 19:57 ` Luck, Tony
2009-09-26 20:39 ` Linus Torvalds
2009-09-26 23:16 ` Matthew Wilcox
2009-09-27  0:00 ` Linus Torvalds [this message]
2009-09-27  0:08 ` Linus Torvalds
2009-09-27  4:54 ` Luck, Tony
2009-09-27  5:18 ` Linus Torvalds
2009-09-27  5:20 ` Luck, Tony
2009-09-28 19:02 ` Boehm, Hans
2009-09-28 22:35 ` Luck, Tony
2009-09-28 22:54 ` Linus Torvalds
2009-09-28 23:01 ` Linus Torvalds
2009-09-28 23:01 ` Luck, Tony
2009-09-29  0:03 ` Rick Jones
2009-09-29  0:14 ` Linus Torvalds
2009-09-29 17:53 ` Luck, Tony
2009-09-29 18:07 ` Linus Torvalds
2009-09-30  0:54 ` Robin Holt
2009-09-30  1:24 ` Linus Torvalds
2009-09-30  1:30 ` Linus Torvalds
2009-09-30  2:46 ` Robin Holt
2009-09-30  2:56 ` Linus Torvalds
2009-09-30 18:00 ` Rick Jones
2009-11-02 18:07 ` Luck, Tony
2009-12-15 22:33 ` Luck, Tony
2010-01-08 17:20 ` Luck, Tony
2010-01-08 17:35 ` Linus Torvalds
2010-01-08 17:49 ` Luck, Tony
2010-01-08 19:24 ` Luck, Tony
2010-02-16 19:43 ` Luck, Tony
2010-02-24 17:32 ` Luck, Tony
2010-03-01 18:11 ` Luck, Tony
2010-05-18 17:11 ` Luck, Tony
2010-05-18 20:43 ` Robin Holt
2010-05-18 22:04 ` Luck, Tony
2010-07-01 15:22 ` Luck, Tony
2010-08-04 16:09 ` Luck, Tony
2010-08-14  4:04 ` Luck, Tony
2010-08-18 20:06 ` Luck, Tony
2010-09-13 18:33 ` Luck, Tony
2010-09-14  6:06 ` Petr Tesarik
2010-09-14  7:02 ` Petr Tesarik
2010-09-14 17:37 ` Luck, Tony
2010-09-14 20:38 ` Petr Tesarik
2010-09-14 20:57 ` Tony Luck
2010-09-16 15:57 ` Luck, Tony
2010-10-21 16:00 ` Luck, Tony
2010-10-21 21:33 ` Linus Torvalds
2011-01-10 18:01 ` Luck, Tony
2011-01-13 18:08 ` Luck, Tony
2011-01-13 23:10 ` Luck, Tony
2011-01-14 19:33 ` Luck, Tony
2011-03-30 19:09 ` Luck, Tony
2011-05-31 20:20 ` Luck, Tony
2011-08-01 16:57 ` Luck, Tony
2011-11-02 21:06 ` Luck, Tony
2012-01-05 17:42 ` Luck, Tony
2012-03-23 17:23 ` [GIT PULL] " Luck, Tony
2008-04-24 23:51 [git pull] " Luck, Tony
2008-04-24 23:51 ` Luck, Tony
2008-04-25  0:16 ` Adrian Bunk
2008-04-25  0:16   ` Adrian Bunk
2008-04-25  0:25   ` Luck, Tony
2008-04-25  0:25     ` Luck, Tony
2008-04-25  0:41     ` [2.6 patch] ia64: let NUMA select SMP Adrian Bunk
2008-04-25  0:41       ` Adrian Bunk
2008-04-25  1:01       ` Luck, Tony
2008-04-25  1:01         ` Luck, Tony
2008-04-25 12:50       ` Mike Travis
2008-04-25 12:50         ` Mike Travis
2010-07-21 16:40 [git pull] ia64 changes Luck, Tony
2011-03-24 16:29 Luck, Tony
2011-03-24 16:29 ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.01.0909261636570.3303@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.