All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linuxppc-dev@lists.ozlabs.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait()
Date: Fri, 10 Jun 2016 01:25:03 +0800	[thread overview]
Message-ID: <20160609172503.GB26274@insomnia> (raw)
In-Reply-To: <1465475008.16363.1.camel@ellerman.id.au>

[-- Attachment #1: Type: text/plain, Size: 3473 bytes --]

On Thu, Jun 09, 2016 at 10:23:28PM +1000, Michael Ellerman wrote:
> On Wed, 2016-06-08 at 15:59 +0200, Peter Zijlstra wrote:
> > On Wed, Jun 08, 2016 at 11:49:20PM +1000, Michael Ellerman wrote:
> >
> > > > Ok; what tree does this go in? I have this dependent series which I'd
> > > > like to get sorted and merged somewhere.
> > > 
> > > Ah sorry, I didn't realise. I was going to put it in my next (which doesn't
> > > exist yet but hopefully will early next week).
> > > 
> > > I'll make a topic branch with just that commit based on rc2 or rc3?
> > 
> > Works for me; thanks!
>  
> Unfortunately the patch isn't 100%.
> 
> It's causing some of my machines to lock up hard, which isn't surprising when
> you look at the generated code for the non-atomic spin loop:
> 
>   c00000000009af48:	7c 21 0b 78 	mr      r1,r1					# HMT_LOW
>   c00000000009af4c:	40 9e ff fc 	bne     cr7,c00000000009af48 <.do_exit+0x6d8>
> 

There is even no code checking for SHARED_PROCESSOR here, so I assume
your config is !PPC_SPLPAR.

> Which is a spin loop waiting for a result in cr7, but with no comparison.
> 
> The problem seems to be that we did:
> 
> @@ -184,7 +184,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
>  	if (arch_spin_value_unlocked(lock_val))
>  		goto out;
>  
> -	while (lock->slock) {
> +	while (!arch_spin_value_unlocked(*lock)) {
>  		HMT_low();
>  		if (SHARED_PROCESSOR)
>  			__spin_yield(lock);
> 

And as I also did an consolidation in this patch, we now share the same
piece of arch_spin_unlock_wait(), so if !PPC_SPLPAR, the previous loop
became:

	while (!arch_spin_value_unlocked(*lock)) {
 		HMT_low();
	}

and given HMT_low() is not a compiler barrier. So the compiler may
optimize out the loop..

> Which seems to be hiding the fact that lock->slock is volatile from the
> compiler, even though arch_spin_value_unlocked() is inline. Not sure if that's
> our bug or gcc's.
> 

I think arch_spin_value_unlocked() is not volatile because
arch_spin_value_unlocked() takes the value of the lock rather than the
address of the lock as its parameter, which makes it a pure function.

To fix this we can add READ_ONCE() for the read of lock value like the
following:

	while(!arch_spin_value_unlock(READ_ONCE(*lock))) {
		HMT_low();
		...

Or you prefer to simply using lock->slock which is a volatile variable
already?

Or maybe we can refactor the code a little like this:

static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
{
       arch_spinlock_t lock_val;

       smp_mb();

       /*
        * Atomically load and store back the lock value (unchanged).  This
        * ensures that our observation of the lock value is ordered with
        * respect to other lock operations.
        */
       __asm__ __volatile__(
"1:    " PPC_LWARX(%0, 0, %2, 0) "\n"
"      stwcx. %0, 0, %2\n"
"      bne- 1b\n"
       : "=&r" (lock_val), "+m" (*lock)
       : "r" (lock)
       : "cr0", "xer");

       while (!arch_spin_value_unlocked(lock_val)) {
               HMT_low();
               if (SHARED_PROCESSOR)
                       __spin_yield(lock);

               lock_val = READ_ONCE(*lock);
       }
       HMT_medium();

       smp_mb();
}

> Will sleep on it.
> 

Bed time for me too, I will run more tests on the three proposals above
tomorrow and see how things are going.

Regards,
Boqun

> cheers
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2016-06-09 17:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06 11:42 [PATCH v3] powerpc: spinlock: Fix spin_unlock_wait() Michael Ellerman
2016-06-06 11:56 ` Peter Zijlstra
2016-06-06 12:17   ` Michael Ellerman
2016-06-06 14:46     ` Peter Zijlstra
2016-06-08 11:20       ` Michael Ellerman
2016-06-08 12:35         ` Peter Zijlstra
2016-06-08 13:49           ` Michael Ellerman
2016-06-08 13:59             ` Peter Zijlstra
2016-06-09 12:23               ` Michael Ellerman
2016-06-09 17:25                 ` Boqun Feng [this message]
2016-06-10  3:06                   ` Boqun Feng
2016-06-09 17:50                 ` Peter Zijlstra
2016-06-10  0:57                   ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160609172503.GB26274@insomnia \
    --to=boqun.feng@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.