All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"will.deacon@arm.com" <will.deacon@arm.com>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-snps-arc@lists.infradead.org" 
	<linux-snps-arc@lists.infradead.org>,
	Alexey Brodkin <Alexey.Brodkin@synopsys.com>,
	"yamada.masahiro@socionext.com" <yamada.masahiro@socionext.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: __clear_bit_lock to use atomic clear_bit (was Re: Patch "asm-generic/bitops/lock.h)
Date: Fri, 31 Aug 2018 09:24:44 +0200	[thread overview]
Message-ID: <20180831072444.GD24124@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <C2D7FE5348E1B147BCA15975FBA23075012B090FE0@us01wembx1.internal.synopsys.com>

On Fri, Aug 31, 2018 at 12:29:27AM +0000, Vineet Gupta wrote:
> On 08/30/2018 02:44 AM, Peter Zijlstra wrote:
> >> Back in 2016, Peter had fixed this file due to a problem I reported on ARC. See
> >> commit f75d48644c56a ("bitops: Do not default to __clear_bit() for
> >> __clear_bit_unlock()")
> >> That made __clear_bit_unlock() use the atomic clear_bit() vs. non-atomic
> >> __clear_bit(), effectively making clear_bit_unlock() and __clear_bit_unlock() same.
> >>
> >> This patch undoes that which could explain the issues you see. @Peter, @Will ?
> > Right, so the thinking is that on platforms that suffer that issue,
> > atomic_set*() should DTRT. And if you look at your spinlock based atomic
> > implementation, you'll note that atomic_set() does indeed do the right
> > thing.
> >
> > arch/arc/include/asm/atomic.h:108
> 
> For !LLSC atomics, ARC has always had atomic_set() DTRT even in the git revision
> of 2016. The problem was not in atomics, but the asymmetric way slub bit lock etc
> worked (haven't checked if this changed), i.e.
> 
>      slab_lock() -> bit_spin_lock() -> test_and_set_bit()    # atomic
>      slab_unlock() -> __bit_spin_unlock() -> __clear_bit()    # non-atomic
> 
> And with v4.19-rc1, we have essentially reverted f75d48644c56a due to 84c6591103db
> ("locking/atomics, asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*()")
> 
> So what we have with 4.19-rc1 is
> 
>    static inline void __clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
>    {
>      unsigned long old;
>      p += ((nr) / 32);
>      old = // some typecheck magic on *p
>      old &= ~(1UL << ((nr) % 32));
>      atomic_long_set_release((atomic_long_t *)p, old);
>    }
> 
> So @p is being r-m-w non atomically. The lock variant uses atomic op...
> 
>    int test_and_set_bit_lock(unsigned int nr, volatile unsigned long *p)
>    { 
>       ...
>       old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
>       ....
>    }
> 
> Now I don't know why we don't see the issue with LLSC atomics, perhaps race window
> reduces due to less verbose code itself etc..
> 
> Am I missing something still ?

Yes :-) So there are 2 things to consider:

 1) this whole test_and_set_bit() + __clear_bit() combo only works if we
    have the guarantee that no other bit will change while we have our
    'lock' bit set.

    This means that @old is invariant.

 2) atomic ops and stores work as 'expected' -- which is true for all
    hardware LL/SC or CAS implementations, but not for spinlock based
    atomics.

The bug in f75d48644c56a was the atomic test_and_set loosing the
__clear_bit() store.

With LL/SC this cannot happen, because the competing store (__clear_bit)
will cause the SC to fail, then we'll retry, the second LL observes the
new value.

So the main point is that test_and_set must not loose a store.
atomic_fetch_or() vs atomic_set() ensures this.


NOTE: another possible solution for spinlock based bitops is making
test_and_set 'smarter':

	spin_lock();
	val = READ_ONCE(word);
	if (!(val & bit)) {
		val |= bit;
		WRITE_ONCE(word, val);
	}
	spin_unlock();

But that is not something that works in generic (the other atomic ops),
and therefore atomic_set() is required to take the spinlock too, which
also cures the problem.

WARNING: multiple messages have this Message-ID (diff)
From: peterz@infradead.org (Peter Zijlstra)
To: linux-snps-arc@lists.infradead.org
Subject: __clear_bit_lock to use atomic clear_bit (was Re: Patch "asm-generic/bitops/lock.h)
Date: Fri, 31 Aug 2018 09:24:44 +0200	[thread overview]
Message-ID: <20180831072444.GD24124@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <C2D7FE5348E1B147BCA15975FBA23075012B090FE0@us01wembx1.internal.synopsys.com>

On Fri, Aug 31, 2018@12:29:27AM +0000, Vineet Gupta wrote:
> On 08/30/2018 02:44 AM, Peter Zijlstra wrote:
> >> Back in 2016, Peter had fixed this file due to a problem I reported on ARC. See
> >> commit f75d48644c56a ("bitops: Do not default to __clear_bit() for
> >> __clear_bit_unlock()")
> >> That made __clear_bit_unlock() use the atomic clear_bit() vs. non-atomic
> >> __clear_bit(), effectively making clear_bit_unlock() and __clear_bit_unlock() same.
> >>
> >> This patch undoes that which could explain the issues you see. @Peter, @Will ?
> > Right, so the thinking is that on platforms that suffer that issue,
> > atomic_set*() should DTRT. And if you look at your spinlock based atomic
> > implementation, you'll note that atomic_set() does indeed do the right
> > thing.
> >
> > arch/arc/include/asm/atomic.h:108
> 
> For !LLSC atomics, ARC has always had atomic_set() DTRT even in the git revision
> of 2016. The problem was not in atomics, but the asymmetric way slub bit lock etc
> worked (haven't checked if this changed), i.e.
> 
>      slab_lock() -> bit_spin_lock() -> test_and_set_bit()    # atomic
>      slab_unlock() -> __bit_spin_unlock() -> __clear_bit()    # non-atomic
> 
> And with v4.19-rc1, we have essentially reverted f75d48644c56a due to 84c6591103db
> ("locking/atomics, asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*()")
> 
> So what we have with 4.19-rc1 is
> 
>    static inline void __clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
>    {
>      unsigned long old;
>      p += ((nr) / 32);
>      old = // some typecheck magic on *p
>      old &= ~(1UL << ((nr) % 32));
>      atomic_long_set_release((atomic_long_t *)p, old);
>    }
> 
> So @p is being r-m-w non atomically. The lock variant uses atomic op...
> 
>    int test_and_set_bit_lock(unsigned int nr, volatile unsigned long *p)
>    { 
>       ...
>       old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
>       ....
>    }
> 
> Now I don't know why we don't see the issue with LLSC atomics, perhaps race window
> reduces due to less verbose code itself etc..
> 
> Am I missing something still ?

Yes :-) So there are 2 things to consider:

 1) this whole test_and_set_bit() + __clear_bit() combo only works if we
    have the guarantee that no other bit will change while we have our
    'lock' bit set.

    This means that @old is invariant.

 2) atomic ops and stores work as 'expected' -- which is true for all
    hardware LL/SC or CAS implementations, but not for spinlock based
    atomics.

The bug in f75d48644c56a was the atomic test_and_set loosing the
__clear_bit() store.

With LL/SC this cannot happen, because the competing store (__clear_bit)
will cause the SC to fail, then we'll retry, the second LL observes the
new value.

So the main point is that test_and_set must not loose a store.
atomic_fetch_or() vs atomic_set() ensures this.


NOTE: another possible solution for spinlock based bitops is making
test_and_set 'smarter':

	spin_lock();
	val = READ_ONCE(word);
	if (!(val & bit)) {
		val |= bit;
		WRITE_ONCE(word, val);
	}
	spin_unlock();

But that is not something that works in generic (the other atomic ops),
and therefore atomic_set() is required to take the spinlock too, which
also cures the problem.

WARNING: multiple messages have this Message-ID (diff)
From: peterz@infradead.org (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: __clear_bit_lock to use atomic clear_bit (was Re: Patch "asm-generic/bitops/lock.h)
Date: Fri, 31 Aug 2018 09:24:44 +0200	[thread overview]
Message-ID: <20180831072444.GD24124@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <C2D7FE5348E1B147BCA15975FBA23075012B090FE0@us01wembx1.internal.synopsys.com>

On Fri, Aug 31, 2018 at 12:29:27AM +0000, Vineet Gupta wrote:
> On 08/30/2018 02:44 AM, Peter Zijlstra wrote:
> >> Back in 2016, Peter had fixed this file due to a problem I reported on ARC. See
> >> commit f75d48644c56a ("bitops: Do not default to __clear_bit() for
> >> __clear_bit_unlock()")
> >> That made __clear_bit_unlock() use the atomic clear_bit() vs. non-atomic
> >> __clear_bit(), effectively making clear_bit_unlock() and __clear_bit_unlock() same.
> >>
> >> This patch undoes that which could explain the issues you see. @Peter, @Will ?
> > Right, so the thinking is that on platforms that suffer that issue,
> > atomic_set*() should DTRT. And if you look at your spinlock based atomic
> > implementation, you'll note that atomic_set() does indeed do the right
> > thing.
> >
> > arch/arc/include/asm/atomic.h:108
> 
> For !LLSC atomics, ARC has always had atomic_set() DTRT even in the git revision
> of 2016. The problem was not in atomics, but the asymmetric way slub bit lock etc
> worked (haven't checked if this changed), i.e.
> 
>      slab_lock() -> bit_spin_lock() -> test_and_set_bit()    # atomic
>      slab_unlock() -> __bit_spin_unlock() -> __clear_bit()    # non-atomic
> 
> And with v4.19-rc1, we have essentially reverted f75d48644c56a due to 84c6591103db
> ("locking/atomics, asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*()")
> 
> So what we have with 4.19-rc1 is
> 
>    static inline void __clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
>    {
>      unsigned long old;
>      p += ((nr) / 32);
>      old = // some typecheck magic on *p
>      old &= ~(1UL << ((nr) % 32));
>      atomic_long_set_release((atomic_long_t *)p, old);
>    }
> 
> So @p is being r-m-w non atomically. The lock variant uses atomic op...
> 
>    int test_and_set_bit_lock(unsigned int nr, volatile unsigned long *p)
>    { 
>       ...
>       old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
>       ....
>    }
> 
> Now I don't know why we don't see the issue with LLSC atomics, perhaps race window
> reduces due to less verbose code itself etc..
> 
> Am I missing something still ?

Yes :-) So there are 2 things to consider:

 1) this whole test_and_set_bit() + __clear_bit() combo only works if we
    have the guarantee that no other bit will change while we have our
    'lock' bit set.

    This means that @old is invariant.

 2) atomic ops and stores work as 'expected' -- which is true for all
    hardware LL/SC or CAS implementations, but not for spinlock based
    atomics.

The bug in f75d48644c56a was the atomic test_and_set loosing the
__clear_bit() store.

With LL/SC this cannot happen, because the competing store (__clear_bit)
will cause the SC to fail, then we'll retry, the second LL observes the
new value.

So the main point is that test_and_set must not loose a store.
atomic_fetch_or() vs atomic_set() ensures this.


NOTE: another possible solution for spinlock based bitops is making
test_and_set 'smarter':

	spin_lock();
	val = READ_ONCE(word);
	if (!(val & bit)) {
		val |= bit;
		WRITE_ONCE(word, val);
	}
	spin_unlock();

But that is not something that works in generic (the other atomic ops),
and therefore atomic_set() is required to take the spinlock too, which
also cures the problem.

  reply	other threads:[~2018-08-31  7:25 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-29 18:33 Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash Eugeniy Paltsev
2018-08-29 18:33 ` Eugeniy Paltsev
2018-08-29 18:33 ` Eugeniy Paltsev
2018-08-29 18:33 ` Eugeniy Paltsev
2018-08-29 21:16 ` Vineet Gupta
2018-08-29 21:16   ` Vineet Gupta
2018-08-29 21:16   ` Vineet Gupta
2018-08-29 21:16   ` Vineet Gupta
2018-08-30  9:35   ` Will Deacon
2018-08-30  9:35     ` Will Deacon
2018-08-30  9:35     ` Will Deacon
2018-08-30  9:35     ` Will Deacon
2018-08-30  9:44   ` Peter Zijlstra
2018-08-30  9:44     ` Peter Zijlstra
2018-08-30  9:44     ` Peter Zijlstra
2018-08-30  9:44     ` Peter Zijlstra
2018-08-30  9:44     ` Peter Zijlstra
2018-08-30  9:51     ` Will Deacon
2018-08-30  9:51       ` Will Deacon
2018-08-30  9:51       ` Will Deacon
2018-08-30  9:51       ` Will Deacon
2018-08-30 11:53       ` Eugeniy Paltsev
2018-08-30 11:53         ` Eugeniy Paltsev
2018-08-30 11:53         ` Eugeniy Paltsev
2018-08-30 11:53         ` Eugeniy Paltsev
2018-08-30 13:57         ` Will Deacon
2018-08-30 13:57           ` Will Deacon
2018-08-30 13:57           ` Will Deacon
2018-08-30 13:57           ` Will Deacon
2018-08-30 14:17         ` Peter Zijlstra
2018-08-30 14:17           ` Peter Zijlstra
2018-08-30 14:17           ` Peter Zijlstra
2018-08-30 14:17           ` Peter Zijlstra
2018-08-30 14:17           ` Peter Zijlstra
2018-08-30 14:23           ` Will Deacon
2018-08-30 14:23             ` Will Deacon
2018-08-30 14:23             ` Will Deacon
2018-08-30 14:23             ` Will Deacon
2018-08-30 14:29             ` Peter Zijlstra
2018-08-30 14:29               ` Peter Zijlstra
2018-08-30 14:29               ` Peter Zijlstra
2018-08-30 14:29               ` Peter Zijlstra
2018-08-30 14:43               ` Peter Zijlstra
2018-08-30 14:43                 ` Peter Zijlstra
2018-08-30 14:43                 ` Peter Zijlstra
2018-08-30 14:43                 ` Peter Zijlstra
2018-08-30 14:43                 ` Peter Zijlstra
2020-04-14  1:19                 ` Vineet Gupta
2020-04-14  1:19                   ` Vineet Gupta
2020-04-14  1:19                   ` Vineet Gupta
2020-04-14  1:19                   ` Vineet Gupta
2018-08-30 20:31               ` Vineet Gupta
2018-08-30 20:31                 ` Vineet Gupta
2018-08-30 20:31                 ` Vineet Gupta
2018-08-30 20:31                 ` Vineet Gupta
2018-08-30 20:45                 ` Peter Zijlstra
2018-08-30 20:45                   ` Peter Zijlstra
2018-08-30 20:45                   ` Peter Zijlstra
2018-08-30 20:45                   ` Peter Zijlstra
2018-08-30 20:45                   ` Peter Zijlstra
2018-08-31  0:30                   ` Vineet Gupta
2018-08-31  0:30                     ` Vineet Gupta
2018-08-31  0:30                     ` Vineet Gupta
2018-08-31  0:30                     ` Vineet Gupta
2018-08-31  9:53                     ` Will Deacon
2018-08-31  9:53                       ` Will Deacon
2018-08-31  9:53                       ` Will Deacon
2018-08-31  9:53                       ` Will Deacon
2018-08-30 14:46           ` Eugeniy Paltsev
2018-08-30 14:46             ` Eugeniy Paltsev
2018-08-30 14:46             ` Eugeniy Paltsev
2018-08-30 14:46             ` Eugeniy Paltsev
2018-08-30 14:46             ` Eugeniy Paltsev
2018-08-30 17:15             ` Peter Zijlstra
2018-08-30 17:15               ` Peter Zijlstra
2018-08-30 17:15               ` Peter Zijlstra
2018-08-30 17:15               ` Peter Zijlstra
2018-08-31  0:42       ` Vineet Gupta
2018-08-31  0:42         ` Vineet Gupta
2018-08-31  0:42         ` Vineet Gupta
2018-08-31  0:42         ` Vineet Gupta
2018-08-31  0:29     ` __clear_bit_lock to use atomic clear_bit (was Re: Patch "asm-generic/bitops/lock.h) Vineet Gupta
2018-08-31  0:29       ` Vineet Gupta
2018-08-31  0:29       ` Vineet Gupta
2018-08-31  0:29       ` Vineet Gupta
2018-08-31  7:24       ` Peter Zijlstra [this message]
2018-08-31  7:24         ` Peter Zijlstra
2018-08-31  7:24         ` Peter Zijlstra
2018-08-31  7:24         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180831072444.GD24124@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Alexey.Brodkin@synopsys.com \
    --cc=Eugeniy.Paltsev@synopsys.com \
    --cc=Vineet.Gupta1@synopsys.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.