linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: slub: Ensure that slab_unlock() is atomic
@ 2016-03-08 14:30 Vineet Gupta
  2016-03-08 15:00 ` Christoph Lameter
  2016-03-08 15:32 ` Vlastimil Babka
  0 siblings, 2 replies; 19+ messages in thread
From: Vineet Gupta @ 2016-03-08 14:30 UTC (permalink / raw)
  To: linux-mm
  Cc: Vineet Gupta, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Noam Camus, stable, linux-kernel,
	linux-snps-arc

We observed livelocks on ARC SMP setup when running hackbench with SLUB.
This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus
kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops
suh as test_and_set_bit()

The spinlock itself is implemented using Atomic [EX]change instruction
which is always available.

The race happened when both cores tried to slab_lock() the same page.

   c1		    c0
-----------	-----------
slab_lock
		slab_lock
slab_unlock
		Not observing the unlock

This in turn happened because slab_unlock() doesn't serialize properly
(doesn't use atomic clear) with a concurrent running
slab_lock()->test_and_set_bit()

Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Noam Camus <noamc@ezchip.com>
Cc: <stable@vger.kernel.org>
Cc: <linux-mm@kvack.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: <linux-snps-arc@lists.infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 mm/slub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index d8fbd4a6ed59..b7d345a508dc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page)
 static __always_inline void slab_unlock(struct page *page)
 {
 	VM_BUG_ON_PAGE(PageTail(page), page);
-	__bit_spin_unlock(PG_locked, &page->flags);
+	bit_spin_unlock(PG_locked, &page->flags);
 }
 
 static inline void set_page_slub_counters(struct page *page, unsigned long counters_new)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-08 14:30 [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
@ 2016-03-08 15:00 ` Christoph Lameter
  2016-03-08 15:46   ` Vineet Gupta
  2016-03-08 15:32 ` Vlastimil Babka
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Lameter @ 2016-03-08 15:00 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-mm, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Noam Camus, stable, linux-kernel, linux-snps-arc

On Tue, 8 Mar 2016, Vineet Gupta wrote:

> This in turn happened because slab_unlock() doesn't serialize properly
> (doesn't use atomic clear) with a concurrent running
> slab_lock()->test_and_set_bit()

This is intentional because of the increased latency of atomic
instructions. Why would the unlock need to be atomic? This patch will
cause regressions.

Guess this is an architecture specific issue of modified
cachelines not becoming visible to other processors?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-08 14:30 [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
  2016-03-08 15:00 ` Christoph Lameter
@ 2016-03-08 15:32 ` Vlastimil Babka
  1 sibling, 0 replies; 19+ messages in thread
From: Vlastimil Babka @ 2016-03-08 15:32 UTC (permalink / raw)
  To: Vineet Gupta, linux-mm
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Noam Camus, stable, linux-kernel, linux-snps-arc

On 03/08/2016 03:30 PM, Vineet Gupta wrote:
> We observed livelocks on ARC SMP setup when running hackbench with SLUB.
> This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus
> kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops
> suh as test_and_set_bit()

Sounds like this architecture should then redefine __clear_bit_unlock
and perhaps other non-atomic __X_bit() variants to be atomic, and not
defer this requirement to places that use the API?

> The spinlock itself is implemented using Atomic [EX]change instruction
> which is always available.
> 
> The race happened when both cores tried to slab_lock() the same page.
> 
>    c1		    c0
> -----------	-----------
> slab_lock
> 		slab_lock
> slab_unlock
> 		Not observing the unlock
> 
> This in turn happened because slab_unlock() doesn't serialize properly
> (doesn't use atomic clear) with a concurrent running
> slab_lock()->test_and_set_bit()
> 
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Noam Camus <noamc@ezchip.com>
> Cc: <stable@vger.kernel.org>
> Cc: <linux-mm@kvack.org>
> Cc: <linux-kernel@vger.kernel.org>
> Cc: <linux-snps-arc@lists.infradead.org>
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
> ---
>  mm/slub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index d8fbd4a6ed59..b7d345a508dc 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page)
>  static __always_inline void slab_unlock(struct page *page)
>  {
>  	VM_BUG_ON_PAGE(PageTail(page), page);
> -	__bit_spin_unlock(PG_locked, &page->flags);
> +	bit_spin_unlock(PG_locked, &page->flags);
>  }
>  
>  static inline void set_page_slub_counters(struct page *page, unsigned long counters_new)
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-08 15:00 ` Christoph Lameter
@ 2016-03-08 15:46   ` Vineet Gupta
  2016-03-08 20:40     ` Christoph Lameter
  0 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2016-03-08 15:46 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-mm, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Noam Camus, stable, linux-kernel, linux-snps-arc

On Tuesday 08 March 2016 08:30 PM, Christoph Lameter wrote:
> On Tue, 8 Mar 2016, Vineet Gupta wrote:
> 
>> This in turn happened because slab_unlock() doesn't serialize properly
>> (doesn't use atomic clear) with a concurrent running
>> slab_lock()->test_and_set_bit()
> 
> This is intentional because of the increased latency of atomic
> instructions. Why would the unlock need to be atomic? This patch will
> cause regressions.
> 
> Guess this is an architecture specific issue of modified
> cachelines not becoming visible to other processors?

Absolutely not - we verified with the hardware coherency tracing that there was no
foul play there. And I would dare not point finger at code which was last updated
in 2011 w/o being absolutely sure.

Let me explain this in bit more detail. Like I mentioned in commitlog, this config
of ARC doesn't have exclusive load/stores (LLOCK/SCOND) so atomic ops are
implemented using a "central" spin lock. The spin lock itself is implemented using
EX instruction (atomic R-W)

Generated code for slab_lock() - essentially bit_spin_lock() is below (I've
removed generated code for CONFIG_PREEMPT for simplicity)

80543b0c <slab_lock>:
80543b0c:	push_s     blink
...
80543b3a:	mov_s      r15,0x809de168   <-- @smp_bitops_lock
80543b40:	mov_s      r17,1
80543b46:	mov_s      r16,0

# spin lock() inside test_and_set_bit() - see arc bitops.h (!LLSC code)
80543b78:	clri       r4
80543b7c:	dmb        3
80543b80:	mov_s      r2,r17
80543b82:	ex         r2,[r15]
80543b86:	breq       r2,1,80543b82
80543b8a:	dmb        3

# set the bit
80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)

# spin_unlock
80543b96:	dmb        3
80543b9a:	mov_s      r3,r16
80543b9c:	ex         r3,[r15]
80543ba0:	dmb        3
80543ba4:	seti       r4

# check the old bit
80543ba8:	bbit0      r2,0,80543bb8   <--- bit was set, branch not taken
80543bac:	b_s        80543b68        <--- enter the test_bit() loop

   80543b68:	ld_s       r2,[r13,0]	   <-- (C) reads the bit, set by SELF
   80543b6a:	bbit1    r2,0,80543b68              spins infinitely

...


Now using hardware coherency tracing (and using the cycle timestamps) we verified
(A) and (B)

Thing is with exclusive load/store this race can't just happen since the
intervening ST will cause the ST in (C) to NOT commit and the LD/ST will be retried.

And there will be very few production systems which are SMP but lack exclusive
load/stores.

Are you convinced now !

-Vineet

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-08 15:46   ` Vineet Gupta
@ 2016-03-08 20:40     ` Christoph Lameter
  2016-03-09  6:43       ` Vineet Gupta
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Lameter @ 2016-03-08 20:40 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-mm, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Noam Camus, stable, linux-kernel, linux-snps-arc

On Tue, 8 Mar 2016, Vineet Gupta wrote:

> # set the bit
> 80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
> 80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
> 80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)

Duh. Guess you  need to take the spinlock also in the arch specific
implementation of __bit_spin_unlock(). This is certainly not the only case
in which we use the __ op to unlock.

You need a true atomic op or you need to take the "spinlock" in all
cases where you modify the bit. If you take the lock in __bit_spin_unlock
then the race cannot happen.

> Are you convinced now !

Yes, please fix your arch specific code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-08 20:40     ` Christoph Lameter
@ 2016-03-09  6:43       ` Vineet Gupta
  2016-03-09 10:13         ` Peter Zijlstra
  0 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2016-03-09  6:43 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-mm, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Noam Camus, stable, linux-kernel, linux-snps-arc,
	linux-parisc, Peter Zijlstra, James E.J. Bottomley, Helge Deller,
	linux-arch

+CC linux-arch, parisc folks, PeterZ

On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote:
> On Tue, 8 Mar 2016, Vineet Gupta wrote:
> 
>> # set the bit
>> 80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
>> 80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
>> 80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)
> 
> Duh. Guess you  need to take the spinlock also in the arch specific
> implementation of __bit_spin_unlock(). This is certainly not the only case
> in which we use the __ op to unlock.

__bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is
- so I don't think we need a spinlock there.

There is clearly a problem in slub code that it is pairing a test_and_set_bit()
with a __clear_bit(). Latter can obviously clobber former if they are not a single
instruction each unlike x86 or they use llock/scond kind of instructions where the
interim store from other core is detected and causes a retry of whole llock/scond
sequence.

BTW ARC is not the only arch which suffers from this - other arches potentially
also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of
external hashed spinlocks to protect the r-m-w sequences.

https://lkml.org/lkml/2014/6/1/178

So there also we have the same race because the outer spin lock is not taken for
slab_unlock() -> __bit_spin_lock() -> __clear_bit.

Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit
unconditionally but only if it was clear (PARISC does the same). That would be a
slight micro-optimization as we won't need another snoop transaction to make line
writable and that would also elide this problem, but I think there is a
fundamental problem here in slub which is pairing atomic and non atomic ops - for
performance reasons. It doesn't work on all arches and/or configurations.

> You need a true atomic op or you need to take the "spinlock" in all
> cases where you modify the bit.

No we don't in __bit_spin_lock and we already do in bit_spin_lock.

> If you take the lock in __bit_spin_unlock
> then the race cannot happen.

Of course it won't but that means we penalize all non atomic callers of the API
with a superfluous spinlock which is not require din first place given the
definition of API.


>> Are you convinced now !
> 
> Yes, please fix your arch specific code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09  6:43       ` Vineet Gupta
@ 2016-03-09 10:13         ` Peter Zijlstra
  2016-03-09 10:31           ` Peter Zijlstra
                             ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Peter Zijlstra @ 2016-03-09 10:13 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Christoph Lameter, linux-mm, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Noam Camus, stable, linux-kernel,
	linux-snps-arc, linux-parisc, James E.J. Bottomley, Helge Deller,
	linux-arch

On Wed, Mar 09, 2016 at 12:13:16PM +0530, Vineet Gupta wrote:
> +CC linux-arch, parisc folks, PeterZ
> 
> On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote:
> > On Tue, 8 Mar 2016, Vineet Gupta wrote:
> > 
> >> # set the bit
> >> 80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
> >> 80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
> >> 80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)
> > 
> > Duh. Guess you  need to take the spinlock also in the arch specific
> > implementation of __bit_spin_unlock(). This is certainly not the only case
> > in which we use the __ op to unlock.
> 
> __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is
> - so I don't think we need a spinlock there.

Agreed. The double underscore prefixed instructions are not required to
be atomic in any way shape or form.

> There is clearly a problem in slub code that it is pairing a test_and_set_bit()
> with a __clear_bit(). Latter can obviously clobber former if they are not a single
> instruction each unlike x86 or they use llock/scond kind of instructions where the
> interim store from other core is detected and causes a retry of whole llock/scond
> sequence.

Yes, test_and_set_bit() + __clear_bit() is broken.

> > If you take the lock in __bit_spin_unlock
> > then the race cannot happen.
> 
> Of course it won't but that means we penalize all non atomic callers of the API
> with a superfluous spinlock which is not require din first place given the
> definition of API.

Quite. _However_, your arch is still broken, but not by your fault. Its
the generic-asm code that is wrong.

The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which
defaults to __clear_bit(). Which is wrong.

---
Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()

__clear_bit_unlock() is a special little snowflake. While it carries the
non-atomic '__' prefix, it is specifically documented to pair with
test_and_set_bit() and therefore should be 'somewhat' atomic.

Therefore the generic implementation of __clear_bit_unlock() cannot use
the fully non-atomic __clear_bit() as a default.

If an arch is able to do better; is must provide an implementation of
__clear_bit_unlock() itself.

Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/bitops/lock.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index c30266e94806..8ef0ccbf8167 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -29,16 +29,16 @@ do {					\
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
- * This operation is like clear_bit_unlock, however it is not atomic.
- * It does provide release barrier semantics so it can be used to unlock
- * a bit lock, however it would only be used if no other CPU can modify
- * any bits in the memory until the lock is released (a good example is
- * if the bit lock itself protects access to the other bits in the word).
+ * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
+ * the bits in the word are protected by this lock some archs can use weaker
+ * ops to safely unlock.
+ *
+ * See for example x86's implementation.
  */
 #define __clear_bit_unlock(nr, addr)	\
 do {					\
-	smp_mb();			\
-	__clear_bit(nr, addr);		\
+	smp_mb__before_atomic();	\
+	clear_bit(nr, addr);		\
 } while (0)
 
 #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 10:13         ` Peter Zijlstra
@ 2016-03-09 10:31           ` Peter Zijlstra
  2016-03-09 11:12             ` Vineet Gupta
  2016-03-09 11:00           ` Vineet Gupta
  2016-03-09 13:22           ` [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
  2 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2016-03-09 10:31 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Christoph Lameter, linux-mm, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Noam Camus, stable, linux-kernel,
	linux-snps-arc, linux-parisc, James E.J. Bottomley, Helge Deller,
	linux-arch

On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote:
> ---
> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()
> 
> __clear_bit_unlock() is a special little snowflake. While it carries the
> non-atomic '__' prefix, it is specifically documented to pair with
> test_and_set_bit() and therefore should be 'somewhat' atomic.
> 
> Therefore the generic implementation of __clear_bit_unlock() cannot use
> the fully non-atomic __clear_bit() as a default.
> 
> If an arch is able to do better; is must provide an implementation of
> __clear_bit_unlock() itself.
> 

FWIW, we could probably undo this if we unified all the spinlock based
atomic ops implementations (there's a whole bunch doing essentially the
same), and special cased __clear_bit_unlock() for that.

Collapsing them is probably a good idea anyway, just a fair bit of
non-trivial work to figure out all the differences and if they matter
etc..

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 10:13         ` Peter Zijlstra
  2016-03-09 10:31           ` Peter Zijlstra
@ 2016-03-09 11:00           ` Vineet Gupta
  2016-03-09 11:40             ` Peter Zijlstra
  2016-03-09 13:22           ` [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
  2 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2016-03-09 11:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-parisc, Andrew Morton, Helge Deller,
	linux-kernel, stable, James E.J. Bottomley, Pekka Enberg,
	linux-mm, Noam Camus, David Rientjes, Christoph Lameter,
	linux-snps-arc, Joonsoo Kim

On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote:
>>> If you take the lock in __bit_spin_unlock
>>> then the race cannot happen.
>>
>> Of course it won't but that means we penalize all non atomic callers of the API
>> with a superfluous spinlock which is not require din first place given the
>> definition of API.
> 
> Quite. _However_, your arch is still broken, but not by your fault. Its
> the generic-asm code that is wrong.
> 
> The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which
> defaults to __clear_bit(). Which is wrong.
> 
> ---
> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()
> 
> __clear_bit_unlock() is a special little snowflake. While it carries the
> non-atomic '__' prefix, it is specifically documented to pair with
> test_and_set_bit() and therefore should be 'somewhat' atomic.
> 
> Therefore the generic implementation of __clear_bit_unlock() cannot use
> the fully non-atomic __clear_bit() as a default.
> 
> If an arch is able to do better; is must provide an implementation of
> __clear_bit_unlock() itself.
> 
> Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>

This needs to be CCed stable as it fixes a real bug for ARC.

> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>

FWIW, could we add some background to commit log, specifically what prompted this.
Something like below...

---->8------
This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP
+ SLUB + !LLSC.

The issue was incorrect pairing of atomic ops.

slab_lock() -> bit_spin_lock() -> test_and_set_bit()
slab_unlock() -> __bit_spin_unlock() -> __clear_bit()

The non serializing __clear_bit() was getting "lost"

80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)

Fixes ARC STAR 9000817404
---->8------

> ---
>  include/asm-generic/bitops/lock.h | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
> index c30266e94806..8ef0ccbf8167 100644
> --- a/include/asm-generic/bitops/lock.h
> +++ b/include/asm-generic/bitops/lock.h
> @@ -29,16 +29,16 @@ do {					\
>   * @nr: the bit to set
>   * @addr: the address to start counting from
>   *
> - * This operation is like clear_bit_unlock, however it is not atomic.
> - * It does provide release barrier semantics so it can be used to unlock
> - * a bit lock, however it would only be used if no other CPU can modify
> - * any bits in the memory until the lock is released (a good example is
> - * if the bit lock itself protects access to the other bits in the word).
> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
> + * the bits in the word are protected by this lock some archs can use weaker
> + * ops to safely unlock.
> + *
> + * See for example x86's implementation.
>   */

To be able to override/use-generic don't we need #ifndef ....

>  #define __clear_bit_unlock(nr, addr)	\
>  do {					\
> -	smp_mb();			\
> -	__clear_bit(nr, addr);		\
> +	smp_mb__before_atomic();	\
> +	clear_bit(nr, addr);		\
>  } while (0)
>  
>  #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 10:31           ` Peter Zijlstra
@ 2016-03-09 11:12             ` Vineet Gupta
  0 siblings, 0 replies; 19+ messages in thread
From: Vineet Gupta @ 2016-03-09 11:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Lameter, linux-mm, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Noam Camus, stable, linux-kernel,
	linux-snps-arc, linux-parisc, James E.J. Bottomley, Helge Deller,
	linux-arch

On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote:
> On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote:
>> ---
>> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()
>>
>> __clear_bit_unlock() is a special little snowflake. While it carries the
>> non-atomic '__' prefix, it is specifically documented to pair with
>> test_and_set_bit() and therefore should be 'somewhat' atomic.
>>
>> Therefore the generic implementation of __clear_bit_unlock() cannot use
>> the fully non-atomic __clear_bit() as a default.
>>
>> If an arch is able to do better; is must provide an implementation of
>> __clear_bit_unlock() itself.
>>
> 
> FWIW, we could probably undo this if we unified all the spinlock based
> atomic ops implementations (there's a whole bunch doing essentially the
> same), and special cased __clear_bit_unlock() for that.
> 
> Collapsing them is probably a good idea anyway, just a fair bit of
> non-trivial work to figure out all the differences and if they matter
> etc..

Indeed I thought about this when we first did the SMP port. The only issue was
somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks -
which as I see now will also cause false sharing - likely ending up in the same
cache line), but I was more of a micro-optimization freak then than I'm now :-)

So yeah I agree !

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 11:00           ` Vineet Gupta
@ 2016-03-09 11:40             ` Peter Zijlstra
  2016-03-09 11:53               ` Vineet Gupta
                                 ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Peter Zijlstra @ 2016-03-09 11:40 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-arch, linux-parisc, Andrew Morton, Helge Deller,
	linux-kernel, stable, James E.J. Bottomley, Pekka Enberg,
	linux-mm, Noam Camus, David Rientjes, Christoph Lameter,
	linux-snps-arc, Joonsoo Kim

On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote:
> FWIW, could we add some background to commit log, specifically what prompted this.
> Something like below...

Sure.. find below.

> > +++ b/include/asm-generic/bitops/lock.h
> > @@ -29,16 +29,16 @@ do {					\
> >   * @nr: the bit to set
> >   * @addr: the address to start counting from
> >   *
> > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
> > + * the bits in the word are protected by this lock some archs can use weaker
> > + * ops to safely unlock.
> > + *
> > + * See for example x86's implementation.
> >   */
> 
> To be able to override/use-generic don't we need #ifndef ....

I did not follow through the maze, I think the few archs implementing
this simply do not include this file at all.

I'll let the first person that cares about this worry about that :-)

---
Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()

__clear_bit_unlock() is a special little snowflake. While it carries the
non-atomic '__' prefix, it is specifically documented to pair with
test_and_set_bit() and therefore should be 'somewhat' atomic.

Therefore the generic implementation of __clear_bit_unlock() cannot use
the fully non-atomic __clear_bit() as a default.

If an arch is able to do better; is must provide an implementation of
__clear_bit_unlock() itself.

Specifically, this came up as a result of hackbench livelock'ing in
slab_lock() on ARC with SMP + SLUB + !LLSC.

The issue was incorrect pairing of atomic ops.

slab_lock() -> bit_spin_lock() -> test_and_set_bit()
slab_unlock() -> __bit_spin_unlock() -> __clear_bit()

The non serializing __clear_bit() was getting "lost"

80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)

Fixes ARC STAR 9000817404 (and probably more).

Cc: stable@vger.kernel.org
Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/bitops/lock.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index c30266e94806..8ef0ccbf8167 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -29,16 +29,16 @@ do {					\
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
- * This operation is like clear_bit_unlock, however it is not atomic.
- * It does provide release barrier semantics so it can be used to unlock
- * a bit lock, however it would only be used if no other CPU can modify
- * any bits in the memory until the lock is released (a good example is
- * if the bit lock itself protects access to the other bits in the word).
+ * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
+ * the bits in the word are protected by this lock some archs can use weaker
+ * ops to safely unlock.
+ *
+ * See for example x86's implementation.
  */
 #define __clear_bit_unlock(nr, addr)	\
 do {					\
-	smp_mb();			\
-	__clear_bit(nr, addr);		\
+	smp_mb__before_atomic();	\
+	clear_bit(nr, addr);		\
 } while (0)
 
 #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 11:40             ` Peter Zijlstra
@ 2016-03-09 11:53               ` Vineet Gupta
  2016-03-09 12:22                 ` Peter Zijlstra
  2016-03-14  8:05               ` Vineet Gupta
  2016-03-21 11:16               ` [tip:locking/urgent] bitops: Do not default to __clear_bit() for __clear_bit_unlock() tip-bot for Peter Zijlstra
  2 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2016-03-09 11:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-parisc, Helge Deller, linux-kernel, stable,
	James E.J. Bottomley, Pekka Enberg, linux-mm, Noam Camus,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-snps-arc,
	Christoph Lameter

On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote:
> On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote:
>> FWIW, could we add some background to commit log, specifically what prompted this.
>> Something like below...
> 
> Sure.. find below.
> 
>>> +++ b/include/asm-generic/bitops/lock.h
>>> @@ -29,16 +29,16 @@ do {					\
>>>   * @nr: the bit to set
>>>   * @addr: the address to start counting from
>>>   *
>>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
>>> + * the bits in the word are protected by this lock some archs can use weaker
>>> + * ops to safely unlock.
>>> + *
>>> + * See for example x86's implementation.
>>>   */
>>
>> To be able to override/use-generic don't we need #ifndef ....
> 
> I did not follow through the maze, I think the few archs implementing
> this simply do not include this file at all.
> 
> I'll let the first person that cares about this worry about that :-)

Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC.

For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of
cache coherency transactions !

> 
> ---
> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()
> 
> __clear_bit_unlock() is a special little snowflake. While it carries the
> non-atomic '__' prefix, it is specifically documented to pair with
> test_and_set_bit() and therefore should be 'somewhat' atomic.
> 
> Therefore the generic implementation of __clear_bit_unlock() cannot use
> the fully non-atomic __clear_bit() as a default.
> 
> If an arch is able to do better; is must provide an implementation of
> __clear_bit_unlock() itself.
> 
> Specifically, this came up as a result of hackbench livelock'ing in
> slab_lock() on ARC with SMP + SLUB + !LLSC.
> 
> The issue was incorrect pairing of atomic ops.
> 
> slab_lock() -> bit_spin_lock() -> test_and_set_bit()
> slab_unlock() -> __bit_spin_unlock() -> __clear_bit()
> 
> The non serializing __clear_bit() was getting "lost"
> 
> 80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
> 80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
> 80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)
> 
> Fixes ARC STAR 9000817404 (and probably more).
> 
> Cc: stable@vger.kernel.org
> Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
> Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

LGTM. Thx a bunch Peter !

-Vineet

> ---
>  include/asm-generic/bitops/lock.h | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
> index c30266e94806..8ef0ccbf8167 100644
> --- a/include/asm-generic/bitops/lock.h
> +++ b/include/asm-generic/bitops/lock.h
> @@ -29,16 +29,16 @@ do {					\
>   * @nr: the bit to set
>   * @addr: the address to start counting from
>   *
> - * This operation is like clear_bit_unlock, however it is not atomic.
> - * It does provide release barrier semantics so it can be used to unlock
> - * a bit lock, however it would only be used if no other CPU can modify
> - * any bits in the memory until the lock is released (a good example is
> - * if the bit lock itself protects access to the other bits in the word).
> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
> + * the bits in the word are protected by this lock some archs can use weaker
> + * ops to safely unlock.
> + *
> + * See for example x86's implementation.
>   */
>  #define __clear_bit_unlock(nr, addr)	\
>  do {					\
> -	smp_mb();			\
> -	__clear_bit(nr, addr);		\
> +	smp_mb__before_atomic();	\
> +	clear_bit(nr, addr);		\
>  } while (0)
>  
>  #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 11:53               ` Vineet Gupta
@ 2016-03-09 12:22                 ` Peter Zijlstra
  0 siblings, 0 replies; 19+ messages in thread
From: Peter Zijlstra @ 2016-03-09 12:22 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-arch, linux-parisc, Helge Deller, linux-kernel, stable,
	James E.J. Bottomley, Pekka Enberg, linux-mm, Noam Camus,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-snps-arc,
	Christoph Lameter

On Wed, Mar 09, 2016 at 05:23:26PM +0530, Vineet Gupta wrote:
> > I did not follow through the maze, I think the few archs implementing
> > this simply do not include this file at all.
> > 
> > I'll let the first person that cares about this worry about that :-)
> 
> Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC.
> 
> For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of
> cache coherency transactions !

The win would be in not having to ever retry the SCOND. Although in this
case, the contending CPU would be doing reads -- which I assume will not
cause a SCOND to fail, so it might indeed not make any difference.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 10:13         ` Peter Zijlstra
  2016-03-09 10:31           ` Peter Zijlstra
  2016-03-09 11:00           ` Vineet Gupta
@ 2016-03-09 13:22           ` Vineet Gupta
  2016-03-09 14:51             ` Peter Zijlstra
  2 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2016-03-09 13:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-parisc, Andrew Morton, Helge Deller,
	linux-kernel, stable, James E.J. Bottomley, Pekka Enberg,
	linux-mm, Noam Camus, David Rientjes, Christoph Lameter,
	linux-snps-arc, Joonsoo Kim

On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote:
>> There is clearly a problem in slub code that it is pairing a test_and_set_bit()
>> with a __clear_bit(). Latter can obviously clobber former if they are not a single
>> instruction each unlike x86 or they use llock/scond kind of instructions where the
>> interim store from other core is detected and causes a retry of whole llock/scond
>> sequence.
> 
> Yes, test_and_set_bit() + __clear_bit() is broken.

But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so
(ignoring the performance thing for discussion sake, which is a side effect of
this implementation).

So despite the comment below in bit_spinlock.h I don't quite comprehend how this
is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed
cases, then isn't this true in general for lot more cases in kernel, i.e. pairing
atomic lock with non-atomic unlock ? I'm missing something !

| /*
|  *  bit-based spin_unlock()
|  *  non-atomic version, which can be used eg. if the bit lock itself is
|  *  protecting the rest of the flags in the word.
|  */
| static inline void __bit_spin_unlock(int bitnum, unsigned long *addr)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 13:22           ` [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
@ 2016-03-09 14:51             ` Peter Zijlstra
  2016-03-10  5:51               ` Vineet Gupta
  0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2016-03-09 14:51 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-arch, linux-parisc, Andrew Morton, Helge Deller,
	linux-kernel, stable, James E.J. Bottomley, Pekka Enberg,
	linux-mm, Noam Camus, David Rientjes, Christoph Lameter,
	linux-snps-arc, Joonsoo Kim

On Wed, Mar 09, 2016 at 06:52:45PM +0530, Vineet Gupta wrote:
> On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote:
> >> There is clearly a problem in slub code that it is pairing a test_and_set_bit()
> >> with a __clear_bit(). Latter can obviously clobber former if they are not a single
> >> instruction each unlike x86 or they use llock/scond kind of instructions where the
> >> interim store from other core is detected and causes a retry of whole llock/scond
> >> sequence.
> > 
> > Yes, test_and_set_bit() + __clear_bit() is broken.
> 
> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so
> (ignoring the performance thing for discussion sake, which is a side effect of
> this implementation).

The sort answer is: Per definition. They are defined to work together,
which is what makes __clear_bit_unlock() such a special function.

> So despite the comment below in bit_spinlock.h I don't quite comprehend how this
> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed
> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing
> atomic lock with non-atomic unlock ? I'm missing something !

x86 (and others) do in fact use non-atomic instructions for
spin_unlock(). But as this is all arch specific, we can make these
assumptions. Its just that generic code cannot rely on it.

So let me try and explain.


The problem as identified is:

CPU0						CPU1

bit_spin_lock()					__bit_spin_unlock()
1:
	/* fetch_or, r1 holds the old value */
	spin_lock
	load	r1, addr
						load	r1, addr
						bclr	r2, r1, 1
						store	r2, addr
	or	r2, r1, 1
	store	r2, addr	/* lost the store from CPU1 */
	spin_unlock

	and	r1, 1
	bnz	2	/* it was set, go wait */
	ret

2:
	load	r1, addr
	and	r1, 1
	bnz	2	/* wait until its not set */

	b	1	/* try again */



For LL/SC we replace:

	spin_lock
	load	r1, addr

	...

	store	r2, addr
	spin_unlock

With the (obvious):

1:
	load-locked	r1, addr

	...

	store-cond	r2, addr
	bnz		1 /* or whatever branch instruction is required to retry */


In this case the failure cannot happen, because the store from CPU1
would have invalidated the lock from CPU0 and caused the
store-cond to fail and retry the loop, observing the new value.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 14:51             ` Peter Zijlstra
@ 2016-03-10  5:51               ` Vineet Gupta
  2016-03-10  9:10                 ` Peter Zijlstra
  0 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2016-03-10  5:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-parisc, Andrew Morton, Helge Deller,
	linux-kernel, stable, James E.J. Bottomley, Pekka Enberg,
	linux-mm, Noam Camus, David Rientjes, Christoph Lameter,
	linux-snps-arc, Joonsoo Kim

On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote:
>> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so
>> (ignoring the performance thing for discussion sake, which is a side effect of
>> this implementation).
> 
> The sort answer is: Per definition. They are defined to work together,
> which is what makes __clear_bit_unlock() such a special function.
> 
>> So despite the comment below in bit_spinlock.h I don't quite comprehend how this
>> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed
>> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing
>> atomic lock with non-atomic unlock ? I'm missing something !
> 
> x86 (and others) do in fact use non-atomic instructions for
> spin_unlock(). But as this is all arch specific, we can make these
> assumptions. Its just that generic code cannot rely on it.

OK despite being obvious now, I was not seeing the similarity between spin_*lock()
and bit_spin*lock() :-(

ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for
LLSC case) would be correctly paired with bit_spin_lock().

But then why would anyone need bit_spin_unlock() at all. Specially after this
patch from you which tightens __bit_spin_lock() even more for the general case.

Thing is if the API exists majority of people would would use the more
conservative version w/o understand all these nuances. Can we pursue the path of
moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend
only and if proven stable replacing the call-sites themselves.

> 
> So let me try and explain.
> 
> 
> The problem as identified is:
> 
> CPU0						CPU1
> 
> bit_spin_lock()					__bit_spin_unlock()
> 1:
> 	/* fetch_or, r1 holds the old value */
> 	spin_lock
> 	load	r1, addr
> 						load	r1, addr
> 						bclr	r2, r1, 1
> 						store	r2, addr
> 	or	r2, r1, 1
> 	store	r2, addr	/* lost the store from CPU1 */
> 	spin_unlock
> 
> 	and	r1, 1
> 	bnz	2	/* it was set, go wait */
> 	ret
> 
> 2:
> 	load	r1, addr
> 	and	r1, 1
> 	bnz	2	/* wait until its not set */
> 
> 	b	1	/* try again */
> 
> 
> 
> For LL/SC we replace:
> 
> 	spin_lock
> 	load	r1, addr
> 
> 	...
> 
> 	store	r2, addr
> 	spin_unlock
> 
> With the (obvious):
> 
> 1:
> 	load-locked	r1, addr
> 
> 	...
> 
> 	store-cond	r2, addr
> 	bnz		1 /* or whatever branch instruction is required to retry */
> 
> 
> In this case the failure cannot happen, because the store from CPU1
> would have invalidated the lock from CPU0 and caused the
> store-cond to fail and retry the loop, observing the new value. 

You did it again, A picture is worth thousand words !

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-10  5:51               ` Vineet Gupta
@ 2016-03-10  9:10                 ` Peter Zijlstra
  0 siblings, 0 replies; 19+ messages in thread
From: Peter Zijlstra @ 2016-03-10  9:10 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: linux-arch, linux-parisc, Andrew Morton, Helge Deller,
	linux-kernel, stable, James E.J. Bottomley, Pekka Enberg,
	linux-mm, Noam Camus, David Rientjes, Christoph Lameter,
	linux-snps-arc, Joonsoo Kim

On Thu, Mar 10, 2016 at 11:21:21AM +0530, Vineet Gupta wrote:
> On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote:
> >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so
> >> (ignoring the performance thing for discussion sake, which is a side effect of
> >> this implementation).
> > 
> > The sort answer is: Per definition. They are defined to work together,
> > which is what makes __clear_bit_unlock() such a special function.
> > 
> >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this
> >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed
> >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing
> >> atomic lock with non-atomic unlock ? I'm missing something !
> > 
> > x86 (and others) do in fact use non-atomic instructions for
> > spin_unlock(). But as this is all arch specific, we can make these
> > assumptions. Its just that generic code cannot rely on it.
> 
> OK despite being obvious now, I was not seeing the similarity between spin_*lock()
> and bit_spin*lock() :-(
> 
> ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for
> LLSC case) would be correctly paired with bit_spin_lock().
> 
> But then why would anyone need bit_spin_unlock() at all. Specially after this
> patch from you which tightens __bit_spin_lock() even more for the general case.
> 
> Thing is if the API exists majority of people would would use the more
> conservative version w/o understand all these nuances. Can we pursue the path of
> moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend
> only and if proven stable replacing the call-sites themselves.

So the thing is, __bit_spin_unlock() is not safe if other bits in that
word can have concurrent modifications.

Only if the bitlock locks the whole word (or something else ensures no
other bits will change) can you use __bit_spin_unlock() to clear the
lock bit.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
  2016-03-09 11:40             ` Peter Zijlstra
  2016-03-09 11:53               ` Vineet Gupta
@ 2016-03-14  8:05               ` Vineet Gupta
  2016-03-21 11:16               ` [tip:locking/urgent] bitops: Do not default to __clear_bit() for __clear_bit_unlock() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 19+ messages in thread
From: Vineet Gupta @ 2016-03-14  8:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-parisc, Helge Deller, linux-kernel, stable,
	James E.J. Bottomley, Pekka Enberg, linux-mm, Noam Camus,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-snps-arc,
	Christoph Lameter

On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote:

> ---
> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock()
> 
> __clear_bit_unlock() is a special little snowflake. While it carries the
> non-atomic '__' prefix, it is specifically documented to pair with
> test_and_set_bit() and therefore should be 'somewhat' atomic.
> 
> Therefore the generic implementation of __clear_bit_unlock() cannot use
> the fully non-atomic __clear_bit() as a default.
> 
> If an arch is able to do better; is must provide an implementation of
> __clear_bit_unlock() itself.
> 
> Specifically, this came up as a result of hackbench livelock'ing in
> slab_lock() on ARC with SMP + SLUB + !LLSC.
> 
> The issue was incorrect pairing of atomic ops.
> 
> slab_lock() -> bit_spin_lock() -> test_and_set_bit()
> slab_unlock() -> __bit_spin_unlock() -> __clear_bit()
> 
> The non serializing __clear_bit() was getting "lost"
> 
> 80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
> 80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
> 80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)
> 
> Fixes ARC STAR 9000817404 (and probably more).
> 
> Cc: stable@vger.kernel.org
> Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
> Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way
for 4.6-rc1.

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [tip:locking/urgent] bitops: Do not default to __clear_bit() for __clear_bit_unlock()
  2016-03-09 11:40             ` Peter Zijlstra
  2016-03-09 11:53               ` Vineet Gupta
  2016-03-14  8:05               ` Vineet Gupta
@ 2016-03-21 11:16               ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 19+ messages in thread
From: tip-bot for Peter Zijlstra @ 2016-03-21 11:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: rientjes, jejb, mingo, deller, cl, akpm, linux-kernel,
	Vineet.Gupta1, peterz, iamjoonsoo.kim, hpa, tglx, penberg,
	torvalds, paulmck, noamc

Commit-ID:  f75d48644c56a31731d17fa693c8175328957e1d
Gitweb:     http://git.kernel.org/tip/f75d48644c56a31731d17fa693c8175328957e1d
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 9 Mar 2016 12:40:54 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 21 Mar 2016 10:50:48 +0100

bitops: Do not default to __clear_bit() for __clear_bit_unlock()

__clear_bit_unlock() is a special little snowflake. While it carries the
non-atomic '__' prefix, it is specifically documented to pair with
test_and_set_bit() and therefore should be 'somewhat' atomic.

Therefore the generic implementation of __clear_bit_unlock() cannot use
the fully non-atomic __clear_bit() as a default.

If an arch is able to do better; is must provide an implementation of
__clear_bit_unlock() itself.

Specifically, this came up as a result of hackbench livelock'ing in
slab_lock() on ARC with SMP + SLUB + !LLSC.

The issue was incorrect pairing of atomic ops.

 slab_lock() -> bit_spin_lock() -> test_and_set_bit()
 slab_unlock() -> __bit_spin_unlock() -> __clear_bit()

The non serializing __clear_bit() was getting "lost"

 80543b8e:	ld_s       r2,[r13,0] <--- (A) Finds PG_locked is set
 80543b90:	or         r3,r2,1    <--- (B) other core unlocks right here
 80543b94:	st_s       r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)

Fixes ARC STAR 9000817404 (and probably more).

Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160309114054.GJ6356@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/asm-generic/bitops/lock.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index c30266e..8ef0ccb 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -29,16 +29,16 @@ do {					\
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
- * This operation is like clear_bit_unlock, however it is not atomic.
- * It does provide release barrier semantics so it can be used to unlock
- * a bit lock, however it would only be used if no other CPU can modify
- * any bits in the memory until the lock is released (a good example is
- * if the bit lock itself protects access to the other bits in the word).
+ * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
+ * the bits in the word are protected by this lock some archs can use weaker
+ * ops to safely unlock.
+ *
+ * See for example x86's implementation.
  */
 #define __clear_bit_unlock(nr, addr)	\
 do {					\
-	smp_mb();			\
-	__clear_bit(nr, addr);		\
+	smp_mb__before_atomic();	\
+	clear_bit(nr, addr);		\
 } while (0)
 
 #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-03-21 11:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-08 14:30 [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
2016-03-08 15:00 ` Christoph Lameter
2016-03-08 15:46   ` Vineet Gupta
2016-03-08 20:40     ` Christoph Lameter
2016-03-09  6:43       ` Vineet Gupta
2016-03-09 10:13         ` Peter Zijlstra
2016-03-09 10:31           ` Peter Zijlstra
2016-03-09 11:12             ` Vineet Gupta
2016-03-09 11:00           ` Vineet Gupta
2016-03-09 11:40             ` Peter Zijlstra
2016-03-09 11:53               ` Vineet Gupta
2016-03-09 12:22                 ` Peter Zijlstra
2016-03-14  8:05               ` Vineet Gupta
2016-03-21 11:16               ` [tip:locking/urgent] bitops: Do not default to __clear_bit() for __clear_bit_unlock() tip-bot for Peter Zijlstra
2016-03-09 13:22           ` [PATCH] mm: slub: Ensure that slab_unlock() is atomic Vineet Gupta
2016-03-09 14:51             ` Peter Zijlstra
2016-03-10  5:51               ` Vineet Gupta
2016-03-10  9:10                 ` Peter Zijlstra
2016-03-08 15:32 ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).