linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	dja@axtens.net, linux-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, christophe.leroy@c-s.fr,
	linux-arch@vger.kernel.org, Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Christian Borntraeger <borntraeger@de.ibm.com>
Subject: Re: READ_ONCE() + STACKPROTECTOR_STRONG == :/ (was Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops))
Date: Thu, 12 Dec 2019 09:01:05 +0100	[thread overview]
Message-ID: <20191212080105.GV2844@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <875zimp0ay.fsf@mpe.ellerman.id.au>

On Thu, Dec 12, 2019 at 04:42:13PM +1100, Michael Ellerman wrote:
> [ trimmed CC a bit ]
> 
> Peter Zijlstra <peterz@infradead.org> writes:
> > On Fri, Dec 06, 2019 at 11:46:11PM +1100, Michael Ellerman wrote:
> ...
> > you write:
> >
> >   "Currently bitops-instrumented.h assumes that the architecture provides
> > atomic, non-atomic and locking bitops (e.g. both set_bit and __set_bit).
> > This is true on x86 and s390, but is not always true: there is a
> > generic bitops/non-atomic.h header that provides generic non-atomic
> > operations, and also a generic bitops/lock.h for locking operations."
> >
> > Is there any actual benefit for PPC to using their own atomic bitops
> > over bitops/lock.h ? I'm thinking that the generic code is fairly
> > optimal for most LL/SC architectures.
> 
> Yes and no :)
> 
> Some of the generic versions don't generate good code compared to our
> versions, but that's because READ_ONCE() is triggering stack protector
> to be enabled.

Bah, there's never anything simple, is there :/

> For example, comparing an out-of-line copy of the generic and ppc
> versions of test_and_set_bit_lock():
> 
>    1 <generic_test_and_set_bit_lock>:           1 <ppc_test_and_set_bit_lock>:
>    2         addis   r2,r12,361
>    3         addi    r2,r2,-4240
>    4         stdu    r1,-48(r1)
>    5         rlwinm  r8,r3,29,3,28
>    6         clrlwi  r10,r3,26                   2         rldicl  r10,r3,58,6
>    7         ld      r9,3320(r13)
>    8         std     r9,40(r1)
>    9         li      r9,0
>   10         li      r9,1                        3         li      r9,1
>                                                  4         clrlwi  r3,r3,26
>                                                  5         rldicr  r10,r10,3,60
>   11         sld     r9,r9,r10                   6         sld     r3,r9,r3
>   12         add     r10,r4,r8                   7         add     r4,r4,r10
>   13         ldx     r8,r4,r8
>   14         and.    r8,r9,r8
>   15         bne     34f
>   16         ldarx   r7,0,r10                    8         ldarx   r9,0,r4,1
>   17         or      r8,r9,r7                    9         or      r10,r9,r3
>   18         stdcx.  r8,0,r10                   10         stdcx.  r10,0,r4
>   19         bne-    16b                        11         bne-    8b
>   20         isync                              12         isync
>   21         and     r9,r7,r9                   13         and     r3,r3,r9
>   22         addic   r7,r9,-1                   14         addic   r9,r3,-1
>   23         subfe   r7,r7,r9                   15         subfe   r3,r9,r3
>   24         ld      r9,40(r1)
>   25         ld      r10,3320(r13)
>   26         xor.    r9,r9,r10
>   27         li      r10,0
>   28         mr      r3,r7
>   29         bne     36f
>   30         addi    r1,r1,48
>   31         blr                                16         blr
>   32         nop
>   33         nop
>   34         li      r7,1
>   35         b       24b
>   36         mflr    r0
>   37         std     r0,64(r1)
>   38         bl      <__stack_chk_fail+0x8>
> 
> 
> If you squint, the generated code for the actual logic is pretty similar, but
> the stack protector gunk makes a big mess. It's particularly bad here
> because the ppc version doesn't even need a stack frame.
> 
> I've also confirmed that even when test_and_set_bit_lock() is inlined
> into an actual call site the stack protector logic still triggers.

> If I change the READ_ONCE() in test_and_set_bit_lock():
> 
> 	if (READ_ONCE(*p) & mask)
> 		return 1;
> 
> to a regular pointer access:
> 
> 	if (*p & mask)
> 		return 1;
> 
> Then the generated code looks more or less the same, except for the extra early
> return in the generic version of test_and_set_bit_lock(), and different handling
> of the return code by the compiler.

So given that the function signature is:

static inline int test_and_set_bit_lock(unsigned int nr,
					volatile unsigned long *p)

@p already carries the required volatile qualifier, so READ_ONCE() does
not add anything here (except for easier to read code and poor code
generation).

So your proposed change _should_ be fine. Will, I'm assuming you never
saw this on your ARGH64 builds when you did this code ?

---
diff --git a/include/asm-generic/bitops/atomic.h b/include/asm-generic/bitops/atomic.h
index dd90c9792909..10264e8808f8 100644
--- a/include/asm-generic/bitops/atomic.h
+++ b/include/asm-generic/bitops/atomic.h
@@ -35,7 +35,7 @@ static inline int test_and_set_bit(unsigned int nr, volatile unsigned long *p)
 	unsigned long mask = BIT_MASK(nr);
 
 	p += BIT_WORD(nr);
-	if (READ_ONCE(*p) & mask)
+	if (*p & mask)
 		return 1;
 
 	old = atomic_long_fetch_or(mask, (atomic_long_t *)p);
@@ -48,7 +48,7 @@ static inline int test_and_clear_bit(unsigned int nr, volatile unsigned long *p)
 	unsigned long mask = BIT_MASK(nr);
 
 	p += BIT_WORD(nr);
-	if (!(READ_ONCE(*p) & mask))
+	if (!(*p & mask))
 		return 0;
 
 	old = atomic_long_fetch_andnot(mask, (atomic_long_t *)p);
diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index 3ae021368f48..9baf0a0055b8 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -22,7 +22,7 @@ static inline int test_and_set_bit_lock(unsigned int nr,
 	unsigned long mask = BIT_MASK(nr);
 
 	p += BIT_WORD(nr);
-	if (READ_ONCE(*p) & mask)
+	if (*p & mask)
 		return 1;
 
 	old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
@@ -60,7 +60,7 @@ static inline void __clear_bit_unlock(unsigned int nr,
 	unsigned long old;
 
 	p += BIT_WORD(nr);
-	old = READ_ONCE(*p);
+	old = *p;
 	old &= ~BIT_MASK(nr);
 	atomic_long_set_release((atomic_long_t *)p, old);
 }

  reply	other threads:[~2019-12-12  8:01 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06 12:46 [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops) Michael Ellerman
2019-12-06 13:16 ` Peter Zijlstra
2019-12-10  5:38   ` Michael Ellerman
2019-12-10 10:15     ` Peter Zijlstra
2019-12-11  0:29       ` Michael Ellerman
2019-12-12  5:42   ` READ_ONCE() + STACKPROTECTOR_STRONG == :/ (was Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops)) Michael Ellerman
2019-12-12  8:01     ` Peter Zijlstra [this message]
2019-12-12 10:07       ` Will Deacon
2019-12-12 10:46         ` Peter Zijlstra
2019-12-12 17:04           ` Will Deacon
2019-12-12 17:16             ` Will Deacon
2019-12-12 17:41           ` Linus Torvalds
2019-12-12 17:50             ` Segher Boessenkool
2019-12-12 18:06             ` Will Deacon
2019-12-12 18:29               ` Christian Borntraeger
2019-12-12 18:43               ` Linus Torvalds
2019-12-12 19:34                 ` Will Deacon
2019-12-12 20:21                   ` Peter Zijlstra
2019-12-12 20:53                     ` Peter Zijlstra
2019-12-13 10:47                       ` Luc Van Oostenryck
2019-12-13 12:56                         ` Peter Zijlstra
2019-12-13 14:28                           ` Luc Van Oostenryck
2019-12-12 20:49                   ` Linus Torvalds
2019-12-13 13:17                     ` Arnd Bergmann
2019-12-13 21:32                       ` Arnd Bergmann
2019-12-13 22:01                         ` Linus Torvalds
2019-12-16 10:28                       ` Will Deacon
2019-12-16 11:47                         ` Peter Zijlstra
2019-12-16 12:06                         ` Arnd Bergmann
2019-12-17 17:07                     ` Will Deacon
2019-12-17 18:04                       ` Linus Torvalds
2019-12-17 18:05                         ` Linus Torvalds
2019-12-17 18:31                           ` Will Deacon
2019-12-17 18:32                         ` Linus Torvalds
2019-12-18 12:17                           ` Michael Ellerman
2019-12-19 12:11                           ` Will Deacon
2019-12-18 10:22                     ` Christian Borntraeger
2019-12-18 10:35                       ` Will Deacon
2019-12-13 12:07           ` Michael Ellerman
2019-12-13 13:53             ` Segher Boessenkool
2019-12-13 21:06               ` Michael Ellerman
2019-12-12 15:10     ` Segher Boessenkool
2019-12-06 22:15 ` [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops) pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191212080105.GV2844@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=arnd@arndb.de \
    --cc=borntraeger@de.ibm.com \
    --cc=christophe.leroy@c-s.fr \
    --cc=dja@axtens.net \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mark.rutland@arm.com \
    --cc=mpe@ellerman.id.au \
    --cc=segher@kernel.crashing.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).