From: Peter Zijlstra <peterz@infradead.org> To: Michael Ellerman <mpe@ellerman.id.au> Cc: Linus Torvalds <torvalds@linux-foundation.org>, dja@axtens.net, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, christophe.leroy@c-s.fr, linux-arch@vger.kernel.org, Will Deacon <will@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Segher Boessenkool <segher@kernel.crashing.org>, Arnd Bergmann <arnd@arndb.de>, Christian Borntraeger <borntraeger@de.ibm.com> Subject: Re: READ_ONCE() + STACKPROTECTOR_STRONG == :/ (was Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops)) Date: Thu, 12 Dec 2019 09:01:05 +0100 [thread overview] Message-ID: <20191212080105.GV2844@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <875zimp0ay.fsf@mpe.ellerman.id.au> On Thu, Dec 12, 2019 at 04:42:13PM +1100, Michael Ellerman wrote: > [ trimmed CC a bit ] > > Peter Zijlstra <peterz@infradead.org> writes: > > On Fri, Dec 06, 2019 at 11:46:11PM +1100, Michael Ellerman wrote: > ... > > you write: > > > > "Currently bitops-instrumented.h assumes that the architecture provides > > atomic, non-atomic and locking bitops (e.g. both set_bit and __set_bit). > > This is true on x86 and s390, but is not always true: there is a > > generic bitops/non-atomic.h header that provides generic non-atomic > > operations, and also a generic bitops/lock.h for locking operations." > > > > Is there any actual benefit for PPC to using their own atomic bitops > > over bitops/lock.h ? I'm thinking that the generic code is fairly > > optimal for most LL/SC architectures. > > Yes and no :) > > Some of the generic versions don't generate good code compared to our > versions, but that's because READ_ONCE() is triggering stack protector > to be enabled. Bah, there's never anything simple, is there :/ > For example, comparing an out-of-line copy of the generic and ppc > versions of test_and_set_bit_lock(): > > 1 <generic_test_and_set_bit_lock>: 1 <ppc_test_and_set_bit_lock>: > 2 addis r2,r12,361 > 3 addi r2,r2,-4240 > 4 stdu r1,-48(r1) > 5 rlwinm r8,r3,29,3,28 > 6 clrlwi r10,r3,26 2 rldicl r10,r3,58,6 > 7 ld r9,3320(r13) > 8 std r9,40(r1) > 9 li r9,0 > 10 li r9,1 3 li r9,1 > 4 clrlwi r3,r3,26 > 5 rldicr r10,r10,3,60 > 11 sld r9,r9,r10 6 sld r3,r9,r3 > 12 add r10,r4,r8 7 add r4,r4,r10 > 13 ldx r8,r4,r8 > 14 and. r8,r9,r8 > 15 bne 34f > 16 ldarx r7,0,r10 8 ldarx r9,0,r4,1 > 17 or r8,r9,r7 9 or r10,r9,r3 > 18 stdcx. r8,0,r10 10 stdcx. r10,0,r4 > 19 bne- 16b 11 bne- 8b > 20 isync 12 isync > 21 and r9,r7,r9 13 and r3,r3,r9 > 22 addic r7,r9,-1 14 addic r9,r3,-1 > 23 subfe r7,r7,r9 15 subfe r3,r9,r3 > 24 ld r9,40(r1) > 25 ld r10,3320(r13) > 26 xor. r9,r9,r10 > 27 li r10,0 > 28 mr r3,r7 > 29 bne 36f > 30 addi r1,r1,48 > 31 blr 16 blr > 32 nop > 33 nop > 34 li r7,1 > 35 b 24b > 36 mflr r0 > 37 std r0,64(r1) > 38 bl <__stack_chk_fail+0x8> > > > If you squint, the generated code for the actual logic is pretty similar, but > the stack protector gunk makes a big mess. It's particularly bad here > because the ppc version doesn't even need a stack frame. > > I've also confirmed that even when test_and_set_bit_lock() is inlined > into an actual call site the stack protector logic still triggers. > If I change the READ_ONCE() in test_and_set_bit_lock(): > > if (READ_ONCE(*p) & mask) > return 1; > > to a regular pointer access: > > if (*p & mask) > return 1; > > Then the generated code looks more or less the same, except for the extra early > return in the generic version of test_and_set_bit_lock(), and different handling > of the return code by the compiler. So given that the function signature is: static inline int test_and_set_bit_lock(unsigned int nr, volatile unsigned long *p) @p already carries the required volatile qualifier, so READ_ONCE() does not add anything here (except for easier to read code and poor code generation). So your proposed change _should_ be fine. Will, I'm assuming you never saw this on your ARGH64 builds when you did this code ? --- diff --git a/include/asm-generic/bitops/atomic.h b/include/asm-generic/bitops/atomic.h index dd90c9792909..10264e8808f8 100644 --- a/include/asm-generic/bitops/atomic.h +++ b/include/asm-generic/bitops/atomic.h @@ -35,7 +35,7 @@ static inline int test_and_set_bit(unsigned int nr, volatile unsigned long *p) unsigned long mask = BIT_MASK(nr); p += BIT_WORD(nr); - if (READ_ONCE(*p) & mask) + if (*p & mask) return 1; old = atomic_long_fetch_or(mask, (atomic_long_t *)p); @@ -48,7 +48,7 @@ static inline int test_and_clear_bit(unsigned int nr, volatile unsigned long *p) unsigned long mask = BIT_MASK(nr); p += BIT_WORD(nr); - if (!(READ_ONCE(*p) & mask)) + if (!(*p & mask)) return 0; old = atomic_long_fetch_andnot(mask, (atomic_long_t *)p); diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index 3ae021368f48..9baf0a0055b8 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -22,7 +22,7 @@ static inline int test_and_set_bit_lock(unsigned int nr, unsigned long mask = BIT_MASK(nr); p += BIT_WORD(nr); - if (READ_ONCE(*p) & mask) + if (*p & mask) return 1; old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p); @@ -60,7 +60,7 @@ static inline void __clear_bit_unlock(unsigned int nr, unsigned long old; p += BIT_WORD(nr); - old = READ_ONCE(*p); + old = *p; old &= ~BIT_MASK(nr); atomic_long_set_release((atomic_long_t *)p, old); }
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org> To: Michael Ellerman <mpe@ellerman.id.au> Cc: linux-arch@vger.kernel.org, Will Deacon <will@kernel.org>, Linus Torvalds <torvalds@linux-foundation.org>, linux-kernel@vger.kernel.org, Christian Borntraeger <borntraeger@de.ibm.com>, Arnd Bergmann <arnd@arndb.de>, Mark Rutland <mark.rutland@arm.com>, linuxppc-dev@lists.ozlabs.org, dja@axtens.net Subject: Re: READ_ONCE() + STACKPROTECTOR_STRONG == :/ (was Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops)) Date: Thu, 12 Dec 2019 09:01:05 +0100 [thread overview] Message-ID: <20191212080105.GV2844@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <875zimp0ay.fsf@mpe.ellerman.id.au> On Thu, Dec 12, 2019 at 04:42:13PM +1100, Michael Ellerman wrote: > [ trimmed CC a bit ] > > Peter Zijlstra <peterz@infradead.org> writes: > > On Fri, Dec 06, 2019 at 11:46:11PM +1100, Michael Ellerman wrote: > ... > > you write: > > > > "Currently bitops-instrumented.h assumes that the architecture provides > > atomic, non-atomic and locking bitops (e.g. both set_bit and __set_bit). > > This is true on x86 and s390, but is not always true: there is a > > generic bitops/non-atomic.h header that provides generic non-atomic > > operations, and also a generic bitops/lock.h for locking operations." > > > > Is there any actual benefit for PPC to using their own atomic bitops > > over bitops/lock.h ? I'm thinking that the generic code is fairly > > optimal for most LL/SC architectures. > > Yes and no :) > > Some of the generic versions don't generate good code compared to our > versions, but that's because READ_ONCE() is triggering stack protector > to be enabled. Bah, there's never anything simple, is there :/ > For example, comparing an out-of-line copy of the generic and ppc > versions of test_and_set_bit_lock(): > > 1 <generic_test_and_set_bit_lock>: 1 <ppc_test_and_set_bit_lock>: > 2 addis r2,r12,361 > 3 addi r2,r2,-4240 > 4 stdu r1,-48(r1) > 5 rlwinm r8,r3,29,3,28 > 6 clrlwi r10,r3,26 2 rldicl r10,r3,58,6 > 7 ld r9,3320(r13) > 8 std r9,40(r1) > 9 li r9,0 > 10 li r9,1 3 li r9,1 > 4 clrlwi r3,r3,26 > 5 rldicr r10,r10,3,60 > 11 sld r9,r9,r10 6 sld r3,r9,r3 > 12 add r10,r4,r8 7 add r4,r4,r10 > 13 ldx r8,r4,r8 > 14 and. r8,r9,r8 > 15 bne 34f > 16 ldarx r7,0,r10 8 ldarx r9,0,r4,1 > 17 or r8,r9,r7 9 or r10,r9,r3 > 18 stdcx. r8,0,r10 10 stdcx. r10,0,r4 > 19 bne- 16b 11 bne- 8b > 20 isync 12 isync > 21 and r9,r7,r9 13 and r3,r3,r9 > 22 addic r7,r9,-1 14 addic r9,r3,-1 > 23 subfe r7,r7,r9 15 subfe r3,r9,r3 > 24 ld r9,40(r1) > 25 ld r10,3320(r13) > 26 xor. r9,r9,r10 > 27 li r10,0 > 28 mr r3,r7 > 29 bne 36f > 30 addi r1,r1,48 > 31 blr 16 blr > 32 nop > 33 nop > 34 li r7,1 > 35 b 24b > 36 mflr r0 > 37 std r0,64(r1) > 38 bl <__stack_chk_fail+0x8> > > > If you squint, the generated code for the actual logic is pretty similar, but > the stack protector gunk makes a big mess. It's particularly bad here > because the ppc version doesn't even need a stack frame. > > I've also confirmed that even when test_and_set_bit_lock() is inlined > into an actual call site the stack protector logic still triggers. > If I change the READ_ONCE() in test_and_set_bit_lock(): > > if (READ_ONCE(*p) & mask) > return 1; > > to a regular pointer access: > > if (*p & mask) > return 1; > > Then the generated code looks more or less the same, except for the extra early > return in the generic version of test_and_set_bit_lock(), and different handling > of the return code by the compiler. So given that the function signature is: static inline int test_and_set_bit_lock(unsigned int nr, volatile unsigned long *p) @p already carries the required volatile qualifier, so READ_ONCE() does not add anything here (except for easier to read code and poor code generation). So your proposed change _should_ be fine. Will, I'm assuming you never saw this on your ARGH64 builds when you did this code ? --- diff --git a/include/asm-generic/bitops/atomic.h b/include/asm-generic/bitops/atomic.h index dd90c9792909..10264e8808f8 100644 --- a/include/asm-generic/bitops/atomic.h +++ b/include/asm-generic/bitops/atomic.h @@ -35,7 +35,7 @@ static inline int test_and_set_bit(unsigned int nr, volatile unsigned long *p) unsigned long mask = BIT_MASK(nr); p += BIT_WORD(nr); - if (READ_ONCE(*p) & mask) + if (*p & mask) return 1; old = atomic_long_fetch_or(mask, (atomic_long_t *)p); @@ -48,7 +48,7 @@ static inline int test_and_clear_bit(unsigned int nr, volatile unsigned long *p) unsigned long mask = BIT_MASK(nr); p += BIT_WORD(nr); - if (!(READ_ONCE(*p) & mask)) + if (!(*p & mask)) return 0; old = atomic_long_fetch_andnot(mask, (atomic_long_t *)p); diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index 3ae021368f48..9baf0a0055b8 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -22,7 +22,7 @@ static inline int test_and_set_bit_lock(unsigned int nr, unsigned long mask = BIT_MASK(nr); p += BIT_WORD(nr); - if (READ_ONCE(*p) & mask) + if (*p & mask) return 1; old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p); @@ -60,7 +60,7 @@ static inline void __clear_bit_unlock(unsigned int nr, unsigned long old; p += BIT_WORD(nr); - old = READ_ONCE(*p); + old = *p; old &= ~BIT_MASK(nr); atomic_long_set_release((atomic_long_t *)p, old); }
next prev parent reply other threads:[~2019-12-12 8:01 UTC|newest] Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-12-06 12:46 [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops) Michael Ellerman 2019-12-06 12:46 ` Michael Ellerman 2019-12-06 13:16 ` Peter Zijlstra 2019-12-06 13:16 ` Peter Zijlstra 2019-12-10 5:38 ` Michael Ellerman 2019-12-10 5:38 ` Michael Ellerman 2019-12-10 10:15 ` Peter Zijlstra 2019-12-10 10:15 ` Peter Zijlstra 2019-12-11 0:29 ` Michael Ellerman 2019-12-11 0:29 ` Michael Ellerman 2019-12-12 5:42 ` READ_ONCE() + STACKPROTECTOR_STRONG == :/ (was Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops)) Michael Ellerman 2019-12-12 5:42 ` Michael Ellerman 2019-12-12 8:01 ` Peter Zijlstra [this message] 2019-12-12 8:01 ` Peter Zijlstra 2019-12-12 10:07 ` Will Deacon 2019-12-12 10:07 ` Will Deacon 2019-12-12 10:46 ` Peter Zijlstra 2019-12-12 10:46 ` Peter Zijlstra 2019-12-12 17:04 ` Will Deacon 2019-12-12 17:04 ` Will Deacon 2019-12-12 17:16 ` Will Deacon 2019-12-12 17:16 ` Will Deacon 2019-12-12 17:41 ` Linus Torvalds 2019-12-12 17:41 ` Linus Torvalds 2019-12-12 17:50 ` Segher Boessenkool 2019-12-12 17:50 ` Segher Boessenkool 2019-12-12 18:06 ` Will Deacon 2019-12-12 18:06 ` Will Deacon 2019-12-12 18:29 ` Christian Borntraeger 2019-12-12 18:29 ` Christian Borntraeger 2019-12-12 18:43 ` Linus Torvalds 2019-12-12 18:43 ` Linus Torvalds 2019-12-12 19:34 ` Will Deacon 2019-12-12 19:34 ` Will Deacon 2019-12-12 20:21 ` Peter Zijlstra 2019-12-12 20:21 ` Peter Zijlstra 2019-12-12 20:53 ` Peter Zijlstra 2019-12-12 20:53 ` Peter Zijlstra 2019-12-13 10:47 ` Luc Van Oostenryck 2019-12-13 10:47 ` Luc Van Oostenryck 2019-12-13 12:56 ` Peter Zijlstra 2019-12-13 12:56 ` Peter Zijlstra 2019-12-13 14:28 ` Luc Van Oostenryck 2019-12-13 14:28 ` Luc Van Oostenryck 2019-12-12 20:49 ` Linus Torvalds 2019-12-12 20:49 ` Linus Torvalds 2019-12-13 13:17 ` Arnd Bergmann 2019-12-13 13:17 ` Arnd Bergmann 2019-12-13 21:32 ` Arnd Bergmann 2019-12-13 21:32 ` Arnd Bergmann 2019-12-13 22:01 ` Linus Torvalds 2019-12-13 22:01 ` Linus Torvalds 2019-12-16 10:28 ` Will Deacon 2019-12-16 10:28 ` Will Deacon 2019-12-16 11:47 ` Peter Zijlstra 2019-12-16 11:47 ` Peter Zijlstra 2019-12-16 12:06 ` Arnd Bergmann 2019-12-16 12:06 ` Arnd Bergmann 2019-12-17 17:07 ` Will Deacon 2019-12-17 17:07 ` Will Deacon 2019-12-17 18:04 ` Linus Torvalds 2019-12-17 18:04 ` Linus Torvalds 2019-12-17 18:05 ` Linus Torvalds 2019-12-17 18:05 ` Linus Torvalds 2019-12-17 18:31 ` Will Deacon 2019-12-17 18:31 ` Will Deacon 2019-12-17 18:32 ` Linus Torvalds 2019-12-17 18:32 ` Linus Torvalds 2019-12-18 12:17 ` Michael Ellerman 2019-12-18 12:17 ` Michael Ellerman 2019-12-19 12:11 ` Will Deacon 2019-12-19 12:11 ` Will Deacon 2019-12-18 10:22 ` Christian Borntraeger 2019-12-18 10:22 ` Christian Borntraeger 2019-12-18 10:35 ` Will Deacon 2019-12-18 10:35 ` Will Deacon 2019-12-13 12:07 ` Michael Ellerman 2019-12-13 12:07 ` Michael Ellerman 2019-12-13 13:53 ` Segher Boessenkool 2019-12-13 13:53 ` Segher Boessenkool 2019-12-13 21:06 ` Michael Ellerman 2019-12-13 21:06 ` Michael Ellerman 2019-12-12 15:10 ` Segher Boessenkool 2019-12-12 15:10 ` Segher Boessenkool 2019-12-06 22:15 ` [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops) pr-tracker-bot 2019-12-06 22:15 ` pr-tracker-bot
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191212080105.GV2844@hirez.programming.kicks-ass.net \ --to=peterz@infradead.org \ --cc=arnd@arndb.de \ --cc=borntraeger@de.ibm.com \ --cc=christophe.leroy@c-s.fr \ --cc=dja@axtens.net \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mark.rutland@arm.com \ --cc=mpe@ellerman.id.au \ --cc=segher@kernel.crashing.org \ --cc=torvalds@linux-foundation.org \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.