From: Segher Boessenkool <segher@kernel.crashing.org> To: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Mackerras <paulus@samba.org>, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible Date: Mon, 20 Sep 2021 16:23:04 -0500 [thread overview] Message-ID: <20210920212303.GZ1583@gate.crashing.org> (raw) In-Reply-To: <db9d01d5c543c5add4b2beadb03d39e99c7ada2c.1632126669.git.christophe.leroy@csgroup.eu> Hi! On Mon, Sep 20, 2021 at 10:31:17AM +0200, Christophe Leroy wrote: > Today we get the following code generation for bitops like > set or clear bit: > > c0009fe0: 39 40 08 00 li r10,2048 > c0009fe4: 7c e0 40 28 lwarx r7,0,r8 > c0009fe8: 7c e7 53 78 or r7,r7,r10 > c0009fec: 7c e0 41 2d stwcx. r7,0,r8 > > c000d568: 39 00 18 00 li r8,6144 > c000d56c: 7c c0 38 28 lwarx r6,0,r7 > c000d570: 7c c6 40 78 andc r6,r6,r8 > c000d574: 7c c0 39 2d stwcx. r6,0,r7 > > Most set bits are constant on lower 16 bits, so it can easily > be replaced by the "immediate" version of the operation. Allow > GCC to choose between the normal or immediate form. You can also handle the second sixteen bits (the "shifted" half), by using oris etc. The "%eN" output modifier prints an "s" for this: /* If the low 16 bits are 0, but some other bit is set, write 's'. */ But this doesn't handle non-constant arguments, so you're likely better off using what you have noe. > For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for > when all bits to be cleared are consecutive. Or when all you want to keep are consecutive (you do handle that now :-) ) > On 64 bits we don't have any equivalent single operation for clearing, > single bits or a few bits, we'd need two 'rldicl' so it is not > worth it, the li/andc sequence is doing the same. You can use rlwinm whenever you want to clear all top 32 bits. A sometimes nice idiom is ori x,x,N ; xori x,x,N to clear the bits N (or oris/xoris). But it's two insns no matter what (but no spare register is needed). > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p) > +{ > + unsigned long old, t; > + unsigned long *p = (unsigned long *)_p; > + > + if (IS_ENABLED(CONFIG_PPC32) && > + __builtin_constant_p(mask) && is_rlwinm_mask_valid(mask)) { is_rlwinm_mask_valid(~mask)? So that test_and_clear_bits(0, ...) will work with rlwinm, and test_and_clear_bits(0xffffffff, ...) will not make gas scream bloody murder ("illegal bitmask"). Tha mask you pass to the instruction is ~mask after all. Looks great except that one nit. Thanks :-) Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Segher
WARNING: multiple messages have this Message-ID
From: Segher Boessenkool <segher@kernel.crashing.org> To: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au>, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible Date: Mon, 20 Sep 2021 16:23:04 -0500 [thread overview] Message-ID: <20210920212303.GZ1583@gate.crashing.org> (raw) In-Reply-To: <db9d01d5c543c5add4b2beadb03d39e99c7ada2c.1632126669.git.christophe.leroy@csgroup.eu> Hi! On Mon, Sep 20, 2021 at 10:31:17AM +0200, Christophe Leroy wrote: > Today we get the following code generation for bitops like > set or clear bit: > > c0009fe0: 39 40 08 00 li r10,2048 > c0009fe4: 7c e0 40 28 lwarx r7,0,r8 > c0009fe8: 7c e7 53 78 or r7,r7,r10 > c0009fec: 7c e0 41 2d stwcx. r7,0,r8 > > c000d568: 39 00 18 00 li r8,6144 > c000d56c: 7c c0 38 28 lwarx r6,0,r7 > c000d570: 7c c6 40 78 andc r6,r6,r8 > c000d574: 7c c0 39 2d stwcx. r6,0,r7 > > Most set bits are constant on lower 16 bits, so it can easily > be replaced by the "immediate" version of the operation. Allow > GCC to choose between the normal or immediate form. You can also handle the second sixteen bits (the "shifted" half), by using oris etc. The "%eN" output modifier prints an "s" for this: /* If the low 16 bits are 0, but some other bit is set, write 's'. */ But this doesn't handle non-constant arguments, so you're likely better off using what you have noe. > For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for > when all bits to be cleared are consecutive. Or when all you want to keep are consecutive (you do handle that now :-) ) > On 64 bits we don't have any equivalent single operation for clearing, > single bits or a few bits, we'd need two 'rldicl' so it is not > worth it, the li/andc sequence is doing the same. You can use rlwinm whenever you want to clear all top 32 bits. A sometimes nice idiom is ori x,x,N ; xori x,x,N to clear the bits N (or oris/xoris). But it's two insns no matter what (but no spare register is needed). > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p) > +{ > + unsigned long old, t; > + unsigned long *p = (unsigned long *)_p; > + > + if (IS_ENABLED(CONFIG_PPC32) && > + __builtin_constant_p(mask) && is_rlwinm_mask_valid(mask)) { is_rlwinm_mask_valid(~mask)? So that test_and_clear_bits(0, ...) will work with rlwinm, and test_and_clear_bits(0xffffffff, ...) will not make gas scream bloody murder ("illegal bitmask"). Tha mask you pass to the instruction is ~mask after all. Looks great except that one nit. Thanks :-) Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Segher
next prev parent reply other threads:[~2021-09-20 21:25 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-20 8:31 Christophe Leroy 2021-09-20 8:31 ` Christophe Leroy 2021-09-20 8:31 ` [PATCH v4 2/3] powerpc/atomics: " Christophe Leroy 2021-09-20 8:31 ` Christophe Leroy 2021-09-20 8:31 ` [PATCH v4 3/3] powerpc/atomics: Remove atomic_inc()/atomic_dec() and friends Christophe Leroy 2021-09-20 8:31 ` Christophe Leroy 2021-09-20 21:23 ` Segher Boessenkool [this message] 2021-09-20 21:23 ` [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible Segher Boessenkool 2021-09-21 15:15 ` Christophe Leroy 2021-09-21 15:15 ` Christophe Leroy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210920212303.GZ1583@gate.crashing.org \ --to=segher@kernel.crashing.org \ --cc=christophe.leroy@csgroup.eu \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=paulus@samba.org \ --subject='Re: [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.