linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Tom Rix <trix@redhat.com>,
	linux-kernel@vger.kernel.org, llvm@lists.linux.dev,
	Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Subject: [PATCH 1/2] x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions
Date: Tue, 10 May 2022 23:25:49 +0900	[thread overview]
Message-ID: <20220510142550.1686866-2-mailhol.vincent@wanadoo.fr> (raw)
In-Reply-To: <20220510142550.1686866-1-mailhol.vincent@wanadoo.fr>

For x86_64, the current ffs() implementation does not produce
optimized code when called with a constant expression. On the
contrary, the __builtin_ffs() function of both GCC and clang is able
to simplify the expression into a single instruction.

* Example *

Let's consider two dummy functions foo() and bar() as below:

| #include <linux/bitops.h>
| #define CONST 0x01000000
|
| unsigned int foo(void)
| {
| 	return ffs(CONST);
| }
|
| unsigned int bar(void)
| {
| 	return __builtin_ffs(CONST);
| }

GCC would produce below assembly code:

| 0000000000000000 <foo>:
|    0:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
|    5:	0f bc c7             	bsf    %edi,%eax
|    8:	83 c0 01             	add    $0x1,%eax
|    b:	c3                   	ret
|    c:	0f 1f 40 00          	nopl   0x0(%rax)
|
| 0000000000000010 <bar>:
|   10:	b8 19 00 00 00       	mov    $0x19,%eax
|   15:	c3                   	ret

And clang would produce:

| 0000000000000000 <foo>:
|    0:	55                   	push   %rbp
|    1:	48 89 e5             	mov    %rsp,%rbp
|    4:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
|    9:	0f bc 05 00 00 00 00 	bsf    0x0(%rip),%eax        # 10 <foo+0x10>
|   10:	ff c0                	inc    %eax
|   12:	5d                   	pop    %rbp
|   13:	c3                   	ret
|   14:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
|   1b:	00 00 00
|   1e:	66 90                	xchg   %ax,%ax
|
| 0000000000000020 <bar>:
|   20:	55                   	push   %rbp
|   21:	48 89 e5             	mov    %rsp,%rbp
|   24:	b8 19 00 00 00       	mov    $0x19,%eax
|   29:	5d                   	pop    %rbp
|   2a:	c3                   	ret

For both examples, we clearly see the benefit of using __builtin_ffs()
instead of the kernel's asm implementation for constant expressions.

However, for non constant expressions, the ffs() asm version of the
kernel remains better for x86_64 because, contrary to GCC, it doesn't
emit the CMOV assembly instruction, c.f. [1] (noticeably, clang is
able optimize out the CMOV call).

This patch uses the __builtin_constant_p() to select between the
kernel's ffs() and the __builtin_ffs() depending on whether the
argument is constant or not.


As a side benefit, this patch also removes below -Wshadow warning:

| ./arch/x86/include/asm/bitops.h:283:28: warning: declaration of 'ffs' shadows a built-in function [-Wshadow]
|   283 | static __always_inline int ffs(int x)

[1] commit ca3d30cc02f7 ("x86_64, asm: Optimise fls(), ffs() and fls64()")
http://lkml.kernel.org/r/20111213145654.14362.39868.stgit@warthog.procyon.org.uk


Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
---
 arch/x86/include/asm/bitops.h | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index a288ecd230ab..535a7a358c14 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -269,18 +269,7 @@ static __always_inline unsigned long __fls(unsigned long word)
 #undef ADDR
 
 #ifdef __KERNEL__
-/**
- * ffs - find first set bit in word
- * @x: the word to search
- *
- * This is defined the same way as the libc and compiler builtin ffs
- * routines, therefore differs in spirit from the other bitops.
- *
- * ffs(value) returns 0 if value is 0 or the position of the first
- * set bit if value is nonzero. The first (least significant) bit
- * is at position 1.
- */
-static __always_inline int ffs(int x)
+static __always_inline int __ffs_asm(int x)
 {
 	int r;
 
@@ -310,6 +299,22 @@ static __always_inline int ffs(int x)
 	return r + 1;
 }
 
+/**
+ * ffs - find first set bit in word
+ * @x: the word to search
+ *
+ * This is defined the same way as the libc and compiler builtin ffs
+ * routines, therefore differs in spirit from the other bitops.
+ *
+ * ffs(value) returns 0 if value is 0 or the position of the first
+ * set bit if value is nonzero. The first (least significant) bit
+ * is at position 1.
+ */
+#define ffs(x)					\
+	 (__builtin_constant_p(x) ?		\
+	  __builtin_ffs(x) :			\
+	  __ffs_asm(x))
+
 /**
  * fls - find last set bit in word
  * @x: the word to search
-- 
2.35.1


  reply	other threads:[~2022-05-10 15:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-10 14:25 [PATCH 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions Vincent Mailhol
2022-05-10 14:25 ` Vincent Mailhol [this message]
2022-05-10 22:29   ` [PATCH 1/2] x86/asm/bitops: ffs: use __builtin_ffs to evaluate " Nick Desaulniers
2022-05-10 23:54     ` Vincent MAILHOL
2022-05-11 15:38       ` Vincent MAILHOL
2022-05-10 14:25 ` [PATCH 2/2] x86/asm/bitops: __ffs,ffz: use __builtin_ctzl " Vincent Mailhol
2022-05-10 22:14 ` [PATCH 0/2] x86/asm/bitops: optimize ff{s,z} functions for " Nick Desaulniers
2022-05-10 23:24   ` Vincent MAILHOL
2022-05-11 14:46     ` Vincent MAILHOL

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220510142550.1686866-2-mailhol.vincent@wanadoo.fr \
    --to=mailhol.vincent@wanadoo.fr \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mingo@redhat.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=tglx@linutronix.de \
    --cc=trix@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).