All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Gerst <brgerst@gmail.com>
To: Borislav Petkov <bp@suse.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andi Kleen <andi@firstfloor.org>,
	zengzhaoxiu@163.com, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH] x86/hweight: Get rid of the special calling convention
Date: Wed, 4 May 2016 15:31:11 -0400	[thread overview]
Message-ID: <CAMzpN2h5AroNfCYcmdRCTBQuqBc_r+i+yMqg2aDFQjwYKWZGOA@mail.gmail.com> (raw)
In-Reply-To: <20160504184612.GC23257@pd.tnic>

On Wed, May 4, 2016 at 2:46 PM, Borislav Petkov <bp@suse.de> wrote:
> On Thu, Apr 07, 2016 at 11:43:33AM +0200, Borislav Petkov wrote:
>> I guess we can do something like this:
>>
>>        if (likely(static_cpu_has(X86_FEATURE_POPCNT)))
>>                asm volatile(POPCNT32
>>                             : "="REG_OUT (res)
>>                             : REG_IN (w));
>>        else
>>                res = __sw_hweight32(w);
>>
>> and get rid of the custom calling convention.
>>
>> Along with some numbers showing that the change doesn't cause any
>> noticeable slowdown...
>
> Ok, here's something which seems to build and boot in kvm.
>
> I like how we don't need the special calling conventions anymore and we
> can actually say "popcnt .." and gcc selects registers.
>
> The include files hackery is kinda nasty but I had to do it because I
> needed to be able to use static_cpu_has() in a header and including
> asm/cpufeature.h pulls in all kinds of nasty dependencies. I'm certainly
> open for better ideas...
>
> ---
> From: Borislav Petkov <bp@suse.de>
> Date: Wed, 4 May 2016 18:52:09 +0200
> Subject: [PATCH] x86/hweight: Get rid of the special calling convention
>
> People complained about ARCH_HWEIGHT_CFLAGS and how it throws a wrench
> into kcov, lto, etc, experimentation.
>
> And its not like we absolutely need it so let's get rid of it and
> streamline it a bit. I had to do some carving out of facilities so
> that the include hell doesn't swallow me but other than that, the new
> __arch_hweight*() versions look much more palatable and gcc is more free
> to select registers than us hardcoding them in the insn bytes.
>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  arch/x86/Kconfig                      |   5 --
>  arch/x86/include/asm/arch_hweight.h   |  43 ++++---------
>  arch/x86/include/asm/cpufeature.h     | 112 +-------------------------------
>  arch/x86/include/asm/cpuinfo.h        |  65 +++++++++++++++++++
>  arch/x86/include/asm/processor.h      |  63 +-----------------
>  arch/x86/include/asm/static_cpu_has.h | 116 ++++++++++++++++++++++++++++++++++
>  lib/Makefile                          |   5 --
>  7 files changed, 197 insertions(+), 212 deletions(-)
>  create mode 100644 arch/x86/include/asm/cpuinfo.h
>  create mode 100644 arch/x86/include/asm/static_cpu_has.h
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 7bb15747fea2..79e0bcd61cb1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -292,11 +292,6 @@ config X86_32_LAZY_GS
>         def_bool y
>         depends on X86_32 && !CC_STACKPROTECTOR
>
> -config ARCH_HWEIGHT_CFLAGS
> -       string
> -       default "-fcall-saved-ecx -fcall-saved-edx" if X86_32
> -       default "-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11" if X86_64
> -
>  config ARCH_SUPPORTS_UPROBES
>         def_bool y
>
> diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
> index 02e799fa43d1..6c1a2d500c4c 100644
> --- a/arch/x86/include/asm/arch_hweight.h
> +++ b/arch/x86/include/asm/arch_hweight.h
> @@ -2,36 +2,18 @@
>  #define _ASM_X86_HWEIGHT_H
>
>  #include <asm/cpufeatures.h>
> +#include <asm/static_cpu_has.h>
>
> -#ifdef CONFIG_64BIT
> -/* popcnt %edi, %eax -- redundant REX prefix for alignment */
> -#define POPCNT32 ".byte 0xf3,0x40,0x0f,0xb8,0xc7"
> -/* popcnt %rdi, %rax */
> -#define POPCNT64 ".byte 0xf3,0x48,0x0f,0xb8,0xc7"
> -#define REG_IN "D"
> -#define REG_OUT "a"
> -#else
> -/* popcnt %eax, %eax */
> -#define POPCNT32 ".byte 0xf3,0x0f,0xb8,0xc0"
> -#define REG_IN "a"
> -#define REG_OUT "a"
> -#endif
> -
> -/*
> - * __sw_hweightXX are called from within the alternatives below
> - * and callee-clobbered registers need to be taken care of. See
> - * ARCH_HWEIGHT_CFLAGS in <arch/x86/Kconfig> for the respective
> - * compiler switches.
> - */
>  static __always_inline unsigned int __arch_hweight32(unsigned int w)
>  {
> -       unsigned int res = 0;
> +       unsigned int res;
>
> -       asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
> -                    : "="REG_OUT (res)
> -                    : REG_IN (w));
> +       if (likely(static_cpu_has(X86_FEATURE_POPCNT))) {
> +               asm volatile("popcnt %[w], %[res]" : [res] "=r" (res) : [w] "r" (w));

Do all supported versions of the assembler know of the popcnt
instruction?  That's why is was open coded before.  The problem is
Intel and AMD are constantly adding new instructions and it's a long
cycle for the user's assembler to get updated.

--
Brian Gerst

  reply	other threads:[~2016-05-04 19:31 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05  2:06 [PATCH V2 01/30] bitops: add parity functions Zeng Zhaoxiu
2016-04-05  4:23 ` [PATCH V2 02/30] Include generic parity.h in some architectures' bitops.h Zeng Zhaoxiu
2016-04-05  4:23 ` Zeng Zhaoxiu
2016-04-05  4:23   ` Zeng Zhaoxiu
2016-04-05  4:23   ` Zeng Zhaoxiu
2016-04-05  4:23   ` Zeng Zhaoxiu
2016-04-05  4:23   ` Zeng Zhaoxiu
2016-04-05  4:23   ` Zeng Zhaoxiu
2016-04-06  8:41   ` [PATCH v2 " zengzhaoxiu
2016-04-06  8:41   ` zengzhaoxiu
2016-04-06  8:41     ` zengzhaoxiu at 163.com
2016-04-06  8:41     ` zengzhaoxiu
2016-04-06  8:41     ` zengzhaoxiu
2016-04-06  8:41     ` zengzhaoxiu
2016-04-11 17:31     ` Alexey Brodkin
2016-04-11 17:31       ` Alexey Brodkin
2016-04-11 17:31       ` Alexey Brodkin
2016-04-11 17:31       ` Alexey Brodkin
2016-04-05 19:04 ` [PATCH V2 01/30] bitops: add parity functions Sam Ravnborg
2016-04-06  5:33   ` Zeng Zhaoxiu
2016-04-06  8:24     ` Sam Ravnborg
2016-04-06  8:22   ` [PATCH v2 " zengzhaoxiu
2016-04-06  8:46 ` [PATCH v2 03/30] Add alpha-specific " zengzhaoxiu
2016-04-06  8:53 ` [PATCH v2 04/30] Add blackfin-specific " zengzhaoxiu
2016-04-06  8:57 ` [PATCH v2 05/30] Add ia64-specific " zengzhaoxiu
2016-04-06  8:57   ` zengzhaoxiu
2016-04-06  8:59 ` [PATCH v2 06/30] Add mips-specific " zengzhaoxiu
2016-04-06 10:23   ` zengzhaoxiu
2016-04-06  9:03 ` [PATCH v2 07/30] Add powerpc-specific " zengzhaoxiu
2016-04-06  9:07 ` [PATCH v2 08/30] Add sparc-specific " zengzhaoxiu
2016-04-06  9:07   ` zengzhaoxiu
2016-04-06 16:37   ` Josip Rodin
2016-04-06 18:44   ` Sam Ravnborg
2016-04-06 18:44     ` Sam Ravnborg
2016-04-07  3:56     ` Zeng Zhaoxiu
2016-04-07  3:56       ` Zeng Zhaoxiu
2016-04-06  9:08 ` [PATCH v2 09/30] Add tile-specific " zengzhaoxiu
2016-04-06 13:27   ` Chris Metcalf
2016-04-07  3:55     ` Zeng Zhaoxiu
2016-04-06  9:14 ` [PATCH v2 10/30] Add x86-specific " zengzhaoxiu
2016-04-06 10:13   ` Borislav Petkov
2016-04-06 10:37     ` One Thousand Gnomes
2016-04-06 10:53       ` Borislav Petkov
2016-04-07  3:55         ` Zeng Zhaoxiu
2016-04-07  9:39           ` Borislav Petkov
2016-04-11  2:43       ` Zeng Zhaoxiu
2016-04-15  2:11         ` Borislav Petkov
2016-04-07  3:55     ` Zeng Zhaoxiu
2016-04-07  9:41       ` Borislav Petkov
2016-04-06 19:45   ` Andi Kleen
2016-04-07  3:56     ` Zeng Zhaoxiu
2016-04-07  6:31     ` Dmitry Vyukov
2016-04-07  9:43       ` Borislav Petkov
2016-05-04 18:46         ` [RFC PATCH] x86/hweight: Get rid of the special calling convention Borislav Petkov
2016-05-04 19:31           ` Brian Gerst [this message]
2016-05-04 19:33             ` H. Peter Anvin
2016-05-04 19:41               ` Borislav Petkov
2016-05-04 19:49                 ` H. Peter Anvin
2016-05-04 20:22                   ` Borislav Petkov
2016-05-04 20:51                     ` H. Peter Anvin
2016-05-04 21:09                     ` Andi Kleen
2016-05-05 13:02                     ` Denys Vlasenko
2016-05-05 14:04                       ` Borislav Petkov
2016-05-10 16:53                         ` [PATCH -v2] " Borislav Petkov
2016-05-10 17:23                           ` Peter Zijlstra
2016-05-10 19:02                             ` Borislav Petkov
2016-05-10 19:03                             ` H. Peter Anvin
2016-05-10 19:10                               ` Borislav Petkov
2016-05-10 22:30                                 ` H. Peter Anvin
2016-05-11  4:11                                   ` Borislav Petkov
2016-05-11 11:15                                     ` Brian Gerst
2016-05-11 11:24                                       ` Peter Zijlstra
2016-05-11 12:47                                         ` Borislav Petkov
2016-05-12  4:54                                         ` H. Peter Anvin
2016-05-12 11:57                                           ` Borislav Petkov
2016-05-12 12:14                                             ` Peter Zijlstra
2016-05-12 13:09                                               ` Borislav Petkov
2016-05-18 10:38                                                 ` Borislav Petkov
2016-04-07 14:10     ` [PATCH v2 10/30] Add x86-specific parity functions One Thousand Gnomes
2016-04-06  9:27 ` [PATCH v2 11/30] sunrpc: use parity8 zengzhaoxiu
2016-04-06  9:30 ` [PATCH v2 12/30] mips: use parity functions in cerr-sb1.c zengzhaoxiu
2016-04-06  9:36 ` [PATCH v2 13/30] bch: use parity32 zengzhaoxiu
2016-04-06  9:39 ` [PATCH v2 14/30] media: use parity8 in vivid-vbi-gen.c zengzhaoxiu
2016-04-06  9:41 ` [PATCH v2 15/30] media: use parity functions in saa7115 zengzhaoxiu
2016-04-06  9:43 ` [PATCH v2 16/30] input: use parity32 in grip_mp zengzhaoxiu
2016-04-06  9:44 ` [PATCH v2 17/30] input: use parity64 in sidewinder zengzhaoxiu
2016-04-06  9:45 ` [PATCH v2 18/30] input: use parity16 in ams_delta_serio zengzhaoxiu
2016-04-06  9:47 ` [PATCH v2 19/30] scsi: use parity32 in isci's phy zengzhaoxiu
2016-04-06  9:52 ` [PATCH v2 20/30] mtd: use parity16 in ssfdc zengzhaoxiu
2016-04-06  9:53 ` [PATCH v2 21/30] mtd: use parity functions in inftlcore zengzhaoxiu
2016-04-06  9:58 ` [PATCH v2 22/30] crypto: use parity functions in qat_hal zengzhaoxiu
2016-04-06 10:05 ` [PATCH v2 23/30] mtd: use parity16 in sm_ftl zengzhaoxiu
2016-04-06 10:11 ` [PATCH v2 24/30] ethernet: use parity8 in sun/niu.c zengzhaoxiu
2016-04-06 10:14 ` [PATCH v2 25/30] input: use parity8 in pcips2 zengzhaoxiu
2016-04-06 10:15 ` [PATCH v2 26/30] input: use parity8 in sa1111ps2 zengzhaoxiu
2016-04-06 10:16 ` [PATCH v2 27/30] iio: use parity32 in adxrs450 zengzhaoxiu
2016-04-10 14:37   ` Jonathan Cameron
2016-04-10 14:41     ` Lars-Peter Clausen
2016-04-10 15:13       ` Jonathan Cameron
2016-04-10 15:14         ` Jonathan Cameron
2016-04-06 10:18 ` [PATCH v2 28/30] serial: use parity32 in max3100 zengzhaoxiu
2016-04-06 10:25   ` Greg KH
2016-04-06 10:20 ` [PATCH v2 29/30] input: use parity8 in elantech zengzhaoxiu
2016-04-06 10:21 ` [PATCH v2 30/30] ethernet: use parity8 in broadcom/tg3.c zengzhaoxiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMzpN2h5AroNfCYcmdRCTBQuqBc_r+i+yMqg2aDFQjwYKWZGOA@mail.gmail.com \
    --to=brgerst@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bp@suse.de \
    --cc=dvlasenk@redhat.com \
    --cc=dvyukov@google.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=zengzhaoxiu@163.com \
    --cc=zhaoxiu.zeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.