linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Dobriyan <adobriyan@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: x86@kernel.org, tglx@linutronix.de, mingo@redhat.com,
	hpa@zytor.com, Alexey Dobriyan <adobriyan@gmail.com>
Subject: [PATCH v0 1/5] x86_64: march=native support
Date: Fri,  8 Dec 2017 01:41:50 +0300	[thread overview]
Message-ID: <20171207224154.4687-1-adobriyan@gmail.com> (raw)

Being Gentoo user part of me died every time I compiled kernel
with raw -O2 when userspace was running with "-march=native -O2" for years.

This patch implements kernel build with "-march=native", at last.
So far resulting kernel is good enough to boot in VM.

Benchmarks:
	No serious benchmarking was done yet. :-(

Random microbenchmarking indicates that a) SHLX et al enabled SHA-1 can be
~10% faster than regular one as there are no carry flags dependencies and
b) REP STOSB clear_page() can be ~15% faster then REP STOSQ one where
fast REP STOSB is advertised. This is actually important because
clear_page()/copy_page() are regularly seen on top of kernel profiles.

Code size:
SHLX et al bloat kernel quite a lot as these new instructions live in
extended opcode space. However, this is compensated by telling gcc to use
REP STOSB/MOVSB. gcc loves to unroll memset/memcpy to ungodly amounts.

These 2 effects roughly compensate each other: shifts and memset/memcpy
are everywhere.

Regardless, code size in not the objective of this patch, performance is.

Support status:
	x86_64 only (didn't run 386 for a long time)
	Intel only (never owner AMD box)

TODO:
	foolproof protection
	SSE2/AVX/AVX2/AVX-512 disabling (-mno-...)
	.config injection
	BMI2 for %08x/%016lx
	faster clear_user()
	RAID functions (ungodly unrolling, requires lots of courage)
	BPF JIT
	and of course more instructions which kernel is forced to ignore
	because generic kernels

If you want to try it out:
* make sure this kernel is only used on machine which it is compiled at
* grab gcc with "-march=native" support (modern ones have it)
* select CONFIG_MARCH_NATIVE in the CPU choice menu
* add "unexpected options" to scripts/march-native.sh until checks pass
* verify CONFIG_MARCH_NATIVE options in .config, include/config/auto.conf
  and include/generated/autoconf.h
* cross fingers, recompile and reboot

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---
 Makefile                   |  4 ++
 arch/x86/Kconfig.cpu       |  8 ++++
 scripts/kconfig/.gitignore |  1 +
 scripts/kconfig/Makefile   |  9 ++++-
 scripts/kconfig/cpuid.c    | 76 ++++++++++++++++++++++++++++++++++++
 scripts/march-native.sh    | 96 ++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 193 insertions(+), 1 deletion(-)
 create mode 100644 scripts/kconfig/cpuid.c
 create mode 100755 scripts/march-native.sh

diff --git a/Makefile b/Makefile
index 86bb80540cbd..c1cc730b81a8 100644
--- a/Makefile
+++ b/Makefile
@@ -587,6 +587,10 @@ ifeq ($(dot-config),1)
 # Read in config
 -include include/config/auto.conf
 
+ifdef CONFIG_MARCH_NATIVE
+KBUILD_CFLAGS += -march=native
+endif
+
 ifeq ($(KBUILD_EXTMOD),)
 # Read in dependencies to all Kconfig* files, make sure to run
 # oldconfig if changes are detected.
diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index 4493e8c5d1ea..2e4750b6b891 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -287,6 +287,12 @@ config GENERIC_CPU
 	  Generic x86-64 CPU.
 	  Run equally well on all x86-64 CPUs.
 
+config MARCH_NATIVE
+	bool "-march=native"
+	depends on X86_64
+	---help---
+	  -march=native support.
+
 endchoice
 
 config X86_GENERIC
@@ -307,6 +313,7 @@ config X86_INTERNODE_CACHE_SHIFT
 	int
 	default "12" if X86_VSMP
 	default X86_L1_CACHE_SHIFT
+	depends on !MARCH_NATIVE
 
 config X86_L1_CACHE_SHIFT
 	int
@@ -314,6 +321,7 @@ config X86_L1_CACHE_SHIFT
 	default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
 	default "4" if MELAN || M486 || MGEODEGX1
 	default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
+	depends on !MARCH_NATIVE
 
 config X86_PPRO_FENCE
 	bool "PentiumPro memory ordering errata workaround"
diff --git a/scripts/kconfig/.gitignore b/scripts/kconfig/.gitignore
index 51f1c877b543..73ebca4b1888 100644
--- a/scripts/kconfig/.gitignore
+++ b/scripts/kconfig/.gitignore
@@ -14,6 +14,7 @@ gconf.glade.h
 # configuration programs
 #
 conf
+cpuid
 mconf
 nconf
 qconf
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index 297c1bf35140..7b43b66d4efa 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -21,24 +21,30 @@ unexport CONFIG_
 
 xconfig: $(obj)/qconf
 	$< $(silent) $(Kconfig)
+	$(Q)$(srctree)/scripts/march-native.sh $(CC) $(obj)/cpuid
 
 gconfig: $(obj)/gconf
 	$< $(silent) $(Kconfig)
+	$(Q)$(srctree)/scripts/march-native.sh $(CC) $(obj)/cpuid
 
 menuconfig: $(obj)/mconf
 	$< $(silent) $(Kconfig)
+	$(Q)$(srctree)/scripts/march-native.sh $(CC) $(obj)/cpuid
 
 config: $(obj)/conf
 	$< $(silent) --oldaskconfig $(Kconfig)
+	$(Q)$(srctree)/scripts/march-native.sh $(CC) $(obj)/cpuid
 
 nconfig: $(obj)/nconf
 	$< $(silent) $(Kconfig)
+	$(Q)$(srctree)/scripts/march-native.sh $(CC) $(obj)/cpuid
 
-silentoldconfig: $(obj)/conf
+silentoldconfig: $(obj)/conf $(obj)/cpuid
 	$(Q)mkdir -p include/config include/generated
 	$(Q)test -e include/generated/autoksyms.h || \
 	    touch   include/generated/autoksyms.h
 	$< $(silent) --$@ $(Kconfig)
+	$(Q)$(srctree)/scripts/march-native.sh $(CC) $(obj)/cpuid
 
 localyesconfig localmodconfig: $(obj)/streamline_config.pl $(obj)/conf
 	$(Q)mkdir -p include/config include/generated
@@ -190,6 +196,7 @@ qconf-objs	:= zconf.tab.o
 gconf-objs	:= gconf.o zconf.tab.o
 
 hostprogs-y := conf nconf mconf kxgettext qconf gconf
+hostprogs-y += cpuid
 
 clean-files	:= qconf.moc .tmp_qtcheck .tmp_gtkcheck
 clean-files	+= zconf.tab.c zconf.lex.c gconf.glade.h
diff --git a/scripts/kconfig/cpuid.c b/scripts/kconfig/cpuid.c
new file mode 100644
index 000000000000..f1983027fe2b
--- /dev/null
+++ b/scripts/kconfig/cpuid.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright (c) 2017 Alexey Dobriyan <adobriyan@gmail.com>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+static inline bool streq(const char *s1, const char *s2)
+{
+	return strcmp(s1, s2) == 0;
+}
+
+static inline void cpuid(uint32_t eax0, uint32_t *eax, uint32_t *ecx, uint32_t *edx, uint32_t *ebx)
+{
+	asm volatile (
+		"cpuid"
+		: "=a" (*eax), "=c" (*ecx), "=d" (*edx), "=b" (*ebx)
+		: "0" (eax0)
+	);
+}
+
+static inline void cpuid2(uint32_t eax0, uint32_t ecx0, uint32_t *eax, uint32_t *ecx, uint32_t *edx, uint32_t *ebx)
+{
+	asm volatile (
+		"cpuid"
+		: "=a" (*eax), "=c" (*ecx), "=d" (*edx), "=b" (*ebx)
+		: "0" (eax0), "1" (ecx0)
+	);
+}
+
+static uint32_t eax0_max;
+
+static void intel(void)
+{
+	uint32_t eax, ecx, edx, ebx;
+
+	if (eax0_max >= 1) {
+		cpuid(1, &eax, &ecx, &edx, &ebx);
+//		printf("%08x %08x %08x %08x\n", eax, ecx, edx, ebx);
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	const char *opt = argv[1];
+	uint32_t eax, ecx, edx, ebx;
+
+	if (argc != 2)
+		return EXIT_FAILURE;
+
+	cpuid(0, &eax, &ecx, &edx, &ebx);
+//	printf("%08x %08x %08x %08x\n", eax, ecx, edx, ebx);
+	eax0_max = eax;
+
+	if (ecx == 0x6c65746e && edx == 0x49656e69 && ebx == 0x756e6547)
+		intel();
+
+#define _(x)	if (streq(opt, #x)) return x ? EXIT_SUCCESS : EXIT_FAILURE
+#undef _
+
+	return EXIT_FAILURE;
+}
diff --git a/scripts/march-native.sh b/scripts/march-native.sh
new file mode 100755
index 000000000000..4f0fc82f7722
--- /dev/null
+++ b/scripts/march-native.sh
@@ -0,0 +1,96 @@
+#!/bin/sh
+# Copyright (c) 2017 Alexey Dobriyan <adobriyan@gmail.com>
+if test "$(uname -m)" != x86_64; then
+	exit 0
+fi
+
+CC="$1"
+CPUID="$2"
+AUTOCONF1="include/config/auto.conf"
+AUTOCONF2="include/generated/autoconf.h"
+
+if ! grep -q -e '^CONFIG_MARCH_NATIVE=y$' .config; then
+	sed -i -e '/CONFIG_MARCH_NATIVE/d' "$AUTOCONF1" "$AUTOCONF2" >/dev/null 2>&1
+	exit 0
+fi
+
+if ! "$CC" -march=native -x c -c -o /dev/null /dev/null >/dev/null 2>&1; then
+	echo "error: unsupported '-march=native' compiler option" >&2
+	exit 1
+fi
+
+_option() {
+	echo "$1=$2"		>>"$AUTOCONF1"
+	echo "#define $1 $2"	>>"$AUTOCONF2"
+}
+
+option() {
+	echo "$1=y"		>>"$AUTOCONF1"
+	echo "#define $1 1"	>>"$AUTOCONF2"
+}
+
+if test ! -f "$AUTOCONF1" -o ! -f "$AUTOCONF2"; then
+	exit 0
+fi
+
+COLLECT_GCC_OPTIONS=$("$CC" -march=native -v -E -x c -c /dev/null 2>&1 | sed -ne '/^COLLECT_GCC_OPTIONS=/{n;p}')
+echo $COLLECT_GCC_OPTIONS
+
+for i in $COLLECT_GCC_OPTIONS; do
+	case $i in
+		*/cc1|-E|-quiet|-v|/dev/null|--param|-fstack-protector*|-mno-*)
+			;;
+
+		# FIXME
+		l1-cache-size=*);;
+		l2-cache-size=*);;
+
+		l1-cache-line-size=64)
+			_option "CONFIG_X86_L1_CACHE_SHIFT"		6
+			_option "CONFIG_X86_INTERNODE_CACHE_SHIFT"	6
+			;;
+
+		-march=broadwell);;
+		-mtune=broadwell);;
+		-march=nehalem);;
+		-mtune=nehalem);;
+
+		-mabm)		option "CONFIG_MARCH_NATIVE_ABM"	;;
+		-madx)		option "CONFIG_MARCH_NATIVE_ADX"	;;
+		-maes)		option "CONFIG_MARCH_NATIVE_AES"	;;
+		-mavx)		option "CONFIG_MARCH_NATIVE_AVX"	;;
+		-mavx2)		option "CONFIG_MARCH_NATIVE_AVX2"	;;
+		-mbmi)		option "CONFIG_MARCH_NATIVE_BMI"	;;
+		-mbmi2)		option "CONFIG_MARCH_NATIVE_BMI2"	;;
+		-mcx16)		option "CONFIG_MARCH_NATIVE_CX16"	;;
+		-mf16c)		option "CONFIG_MARCH_NATIVE_F16C"	;;
+		-mfsgsbase)	option "CONFIG_MARCH_NATIVE_FSGSBASE"	;;
+		-mfma)		option "CONFIG_MARCH_NATIVE_FMA"	;;
+		-mfxsr)		option "CONFIG_MARCH_NATIVE_FXSR"	;;
+		-mhle)		option "CONFIG_MARCH_NATIVE_HLE"	;;
+		-mlzcnt)	option "CONFIG_MARCH_NATIVE_LZCNT"	;;
+		-mmmx)		option "CONFIG_MARCH_NATIVE_MMX"	;;
+		-mmovbe)	option "CONFIG_MARCH_NATIVE_MOVBE"	;;
+		-mpclmul)	option "CONFIG_MARCH_NATIVE_PCLMUL"	;;
+		-mpopcnt)	option "CONFIG_MATCH_NATIVE_POPCNT"	;;
+		-mprfchw)	option "CONFIG_MARCH_NATIVE_PREFETCHW"	;;
+		-mrdrnd)	option "CONFIG_MARCH_NATIVE_RDRND"	;;
+		-mrdseed)	option "CONFIG_MARCH_NATIVE_RDSEED"	;;
+		-mrtm)		option "CONFIG_MARCH_NATIVE_RTM"	;;
+		-msahf)		option "CONFIG_MARCH_NATIVE_SAHF"	;;
+		-msse)		option "CONFIG_MARCH_NATIVE_SSE"	;;
+		-msse2)		option "CONFIG_MARCH_NATIVE_SSE2"	;;
+		-msse3)		option "CONFIG_MARCH_NATIVE_SSE3"	;;
+		-msse4.1)	option "CONFIG_MARCH_NATIVE_SSE4_1"	;;
+		-msse4.2)	option "CONFIG_MARCH_NATIVE_SSE4_2"	;;
+		-mssse3)	option "CONFIG_MARCH_NATIVE_SSSE3"	;;
+		-mxsave)	option "CONFIG_MARCH_NATIVE_XSAVE"	;;
+		-mxsaveopt)	option "CONFIG_MARCH_NATIVE_XSAVEOPT"	;;
+
+		*)
+			echo >&2
+			echo "Unexpected -march=native option: '$i'" >&2
+			echo >&2
+			exit 1
+	esac
+done
-- 
2.13.6

             reply	other threads:[~2017-12-07 22:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 22:41 Alexey Dobriyan [this message]
2017-12-07 22:41 ` [PATCH 2/5] -march=native: POPCNT support Alexey Dobriyan
2017-12-07 23:07   ` H. Peter Anvin
2017-12-08 10:09     ` Alexey Dobriyan
2017-12-07 22:41 ` [PATCH 3/5] -march=native: REP MOVSB support Alexey Dobriyan
2017-12-07 22:41 ` [PATCH 4/5] -march=native: REP STOSB Alexey Dobriyan
2017-12-08 19:08   ` Andi Kleen
2017-12-07 22:41 ` [PATCH 5/5] -march=native: MOVBE support Alexey Dobriyan
2017-12-07 23:32 ` [PATCH v0 1/5] x86_64: march=native support H. Peter Anvin
2017-12-08  9:57   ` Alexey Dobriyan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171207224154.4687-1-adobriyan@gmail.com \
    --to=adobriyan@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).