All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE)
@ 2022-12-26  8:44 Alexander Kanavin
  2022-12-26  8:44 ` [RFC PATCH 2/3] qemux86-64: build for x86-64-v3 (2013 Haswell and later) rather than Core 2 from 2006 Alexander Kanavin
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Alexander Kanavin @ 2022-12-26  8:44 UTC (permalink / raw)
  To: openembedded-core; +Cc: Alexander Kanavin

Qemu 7.2 finally allows us to move beyond building for original Core 2/Core i7 era hardware,
and this patch adds support for the newer generations. But first, a bit of
background:

Recently toolchains gained support for specifying x86-64 'levels' of
instruction set support; v3 corresponds to 2013-era Haswell CPUs
(and later), with AVX, AVX2 and a few other instructions that
were introduced in that generation. I believe this is preferrable
to picking a specific CPU model as the baseline.

Here's Phoronix's feature article that explains the feature and the available levels:

"Both LLVM Clang 12 and GCC 11 are ready to go in offering the new x86-64-v2, x86-64-v3, and x86-64-v4 targets.

These x86_64 micro-architecture feature levels have been about coming up with a few "classes" of Intel/AMD CPU processor support rather than continuing to rely on just the x86_64 baseline or targeting a
specific CPU family for optimizations. These new levels make it easier to raise the base requirements around Linux x86-64 whether it be for a Linux distribution or a particular software application where
the developer/ISV may be wanting to compile with greater instruction set extensions enabled in catering to more recent Intel/AMD CPUs."

https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels

Here's gcc docs for it:
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

And here's the formal specification (click on the pdf link):
https://gitlab.com/x86-psABIs/x86-64-ABI

The actual tune file was created by copying corei7 tunes and doing
search/replace on them. Qemu options were dropped as unnecessary.

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
---
 .../machine/include/x86/tune-x86-64-v3.inc    | 35 +++++++++++++++++++
 1 file changed, 35 insertions(+)
 create mode 100644 meta/conf/machine/include/x86/tune-x86-64-v3.inc

diff --git a/meta/conf/machine/include/x86/tune-x86-64-v3.inc b/meta/conf/machine/include/x86/tune-x86-64-v3.inc
new file mode 100644
index 0000000000..365e23b49b
--- /dev/null
+++ b/meta/conf/machine/include/x86/tune-x86-64-v3.inc
@@ -0,0 +1,35 @@
+# Settings for the GCC(1) cpu-type "x86-64-v3":
+#
+#     CPUs with AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE.
+#     (but not AVX512).
+#     See https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels for details.
+#
+# This tune is recommended for Intel Haswell/AMD Excavator CPUs (and later).
+#
+DEFAULTTUNE ?= "x86-64-v3-64"
+
+# Include the previous tune to pull in PACKAGE_EXTRA_ARCHS
+require conf/machine/include/x86/tune-corei7.inc
+
+# Extra tune features
+TUNEVALID[x86-64-v3] = "Enable x86-64-v3 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'x86-64-v3', ' -march=x86-64-v3', '', d)}"
+
+# Extra tune selections
+AVAILTUNES += "x86-64-v3-32"
+TUNE_FEATURES:tune-x86-64-v3-32 = "${TUNE_FEATURES:tune-x86} x86-64-v3"
+BASE_LIB:tune-x86-64-v3-32 = "lib"
+TUNE_PKGARCH:tune-x86-64-v3-32 = "x86-64-v3-32"
+PACKAGE_EXTRA_ARCHS:tune-x86-64-v3-32 = "${PACKAGE_EXTRA_ARCHS:tune-corei7-32} x86-64-v3-32"
+
+AVAILTUNES += "x86-64-v3-64"
+TUNE_FEATURES:tune-x86-64-v3-64 = "${TUNE_FEATURES:tune-x86-64} x86-64-v3"
+BASE_LIB:tune-x86-64-v3-64 = "lib64"
+TUNE_PKGARCH:tune-x86-64-v3-64 = "x86-64-v3-64"
+PACKAGE_EXTRA_ARCHS:tune-x86-64-v3-64 = "${PACKAGE_EXTRA_ARCHS:tune-corei7-64} x86-64-v3-64"
+
+AVAILTUNES += "x86-64-v3-64-x32"
+TUNE_FEATURES:tune-x86-64-v3-64-x32 = "${TUNE_FEATURES:tune-x86-64-x32} x86-64-v3"
+BASE_LIB:tune-x86-64-v3-64-x32 = "libx32"
+TUNE_PKGARCH:tune-x86-64-v3-64-x32 = "x86-64-v3-64-x32"
+PACKAGE_EXTRA_ARCHS:tune-x86-64-v3-64-x32 = "${PACKAGE_EXTRA_ARCHS:tune-corei7-64-x32} x86-64-v3-64-x32"
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/3] qemux86-64: build for x86-64-v3 (2013 Haswell and later) rather than Core 2 from 2006
  2022-12-26  8:44 [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Alexander Kanavin
@ 2022-12-26  8:44 ` Alexander Kanavin
  2022-12-26  8:44 ` [RFC PATCH 3/3] valgrind: disable tests that started failing after switching to x86-64-v3 target Alexander Kanavin
  2022-12-28 14:32 ` [OE-core] [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Richard Purdie
  2 siblings, 0 replies; 5+ messages in thread
From: Alexander Kanavin @ 2022-12-26  8:44 UTC (permalink / raw)
  To: openembedded-core; +Cc: Alexander Kanavin

This allows us to
- test those more recent instruction sets (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE)
- benefit from improved performance across the stack both in kvm-driven system emulation and when running
on real silicon.
For example, glibc:
https://www.phoronix.com/news/Glibc-strcasecmp-AVX2-EVEX

v4 level is adding AVX-512, which is far less established, particularly Intel has famously backtracked
from supporting it in Alder Lake/Raport Lake client CPUs and AMD has only implemented it in very recent Zen4 products:
https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
---
 meta/conf/machine/include/x86/qemuboot-x86.inc | 4 ++--
 meta/conf/machine/qemux86-64.conf              | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/meta/conf/machine/include/x86/qemuboot-x86.inc b/meta/conf/machine/include/x86/qemuboot-x86.inc
index 3953679366..31db1b2a61 100644
--- a/meta/conf/machine/include/x86/qemuboot-x86.inc
+++ b/meta/conf/machine/include/x86/qemuboot-x86.inc
@@ -4,8 +4,8 @@ QB_SMP = "-smp 4"
 QB_CPU:x86 = "-cpu IvyBridge -machine q35,i8042=off"
 QB_CPU_KVM:x86 = "-cpu IvyBridge -machine q35,i8042=off"
 
-QB_CPU:x86-64 = "-cpu IvyBridge -machine q35,i8042=off"
-QB_CPU_KVM:x86-64 = "-cpu IvyBridge -machine q35,i8042=off"
+QB_CPU:x86-64 = "-cpu Skylake-Client -machine q35,i8042=off"
+QB_CPU_KVM:x86-64 = "-cpu Skylake-Client -machine q35,i8042=off"
 
 QB_AUDIO_DRV = "alsa"
 QB_AUDIO_OPT = "-device AC97"
diff --git a/meta/conf/machine/qemux86-64.conf b/meta/conf/machine/qemux86-64.conf
index 8640867911..ba8ed21707 100644
--- a/meta/conf/machine/qemux86-64.conf
+++ b/meta/conf/machine/qemux86-64.conf
@@ -9,8 +9,8 @@ PREFERRED_PROVIDER_virtual/libgles2 ?= "mesa"
 PREFERRED_PROVIDER_virtual/libgles3 ?= "mesa"
 
 require conf/machine/include/qemu.inc
-DEFAULTTUNE ?= "core2-64"
-require conf/machine/include/x86/tune-corei7.inc
+DEFAULTTUNE ?= "x86-64-v3-64"
+require conf/machine/include/x86/tune-x86-64-v3.inc
 require conf/machine/include/x86/qemuboot-x86.inc
 
 UBOOT_MACHINE ?= "qemu-x86_64_defconfig"
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 3/3] valgrind: disable tests that started failing after switching to x86-64-v3 target
  2022-12-26  8:44 [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Alexander Kanavin
  2022-12-26  8:44 ` [RFC PATCH 2/3] qemux86-64: build for x86-64-v3 (2013 Haswell and later) rather than Core 2 from 2006 Alexander Kanavin
@ 2022-12-26  8:44 ` Alexander Kanavin
  2022-12-28 14:32 ` [OE-core] [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Richard Purdie
  2 siblings, 0 replies; 5+ messages in thread
From: Alexander Kanavin @ 2022-12-26  8:44 UTC (permalink / raw)
  To: openembedded-core; +Cc: Alexander Kanavin

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
---
 meta/recipes-devtools/valgrind/valgrind_3.20.0.bb | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/meta/recipes-devtools/valgrind/valgrind_3.20.0.bb b/meta/recipes-devtools/valgrind/valgrind_3.20.0.bb
index cd9c4d9fe9..1e1f0ccdd3 100644
--- a/meta/recipes-devtools/valgrind/valgrind_3.20.0.bb
+++ b/meta/recipes-devtools/valgrind/valgrind_3.20.0.bb
@@ -242,6 +242,15 @@ do_install_ptest() {
     install ${S}/none/tests/tls.c ${D}/usr/src/debug/${PN}/${EXTENDPE}${PV}-${PR}/none/tests/
 }
 
+do_install_ptest:append:x86-64 () {
+    # https://bugs.kde.org/show_bug.cgi?id=463456
+    rm ${D}${PTEST_PATH}/memcheck/tests/origin6-fp.vgtest
+    # https://bugs.kde.org/show_bug.cgi?id=463458
+    rm ${D}${PTEST_PATH}/memcheck/tests/vcpu_fnfns.vgtest
+    # https://bugs.kde.org/show_bug.cgi?id=463463
+    rm ${D}${PTEST_PATH}/none/tests/amd64/fma.vgtest
+}
+
 # avoid stripping some generated binaries otherwise some of the tests will fail
 # run-strip-reloc.sh, run-strip-strmerge.sh and so on will fail
 INHIBIT_PACKAGE_STRIP_FILES += "\
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [OE-core] [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE)
  2022-12-26  8:44 [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Alexander Kanavin
  2022-12-26  8:44 ` [RFC PATCH 2/3] qemux86-64: build for x86-64-v3 (2013 Haswell and later) rather than Core 2 from 2006 Alexander Kanavin
  2022-12-26  8:44 ` [RFC PATCH 3/3] valgrind: disable tests that started failing after switching to x86-64-v3 target Alexander Kanavin
@ 2022-12-28 14:32 ` Richard Purdie
  2022-12-28 17:32   ` Alexander Kanavin
  2 siblings, 1 reply; 5+ messages in thread
From: Richard Purdie @ 2022-12-28 14:32 UTC (permalink / raw)
  To: Alexander Kanavin, openembedded-core; +Cc: Alexander Kanavin

On Mon, 2022-12-26 at 09:44 +0100, Alexander Kanavin wrote:
> Qemu 7.2 finally allows us to move beyond building for original Core 2/Core i7 era hardware,
> and this patch adds support for the newer generations. But first, a bit of
> background:
> 
> Recently toolchains gained support for specifying x86-64 'levels' of
> instruction set support; v3 corresponds to 2013-era Haswell CPUs
> (and later), with AVX, AVX2 and a few other instructions that
> were introduced in that generation. I believe this is preferrable
> to picking a specific CPU model as the baseline.
> 
> Here's Phoronix's feature article that explains the feature and the available levels:
> 
> "Both LLVM Clang 12 and GCC 11 are ready to go in offering the new x86-64-v2, x86-64-v3, and x86-64-v4 targets.
> 
> These x86_64 micro-architecture feature levels have been about coming up with a few "classes" of Intel/AMD CPU processor support rather than continuing to rely on just the x86_64 baseline or targeting a
> specific CPU family for optimizations. These new levels make it easier to raise the base requirements around Linux x86-64 whether it be for a Linux distribution or a particular software application where
> the developer/ISV may be wanting to compile with greater instruction set extensions enabled in catering to more recent Intel/AMD CPUs."
> 
> https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels
> 
> Here's gcc docs for it:
> https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
> 
> And here's the formal specification (click on the pdf link):
> https://gitlab.com/x86-psABIs/x86-64-ABI
> 
> The actual tune file was created by copying corei7 tunes and doing
> search/replace on them. Qemu options were dropped as unnecessary.
> 
> Signed-off-by: Alexander Kanavin <alex@linutronix.de>
> ---
>  .../machine/include/x86/tune-x86-64-v3.inc    | 35 +++++++++++++++++++
>  1 file changed, 35 insertions(+)
>  create mode 100644 meta/conf/machine/include/x86/tune-x86-64-v3.inc
> 
> diff --git a/meta/conf/machine/include/x86/tune-x86-64-v3.inc b/meta/conf/machine/include/x86/tune-x86-64-v3.inc
> new file mode 100644
> index 0000000000..365e23b49b
> --- /dev/null
> +++ b/meta/conf/machine/include/x86/tune-x86-64-v3.inc
> @@ -0,0 +1,35 @@
> +# Settings for the GCC(1) cpu-type "x86-64-v3":
> +#
> +#     CPUs with AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE.
> +#     (but not AVX512).
> +#     See https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels for details.
> +#
> +# This tune is recommended for Intel Haswell/AMD Excavator CPUs (and later).
> +#
> +DEFAULTTUNE ?= "x86-64-v3-64"
> +
> +# Include the previous tune to pull in PACKAGE_EXTRA_ARCHS
> +require conf/machine/include/x86/tune-corei7.inc
> +
> +# Extra tune features
> +TUNEVALID[x86-64-v3] = "Enable x86-64-v3 specific processor optimizations"
> +TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'x86-64-v3', ' -march=x86-64-v3', '', d)}"
> +
> +# Extra tune selections
> +AVAILTUNES += "x86-64-v3-32"
> +TUNE_FEATURES:tune-x86-64-v3-32 = "${TUNE_FEATURES:tune-x86} x86-64-v3"
> +BASE_LIB:tune-x86-64-v3-32 = "lib"
> +TUNE_PKGARCH:tune-x86-64-v3-32 = "x86-64-v3-32"
> +PACKAGE_EXTRA_ARCHS:tune-x86-64-v3-32 = "${PACKAGE_EXTRA_ARCHS:tune-corei7-32} x86-64-v3-32"
> +
> +AVAILTUNES += "x86-64-v3-64"
> +TUNE_FEATURES:tune-x86-64-v3-64 = "${TUNE_FEATURES:tune-x86-64} x86-64-v3"
> +BASE_LIB:tune-x86-64-v3-64 = "lib64"
> +TUNE_PKGARCH:tune-x86-64-v3-64 = "x86-64-v3-64"
> +PACKAGE_EXTRA_ARCHS:tune-x86-64-v3-64 = "${PACKAGE_EXTRA_ARCHS:tune-corei7-64} x86-64-v3-64"
> +
> +AVAILTUNES += "x86-64-v3-64-x32"
> +TUNE_FEATURES:tune-x86-64-v3-64-x32 = "${TUNE_FEATURES:tune-x86-64-x32} x86-64-v3"
> +BASE_LIB:tune-x86-64-v3-64-x32 = "libx32"
> +TUNE_PKGARCH:tune-x86-64-v3-64-x32 = "x86-64-v3-64-x32"
> +PACKAGE_EXTRA_ARCHS:tune-x86-64-v3-64-x32 = "${PACKAGE_EXTRA_ARCHS:tune-corei7-64-x32} x86-64-v3-64-x32"

I suspect we may want to call the x86-64-v3-64 tune simply "x86-64-v3"?

Also, does a 32 bit version of the tune make sense? Is that useful to
anyone? I appreciate the x32 case is marginal as well but at least
there it is something designed for 64 bit processors.

Cheers,

Richard






^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [OE-core] [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE)
  2022-12-28 14:32 ` [OE-core] [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Richard Purdie
@ 2022-12-28 17:32   ` Alexander Kanavin
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Kanavin @ 2022-12-28 17:32 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core, Alexander Kanavin

On Wed, 28 Dec 2022 at 15:32, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> I suspect we may want to call the x86-64-v3-64 tune simply "x86-64-v3"?
>
> Also, does a 32 bit version of the tune make sense? Is that useful to
> anyone? I appreciate the x32 case is marginal as well but at least
> there it is something designed for 64 bit processors.

Right, I can drop the -64 suffix, and the 32 bit option. There's no
chip that supports these instructions but is otherwise 32 bit only.

Alex


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-28 17:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-26  8:44 [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Alexander Kanavin
2022-12-26  8:44 ` [RFC PATCH 2/3] qemux86-64: build for x86-64-v3 (2013 Haswell and later) rather than Core 2 from 2006 Alexander Kanavin
2022-12-26  8:44 ` [RFC PATCH 3/3] valgrind: disable tests that started failing after switching to x86-64-v3 target Alexander Kanavin
2022-12-28 14:32 ` [OE-core] [RFC PATCH 1/3] conf/machine/include: add x86-64-v3 tunes (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) Richard Purdie
2022-12-28 17:32   ` Alexander Kanavin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.