All of lore.kernel.org
 help / color / mirror / Atom feed
* [meta-oe][PATCH 0/5] ARMv8 Tune reorg
@ 2020-09-14 15:13 Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 1/5] arch-armv8-2a.inc: Add Cortex-A55 tunings Jon Mason
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-14 15:13 UTC (permalink / raw)
  To: openembedded-core

There is a large number of Arm Tune files located in
meta/conf/machine/include/, and to support the current and upcoming Arm
cores, more are needed.  Adding more files is simply going to make it
harder to find the relevant ones for an OE/YP developer/user.  Also,
there are problems with stale and erroneous configs (see my previous
series), which will only be exacerbated by having more files.

I am proposing a reorganization of the existing tune files by including
them in the generic family include file.  For example, the
tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
the number of files from 12 to 2 for ARMv8a, and that is excluding the
13 I am adding in this series that would otherwise be unique files.

To use, simply add
...
DEFAULTTUNE ?= "neoversen1"
require conf/machine/include/arm/arch-armv8-2a.inc
...

Which is arguably what should be done anyway (instead of taking the
default of the tune include file).
See the qemuarm64 patch in the series for a working example.

Of course, by removing the existing tune files, current users are going
to break.  A simple script can be written to use sed (or similar) to
replace the relevant parts for those users that would be affected (at
least for those that are in the layer index and update regularly).

Thanks,
Jon

---

Originally sent as a RFC in
https://lists.openembedded.org/g/openembedded-core/message/142324

Given the generally positive feedback, sending as a patch series.
Keeping the "hard fail" of the file removal (per Richards comment in
https://lists.openembedded.org/g/openembedded-core/message/142356).
Only difference of note is the removal of the "arm64: set BASE_LIB to
lib64", as there needs to be more investigation (see
https://lists.openembedded.org/g/openembedded-core/message/142414).


Jon Mason (5):
  arch-armv8-2a.inc: Add Cortex-A55 tunings
  arch-armv8a.inc: Add existing tunings
  qemuarm64: change tuning
  arch-armv8a.inc: Add tunes for supported ARMv8a cores
  arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores

 .../machine/include/arm/arch-armv8-2a.inc     | 175 ++++++++++++++-
 meta/conf/machine/include/arm/arch-armv8a.inc | 206 +++++++++++++++++-
 meta/conf/machine/include/tune-cortexa32.inc  |  18 --
 meta/conf/machine/include/tune-cortexa35.inc  |  17 --
 meta/conf/machine/include/tune-cortexa53.inc  |  18 --
 meta/conf/machine/include/tune-cortexa55.inc  |  13 --
 .../include/tune-cortexa57-cortexa53.inc      |  15 --
 meta/conf/machine/include/tune-cortexa57.inc  |  17 --
 .../include/tune-cortexa72-cortexa53.inc      |  20 --
 meta/conf/machine/include/tune-cortexa72.inc  |  13 --
 .../include/tune-cortexa73-cortexa53.inc      |  20 --
 meta/conf/machine/include/tune-thunderx.inc   |  19 --
 meta/conf/machine/qemuarm64.conf              |   3 +-
 13 files changed, 371 insertions(+), 183 deletions(-)
 delete mode 100644 meta/conf/machine/include/tune-cortexa32.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa35.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa55.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa57-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa57.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa72-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa72.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa73-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-thunderx.inc

-- 
2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [meta-oe][PATCH 1/5] arch-armv8-2a.inc: Add Cortex-A55 tunings
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
@ 2020-09-14 15:13 ` Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 2/5] arch-armv8a.inc: Add existing tunings Jon Mason
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-14 15:13 UTC (permalink / raw)
  To: openembedded-core

Migrate the settings in tune-cortexa55.inc to arch-armv8-2a.inc.  This
will allow for a single file to include all of the tunings of a family
of processors.  This will reduce the proliferation of unique files per
processor currently occurring in conf/machine/include/

Signed-off-by: Jon Mason <jon.mason@arm.com>
---
 .../machine/include/arm/arch-armv8-2a.inc     | 36 +++++++++++++------
 meta/conf/machine/include/tune-cortexa55.inc  | 13 -------
 2 files changed, 26 insertions(+), 23 deletions(-)
 delete mode 100644 meta/conf/machine/include/tune-cortexa55.inc

diff --git a/meta/conf/machine/include/arm/arch-armv8-2a.inc b/meta/conf/machine/include/arm/arch-armv8-2a.inc
index 1c095256d185..3fc9658400a3 100644
--- a/meta/conf/machine/include/arm/arch-armv8-2a.inc
+++ b/meta/conf/machine/include/arm/arch-armv8-2a.inc
@@ -1,4 +1,20 @@
-DEFAULTTUNE ?= "armv8-2a"
+#
+# Tune Settings for Cortex-A55
+#
+TUNEVALID[cortexa55] = "Enable Cortex-A55 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa55', ' -mcpu=cortex-a55', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa55"
+ARMPKGARCH_tune-cortexa55                           = "cortexa55"
+TUNE_FEATURES_tune-cortexa55                        = "aarch64 cortexa55 crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa55                  = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa55"
+BASE_LIB_tune-cortexa55                             = "lib64"
+
+#
+# Defaults for ARMv8-a
+#
+DEFAULTTUNE                                        ?= "armv8-2a"
 
 TUNEVALID[armv8-2a] = "Enable instructions for ARMv8-a"
 TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'armv8-2a', ' -march=armv8.2-a', '', d)}"
@@ -8,12 +24,12 @@ MACHINEOVERRIDES =. "${@bb.utils.contains('TUNE_FEATURES', 'armv8-2a', 'armv8-2a
 require conf/machine/include/arm/arch-armv8a.inc
 
 # Little Endian base configs
-AVAILTUNES += "armv8-2a armv8-2a-crypto"
-ARMPKGARCH_tune-armv8-2a                    ?= "armv8-2a"
-ARMPKGARCH_tune-armv8-2a-crypto             ?= "armv8-2a"
-TUNE_FEATURES_tune-armv8-2a                  = "aarch64 armv8-2a"
-TUNE_FEATURES_tune-armv8-2a-crypto           = "${TUNE_FEATURES_tune-armv8-2a} crypto"
-PACKAGE_EXTRA_ARCHS_tune-armv8-2a            = "${PACKAGE_EXTRA_ARCHS_tune-armv8a} armv8-2a"
-PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto     = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a} armv8-2a-crypto"
-BASE_LIB_tune-armv8-2a                       = "lib64"
-BASE_LIB_tune-armv8-2a-crypto                = "lib64"
+AVAILTUNES                                         += "armv8-2a armv8-2a-crypto"
+ARMPKGARCH_tune-armv8-2a                           ?= "armv8-2a"
+ARMPKGARCH_tune-armv8-2a-crypto                    ?= "armv8-2a"
+TUNE_FEATURES_tune-armv8-2a                         = "aarch64 armv8-2a"
+TUNE_FEATURES_tune-armv8-2a-crypto                  = "${TUNE_FEATURES_tune-armv8-2a} crypto"
+PACKAGE_EXTRA_ARCHS_tune-armv8-2a                   = "${PACKAGE_EXTRA_ARCHS_tune-armv8a} armv8-2a"
+PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto            = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a} armv8-2a-crypto"
+BASE_LIB_tune-armv8-2a                              = "lib64"
+BASE_LIB_tune-armv8-2a-crypto                       = "lib64"
diff --git a/meta/conf/machine/include/tune-cortexa55.inc b/meta/conf/machine/include/tune-cortexa55.inc
deleted file mode 100644
index 66a5d0c437ee..000000000000
--- a/meta/conf/machine/include/tune-cortexa55.inc
+++ /dev/null
@@ -1,13 +0,0 @@
-DEFAULTTUNE ?= "cortexa55"
-
-TUNEVALID[cortexa55] = "Enable Cortex-A55 specific processor optimizations"
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa55', ' -mcpu=cortex-a55', '', d)}"
-
-require conf/machine/include/arm/arch-armv8-2a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa55"
-ARMPKGARCH_tune-cortexa55             = "cortexa55"
-TUNE_FEATURES_tune-cortexa55          = "aarch64 cortexa55 crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa55    = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa55"
-BASE_LIB_tune-cortexa55               = "lib64"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [meta-oe][PATCH 2/5] arch-armv8a.inc: Add existing tunings
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 1/5] arch-armv8-2a.inc: Add Cortex-A55 tunings Jon Mason
@ 2020-09-14 15:13 ` Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 3/5] qemuarm64: change tuning Jon Mason
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-14 15:13 UTC (permalink / raw)
  To: openembedded-core

Migrate the existing tune settings to arch-armv8a.inc.  This will allow
for a single file to include all of the tunings of a family of
processors.  This will reduce the proliferation of unique files per
processor currently occurring in conf/machine/include/

Signed-off-by: Jon Mason <jon.mason@arm.com>
---
 meta/conf/machine/include/arm/arch-armv8a.inc | 157 +++++++++++++++++-
 meta/conf/machine/include/tune-cortexa32.inc  |  18 --
 meta/conf/machine/include/tune-cortexa35.inc  |  17 --
 meta/conf/machine/include/tune-cortexa53.inc  |  18 --
 .../include/tune-cortexa57-cortexa53.inc      |  15 --
 meta/conf/machine/include/tune-cortexa57.inc  |  17 --
 .../include/tune-cortexa72-cortexa53.inc      |  20 ---
 meta/conf/machine/include/tune-cortexa72.inc  |  13 --
 .../include/tune-cortexa73-cortexa53.inc      |  20 ---
 meta/conf/machine/include/tune-thunderx.inc   |  19 ---
 10 files changed, 155 insertions(+), 159 deletions(-)
 delete mode 100644 meta/conf/machine/include/tune-cortexa32.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa35.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa57-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa57.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa72-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa72.inc
 delete mode 100644 meta/conf/machine/include/tune-cortexa73-cortexa53.inc
 delete mode 100644 meta/conf/machine/include/tune-thunderx.inc

diff --git a/meta/conf/machine/include/arm/arch-armv8a.inc b/meta/conf/machine/include/arm/arch-armv8a.inc
index f810a1e8fc98..eaf601216a3d 100644
--- a/meta/conf/machine/include/arm/arch-armv8a.inc
+++ b/meta/conf/machine/include/arm/arch-armv8a.inc
@@ -1,4 +1,155 @@
-DEFAULTTUNE ?= "armv8a-crc"
+#
+# Tune Settings for Cortex-A32 (32bit only)
+#
+TUNEVALID[cortexa32] = "Enable Cortex-A32 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa32', ' -mcpu=cortex-a32', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa32 cortexa32-crypto"
+ARMPKGARCH_tune-cortexa32                  = "cortexa32"
+ARMPKGARCH_tune-cortexa32-crypto           = "cortexa32"
+TUNE_FEATURES_tune-cortexa32               = "armv8a cortexa32 crc"
+TUNE_FEATURES_tune-cortexa32-crypto        = "${TUNE_FEATURES_tune-cortexa32} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa32         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa32"
+PACKAGE_EXTRA_ARCHS_tune-cortexa32-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa32 cortexa32-crypto"
+BASE_LIB_tune-cortexa32                    = "lib"
+BASE_LIB_tune-cortexa32-crypto             = "lib"
+
+#
+# Tune Settings for Cortex-A35
+#
+TUNEVALID[cortexa35] = "Enable Cortex-A35 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa35', ' -mcpu=cortex-a35', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa35 cortexa35-crypto"
+ARMPKGARCH_tune-cortexa35                  = "cortexa35"
+ARMPKGARCH_tune-cortexa35-crypto           = "cortexa35"
+TUNE_FEATURES_tune-cortexa35               = "aarch64 cortexa35 crc"
+TUNE_FEATURES_tune-cortexa35-crypto        = "${TUNE_FEATURES_tune-cortexa35} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa35         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa35"
+PACKAGE_EXTRA_ARCHS_tune-cortexa35-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa35 cortexa35-crypto"
+BASE_LIB_tune-cortexa35                    = "lib64"
+BASE_LIB_tune-cortexa35-crypto             = "lib64"
+
+#
+# Tune Settings for Cortex-A53
+#
+TUNEVALID[cortexa53] = "Enable Cortex-A53 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa53', ' -mcpu=cortex-a53', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa53 cortexa53-crypto"
+ARMPKGARCH_tune-cortexa53                  = "cortexa53"
+ARMPKGARCH_tune-cortexa53-crypto           = "cortexa53-crypto"
+TUNE_FEATURES_tune-cortexa53               = "aarch64 cortexa53 crc"
+TUNE_FEATURES_tune-cortexa53-crypto        = "${TUNE_FEATURES_tune-cortexa53} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa53         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa53"
+PACKAGE_EXTRA_ARCHS_tune-cortexa53-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa53 cortexa53-crypto"
+BASE_LIB_tune-cortexa53                    = "lib64"
+BASE_LIB_tune-cortexa53-crypto             = "lib64"
+
+#
+# Tune Settings for Cortex-A57
+#
+TUNEVALID[cortexa57] = "Enable Cortex-A57 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa57', ' -mcpu=cortex-a57', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa57 cortexa57-crypto"
+ARMPKGARCH_tune-cortexa57                  = "cortexa57"
+ARMPKGARCH_tune-cortexa57-crypto           = "cortexa57-crypto"
+TUNE_FEATURES_tune-cortexa57               = "aarch64 cortexa57 crc"
+TUNE_FEATURES_tune-cortexa57-crypto        = "${TUNE_FEATURES_tune-cortexa57} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa57         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa57"
+PACKAGE_EXTRA_ARCHS_tune-cortexa57-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa57 cortexa57-crypto"
+BASE_LIB_tune-cortexa57                    = "lib64"
+BASE_LIB_tune-cortexa57-crypto             = "lib64"
+
+#
+# Tune Settings for Cortex-A72
+#
+TUNEVALID[cortexa72] = "Enable Cortex-A72 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa72', ' -mcpu=cortex-a72', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa72"
+ARMPKGARCH_tune-cortexa72                  = "cortexa72"
+TUNE_FEATURES_tune-cortexa72               = "aarch64 cortexa72 crc crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa72         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa72"
+BASE_LIB_tune-cortexa72                    = "lib64"
+BASE_LIB_tune-cortexa72-crypto             = "lib64"
+
+#
+# Tune Settings for ThunderX
+#
+TUNEVALID[thunderx] = "Enable instructions for Cavium ThunderX"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'thunderx', ' -mcpu=thunderx', '',d)}"
+
+AVAILTUNES                                += "thunderx thunderx_be"
+ARMPKGARCH_tune-thunderx                  ?= "thunderx"
+ARMPKGARCH_tune-thunderx_be               ?= "thunderx_be"
+TUNE_FEATURES_tune-thunderx                = "${TUNE_FEATURES_tune-aarch64} thunderx"
+TUNE_FEATURES_tune-thunderx_be             = "${TUNE_FEATURES_tune-thunderx} bigendian"
+PACKAGE_EXTRA_ARCHS_tune-thunderx          = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} thunderx"
+PACKAGE_EXTRA_ARCHS_tune-thunderx_be       = "aarch64_be thunderx_be"
+BASE_LIB_tune-thunderx                     = "lib64"
+BASE_LIB_tune-thunderx_be                  = "lib64"
+
+#
+# Tune Settings for big.LITTLE Cortex-A57 - Cortex-A53
+#
+TUNEVALID[cortexa57-cortexa53] = "Enable big.LITTLE Cortex-A57.Cortex-A53 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa57-cortexa53", " -mcpu=cortex-a57.cortex-a53", "", d)}"
+MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa57-cortexa53", "cortexa57-cortexa53:", "" ,d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa57-cortexa53"
+ARMPKGARCH_tune-cortexa57-cortexa53        = "cortexa57-cortexa53"
+TUNE_FEATURES_tune-cortexa57-cortexa53     = "aarch64 crc cortexa57-cortexa53"
+PACKAGE_EXTRA_ARCHS_tune-cortexa57-cortexa53 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa57-cortexa53"
+BASE_LIB_tune-cortexa57-cortexa53          = "lib64"
+
+#
+# Tune Settings for big.LITTLE Cortex-A72 - Cortex-A53
+#
+TUNEVALID[cortexa72-cortexa53] = "Enable big.LITTLE Cortex-A72.Cortex-A53 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa72-cortexa53", " -mcpu=cortex-a72.cortex-a53", "", d)}"
+MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa72-cortexa53", "cortexa72-cortexa53:", "" ,d)}"
+
+# cortexa72.cortexa53 implies crc support
+AVAILTUNES                                += "cortexa72-cortexa53 cortexa72-cortexa53-crypto"
+ARMPKGARCH_tune-cortexa72-cortexa53        = "cortexa72-cortexa53"
+ARMPKGARCH_tune-cortexa72-cortexa53-crypto = "cortexa72-cortexa53-crypto"
+TUNE_FEATURES_tune-cortexa72-cortexa53     = "aarch64 crc cortexa72-cortexa53"
+TUNE_FEATURES_tune-cortexa72-cortexa53-crypto = "${TUNE_FEATURES_tune-cortexa72-cortexa53} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa72-cortexa53  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa72-cortexa53"
+PACKAGE_EXTRA_ARCHS_tune-cortexa72-cortexa53-crypto = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa72-cortexa53 cortexa72-cortexa53-crypto"
+BASE_LIB_tune-cortexa72-cortexa53          = "lib64"
+BASE_LIB_tune-cortexa72-cortexa53-crypto   = "lib64"
+
+#
+# Tune Settings for big.LITTLE Cortex-A73 - Cortex-A53
+#
+TUNEVALID[cortexa73-cortexa53] = "Enable big.LITTLE Cortex-A73.Cortex-A53 specific processor optimizations"
+MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa73-cortexa53", "cortexa73-cortexa53:", "" ,d)}"
+TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa73-cortexa53", " -mcpu=cortex-a73.cortex-a53", "", d)}"
+
+# cortexa73.cortexa53 implies crc support
+AVAILTUNES                                += "cortexa73-cortexa53 cortexa73-cortexa53-crypto"
+ARMPKGARCH_tune-cortexa73-cortexa53        = "cortexa73-cortexa53"
+ARMPKGARCH_tune-cortexa73-cortexa53-crypto = "cortexa73-cortexa53-crypto"
+TUNE_FEATURES_tune-cortexa73-cortexa53     = "aarch64 crc cortexa73-cortexa53"
+TUNE_FEATURES_tune-cortexa73-cortexa53-crypto = "${TUNE_FEATURES_tune-cortexa73-cortexa53} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa73-cortexa53  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa73-cortexa53"
+PACKAGE_EXTRA_ARCHS_tune-cortexa73-cortexa53-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa73-cortexa53 cortexa73-cortexa53-crypto"
+BASE_LIB_tune-cortexa73-cortexa53          = "lib64"
+BASE_LIB_tune-cortexa73-cortexa53-crypto   = "lib64"
+
+#
+# Defaults for ARMv8-a
+#
+DEFAULTTUNE                               ?= "armv8a-crc"
 
 TUNEVALID[armv8a] = "Enable instructions for ARMv8-a"
 TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'armv8a', ' -march=armv8-a', '', d)}"
@@ -8,10 +159,12 @@ TUNEVALID[crypto] = "Enable instructions for ARMv8-a cryptographic"
 TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'crypto', '+crypto', '', d)}"
 MACHINEOVERRIDES =. "${@bb.utils.contains('TUNE_FEATURES', 'armv8a', 'armv8a:', '' ,d)}"
 
+TUNECONFLICTS[aarch64] = "armv4 armv5 armv6 armv7 armv7a"
+
 require conf/machine/include/arm/arch-arm64.inc
 
 # Little Endian base configs
-AVAILTUNES += "armv8a armv8a-crc armv8a-crc-crypto armv8a-crypto"
+AVAILTUNES                                += "armv8a armv8a-crc armv8a-crc-crypto armv8a-crypto"
 ARMPKGARCH_tune-armv8a                    ?= "armv8a"
 ARMPKGARCH_tune-armv8a-crc                ?= "armv8a"
 ARMPKGARCH_tune-armv8a-crypto             ?= "armv8a"
diff --git a/meta/conf/machine/include/tune-cortexa32.inc b/meta/conf/machine/include/tune-cortexa32.inc
deleted file mode 100644
index 0ffb3e068855..000000000000
--- a/meta/conf/machine/include/tune-cortexa32.inc
+++ /dev/null
@@ -1,18 +0,0 @@
-DEFAULTTUNE ?= "cortexa32"
-
-
-TUNEVALID[cortexa32] = "Enable Cortex-A32 specific processor optimizations"
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa32', ' -mcpu=cortex-a32', '', d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa32 cortexa32-crypto"
-ARMPKGARCH_tune-cortexa32             = "cortexa32"
-ARMPKGARCH_tune-cortexa32-crypto      = "cortexa32"
-TUNE_FEATURES_tune-cortexa32          = "armv8a cortexa32 crc"
-TUNE_FEATURES_tune-cortexa32-crypto   = "${TUNE_FEATURES_tune-cortexa32} crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa32             = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa32"
-PACKAGE_EXTRA_ARCHS_tune-cortexa32-crypto      = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa32 cortexa32-crypto"
-BASE_LIB_tune-cortexa32               = "lib"
-BASE_LIB_tune-cortexa32-crypto        = "lib"
diff --git a/meta/conf/machine/include/tune-cortexa35.inc b/meta/conf/machine/include/tune-cortexa35.inc
deleted file mode 100644
index 61696da540cc..000000000000
--- a/meta/conf/machine/include/tune-cortexa35.inc
+++ /dev/null
@@ -1,17 +0,0 @@
-DEFAULTTUNE ?= "cortexa35"
-
-TUNEVALID[cortexa35] = "Enable Cortex-A35 specific processor optimizations"
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa35', ' -mcpu=cortex-a35', '', d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa35 cortexa35-crypto"
-ARMPKGARCH_tune-cortexa35             = "cortexa35"
-ARMPKGARCH_tune-cortexa35-crypto      = "cortexa35"
-TUNE_FEATURES_tune-cortexa35          = "aarch64 cortexa35 crc"
-TUNE_FEATURES_tune-cortexa35-crypto   = "${TUNE_FEATURES_tune-cortexa35} crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa35             = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa35"
-PACKAGE_EXTRA_ARCHS_tune-cortexa35-crypto      = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa35 cortexa35-crypto"
-BASE_LIB_tune-cortexa35               = "lib64"
-BASE_LIB_tune-cortexa35-crypto        = "lib64"
diff --git a/meta/conf/machine/include/tune-cortexa53.inc b/meta/conf/machine/include/tune-cortexa53.inc
deleted file mode 100644
index 79ce7c4b1c21..000000000000
--- a/meta/conf/machine/include/tune-cortexa53.inc
+++ /dev/null
@@ -1,18 +0,0 @@
-DEFAULTTUNE ?= "cortexa53"
-
-TUNEVALID[cortexa53] = "Enable Cortex-A53 specific processor optimizations"
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa53', ' -mcpu=cortex-a53', '', d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa53 cortexa53-crypto"
-ARMPKGARCH_tune-cortexa53             = "cortexa53"
-ARMPKGARCH_tune-cortexa53-crypto      = "cortexa53-crypto"
-TUNE_FEATURES_tune-cortexa53          = "aarch64 cortexa53 crc"
-TUNE_FEATURES_tune-cortexa53-crypto   = "${TUNE_FEATURES_tune-cortexa53} crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa53             = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa53"
-PACKAGE_EXTRA_ARCHS_tune-cortexa53-crypto      = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa53 cortexa53-crypto"
-
-BASE_LIB_tune-cortexa53               = "lib64"
-BASE_LIB_tune-cortexa53-crypto        = "lib64"
diff --git a/meta/conf/machine/include/tune-cortexa57-cortexa53.inc b/meta/conf/machine/include/tune-cortexa57-cortexa53.inc
deleted file mode 100644
index 5880bf203231..000000000000
--- a/meta/conf/machine/include/tune-cortexa57-cortexa53.inc
+++ /dev/null
@@ -1,15 +0,0 @@
-DEFAULTTUNE ?= "cortexa57-cortexa53"
-
-TUNEVALID[cortexa57-cortexa53] = "Enable big.LITTLE Cortex-A57.Cortex-A53 specific processor optimizations"
-TUNECONFLICTS[aarch64] = "armv4 armv5 armv6 armv7 armv7a"
-TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa57-cortexa53", " -mcpu=cortex-a57.cortex-a53", "", d)}"
-MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa57-cortexa53", "cortexa57-cortexa53:", "" ,d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa57-cortexa53"
-ARMPKGARCH_tune-cortexa57-cortexa53 = "cortexa57-cortexa53"
-TUNE_FEATURES_tune-cortexa57-cortexa53 = "aarch64 crc cortexa57-cortexa53"
-PACKAGE_EXTRA_ARCHS_tune-cortexa57-cortexa53 = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa57-cortexa53"
-BASE_LIB_tune-cortexa57-cortexa53 = "lib64"
diff --git a/meta/conf/machine/include/tune-cortexa57.inc b/meta/conf/machine/include/tune-cortexa57.inc
deleted file mode 100644
index 3206ce75a6b6..000000000000
--- a/meta/conf/machine/include/tune-cortexa57.inc
+++ /dev/null
@@ -1,17 +0,0 @@
-DEFAULTTUNE ?= "cortexa57"
-
-TUNEVALID[cortexa57] = "Enable Cortex-A57 specific processor optimizations"
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa57', ' -mcpu=cortex-a57', '', d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa57 cortexa57-crypto"
-ARMPKGARCH_tune-cortexa57             = "cortexa57"
-ARMPKGARCH_tune-cortexa57-crypto      = "cortexa57-crypto"
-TUNE_FEATURES_tune-cortexa57          = "aarch64 cortexa57 crc"
-TUNE_FEATURES_tune-cortexa57-crypto   = "${TUNE_FEATURES_tune-cortexa57} crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa57             = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa57"
-PACKAGE_EXTRA_ARCHS_tune-cortexa57-crypto      = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa57 cortexa57-crypto"
-BASE_LIB_tune-cortexa57               = "lib64"
-BASE_LIB_tune-cortexa57-crypto        = "lib64"
diff --git a/meta/conf/machine/include/tune-cortexa72-cortexa53.inc b/meta/conf/machine/include/tune-cortexa72-cortexa53.inc
deleted file mode 100644
index feb1df5c178d..000000000000
--- a/meta/conf/machine/include/tune-cortexa72-cortexa53.inc
+++ /dev/null
@@ -1,20 +0,0 @@
-DEFAULTTUNE ?= "cortexa72-cortexa53"
-
-TUNEVALID[cortexa72-cortexa53] = "Enable big.LITTLE Cortex-A72.Cortex-A53 specific processor optimizations"
-TUNECONFLICTS[aarch64] = "armv4 armv5 armv6 armv7 armv7a"
-TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa72-cortexa53", " -mcpu=cortex-a72.cortex-a53", "", d)}"
-MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa72-cortexa53", "cortexa72-cortexa53:", "" ,d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# cortexa72.cortexa53 implies crc support
-AVAILTUNES += "cortexa72-cortexa53 cortexa72-cortexa53-crypto"
-ARMPKGARCH_tune-cortexa72-cortexa53                  = "cortexa72-cortexa53"
-ARMPKGARCH_tune-cortexa72-cortexa53-crypto           = "cortexa72-cortexa53-crypto"
-TUNE_FEATURES_tune-cortexa72-cortexa53               = "aarch64 crc cortexa72-cortexa53"
-TUNE_FEATURES_tune-cortexa72-cortexa53-crypto        = "${TUNE_FEATURES_tune-cortexa72-cortexa53} crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa72-cortexa53         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc}        cortexa72-cortexa53"
-PACKAGE_EXTRA_ARCHS_tune-cortexa72-cortexa53-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa72-cortexa53 cortexa72-cortexa53-crypto"
-BASE_LIB_tune-cortexa72-cortexa53                    = "lib64"
-BASE_LIB_tune-cortexa72-cortexa53-crypto             = "lib64"
-
diff --git a/meta/conf/machine/include/tune-cortexa72.inc b/meta/conf/machine/include/tune-cortexa72.inc
deleted file mode 100644
index 00f7745a22fd..000000000000
--- a/meta/conf/machine/include/tune-cortexa72.inc
+++ /dev/null
@@ -1,13 +0,0 @@
-DEFAULTTUNE ?= "cortexa72"
-
-TUNEVALID[cortexa72] = "Enable Cortex-A72 specific processor optimizations"
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa72', ' -mcpu=cortex-a72', '', d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# Little Endian base configs
-AVAILTUNES += "cortexa72"
-ARMPKGARCH_tune-cortexa72             = "cortexa72"
-TUNE_FEATURES_tune-cortexa72          = "aarch64 cortexa72 crc crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa72    = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa72"
-BASE_LIB_tune-cortexa72               = "lib64"
diff --git a/meta/conf/machine/include/tune-cortexa73-cortexa53.inc b/meta/conf/machine/include/tune-cortexa73-cortexa53.inc
deleted file mode 100644
index 1c221999f408..000000000000
--- a/meta/conf/machine/include/tune-cortexa73-cortexa53.inc
+++ /dev/null
@@ -1,20 +0,0 @@
-DEFAULTTUNE ?= "cortexa73-cortexa53"
-
-TUNEVALID[cortexa73-cortexa53] = "Enable big.LITTLE Cortex-A73.Cortex-A53 specific processor optimizations"
-TUNECONFLICTS[aarch64] = "armv4 armv5 armv6 armv7 armv7a"
-MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa73-cortexa53", "cortexa73-cortexa53:", "" ,d)}"
-TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa73-cortexa53", " -mcpu=cortex-a73.cortex-a53", "", d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-# cortexa73.cortexa53 implies crc support
-AVAILTUNES += "cortexa73-cortexa53 cortexa73-cortexa53-crypto"
-ARMPKGARCH_tune-cortexa73-cortexa53                  = "cortexa73-cortexa53"
-ARMPKGARCH_tune-cortexa73-cortexa53-crypto           = "cortexa73-cortexa53-crypto"
-TUNE_FEATURES_tune-cortexa73-cortexa53               = "aarch64 crc cortexa73-cortexa53"
-TUNE_FEATURES_tune-cortexa73-cortexa53-crypto        = "${TUNE_FEATURES_tune-cortexa73-cortexa53} crypto"
-PACKAGE_EXTRA_ARCHS_tune-cortexa73-cortexa53         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc}        cortexa73-cortexa53"
-PACKAGE_EXTRA_ARCHS_tune-cortexa73-cortexa53-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa73-cortexa53 cortexa73-cortexa53-crypto"
-BASE_LIB_tune-cortexa73-cortexa53                    = "lib64"
-BASE_LIB_tune-cortexa73-cortexa53-crypto             = "lib64"
-
diff --git a/meta/conf/machine/include/tune-thunderx.inc b/meta/conf/machine/include/tune-thunderx.inc
deleted file mode 100644
index aa4d0501d400..000000000000
--- a/meta/conf/machine/include/tune-thunderx.inc
+++ /dev/null
@@ -1,19 +0,0 @@
-DEFAULTTUNE ?= "thunderx"
-AVAILTUNES += "thunderx thunderx_be"
-
-TUNEVALID[thunderx] = "Enable instructions for Cavium ThunderX"
-
-TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'thunderx', ' -mcpu=thunderx', '',d)}"
-
-require conf/machine/include/arm/arch-armv8a.inc
-
-ARMPKGARCH_tune-thunderx ?= "thunderx"
-ARMPKGARCH_tune-thunderx_be ?= "thunderx_be"
-
-TUNE_FEATURES_tune-thunderx = "${TUNE_FEATURES_tune-aarch64} thunderx"
-TUNE_FEATURES_tune-thunderx_be = "${TUNE_FEATURES_tune-thunderx} bigendian"
-BASE_LIB_tune-thunderx = "lib64"
-BASE_LIB_tune-thunderx_be = "lib64"
-
-PACKAGE_EXTRA_ARCHS_tune-thunderx = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} thunderx"
-PACKAGE_EXTRA_ARCHS_tune-thunderx_be = "aarch64_be thunderx_be"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [meta-oe][PATCH 3/5] qemuarm64: change tuning
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 1/5] arch-armv8-2a.inc: Add Cortex-A55 tunings Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 2/5] arch-armv8a.inc: Add existing tunings Jon Mason
@ 2020-09-14 15:13 ` Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 4/5] arch-armv8a.inc: Add tunes for supported ARMv8a cores Jon Mason
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-14 15:13 UTC (permalink / raw)
  To: openembedded-core

The previous patch caused the tuning file referenced here to be removed.
Use the new one with the new DEFAULTTUNE.

Signed-off-by: Jon Mason <jon.mason@arm.com>
---
 meta/conf/machine/qemuarm64.conf | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/meta/conf/machine/qemuarm64.conf b/meta/conf/machine/qemuarm64.conf
index fdd464d708be..9367f6ccb8d0 100644
--- a/meta/conf/machine/qemuarm64.conf
+++ b/meta/conf/machine/qemuarm64.conf
@@ -2,7 +2,8 @@
 #@NAME: QEMU ARMv8 machine
 #@DESCRIPTION: Machine configuration for running an ARMv8 system on QEMU
 
-require conf/machine/include/tune-cortexa57.inc
+DEFAULTTUNE ?= "cortexa57"
+require conf/machine/include/arm/arch-armv8a.inc
 require conf/machine/include/qemu.inc
 
 KERNEL_IMAGETYPE = "Image"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [meta-oe][PATCH 4/5] arch-armv8a.inc: Add tunes for supported ARMv8a cores
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
                   ` (2 preceding siblings ...)
  2020-09-14 15:13 ` [meta-oe][PATCH 3/5] qemuarm64: change tuning Jon Mason
@ 2020-09-14 15:13 ` Jon Mason
  2020-09-14 15:13 ` [meta-oe][PATCH 5/5] arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores Jon Mason
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-14 15:13 UTC (permalink / raw)
  To: openembedded-core

Add tunes for all the ARMv8a cores currently supported in GCC.  This
is: Cortex-A34, Cortex-A73, and Cortex-A73-Cortex-A35.

Signed-off-by: Jon Mason <jon.mason@arm.com>
---
 meta/conf/machine/include/arm/arch-armv8a.inc | 49 +++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/meta/conf/machine/include/arm/arch-armv8a.inc b/meta/conf/machine/include/arm/arch-armv8a.inc
index eaf601216a3d..3febbcb6a6f6 100644
--- a/meta/conf/machine/include/arm/arch-armv8a.inc
+++ b/meta/conf/machine/include/arm/arch-armv8a.inc
@@ -15,6 +15,23 @@ PACKAGE_EXTRA_ARCHS_tune-cortexa32-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-
 BASE_LIB_tune-cortexa32                    = "lib"
 BASE_LIB_tune-cortexa32-crypto             = "lib"
 
+#
+# Tune Settings for Cortex-A34
+#
+TUNEVALID[cortexa34] = "Enable Cortex-A34 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa34', ' -mcpu=cortex-a34', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa34 cortexa34-crypto"
+ARMPKGARCH_tune-cortexa34                  = "cortexa34"
+ARMPKGARCH_tune-cortexa34-crypto           = "cortexa34"
+TUNE_FEATURES_tune-cortexa34               = "aarch64 cortexa34 crc"
+TUNE_FEATURES_tune-cortexa34-crypto        = "${TUNE_FEATURES_tune-cortexa34} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa34         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa34"
+PACKAGE_EXTRA_ARCHS_tune-cortexa34-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa34 cortexa34-crypto"
+BASE_LIB_tune-cortexa34                    = "lib64"
+BASE_LIB_tune-cortexa34-crypto             = "lib64"
+
 #
 # Tune Settings for Cortex-A35
 #
@@ -80,6 +97,20 @@ PACKAGE_EXTRA_ARCHS_tune-cortexa72         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-
 BASE_LIB_tune-cortexa72                    = "lib64"
 BASE_LIB_tune-cortexa72-crypto             = "lib64"
 
+#
+# Tune Settings for Cortex-A73
+#
+TUNEVALID[cortexa73] = "Enable Cortex-A73 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa73', ' -mcpu=cortex-a73', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                += "cortexa73"
+ARMPKGARCH_tune-cortexa73                  = "cortexa73"
+TUNE_FEATURES_tune-cortexa73               = "aarch64 cortexa73 crc crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa73         = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa73"
+BASE_LIB_tune-cortexa73                    = "lib64"
+BASE_LIB_tune-cortexa73-crypto             = "lib64"
+
 #
 # Tune Settings for ThunderX
 #
@@ -128,6 +159,24 @@ PACKAGE_EXTRA_ARCHS_tune-cortexa72-cortexa53-crypto = "${PACKAGE_EXTRA_ARCHS_tun
 BASE_LIB_tune-cortexa72-cortexa53          = "lib64"
 BASE_LIB_tune-cortexa72-cortexa53-crypto   = "lib64"
 
+#
+# Tune Settings for big.LITTLE Cortex-A73 - Cortex-A35
+#
+TUNEVALID[cortexa73-cortexa35] = "Enable big.LITTLE Cortex-A73.Cortex-A35 specific processor optimizations"
+MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa73-cortexa35", "cortexa73-cortexa35:", "" ,d)}"
+TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa73-cortexa35", " -mcpu=cortex-a73.cortex-a35", "", d)}"
+
+# cortexa73.cortexa35 implies crc support
+AVAILTUNES                                += "cortexa73-cortexa35 cortexa73-cortexa35-crypto"
+ARMPKGARCH_tune-cortexa73-cortexa35        = "cortexa73-cortexa35"
+ARMPKGARCH_tune-cortexa73-cortexa35-crypto = "cortexa73-cortexa35-crypto"
+TUNE_FEATURES_tune-cortexa73-cortexa35     = "aarch64 crc cortexa73-cortexa35"
+TUNE_FEATURES_tune-cortexa73-cortexa35-crypto = "${TUNE_FEATURES_tune-cortexa73-cortexa35} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa73-cortexa35  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc} cortexa73-cortexa35"
+PACKAGE_EXTRA_ARCHS_tune-cortexa73-cortexa35-crypto  = "${PACKAGE_EXTRA_ARCHS_tune-armv8a-crc-crypto} cortexa73-cortexa35 cortexa73-cortexa35-crypto"
+BASE_LIB_tune-cortexa73-cortexa35          = "lib64"
+BASE_LIB_tune-cortexa73-cortexa35-crypto   = "lib64"
+
 #
 # Tune Settings for big.LITTLE Cortex-A73 - Cortex-A53
 #
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [meta-oe][PATCH 5/5] arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
                   ` (3 preceding siblings ...)
  2020-09-14 15:13 ` [meta-oe][PATCH 4/5] arch-armv8a.inc: Add tunes for supported ARMv8a cores Jon Mason
@ 2020-09-14 15:13 ` Jon Mason
  2020-09-14 15:32 ` [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg Martin Jansa
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-14 15:13 UTC (permalink / raw)
  To: openembedded-core

Add tunes for all the ARMv8.2a cores currently supported in GCC.  This
is: Cortex-A65, Cortex-A65AE, Cortex-A75, Cortex-A76, Cortex-A76AE,
Cortex-A77, Neoverse-E1, Neoverse-N1, Cortex-A75-Cortex-A55, and
Cortex-A76-Cortex-A55.

Signed-off-by: Jon Mason <jon.mason@arm.com>
---
 .../machine/include/arm/arch-armv8-2a.inc     | 139 ++++++++++++++++++
 1 file changed, 139 insertions(+)

diff --git a/meta/conf/machine/include/arm/arch-armv8-2a.inc b/meta/conf/machine/include/arm/arch-armv8-2a.inc
index 3fc9658400a3..f620eafa013b 100644
--- a/meta/conf/machine/include/arm/arch-armv8-2a.inc
+++ b/meta/conf/machine/include/arm/arch-armv8-2a.inc
@@ -11,6 +11,145 @@ TUNE_FEATURES_tune-cortexa55                        = "aarch64 cortexa55 crypto"
 PACKAGE_EXTRA_ARCHS_tune-cortexa55                  = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa55"
 BASE_LIB_tune-cortexa55                             = "lib64"
 
+#
+# Tune Settings for Cortex-A65
+#
+TUNEVALID[cortexa65] = "Enable Cortex-A65 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa65', ' -mcpu=cortex-a65', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa65"
+ARMPKGARCH_tune-cortexa65                           = "cortexa65"
+TUNE_FEATURES_tune-cortexa65                        = "aarch64 cortexa65 crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa65                  = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa65"
+BASE_LIB_tune-cortexa65                             = "lib64"
+
+#
+# Tune Settings for Cortex-A65AE
+#
+TUNEVALID[cortexa65ae] = "Enable Cortex-A65AE specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa65ae', ' -mcpu=cortex-a65ae', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa65ae"
+ARMPKGARCH_tune-cortexa65ae                         = "cortexa65ae"
+TUNE_FEATURES_tune-cortexa65ae                      = "aarch64 cortexa65ae crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa65ae                = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa65ae"
+BASE_LIB_tune-cortexa65ae                           = "lib64"
+
+#
+# Tune Settings for Cortex-A75
+#
+TUNEVALID[cortexa75] = "Enable Cortex-A75 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa75', ' -mcpu=cortex-a75', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa75"
+ARMPKGARCH_tune-cortexa75                           = "cortexa75"
+TUNE_FEATURES_tune-cortexa75                        = "aarch64 cortexa75 crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa75                  = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa75"
+BASE_LIB_tune-cortexa75                             = "lib64"
+
+#
+# Tune Settings for Cortex-A76
+#
+TUNEVALID[cortexa76] = "Enable Cortex-A76 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa76', ' -mcpu=cortex-a76', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa76"
+ARMPKGARCH_tune-cortexa76                           = "cortexa76"
+TUNE_FEATURES_tune-cortexa76                        = "aarch64 cortexa76 crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa76                  = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa76"
+BASE_LIB_tune-cortexa76                             = "lib64"
+
+#
+# Tune Settings for Cortex-A76AE
+#
+TUNEVALID[cortexa76ae] = "Enable Cortex-A76AE specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa76ae', ' -mcpu=cortex-a76ae', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa76ae"
+ARMPKGARCH_tune-cortexa76ae                         = "cortexa76ae"
+TUNE_FEATURES_tune-cortexa76ae                      = "aarch64 cortexa76ae crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa76ae                = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa76ae"
+BASE_LIB_tune-cortexa76ae                           = "lib64"
+
+#
+# Tune Settings for Cortex-A77
+#
+TUNEVALID[cortexa77] = "Enable Cortex-A77 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'cortexa77', ' -mcpu=cortex-a77', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "cortexa77"
+ARMPKGARCH_tune-cortexa77                           = "cortexa77"
+TUNE_FEATURES_tune-cortexa77                        = "aarch64 cortexa77 crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa77                  = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa77"
+BASE_LIB_tune-cortexa77                             = "lib64"
+
+#
+# Tune Settings for Neoverse-E1
+#
+TUNEVALID[neoversee1] = "Enable Neoverse-E1 specific processor optimizations"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'neoversee1', ' -mcpu=neoverse-e1', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "neoversee1"
+ARMPKGARCH_tune-neoversee1                          = "neoversee1"
+TUNE_FEATURES_tune-neoversee1                       = "aarch64 neoversee1 crypto"
+PACKAGE_EXTRA_ARCHS_tune-neoversee1                 = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} neoversee1"
+BASE_LIB_tune-neoversee1                            = "lib64"
+
+#
+# Tune Settings for Neoverse-N1
+#
+TUNEVALID[neoversen1] = "Enable Neoverse-N1 specific processor optimizations"
+# Note: Neoverse was called Ares, and GCC will accept "ares" in place of "neoverse-n1"
+TUNE_CCARGS .= "${@bb.utils.contains('TUNE_FEATURES', 'neoversen1', ' -mcpu=neoverse-n1', '', d)}"
+
+# Little Endian base configs
+AVAILTUNES                                         += "neoversen1"
+ARMPKGARCH_tune-neoversen1                          = "neoversen1"
+TUNE_FEATURES_tune-neoversen1                       = "aarch64 neoversen1 crypto"
+PACKAGE_EXTRA_ARCHS_tune-neoversen1                 = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} neoversen1"
+BASE_LIB_tune-neoversen1                            = "lib64"
+
+#
+# Tune Settings for big.LITTLE Cortex-A75 - Cortex-A55
+#
+TUNEVALID[cortexa75-cortexa55] = "Enable big.LITTLE Cortex-A75.Cortex-A55 specific processor optimizations"
+MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa75-cortexa55", "cortexa75-cortexa55:", "" ,d)}"
+TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa75-cortexa55", " -mcpu=cortex-a75.cortex-a55", "", d)}"
+
+AVAILTUNES                                         += "cortexa75-cortexa55 cortexa75-cortexa55-crypto"
+ARMPKGARCH_tune-cortexa75-cortexa55                 = "cortexa75-cortexa55"
+ARMPKGARCH_tune-cortexa75-cortexa55-crypto          = "cortexa75-cortexa55-crypto"
+TUNE_FEATURES_tune-cortexa75-cortexa55              = "aarch64 cortexa75-cortexa55"
+TUNE_FEATURES_tune-cortexa75-cortexa55-crypto       = "${TUNE_FEATURES_tune-cortexa75-cortexa55} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa75-cortexa55        = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a} cortexa75-cortexa55"
+PACKAGE_EXTRA_ARCHS_tune-cortexa75-cortexa55-crypto = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa75-cortexa55 cortexa75-cortexa55-crypto"
+BASE_LIB_tune-cortexa75-cortexa55                    = "lib64"
+BASE_LIB_tune-cortexa75-cortexa55-crypto             = "lib64"
+
+#
+# Tune Settings for big.LITTLE Cortex-A76 - Cortex-A55
+#
+TUNEVALID[cortexa76-cortexa55] = "Enable big.LITTLE Cortex-A76.Cortex-A55 specific processor optimizations"
+MACHINEOVERRIDES =. "${@bb.utils.contains("TUNE_FEATURES", "cortexa76-cortexa55", "cortexa76-cortexa55:", "" ,d)}"
+TUNE_CCARGS .= "${@bb.utils.contains("TUNE_FEATURES", "cortexa76-cortexa55", " -mcpu=cortex-a76.cortex-a55", "", d)}"
+
+AVAILTUNES                                         += "cortexa76-cortexa55 cortexa76-cortexa55-crypto"
+ARMPKGARCH_tune-cortexa76-cortexa55                 = "cortexa76-cortexa55"
+ARMPKGARCH_tune-cortexa76-cortexa55-crypto          = "cortexa76-cortexa55-crypto"
+TUNE_FEATURES_tune-cortexa76-cortexa55              = "aarch64 cortexa76-cortexa55"
+TUNE_FEATURES_tune-cortexa76-cortexa55-crypto       = "${TUNE_FEATURES_tune-cortexa76-cortexa55} crypto"
+PACKAGE_EXTRA_ARCHS_tune-cortexa76-cortexa55        = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a} cortexa76-cortexa55"
+PACKAGE_EXTRA_ARCHS_tune-cortexa76-cortexa55-crypto = "${PACKAGE_EXTRA_ARCHS_tune-armv8-2a-crypto} cortexa76-cortexa55 cortexa76-cortexa55-crypto"
+BASE_LIB_tune-cortexa76-cortexa55                    = "lib64"
+BASE_LIB_tune-cortexa76-cortexa55-crypto             = "lib64"
+
 #
 # Defaults for ARMv8-a
 #
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
                   ` (4 preceding siblings ...)
  2020-09-14 15:13 ` [meta-oe][PATCH 5/5] arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores Jon Mason
@ 2020-09-14 15:32 ` Martin Jansa
  2020-09-14 22:54   ` Jon Mason
  2020-09-15  7:09 ` Robert Berger
  2020-09-16 13:26 ` Richard Purdie
  7 siblings, 1 reply; 19+ messages in thread
From: Martin Jansa @ 2020-09-14 15:32 UTC (permalink / raw)
  To: Jon Mason; +Cc: Patches and discussions about the oe-core layer

[-- Attachment #1: Type: text/plain, Size: 5233 bytes --]

> This reduces the number of files from 12 to 2 for ARMv8a, and that is
excluding the 13 I am adding in this series that would otherwise be unique
files.

I don't have a strong opinion on this anymore, but is the number of the
include files the issue here?

I think the issue is the number of possible combinations and using the same
TUNE_PKGARCH for different tunes (unlike 32bit arm tune files which use
different). Bundling all these combinations in fewer include files doesn't
IMHO improve it.

With 32bit arm include files it was useful to be able to compare various
include files to check that they follow the same "structure" (or to add new
cortexa* one by copying existing and regex-replace).

Also AVAILTUNES will now contain all possible tunes from given family
(which makes it almost useless for aarch64 tunes). So instead of BSP
including .inc file corresponding with the core used in the MACHINE and
getting some "sane" default DEFAULTTUNE (plus some other possible options
listed in AVAILTUNES) we now force every BSP to set more
specific DEFAULTTUNE and then let user figure out what other DEFAULTTUNEs
might be compatible with it, e.g. when someone is doing multi MACHINE
builds and wants to share TUNE_PKGARCH between them (to save build time,
package feed size etc).

Cheers,

On Mon, Sep 14, 2020 at 5:14 PM Jon Mason <jdmason@kudzu.us> wrote:

> There is a large number of Arm Tune files located in
> meta/conf/machine/include/, and to support the current and upcoming Arm
> cores, more are needed.  Adding more files is simply going to make it
> harder to find the relevant ones for an OE/YP developer/user.  Also,
> there are problems with stale and erroneous configs (see my previous
> series), which will only be exacerbated by having more files.
>
> I am proposing a reorganization of the existing tune files by including
> them in the generic family include file.  For example, the
> tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
> the number of files from 12 to 2 for ARMv8a, and that is excluding the
> 13 I am adding in this series that would otherwise be unique files.
>
> To use, simply add
> ...
> DEFAULTTUNE ?= "neoversen1"
> require conf/machine/include/arm/arch-armv8-2a.inc
> ...
>
> Which is arguably what should be done anyway (instead of taking the
> default of the tune include file).
> See the qemuarm64 patch in the series for a working example.
>
> Of course, by removing the existing tune files, current users are going
> to break.  A simple script can be written to use sed (or similar) to
> replace the relevant parts for those users that would be affected (at
> least for those that are in the layer index and update regularly).
>
> Thanks,
> Jon
>
> ---
>
> Originally sent as a RFC in
> https://lists.openembedded.org/g/openembedded-core/message/142324
>
> Given the generally positive feedback, sending as a patch series.
> Keeping the "hard fail" of the file removal (per Richards comment in
> https://lists.openembedded.org/g/openembedded-core/message/142356).
> Only difference of note is the removal of the "arm64: set BASE_LIB to
> lib64", as there needs to be more investigation (see
> https://lists.openembedded.org/g/openembedded-core/message/142414).
>
>
> Jon Mason (5):
>   arch-armv8-2a.inc: Add Cortex-A55 tunings
>   arch-armv8a.inc: Add existing tunings
>   qemuarm64: change tuning
>   arch-armv8a.inc: Add tunes for supported ARMv8a cores
>   arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores
>
>  .../machine/include/arm/arch-armv8-2a.inc     | 175 ++++++++++++++-
>  meta/conf/machine/include/arm/arch-armv8a.inc | 206 +++++++++++++++++-
>  meta/conf/machine/include/tune-cortexa32.inc  |  18 --
>  meta/conf/machine/include/tune-cortexa35.inc  |  17 --
>  meta/conf/machine/include/tune-cortexa53.inc  |  18 --
>  meta/conf/machine/include/tune-cortexa55.inc  |  13 --
>  .../include/tune-cortexa57-cortexa53.inc      |  15 --
>  meta/conf/machine/include/tune-cortexa57.inc  |  17 --
>  .../include/tune-cortexa72-cortexa53.inc      |  20 --
>  meta/conf/machine/include/tune-cortexa72.inc  |  13 --
>  .../include/tune-cortexa73-cortexa53.inc      |  20 --
>  meta/conf/machine/include/tune-thunderx.inc   |  19 --
>  meta/conf/machine/qemuarm64.conf              |   3 +-
>  13 files changed, 371 insertions(+), 183 deletions(-)
>  delete mode 100644 meta/conf/machine/include/tune-cortexa32.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa35.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa53.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa55.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa57-cortexa53.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa57.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa72-cortexa53.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa72.inc
>  delete mode 100644 meta/conf/machine/include/tune-cortexa73-cortexa53.inc
>  delete mode 100644 meta/conf/machine/include/tune-thunderx.inc
>
> --
> 2.20.1
>
> 
>

[-- Attachment #2: Type: text/html, Size: 6258 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-14 15:32 ` [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg Martin Jansa
@ 2020-09-14 22:54   ` Jon Mason
  2020-09-15 14:38     ` Martin Jansa
  0 siblings, 1 reply; 19+ messages in thread
From: Jon Mason @ 2020-09-14 22:54 UTC (permalink / raw)
  To: Martin Jansa; +Cc: Patches and discussions about the oe-core layer

On Mon, Sep 14, 2020 at 11:32 AM Martin Jansa <martin.jansa@gmail.com> wrote:
>
> > This reduces the number of files from 12 to 2 for ARMv8a, and that is excluding the 13 I am adding in this series that would otherwise be unique files.
>
> I don't have a strong opinion on this anymore, but is the number of the include files the issue here?

This is the reason why I'm doing it.  Arm needs to support all the
cortex-a and cortex-m cores.  This would make the majority of the tune
files for only Arm cores.  We could move them into
meta/conf/machine/include/arm and not make the level down directory so
messy, but I think this is cleaner.

> I think the issue is the number of possible combinations and using the same TUNE_PKGARCH for different tunes (unlike 32bit arm tune files which use different). Bundling all these combinations in fewer include files doesn't IMHO improve it.

The multitude of combinations and redundancy code entries makes it
very error prone.  By putting them all in the same file, it makes it
much easier to see the differences in entries and catch errors.
Longer term, I'd like to see something that removes the redundancies
completely, but it will take some work.

> With 32bit arm include files it was useful to be able to compare various include files to check that they follow the same "structure" (or to add new cortexa* one by copying existing and regex-replace).
>
> Also AVAILTUNES will now contain all possible tunes from given family (which makes it almost useless for aarch64 tunes). So instead of BSP including .inc file corresponding with the core used in the MACHINE and getting some "sane" default DEFAULTTUNE (plus some other possible options listed in AVAILTUNES) we now force every BSP to set more specific DEFAULTTUNE and then let user figure out what other DEFAULTTUNEs might be compatible with it, e.g. when someone is doing multi MACHINE builds and wants to share TUNE_PKGARCH between them (to save build time, package feed size etc).

You can still set a generic default for all the machines of a given
family.  By including armv8a.inc, you still get 'DEFAULTTUNE ?=
"armv8a-crc"'.  This will allow for a generic for an entire family.
I'm not seeing the issue.

Thanks,
Jon


>
> Cheers,
>
> On Mon, Sep 14, 2020 at 5:14 PM Jon Mason <jdmason@kudzu.us> wrote:
>>
>> There is a large number of Arm Tune files located in
>> meta/conf/machine/include/, and to support the current and upcoming Arm
>> cores, more are needed.  Adding more files is simply going to make it
>> harder to find the relevant ones for an OE/YP developer/user.  Also,
>> there are problems with stale and erroneous configs (see my previous
>> series), which will only be exacerbated by having more files.
>>
>> I am proposing a reorganization of the existing tune files by including
>> them in the generic family include file.  For example, the
>> tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
>> tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
>> the number of files from 12 to 2 for ARMv8a, and that is excluding the
>> 13 I am adding in this series that would otherwise be unique files.
>>
>> To use, simply add
>> ...
>> DEFAULTTUNE ?= "neoversen1"
>> require conf/machine/include/arm/arch-armv8-2a.inc
>> ...
>>
>> Which is arguably what should be done anyway (instead of taking the
>> default of the tune include file).
>> See the qemuarm64 patch in the series for a working example.
>>
>> Of course, by removing the existing tune files, current users are going
>> to break.  A simple script can be written to use sed (or similar) to
>> replace the relevant parts for those users that would be affected (at
>> least for those that are in the layer index and update regularly).
>>
>> Thanks,
>> Jon
>>
>> ---
>>
>> Originally sent as a RFC in
>> https://lists.openembedded.org/g/openembedded-core/message/142324
>>
>> Given the generally positive feedback, sending as a patch series.
>> Keeping the "hard fail" of the file removal (per Richards comment in
>> https://lists.openembedded.org/g/openembedded-core/message/142356).
>> Only difference of note is the removal of the "arm64: set BASE_LIB to
>> lib64", as there needs to be more investigation (see
>> https://lists.openembedded.org/g/openembedded-core/message/142414).
>>
>>
>> Jon Mason (5):
>>   arch-armv8-2a.inc: Add Cortex-A55 tunings
>>   arch-armv8a.inc: Add existing tunings
>>   qemuarm64: change tuning
>>   arch-armv8a.inc: Add tunes for supported ARMv8a cores
>>   arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores
>>
>>  .../machine/include/arm/arch-armv8-2a.inc     | 175 ++++++++++++++-
>>  meta/conf/machine/include/arm/arch-armv8a.inc | 206 +++++++++++++++++-
>>  meta/conf/machine/include/tune-cortexa32.inc  |  18 --
>>  meta/conf/machine/include/tune-cortexa35.inc  |  17 --
>>  meta/conf/machine/include/tune-cortexa53.inc  |  18 --
>>  meta/conf/machine/include/tune-cortexa55.inc  |  13 --
>>  .../include/tune-cortexa57-cortexa53.inc      |  15 --
>>  meta/conf/machine/include/tune-cortexa57.inc  |  17 --
>>  .../include/tune-cortexa72-cortexa53.inc      |  20 --
>>  meta/conf/machine/include/tune-cortexa72.inc  |  13 --
>>  .../include/tune-cortexa73-cortexa53.inc      |  20 --
>>  meta/conf/machine/include/tune-thunderx.inc   |  19 --
>>  meta/conf/machine/qemuarm64.conf              |   3 +-
>>  13 files changed, 371 insertions(+), 183 deletions(-)
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa32.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa35.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa53.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa55.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa57-cortexa53.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa57.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa72-cortexa53.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa72.inc
>>  delete mode 100644 meta/conf/machine/include/tune-cortexa73-cortexa53.inc
>>  delete mode 100644 meta/conf/machine/include/tune-thunderx.inc
>>
>> --
>> 2.20.1
>>
>> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
                   ` (5 preceding siblings ...)
  2020-09-14 15:32 ` [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg Martin Jansa
@ 2020-09-15  7:09 ` Robert Berger
  2020-09-16 13:38   ` Jon Mason
  2020-09-16 13:26 ` Richard Purdie
  7 siblings, 1 reply; 19+ messages in thread
From: Robert Berger @ 2020-09-15  7:09 UTC (permalink / raw)
  To: Jon Mason, openembedded-core

Hi Jon,

That's not really a comment on the reorganization of compiler tunes, but 
more like "Do they actually do something meaningful?"

I posted here[1] some benchmarks and at least with the benchmarks I 
tried on the chips I tried there is no obvious impact.

i.mx6q:

TUNE_FEATURES        = "arm armv7a vfp thumb callconvention-hard"
TARGET_FPU           = "hard"

vs.

TUNE_FEATURES        = "arm vfp cortexa9 neon thumb callconvention-hard"
TARGET_FPU           = "hard"


i.m8mm:

TUNE_FEATURES        = "aarch64 cortexa53 crc crypto"
TARGET_FPU           = ""

vs.

TUNE_FEATURES        = "aarch64 armv8a crc crypto"
TARGET_FPU           = ""


[1] 
https://yoctoproject.blogspot.com/2020/09/compiler-tunes-benchmarks-with-yocto.html

Should we expect so see differences?

If so can you suggest benchmarks which show those differences?

Regards,

Robert

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-14 22:54   ` Jon Mason
@ 2020-09-15 14:38     ` Martin Jansa
  2020-09-15 16:59       ` Mark Hatle
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Jansa @ 2020-09-15 14:38 UTC (permalink / raw)
  To: Jon Mason; +Cc: Patches and discussions about the oe-core layer

[-- Attachment #1: Type: text/plain, Size: 13062 bytes --]

On Mon, Sep 14, 2020 at 06:54:14PM -0400, Jon Mason wrote:
> On Mon, Sep 14, 2020 at 11:32 AM Martin Jansa <martin.jansa@gmail.com> wrote:
> >
> > > This reduces the number of files from 12 to 2 for ARMv8a, and that is excluding the 13 I am adding in this series that would otherwise be unique files.
> >
> > I don't have a strong opinion on this anymore, but is the number of the include files the issue here?
> 
> This is the reason why I'm doing it.  Arm needs to support all the
> cortex-a and cortex-m cores.  This would make the majority of the tune
> files for only Arm cores.  We could move them into
> meta/conf/machine/include/arm and not make the level down directory so
> messy, but I think this is cleaner.
> 
> > I think the issue is the number of possible combinations and using the same TUNE_PKGARCH for different tunes (unlike 32bit arm tune files which use different). Bundling all these combinations in fewer include files doesn't IMHO improve it.
> 
> The multitude of combinations and redundancy code entries makes it
> very error prone.  By putting them all in the same file, it makes it
> much easier to see the differences in entries and catch errors.

It does make it easier? In my experience it's more difficult to compare
very similar long sections in different parts of the same file, than
comparing very similar files with diff.
When working on
https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/tune-test
https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/tune2-test
https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/tune3
https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/optdefaulttune
long time ago, I found it really convenient to compare all cortex*.inc
files just by replacing the number with some placeholder and comparing
the resulting files (which were in most cases identical) - doing the
same inside single file just makes it more complicated, probably by
first splitting the various sections back to separate files again.

> Longer term, I'd like to see something that removes the redundancies
> completely, but it will take some work.

So now all BSPs need to adapt (by changing from the more specific .inc
file corresponding to the actual CPU in MACHINE they implement to
generic family .inc while setting one of the possible DEFAULTTUNE,
instead of leaving the default from .inc file), but all this only
for the dubious benefit of having fewer .inc files in oe-core, right?

> > With 32bit arm include files it was useful to be able to compare various include files to check that they follow the same "structure" (or to add new cortexa* one by copying existing and regex-replace).
> >
> > Also AVAILTUNES will now contain all possible tunes from given family (which makes it almost useless for aarch64 tunes). So instead of BSP including .inc file corresponding with the core used in the MACHINE and getting some "sane" default DEFAULTTUNE (plus some other possible options listed in AVAILTUNES) we now force every BSP to set more specific DEFAULTTUNE and then let user figure out what other DEFAULTTUNEs might be compatible with it, e.g. when someone is doing multi MACHINE builds and wants to share TUNE_PKGARCH between them (to save build time, package feed size etc).
> 
> You can still set a generic default for all the machines of a given
> family.  By including armv8a.inc, you still get 'DEFAULTTUNE ?=
> "armv8a-crc"'.  This will allow for a generic for an entire family.
> I'm not seeing the issue.

OK, with e.g. raspberrypi4-64.conf from
https://github.com/agherzan/meta-raspberrypi/blob/a4c8118676ba8002edab29fc81b4e4edd9fad1f1/conf/machine/raspberrypi4-64.conf
which just
require conf/machine/include/tune-cortexa72.inc

Without your patches:
DEFAULTTUNE="cortexa72"
TUNE_PKGARCH="cortexa72"
TUNE_CCARGS=" -mcpu=cortex-a72+crc+crypto"
AVAILTUNES=" armv4 armv4t armv4b armv4tb armv5 armv5t armv5-vfp
armv5t-vfp armv5hf-vfp armv5thf-vfp armv5b armv5tb armv5b-vfp
armv5tb-vfp armv5hfb-vfp armv5thfb-vfp armv5e armv5te armv5e-vfp
armv5te-vfp armv5ehf-vfp armv5tehf-vfp armv5eb armv5teb armv5eb-vfp
armv5teb-vfp armv5ehfb-vfp armv5tehfb-vfp armv6-novfp armv6t-novfp armv6
armv6t armv6hf armv6thf armv6b-novfp armv6tb-novfp armv6b armv6tb
armv6hfb armv6thfb armv7a armv7at armv7a-vfpv3d16 armv7at-vfpv3d16
armv7a-vfpv3 armv7at-vfpv3 armv7a-vfpv4d16 armv7at-vfpv4d16 armv7a-neon
armv7at-neon armv7a-neon-vfpv4 armv7at-neon-vfpv4 armv7ahf armv7athf
armv7ahf-vfpv3d16 armv7athf-vfpv3d16 armv7ahf-vfpv3 armv7athf-vfpv3
armv7ahf-vfpv4d16 armv7athf-vfpv4d16 armv7ahf-neon armv7athf-neon
armv7ahf-neon-vfpv4 armv7athf-neon-vfpv4 armv7ab armv7atb
armv7ab-vfpv3d16 armv7atb-vfpv3d16 armv7ab-vfpv3 armv7atb-vfpv3
armv7ab-vfpv4d16 armv7atb-vfpv4d16 armv7ab-neon armv7atb-neon
armv7ab-neon-vfpv4 armv7atb-neon-vfpv4 armv7ahfb armv7athfb
armv7ahfb-vfpv3d16 armv7athfb-vfpv3d16 armv7ahfb-vfpv3 armv7athfb-vfpv3
armv7ahfb-vfpv4d16 armv7athfb-vfpv4d16 armv7ahfb-neon armv7athfb-neon
armv7ahfb-neon-vfpv4 armv7athfb-neon-vfpv4 armv7ve armv7vet
armv7ve-vfpv3d16 armv7vet-vfpv3d16 armv7ve-vfpv3 armv7vet-vfpv3
armv7ve-vfpv4d16 armv7vet-vfpv4d16 armv7ve-neon armv7vet-neon
armv7ve-neon-vfpv4 armv7vet-neon-vfpv4 armv7vehf armv7vethf
armv7vehf-vfpv3d16 armv7vethf-vfpv3d16 armv7vehf-vfpv3 armv7vethf-vfpv3
armv7vehf-vfpv4d16 armv7vethf-vfpv4d16 armv7vehf-neon armv7vethf-neon
armv7vehf-neon-vfpv4 armv7vethf-neon-vfpv4 armv7veb armv7vetb
armv7veb-vfpv3d16 armv7vetb-vfpv3d16 armv7veb-vfpv3 armv7vetb-vfpv3
armv7veb-vfpv4d16 armv7vetb-vfpv4d16 armv7veb-neon armv7vetb-neon
armv7veb-neon-vfpv4 armv7vetb-neon-vfpv4 armv7vehfb armv7vethfb
armv7vehfb-vfpv3d16 armv7vethfb-vfpv3d16 armv7vehfb-vfpv3
armv7vethfb-vfpv3 armv7vehfb-vfpv4d16 armv7vethfb-vfpv4d16
armv7vehfb-neon armv7vethfb-neon armv7vehfb-neon-vfpv4
armv7vethfb-neon-vfpv4 aarch64 aarch64_be armv8a armv8a-crc
armv8a-crc-crypto armv8a-crypto cortexa72"

With your patches:
ERROR: ParseError at /OE/build/oe-core/meta-raspberrypi/conf/machine/raspberrypi4-64.conf:13: Could not include required file conf/machine/include/tune-cortexa72.inc

After changing the require to conf/machine/include/arm/arch-armv8a.inc:
DEFAULTTUNE="armv8a-crc"
TUNE_PKGARCH="armv8a"
TUNE_CCARGS=" -march=armv8-a+crc"
AVAILTUNES=" cortexa32 cortexa32-crypto cortexa34 cortexa34-crypto
cortexa35 cortexa35-crypto cortexa53 cortexa53-crypto cortexa57
cortexa57-crypto cortexa72 cortexa73 thunderx thunderx_be
cortexa57-cortexa53 cortexa72-cortexa53 cortexa72-cortexa53-crypto
cortexa73-cortexa35 cortexa73-cortexa35-crypto cortexa73-cortexa53
cortexa73-cortexa53-crypto armv4 armv4t armv4b armv4tb armv5 armv5t
armv5-vfp armv5t-vfp armv5hf-vfp armv5thf-vfp armv5b armv5tb armv5b-vfp
armv5tb-vfp armv5hfb-vfp armv5thfb-vfp armv5e armv5te armv5e-vfp
armv5te-vfp armv5ehf-vfp armv5tehf-vfp armv5eb armv5teb armv5eb-vfp
armv5teb-vfp armv5ehfb-vfp armv5tehfb-vfp armv6-novfp armv6t-novfp armv6
armv6t armv6hf armv6thf armv6b-novfp armv6tb-novfp armv6b armv6tb
armv6hfb armv6thfb armv7a armv7at armv7a-vfpv3d16 armv7at-vfpv3d16
armv7a-vfpv3 armv7at-vfpv3 armv7a-vfpv4d16 armv7at-vfpv4d16 armv7a-neon
armv7at-neon armv7a-neon-vfpv4 armv7at-neon-vfpv4 armv7ahf armv7athf
armv7ahf-vfpv3d16 armv7athf-vfpv3d16 armv7ahf-vfpv3 armv7athf-vfpv3
armv7ahf-vfpv4d16 armv7athf-vfpv4d16 armv7ahf-neon armv7athf-neon
armv7ahf-neon-vfpv4 armv7athf-neon-vfpv4 armv7ab armv7atb
armv7ab-vfpv3d16 armv7atb-vfpv3d16 armv7ab-vfpv3 armv7atb-vfpv3
armv7ab-vfpv4d16 armv7atb-vfpv4d16 armv7ab-neon armv7atb-neon
armv7ab-neon-vfpv4 armv7atb-neon-vfpv4 armv7ahfb armv7athfb
armv7ahfb-vfpv3d16 armv7athfb-vfpv3d16 armv7ahfb-vfpv3 armv7athfb-vfpv3
armv7ahfb-vfpv4d16 armv7athfb-vfpv4d16 armv7ahfb-neon armv7athfb-neon
armv7ahfb-neon-vfpv4 armv7athfb-neon-vfpv4 armv7ve armv7vet
armv7ve-vfpv3d16 armv7vet-vfpv3d16 armv7ve-vfpv3 armv7vet-vfpv3
armv7ve-vfpv4d16 armv7vet-vfpv4d16 armv7ve-neon armv7vet-neon
armv7ve-neon-vfpv4 armv7vet-neon-vfpv4 armv7vehf armv7vethf
armv7vehf-vfpv3d16 armv7vethf-vfpv3d16 armv7vehf-vfpv3 armv7vethf-vfpv3
armv7vehf-vfpv4d16 armv7vethf-vfpv4d16 armv7vehf-neon armv7vethf-neon
armv7vehf-neon-vfpv4 armv7vethf-neon-vfpv4 armv7veb armv7vetb
armv7veb-vfpv3d16 armv7vetb-vfpv3d16 armv7veb-vfpv3 armv7vetb-vfpv3
armv7veb-vfpv4d16 armv7vetb-vfpv4d16 armv7veb-neon armv7vetb-neon
armv7veb-neon-vfpv4 armv7vetb-neon-vfpv4 armv7vehfb armv7vethfb
armv7vehfb-vfpv3d16 armv7vethfb-vfpv3d16 armv7vehfb-vfpv3
armv7vethfb-vfpv3 armv7vehfb-vfpv4d16 armv7vethfb-vfpv4d16
armv7vehfb-neon armv7vethfb-neon armv7vehfb-neon-vfpv4
armv7vethfb-neon-vfpv4 aarch64 aarch64_be armv8a armv8a-crc
armv8a-crc-crypto armv8a-crypto"

Should really raspberrypi4-64 users see e.g. cortexa73-cortexa35-crypto
in AVAILTUNES even when it doesn't make sense for their HW?

> > On Mon, Sep 14, 2020 at 5:14 PM Jon Mason <jdmason@kudzu.us> wrote:
> >>
> >> There is a large number of Arm Tune files located in
> >> meta/conf/machine/include/, and to support the current and upcoming Arm
> >> cores, more are needed.  Adding more files is simply going to make it
> >> harder to find the relevant ones for an OE/YP developer/user.  Also,
> >> there are problems with stale and erroneous configs (see my previous
> >> series), which will only be exacerbated by having more files.
> >>
> >> I am proposing a reorganization of the existing tune files by including
> >> them in the generic family include file.  For example, the
> >> tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> >> tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
> >> the number of files from 12 to 2 for ARMv8a, and that is excluding the
> >> 13 I am adding in this series that would otherwise be unique files.
> >>
> >> To use, simply add
> >> ...
> >> DEFAULTTUNE ?= "neoversen1"
> >> require conf/machine/include/arm/arch-armv8-2a.inc
> >> ...
> >>
> >> Which is arguably what should be done anyway (instead of taking the
> >> default of the tune include file).
> >> See the qemuarm64 patch in the series for a working example.
> >>
> >> Of course, by removing the existing tune files, current users are going
> >> to break.  A simple script can be written to use sed (or similar) to
> >> replace the relevant parts for those users that would be affected (at
> >> least for those that are in the layer index and update regularly).
> >>
> >> Thanks,
> >> Jon
> >>
> >> ---
> >>
> >> Originally sent as a RFC in
> >> https://lists.openembedded.org/g/openembedded-core/message/142324
> >>
> >> Given the generally positive feedback, sending as a patch series.
> >> Keeping the "hard fail" of the file removal (per Richards comment in
> >> https://lists.openembedded.org/g/openembedded-core/message/142356).
> >> Only difference of note is the removal of the "arm64: set BASE_LIB to
> >> lib64", as there needs to be more investigation (see
> >> https://lists.openembedded.org/g/openembedded-core/message/142414).
> >>
> >>
> >> Jon Mason (5):
> >>   arch-armv8-2a.inc: Add Cortex-A55 tunings
> >>   arch-armv8a.inc: Add existing tunings
> >>   qemuarm64: change tuning
> >>   arch-armv8a.inc: Add tunes for supported ARMv8a cores
> >>   arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores
> >>
> >>  .../machine/include/arm/arch-armv8-2a.inc     | 175 ++++++++++++++-
> >>  meta/conf/machine/include/arm/arch-armv8a.inc | 206 +++++++++++++++++-
> >>  meta/conf/machine/include/tune-cortexa32.inc  |  18 --
> >>  meta/conf/machine/include/tune-cortexa35.inc  |  17 --
> >>  meta/conf/machine/include/tune-cortexa53.inc  |  18 --
> >>  meta/conf/machine/include/tune-cortexa55.inc  |  13 --
> >>  .../include/tune-cortexa57-cortexa53.inc      |  15 --
> >>  meta/conf/machine/include/tune-cortexa57.inc  |  17 --
> >>  .../include/tune-cortexa72-cortexa53.inc      |  20 --
> >>  meta/conf/machine/include/tune-cortexa72.inc  |  13 --
> >>  .../include/tune-cortexa73-cortexa53.inc      |  20 --
> >>  meta/conf/machine/include/tune-thunderx.inc   |  19 --
> >>  meta/conf/machine/qemuarm64.conf              |   3 +-
> >>  13 files changed, 371 insertions(+), 183 deletions(-)
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa32.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa35.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa53.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa55.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa57-cortexa53.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa57.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa72-cortexa53.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa72.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-cortexa73-cortexa53.inc
> >>  delete mode 100644 meta/conf/machine/include/tune-thunderx.inc
> >>
> >> --
> >> 2.20.1
> >>
> >> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 201 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-15 14:38     ` Martin Jansa
@ 2020-09-15 16:59       ` Mark Hatle
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Hatle @ 2020-09-15 16:59 UTC (permalink / raw)
  To: openembedded-core



On 9/15/20 9:38 AM, Martin Jansa wrote:
> On Mon, Sep 14, 2020 at 06:54:14PM -0400, Jon Mason wrote:
>> On Mon, Sep 14, 2020 at 11:32 AM Martin Jansa <martin.jansa@gmail.com> wrote:
>>>
>>>> This reduces the number of files from 12 to 2 for ARMv8a, and that is excluding the 13 I am adding in this series that would otherwise be unique files.
>>>
>>> I don't have a strong opinion on this anymore, but is the number of the include files the issue here?
>>
>> This is the reason why I'm doing it.  Arm needs to support all the
>> cortex-a and cortex-m cores.  This would make the majority of the tune
>> files for only Arm cores.  We could move them into
>> meta/conf/machine/include/arm and not make the level down directory so
>> messy, but I think this is cleaner.
>>
>>> I think the issue is the number of possible combinations and using the same TUNE_PKGARCH for different tunes (unlike 32bit arm tune files which use different). Bundling all these combinations in fewer include files doesn't IMHO improve it.
>>
>> The multitude of combinations and redundancy code entries makes it
>> very error prone.  By putting them all in the same file, it makes it
>> much easier to see the differences in entries and catch errors.
> 
> It does make it easier? In my experience it's more difficult to compare

For some of my usecases, it makes it significantly easier.

I need to build multilib configurations that have a variety of tunes in them.
Before I had to include 5-7 different .inc files, and then ignore the warning
about the same files being included multiple times.

With this approach, I can include just one file, the proper superset of them
all, and then do my multilib configurations together.

(Yes, I realize this isn't the typical Linux use-case, but this seems to be
fairly common to do things like this when working with baremetal configurations
or trying to use the Yocto Project to build toolchains that are shared with others.)

> very similar long sections in different parts of the same file, than
> comparing very similar files with diff.
> When working on
> https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/tune-test
> https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/tune2-test
> https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/tune3
> https://git.openembedded.org/openembedded-core-contrib/log/?h=jansa/optdefaulttune
> long time ago, I found it really convenient to compare all cortex*.inc
> files just by replacing the number with some placeholder and comparing
> the resulting files (which were in most cases identical) - doing the
> same inside single file just makes it more complicated, probably by
> first splitting the various sections back to separate files again.
> 
>> Longer term, I'd like to see something that removes the redundancies
>> completely, but it will take some work.
> 
> So now all BSPs need to adapt (by changing from the more specific .inc
> file corresponding to the actual CPU in MACHINE they implement to
> generic family .inc while setting one of the possible DEFAULTTUNE,
> instead of leaving the default from .inc file), but all this only
> for the dubious benefit of having fewer .inc files in oe-core, right?

I suggested how a transition could be done, but this was rejected in favor of
make people move their BSPs to the new format.

--Mark

>>> With 32bit arm include files it was useful to be able to compare various include files to check that they follow the same "structure" (or to add new cortexa* one by copying existing and regex-replace).
>>>
>>> Also AVAILTUNES will now contain all possible tunes from given family (which makes it almost useless for aarch64 tunes). So instead of BSP including .inc file corresponding with the core used in the MACHINE and getting some "sane" default DEFAULTTUNE (plus some other possible options listed in AVAILTUNES) we now force every BSP to set more specific DEFAULTTUNE and then let user figure out what other DEFAULTTUNEs might be compatible with it, e.g. when someone is doing multi MACHINE builds and wants to share TUNE_PKGARCH between them (to save build time, package feed size etc).
>>
>> You can still set a generic default for all the machines of a given
>> family.  By including armv8a.inc, you still get 'DEFAULTTUNE ?=
>> "armv8a-crc"'.  This will allow for a generic for an entire family.
>> I'm not seeing the issue.
> 
> OK, with e.g. raspberrypi4-64.conf from
> https://github.com/agherzan/meta-raspberrypi/blob/a4c8118676ba8002edab29fc81b4e4edd9fad1f1/conf/machine/raspberrypi4-64.conf
> which just
> require conf/machine/include/tune-cortexa72.inc
> 
> Without your patches:
> DEFAULTTUNE="cortexa72"
> TUNE_PKGARCH="cortexa72"
> TUNE_CCARGS=" -mcpu=cortex-a72+crc+crypto"
> AVAILTUNES=" armv4 armv4t armv4b armv4tb armv5 armv5t armv5-vfp
> armv5t-vfp armv5hf-vfp armv5thf-vfp armv5b armv5tb armv5b-vfp
> armv5tb-vfp armv5hfb-vfp armv5thfb-vfp armv5e armv5te armv5e-vfp
> armv5te-vfp armv5ehf-vfp armv5tehf-vfp armv5eb armv5teb armv5eb-vfp
> armv5teb-vfp armv5ehfb-vfp armv5tehfb-vfp armv6-novfp armv6t-novfp armv6
> armv6t armv6hf armv6thf armv6b-novfp armv6tb-novfp armv6b armv6tb
> armv6hfb armv6thfb armv7a armv7at armv7a-vfpv3d16 armv7at-vfpv3d16
> armv7a-vfpv3 armv7at-vfpv3 armv7a-vfpv4d16 armv7at-vfpv4d16 armv7a-neon
> armv7at-neon armv7a-neon-vfpv4 armv7at-neon-vfpv4 armv7ahf armv7athf
> armv7ahf-vfpv3d16 armv7athf-vfpv3d16 armv7ahf-vfpv3 armv7athf-vfpv3
> armv7ahf-vfpv4d16 armv7athf-vfpv4d16 armv7ahf-neon armv7athf-neon
> armv7ahf-neon-vfpv4 armv7athf-neon-vfpv4 armv7ab armv7atb
> armv7ab-vfpv3d16 armv7atb-vfpv3d16 armv7ab-vfpv3 armv7atb-vfpv3
> armv7ab-vfpv4d16 armv7atb-vfpv4d16 armv7ab-neon armv7atb-neon
> armv7ab-neon-vfpv4 armv7atb-neon-vfpv4 armv7ahfb armv7athfb
> armv7ahfb-vfpv3d16 armv7athfb-vfpv3d16 armv7ahfb-vfpv3 armv7athfb-vfpv3
> armv7ahfb-vfpv4d16 armv7athfb-vfpv4d16 armv7ahfb-neon armv7athfb-neon
> armv7ahfb-neon-vfpv4 armv7athfb-neon-vfpv4 armv7ve armv7vet
> armv7ve-vfpv3d16 armv7vet-vfpv3d16 armv7ve-vfpv3 armv7vet-vfpv3
> armv7ve-vfpv4d16 armv7vet-vfpv4d16 armv7ve-neon armv7vet-neon
> armv7ve-neon-vfpv4 armv7vet-neon-vfpv4 armv7vehf armv7vethf
> armv7vehf-vfpv3d16 armv7vethf-vfpv3d16 armv7vehf-vfpv3 armv7vethf-vfpv3
> armv7vehf-vfpv4d16 armv7vethf-vfpv4d16 armv7vehf-neon armv7vethf-neon
> armv7vehf-neon-vfpv4 armv7vethf-neon-vfpv4 armv7veb armv7vetb
> armv7veb-vfpv3d16 armv7vetb-vfpv3d16 armv7veb-vfpv3 armv7vetb-vfpv3
> armv7veb-vfpv4d16 armv7vetb-vfpv4d16 armv7veb-neon armv7vetb-neon
> armv7veb-neon-vfpv4 armv7vetb-neon-vfpv4 armv7vehfb armv7vethfb
> armv7vehfb-vfpv3d16 armv7vethfb-vfpv3d16 armv7vehfb-vfpv3
> armv7vethfb-vfpv3 armv7vehfb-vfpv4d16 armv7vethfb-vfpv4d16
> armv7vehfb-neon armv7vethfb-neon armv7vehfb-neon-vfpv4
> armv7vethfb-neon-vfpv4 aarch64 aarch64_be armv8a armv8a-crc
> armv8a-crc-crypto armv8a-crypto cortexa72"
> 
> With your patches:
> ERROR: ParseError at /OE/build/oe-core/meta-raspberrypi/conf/machine/raspberrypi4-64.conf:13: Could not include required file conf/machine/include/tune-cortexa72.inc
> 
> After changing the require to conf/machine/include/arm/arch-armv8a.inc:
> DEFAULTTUNE="armv8a-crc"
> TUNE_PKGARCH="armv8a"
> TUNE_CCARGS=" -march=armv8-a+crc"
> AVAILTUNES=" cortexa32 cortexa32-crypto cortexa34 cortexa34-crypto
> cortexa35 cortexa35-crypto cortexa53 cortexa53-crypto cortexa57
> cortexa57-crypto cortexa72 cortexa73 thunderx thunderx_be
> cortexa57-cortexa53 cortexa72-cortexa53 cortexa72-cortexa53-crypto
> cortexa73-cortexa35 cortexa73-cortexa35-crypto cortexa73-cortexa53
> cortexa73-cortexa53-crypto armv4 armv4t armv4b armv4tb armv5 armv5t
> armv5-vfp armv5t-vfp armv5hf-vfp armv5thf-vfp armv5b armv5tb armv5b-vfp
> armv5tb-vfp armv5hfb-vfp armv5thfb-vfp armv5e armv5te armv5e-vfp
> armv5te-vfp armv5ehf-vfp armv5tehf-vfp armv5eb armv5teb armv5eb-vfp
> armv5teb-vfp armv5ehfb-vfp armv5tehfb-vfp armv6-novfp armv6t-novfp armv6
> armv6t armv6hf armv6thf armv6b-novfp armv6tb-novfp armv6b armv6tb
> armv6hfb armv6thfb armv7a armv7at armv7a-vfpv3d16 armv7at-vfpv3d16
> armv7a-vfpv3 armv7at-vfpv3 armv7a-vfpv4d16 armv7at-vfpv4d16 armv7a-neon
> armv7at-neon armv7a-neon-vfpv4 armv7at-neon-vfpv4 armv7ahf armv7athf
> armv7ahf-vfpv3d16 armv7athf-vfpv3d16 armv7ahf-vfpv3 armv7athf-vfpv3
> armv7ahf-vfpv4d16 armv7athf-vfpv4d16 armv7ahf-neon armv7athf-neon
> armv7ahf-neon-vfpv4 armv7athf-neon-vfpv4 armv7ab armv7atb
> armv7ab-vfpv3d16 armv7atb-vfpv3d16 armv7ab-vfpv3 armv7atb-vfpv3
> armv7ab-vfpv4d16 armv7atb-vfpv4d16 armv7ab-neon armv7atb-neon
> armv7ab-neon-vfpv4 armv7atb-neon-vfpv4 armv7ahfb armv7athfb
> armv7ahfb-vfpv3d16 armv7athfb-vfpv3d16 armv7ahfb-vfpv3 armv7athfb-vfpv3
> armv7ahfb-vfpv4d16 armv7athfb-vfpv4d16 armv7ahfb-neon armv7athfb-neon
> armv7ahfb-neon-vfpv4 armv7athfb-neon-vfpv4 armv7ve armv7vet
> armv7ve-vfpv3d16 armv7vet-vfpv3d16 armv7ve-vfpv3 armv7vet-vfpv3
> armv7ve-vfpv4d16 armv7vet-vfpv4d16 armv7ve-neon armv7vet-neon
> armv7ve-neon-vfpv4 armv7vet-neon-vfpv4 armv7vehf armv7vethf
> armv7vehf-vfpv3d16 armv7vethf-vfpv3d16 armv7vehf-vfpv3 armv7vethf-vfpv3
> armv7vehf-vfpv4d16 armv7vethf-vfpv4d16 armv7vehf-neon armv7vethf-neon
> armv7vehf-neon-vfpv4 armv7vethf-neon-vfpv4 armv7veb armv7vetb
> armv7veb-vfpv3d16 armv7vetb-vfpv3d16 armv7veb-vfpv3 armv7vetb-vfpv3
> armv7veb-vfpv4d16 armv7vetb-vfpv4d16 armv7veb-neon armv7vetb-neon
> armv7veb-neon-vfpv4 armv7vetb-neon-vfpv4 armv7vehfb armv7vethfb
> armv7vehfb-vfpv3d16 armv7vethfb-vfpv3d16 armv7vehfb-vfpv3
> armv7vethfb-vfpv3 armv7vehfb-vfpv4d16 armv7vethfb-vfpv4d16
> armv7vehfb-neon armv7vethfb-neon armv7vehfb-neon-vfpv4
> armv7vethfb-neon-vfpv4 aarch64 aarch64_be armv8a armv8a-crc
> armv8a-crc-crypto armv8a-crypto"
> 
> Should really raspberrypi4-64 users see e.g. cortexa73-cortexa35-crypto
> in AVAILTUNES even when it doesn't make sense for their HW?
> 
>>> On Mon, Sep 14, 2020 at 5:14 PM Jon Mason <jdmason@kudzu.us> wrote:
>>>>
>>>> There is a large number of Arm Tune files located in
>>>> meta/conf/machine/include/, and to support the current and upcoming Arm
>>>> cores, more are needed.  Adding more files is simply going to make it
>>>> harder to find the relevant ones for an OE/YP developer/user.  Also,
>>>> there are problems with stale and erroneous configs (see my previous
>>>> series), which will only be exacerbated by having more files.
>>>>
>>>> I am proposing a reorganization of the existing tune files by including
>>>> them in the generic family include file.  For example, the
>>>> tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
>>>> tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
>>>> the number of files from 12 to 2 for ARMv8a, and that is excluding the
>>>> 13 I am adding in this series that would otherwise be unique files.
>>>>
>>>> To use, simply add
>>>> ...
>>>> DEFAULTTUNE ?= "neoversen1"
>>>> require conf/machine/include/arm/arch-armv8-2a.inc
>>>> ...
>>>>
>>>> Which is arguably what should be done anyway (instead of taking the
>>>> default of the tune include file).
>>>> See the qemuarm64 patch in the series for a working example.
>>>>
>>>> Of course, by removing the existing tune files, current users are going
>>>> to break.  A simple script can be written to use sed (or similar) to
>>>> replace the relevant parts for those users that would be affected (at
>>>> least for those that are in the layer index and update regularly).
>>>>
>>>> Thanks,
>>>> Jon
>>>>
>>>> ---
>>>>
>>>> Originally sent as a RFC in
>>>> https://lists.openembedded.org/g/openembedded-core/message/142324
>>>>
>>>> Given the generally positive feedback, sending as a patch series.
>>>> Keeping the "hard fail" of the file removal (per Richards comment in
>>>> https://lists.openembedded.org/g/openembedded-core/message/142356).
>>>> Only difference of note is the removal of the "arm64: set BASE_LIB to
>>>> lib64", as there needs to be more investigation (see
>>>> https://lists.openembedded.org/g/openembedded-core/message/142414).
>>>>
>>>>
>>>> Jon Mason (5):
>>>>   arch-armv8-2a.inc: Add Cortex-A55 tunings
>>>>   arch-armv8a.inc: Add existing tunings
>>>>   qemuarm64: change tuning
>>>>   arch-armv8a.inc: Add tunes for supported ARMv8a cores
>>>>   arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores
>>>>
>>>>  .../machine/include/arm/arch-armv8-2a.inc     | 175 ++++++++++++++-
>>>>  meta/conf/machine/include/arm/arch-armv8a.inc | 206 +++++++++++++++++-
>>>>  meta/conf/machine/include/tune-cortexa32.inc  |  18 --
>>>>  meta/conf/machine/include/tune-cortexa35.inc  |  17 --
>>>>  meta/conf/machine/include/tune-cortexa53.inc  |  18 --
>>>>  meta/conf/machine/include/tune-cortexa55.inc  |  13 --
>>>>  .../include/tune-cortexa57-cortexa53.inc      |  15 --
>>>>  meta/conf/machine/include/tune-cortexa57.inc  |  17 --
>>>>  .../include/tune-cortexa72-cortexa53.inc      |  20 --
>>>>  meta/conf/machine/include/tune-cortexa72.inc  |  13 --
>>>>  .../include/tune-cortexa73-cortexa53.inc      |  20 --
>>>>  meta/conf/machine/include/tune-thunderx.inc   |  19 --
>>>>  meta/conf/machine/qemuarm64.conf              |   3 +-
>>>>  13 files changed, 371 insertions(+), 183 deletions(-)
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa32.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa35.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa53.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa55.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa57-cortexa53.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa57.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa72-cortexa53.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa72.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-cortexa73-cortexa53.inc
>>>>  delete mode 100644 meta/conf/machine/include/tune-thunderx.inc
>>>>
>>>> --
>>>> 2.20.1
>>>>
>>>>
>>>>
>>>> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
                   ` (6 preceding siblings ...)
  2020-09-15  7:09 ` Robert Berger
@ 2020-09-16 13:26 ` Richard Purdie
  2020-09-16 13:45   ` Jon Mason
  7 siblings, 1 reply; 19+ messages in thread
From: Richard Purdie @ 2020-09-16 13:26 UTC (permalink / raw)
  To: Jon Mason, openembedded-core

On Mon, 2020-09-14 at 11:13 -0400, Jon Mason wrote:
> There is a large number of Arm Tune files located in
> meta/conf/machine/include/, and to support the current and upcoming Arm
> cores, more are needed.  Adding more files is simply going to make it
> harder to find the relevant ones for an OE/YP developer/user.  Also,
> there are problems with stale and erroneous configs (see my previous
> series), which will only be exacerbated by having more files.
> 
> I am proposing a reorganization of the existing tune files by including
> them in the generic family include file.  For example, the
> tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
> the number of files from 12 to 2 for ARMv8a, and that is excluding the
> 13 I am adding in this series that would otherwise be unique files.
> 
> To use, simply add
> ...
> DEFAULTTUNE ?= "neoversen1"
> require conf/machine/include/arm/arch-armv8-2a.inc
> ...
> 
> Which is arguably what should be done anyway (instead of taking the
> default of the tune include file).
> See the qemuarm64 patch in the series for a working example.
> 
> Of course, by removing the existing tune files, current users are going
> to break.  A simple script can be written to use sed (or similar) to
> replace the relevant parts for those users that would be affected (at
> least for those that are in the layer index and update regularly).

I've just looked at this in a bit more details and I'm worried.

The intent is the BSP includes the processor/core tune file that its
based upon. The BSP should usually know which one is present.

That tune then presents the possible options which the BSPs selects
from but the end user or distro can override.

With this change, you get all tunes and you don't know which are
compatible with a given core/processor. I'm not sure that is an
improvement.

I appreciate not wanting "lots of files" but we need to consider the
usability too.

As others have mentioned, the errors are fairly obvious with diff
between the files (or a GUI like meld).

What are the advantages this brings other than fewer files? Am I
missing something?

Cheers,

Richard




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-15  7:09 ` Robert Berger
@ 2020-09-16 13:38   ` Jon Mason
  2020-09-17 12:05     ` Robert Berger
  0 siblings, 1 reply; 19+ messages in thread
From: Jon Mason @ 2020-09-16 13:38 UTC (permalink / raw)
  To: Robert Berger; +Cc: Patches and discussions about the oe-core layer

On Tue, Sep 15, 2020 at 3:09 AM Robert Berger
<oecore.mailinglist@gmail.com> wrote:
>
> Hi Jon,
>
> That's not really a comment on the reorganization of compiler tunes, but
> more like "Do they actually do something meaningful?"
>
> I posted here[1] some benchmarks and at least with the benchmarks I
> tried on the chips I tried there is no obvious impact.
>
> i.mx6q:
>
> TUNE_FEATURES        = "arm armv7a vfp thumb callconvention-hard"
> TARGET_FPU           = "hard"
>
> vs.
>
> TUNE_FEATURES        = "arm vfp cortexa9 neon thumb callconvention-hard"
> TARGET_FPU           = "hard"
>
>
> i.m8mm:
>
> TUNE_FEATURES        = "aarch64 cortexa53 crc crypto"
> TARGET_FPU           = ""
>
> vs.
>
> TUNE_FEATURES        = "aarch64 armv8a crc crypto"
> TARGET_FPU           = ""
>
>
> [1]
> https://yoctoproject.blogspot.com/2020/09/compiler-tunes-benchmarks-with-yocto.html
>
> Should we expect so see differences?

There are more things at play here than simply performance.  But
speaking of performance, there might not be much of a benefit between
a generic ARMv8.0 and a ARMv8 based core (like A53), but I do expect
to see a performance bump for a ARMv8.2 based core (like A76).  The
delta between the former is much smaller than the latter.

The differences are not only performance.  Tuning for A76 versus a
more generic armv8a allows for security features like
branch-protection to be enabled (as it isn't supported in older
versions).  You get these kind of things "by default" when tuning for
the specific model.

Thanks,
Jon

> If so can you suggest benchmarks which show those differences?
>
>
> Regards,
>
> Robert

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-16 13:26 ` Richard Purdie
@ 2020-09-16 13:45   ` Jon Mason
  2020-09-16 13:49     ` Richard Purdie
  0 siblings, 1 reply; 19+ messages in thread
From: Jon Mason @ 2020-09-16 13:45 UTC (permalink / raw)
  To: Richard Purdie; +Cc: Patches and discussions about the oe-core layer

On Wed, Sep 16, 2020 at 9:26 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Mon, 2020-09-14 at 11:13 -0400, Jon Mason wrote:
> > There is a large number of Arm Tune files located in
> > meta/conf/machine/include/, and to support the current and upcoming Arm
> > cores, more are needed.  Adding more files is simply going to make it
> > harder to find the relevant ones for an OE/YP developer/user.  Also,
> > there are problems with stale and erroneous configs (see my previous
> > series), which will only be exacerbated by having more files.
> >
> > I am proposing a reorganization of the existing tune files by including
> > them in the generic family include file.  For example, the
> > tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> > tune-cortexa57.inc would be moved into arch-armv8a.inc.  This reduces
> > the number of files from 12 to 2 for ARMv8a, and that is excluding the
> > 13 I am adding in this series that would otherwise be unique files.
> >
> > To use, simply add
> > ...
> > DEFAULTTUNE ?= "neoversen1"
> > require conf/machine/include/arm/arch-armv8-2a.inc
> > ...
> >
> > Which is arguably what should be done anyway (instead of taking the
> > default of the tune include file).
> > See the qemuarm64 patch in the series for a working example.
> >
> > Of course, by removing the existing tune files, current users are going
> > to break.  A simple script can be written to use sed (or similar) to
> > replace the relevant parts for those users that would be affected (at
> > least for those that are in the layer index and update regularly).
>
> I've just looked at this in a bit more details and I'm worried.
>
> The intent is the BSP includes the processor/core tune file that its
> based upon. The BSP should usually know which one is present.
>
> That tune then presents the possible options which the BSPs selects
> from but the end user or distro can override.
>
> With this change, you get all tunes and you don't know which are
> compatible with a given core/processor. I'm not sure that is an
> improvement.

Before you were selecting an A75 inc file (for example), now you are
specifying an ARMv8.2 inc file and saying it is an A75 from the
available list.  Either way, you needed to know which core you were
running.

In addition to this, you have the ability to run the generic default
for the family, which will work for all of them.  And for ARMv8, even
the more generic armv8 will work for all of them.  So, there isn't an
incompatibility problem.

> I appreciate not wanting "lots of files" but we need to consider the
> usability too.
>
> As others have mentioned, the errors are fairly obvious with diff
> between the files (or a GUI like meld).
>
> What are the advantages this brings other than fewer files? Am I
> missing something?

This is the main benefit of it, TBH.  If I add 25 more arm tunes, it's
going to get ugly.  And if Arm keeps adding cores at this same pace
every year, it's going to get even uglier.  However, I'm fine to send
out those patches if it only bothers me to have so many files.

Thanks,
Jon

>
> Cheers,
>
> Richard
>
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-16 13:45   ` Jon Mason
@ 2020-09-16 13:49     ` Richard Purdie
  2020-09-16 14:25       ` Jon Mason
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Purdie @ 2020-09-16 13:49 UTC (permalink / raw)
  To: Jon Mason; +Cc: Patches and discussions about the oe-core layer

On Wed, 2020-09-16 at 09:45 -0400, Jon Mason wrote:
> On Wed, Sep 16, 2020 at 9:26 AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> > On Mon, 2020-09-14 at 11:13 -0400, Jon Mason wrote:
> > > There is a large number of Arm Tune files located in
> > > meta/conf/machine/include/, and to support the current and
> > > upcoming Arm
> > > cores, more are needed.  Adding more files is simply going to
> > > make it
> > > harder to find the relevant ones for an OE/YP
> > > developer/user.  Also,
> > > there are problems with stale and erroneous configs (see my
> > > previous
> > > series), which will only be exacerbated by having more files.
> > > 
> > > I am proposing a reorganization of the existing tune files by
> > > including
> > > them in the generic family include file.  For example, the
> > > tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> > > tune-cortexa57.inc would be moved into arch-armv8a.inc.  This
> > > reduces
> > > the number of files from 12 to 2 for ARMv8a, and that is
> > > excluding the
> > > 13 I am adding in this series that would otherwise be unique
> > > files.
> > > 
> > > To use, simply add
> > > ...
> > > DEFAULTTUNE ?= "neoversen1"
> > > require conf/machine/include/arm/arch-armv8-2a.inc
> > > ...
> > > 
> > > Which is arguably what should be done anyway (instead of taking
> > > the
> > > default of the tune include file).
> > > See the qemuarm64 patch in the series for a working example.
> > > 
> > > Of course, by removing the existing tune files, current users are
> > > going
> > > to break.  A simple script can be written to use sed (or similar)
> > > to
> > > replace the relevant parts for those users that would be affected
> > > (at
> > > least for those that are in the layer index and update
> > > regularly).
> > 
> > I've just looked at this in a bit more details and I'm worried.
> > 
> > The intent is the BSP includes the processor/core tune file that
> > its
> > based upon. The BSP should usually know which one is present.
> > 
> > That tune then presents the possible options which the BSPs selects
> > from but the end user or distro can override.
> > 
> > With this change, you get all tunes and you don't know which are
> > compatible with a given core/processor. I'm not sure that is an
> > improvement.
> 
> Before you were selecting an A75 inc file (for example), now you are
> specifying an ARMv8.2 inc file and saying it is an A75 from the
> available list.  Either way, you needed to know which core you were
> running.
> 
> In addition to this, you have the ability to run the generic default
> for the family, which will work for all of them.  And for ARMv8, even
> the more generic armv8 will work for all of them.  So, there isn't an
> incompatibility problem.

You should have the ability to access the generic tune with either
approach?

> > I appreciate not wanting "lots of files" but we need to consider
> > the
> > usability too.
> > 
> > As others have mentioned, the errors are fairly obvious with diff
> > between the files (or a GUI like meld).
> > 
> > What are the advantages this brings other than fewer files? Am I
> > missing something?
> 
> This is the main benefit of it, TBH.  If I add 25 more arm tunes,
> it's going to get ugly.  And if Arm keeps adding cores at this same
> pace every year, it's going to get even uglier.  However, I'm fine to
> send out those patches if it only bothers me to have so many files.

My worry is that the current system:

a) tells BSP developers to select their processor, not an arch
b) only shows them tunes that work on that processor

So we're changing the model, but only for armv8XXX. This is going to be
confusing. I do like only showing people things they can use too.

How about we split the difference and add the new tune files into
subdirs by architecture?

Cheers,

Richard


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-16 13:49     ` Richard Purdie
@ 2020-09-16 14:25       ` Jon Mason
  2020-09-16 14:30         ` Martin Jansa
  2020-09-16 14:46         ` Richard Purdie
  0 siblings, 2 replies; 19+ messages in thread
From: Jon Mason @ 2020-09-16 14:25 UTC (permalink / raw)
  To: Richard Purdie; +Cc: Patches and discussions about the oe-core layer

On Wed, Sep 16, 2020 at 9:49 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Wed, 2020-09-16 at 09:45 -0400, Jon Mason wrote:
> > On Wed, Sep 16, 2020 at 9:26 AM Richard Purdie
> > <richard.purdie@linuxfoundation.org> wrote:
> > > On Mon, 2020-09-14 at 11:13 -0400, Jon Mason wrote:
> > > > There is a large number of Arm Tune files located in
> > > > meta/conf/machine/include/, and to support the current and
> > > > upcoming Arm
> > > > cores, more are needed.  Adding more files is simply going to
> > > > make it
> > > > harder to find the relevant ones for an OE/YP
> > > > developer/user.  Also,
> > > > there are problems with stale and erroneous configs (see my
> > > > previous
> > > > series), which will only be exacerbated by having more files.
> > > >
> > > > I am proposing a reorganization of the existing tune files by
> > > > including
> > > > them in the generic family include file.  For example, the
> > > > tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> > > > tune-cortexa57.inc would be moved into arch-armv8a.inc.  This
> > > > reduces
> > > > the number of files from 12 to 2 for ARMv8a, and that is
> > > > excluding the
> > > > 13 I am adding in this series that would otherwise be unique
> > > > files.
> > > >
> > > > To use, simply add
> > > > ...
> > > > DEFAULTTUNE ?= "neoversen1"
> > > > require conf/machine/include/arm/arch-armv8-2a.inc
> > > > ...
> > > >
> > > > Which is arguably what should be done anyway (instead of taking
> > > > the
> > > > default of the tune include file).
> > > > See the qemuarm64 patch in the series for a working example.
> > > >
> > > > Of course, by removing the existing tune files, current users are
> > > > going
> > > > to break.  A simple script can be written to use sed (or similar)
> > > > to
> > > > replace the relevant parts for those users that would be affected
> > > > (at
> > > > least for those that are in the layer index and update
> > > > regularly).
> > >
> > > I've just looked at this in a bit more details and I'm worried.
> > >
> > > The intent is the BSP includes the processor/core tune file that
> > > its
> > > based upon. The BSP should usually know which one is present.
> > >
> > > That tune then presents the possible options which the BSPs selects
> > > from but the end user or distro can override.
> > >
> > > With this change, you get all tunes and you don't know which are
> > > compatible with a given core/processor. I'm not sure that is an
> > > improvement.
> >
> > Before you were selecting an A75 inc file (for example), now you are
> > specifying an ARMv8.2 inc file and saying it is an A75 from the
> > available list.  Either way, you needed to know which core you were
> > running.
> >
> > In addition to this, you have the ability to run the generic default
> > for the family, which will work for all of them.  And for ARMv8, even
> > the more generic armv8 will work for all of them.  So, there isn't an
> > incompatibility problem.
>
> You should have the ability to access the generic tune with either
> approach?
>
> > > I appreciate not wanting "lots of files" but we need to consider
> > > the
> > > usability too.
> > >
> > > As others have mentioned, the errors are fairly obvious with diff
> > > between the files (or a GUI like meld).
> > >
> > > What are the advantages this brings other than fewer files? Am I
> > > missing something?
> >
> > This is the main benefit of it, TBH.  If I add 25 more arm tunes,
> > it's going to get ugly.  And if Arm keeps adding cores at this same
> > pace every year, it's going to get even uglier.  However, I'm fine to
> > send out those patches if it only bothers me to have so many files.
>
> My worry is that the current system:
>
> a) tells BSP developers to select their processor, not an arch
> b) only shows them tunes that work on that processor
>
> So we're changing the model, but only for armv8XXX. This is going to be
> confusing. I do like only showing people things they can use too.
>
> How about we split the difference and add the new tune files into
> subdirs by architecture?

So it would look something like
meta/conf/machine/arm/armv8.0/tune-cortexa57.inc
meta/conf/machine/arm/armv8.2/tune-cortexa75.inc

And inside of those, it would reference
meta/conf/machine/include/arm/arch-armv8.inc or
meta/conf/machine/include/arm/arch-armv8-2a.inc (just as it does now).
Correct?

Assuming so, does it make sense to try and match this with all the
other CPU architectures (e.g., x86, ppc, mips, armv7, etc)?

Thanks,
Jon


> Cheers,
>
> Richard
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-16 14:25       ` Jon Mason
@ 2020-09-16 14:30         ` Martin Jansa
  2020-09-16 14:46         ` Richard Purdie
  1 sibling, 0 replies; 19+ messages in thread
From: Martin Jansa @ 2020-09-16 14:30 UTC (permalink / raw)
  To: Jon Mason; +Cc: Richard Purdie, Patches and discussions about the oe-core layer

[-- Attachment #1: Type: text/plain, Size: 5299 bytes --]

On Wed, Sep 16, 2020 at 10:25:49AM -0400, Jon Mason wrote:
> On Wed, Sep 16, 2020 at 9:49 AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> >
> > On Wed, 2020-09-16 at 09:45 -0400, Jon Mason wrote:
> > > On Wed, Sep 16, 2020 at 9:26 AM Richard Purdie
> > > <richard.purdie@linuxfoundation.org> wrote:
> > > > On Mon, 2020-09-14 at 11:13 -0400, Jon Mason wrote:
> > > > > There is a large number of Arm Tune files located in
> > > > > meta/conf/machine/include/, and to support the current and
> > > > > upcoming Arm
> > > > > cores, more are needed.  Adding more files is simply going to
> > > > > make it
> > > > > harder to find the relevant ones for an OE/YP
> > > > > developer/user.  Also,
> > > > > there are problems with stale and erroneous configs (see my
> > > > > previous
> > > > > series), which will only be exacerbated by having more files.
> > > > >
> > > > > I am proposing a reorganization of the existing tune files by
> > > > > including
> > > > > them in the generic family include file.  For example, the
> > > > > tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> > > > > tune-cortexa57.inc would be moved into arch-armv8a.inc.  This
> > > > > reduces
> > > > > the number of files from 12 to 2 for ARMv8a, and that is
> > > > > excluding the
> > > > > 13 I am adding in this series that would otherwise be unique
> > > > > files.
> > > > >
> > > > > To use, simply add
> > > > > ...
> > > > > DEFAULTTUNE ?= "neoversen1"
> > > > > require conf/machine/include/arm/arch-armv8-2a.inc
> > > > > ...
> > > > >
> > > > > Which is arguably what should be done anyway (instead of taking
> > > > > the
> > > > > default of the tune include file).
> > > > > See the qemuarm64 patch in the series for a working example.
> > > > >
> > > > > Of course, by removing the existing tune files, current users are
> > > > > going
> > > > > to break.  A simple script can be written to use sed (or similar)
> > > > > to
> > > > > replace the relevant parts for those users that would be affected
> > > > > (at
> > > > > least for those that are in the layer index and update
> > > > > regularly).
> > > >
> > > > I've just looked at this in a bit more details and I'm worried.
> > > >
> > > > The intent is the BSP includes the processor/core tune file that
> > > > its
> > > > based upon. The BSP should usually know which one is present.
> > > >
> > > > That tune then presents the possible options which the BSPs selects
> > > > from but the end user or distro can override.
> > > >
> > > > With this change, you get all tunes and you don't know which are
> > > > compatible with a given core/processor. I'm not sure that is an
> > > > improvement.
> > >
> > > Before you were selecting an A75 inc file (for example), now you are
> > > specifying an ARMv8.2 inc file and saying it is an A75 from the
> > > available list.  Either way, you needed to know which core you were
> > > running.
> > >
> > > In addition to this, you have the ability to run the generic default
> > > for the family, which will work for all of them.  And for ARMv8, even
> > > the more generic armv8 will work for all of them.  So, there isn't an
> > > incompatibility problem.
> >
> > You should have the ability to access the generic tune with either
> > approach?
> >
> > > > I appreciate not wanting "lots of files" but we need to consider
> > > > the
> > > > usability too.
> > > >
> > > > As others have mentioned, the errors are fairly obvious with diff
> > > > between the files (or a GUI like meld).
> > > >
> > > > What are the advantages this brings other than fewer files? Am I
> > > > missing something?
> > >
> > > This is the main benefit of it, TBH.  If I add 25 more arm tunes,
> > > it's going to get ugly.  And if Arm keeps adding cores at this same
> > > pace every year, it's going to get even uglier.  However, I'm fine to
> > > send out those patches if it only bothers me to have so many files.
> >
> > My worry is that the current system:
> >
> > a) tells BSP developers to select their processor, not an arch
> > b) only shows them tunes that work on that processor
> >
> > So we're changing the model, but only for armv8XXX. This is going to be
> > confusing. I do like only showing people things they can use too.
> >
> > How about we split the difference and add the new tune files into
> > subdirs by architecture?
> 
> So it would look something like
> meta/conf/machine/arm/armv8.0/tune-cortexa57.inc
> meta/conf/machine/arm/armv8.2/tune-cortexa75.inc

This additional level will still force all BSPs to update the
include/require line in MACHINE configs.

> And inside of those, it would reference
> meta/conf/machine/include/arm/arch-armv8.inc or
> meta/conf/machine/include/arm/arch-armv8-2a.inc (just as it does now).
> Correct?

Isn't this enough to see which tune-coretex* file belongs to which
family? (e.g. git grep arch-armv8-2a.inc to see all available tune files
from armv8.2a family instead of armv8.2 subdirectory)

> Assuming so, does it make sense to try and match this with all the
> other CPU architectures (e.g., x86, ppc, mips, armv7, etc)?
> 
> Thanks,
> Jon
> 
> 
> > Cheers,
> >
> > Richard
> >

> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 201 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-16 14:25       ` Jon Mason
  2020-09-16 14:30         ` Martin Jansa
@ 2020-09-16 14:46         ` Richard Purdie
  1 sibling, 0 replies; 19+ messages in thread
From: Richard Purdie @ 2020-09-16 14:46 UTC (permalink / raw)
  To: Jon Mason; +Cc: Patches and discussions about the oe-core layer

On Wed, 2020-09-16 at 10:25 -0400, Jon Mason wrote:
> On Wed, Sep 16, 2020 at 9:49 AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> > On Wed, 2020-09-16 at 09:45 -0400, Jon Mason wrote:
> > > On Wed, Sep 16, 2020 at 9:26 AM Richard Purdie
> > > <richard.purdie@linuxfoundation.org> wrote:
> > > > On Mon, 2020-09-14 at 11:13 -0400, Jon Mason wrote:
> > > > > There is a large number of Arm Tune files located in
> > > > > meta/conf/machine/include/, and to support the current and
> > > > > upcoming Arm
> > > > > cores, more are needed.  Adding more files is simply going to
> > > > > make it
> > > > > harder to find the relevant ones for an OE/YP
> > > > > developer/user.  Also,
> > > > > there are problems with stale and erroneous configs (see my
> > > > > previous
> > > > > series), which will only be exacerbated by having more files.
> > > > > 
> > > > > I am proposing a reorganization of the existing tune files by
> > > > > including
> > > > > them in the generic family include file.  For example, the
> > > > > tune-cortexa55.inc would be moved into arch-armv8-2a.inc, and
> > > > > tune-cortexa57.inc would be moved into arch-armv8a.inc.  This
> > > > > reduces
> > > > > the number of files from 12 to 2 for ARMv8a, and that is
> > > > > excluding the
> > > > > 13 I am adding in this series that would otherwise be unique
> > > > > files.
> > > > > 
> > > > > To use, simply add
> > > > > ...
> > > > > DEFAULTTUNE ?= "neoversen1"
> > > > > require conf/machine/include/arm/arch-armv8-2a.inc
> > > > > ...
> > > > > 
> > > > > Which is arguably what should be done anyway (instead of
> > > > > taking
> > > > > the
> > > > > default of the tune include file).
> > > > > See the qemuarm64 patch in the series for a working example.
> > > > > 
> > > > > Of course, by removing the existing tune files, current users
> > > > > are
> > > > > going
> > > > > to break.  A simple script can be written to use sed (or
> > > > > similar)
> > > > > to
> > > > > replace the relevant parts for those users that would be
> > > > > affected
> > > > > (at
> > > > > least for those that are in the layer index and update
> > > > > regularly).
> > > > 
> > > > I've just looked at this in a bit more details and I'm worried.
> > > > 
> > > > The intent is the BSP includes the processor/core tune file
> > > > that
> > > > its
> > > > based upon. The BSP should usually know which one is present.
> > > > 
> > > > That tune then presents the possible options which the BSPs
> > > > selects
> > > > from but the end user or distro can override.
> > > > 
> > > > With this change, you get all tunes and you don't know which
> > > > are
> > > > compatible with a given core/processor. I'm not sure that is an
> > > > improvement.
> > > 
> > > Before you were selecting an A75 inc file (for example), now you
> > > are
> > > specifying an ARMv8.2 inc file and saying it is an A75 from the
> > > available list.  Either way, you needed to know which core you
> > > were
> > > running.
> > > 
> > > In addition to this, you have the ability to run the generic
> > > default
> > > for the family, which will work for all of them.  And for ARMv8,
> > > even
> > > the more generic armv8 will work for all of them.  So, there
> > > isn't an
> > > incompatibility problem.
> > 
> > You should have the ability to access the generic tune with either
> > approach?
> > 
> > > > I appreciate not wanting "lots of files" but we need to
> > > > consider
> > > > the
> > > > usability too.
> > > > 
> > > > As others have mentioned, the errors are fairly obvious with
> > > > diff
> > > > between the files (or a GUI like meld).
> > > > 
> > > > What are the advantages this brings other than fewer files? Am
> > > > I
> > > > missing something?
> > > 
> > > This is the main benefit of it, TBH.  If I add 25 more arm tunes,
> > > it's going to get ugly.  And if Arm keeps adding cores at this
> > > same
> > > pace every year, it's going to get even uglier.  However, I'm
> > > fine to
> > > send out those patches if it only bothers me to have so many
> > > files.
> > 
> > My worry is that the current system:
> > 
> > a) tells BSP developers to select their processor, not an arch
> > b) only shows them tunes that work on that processor
> > 
> > So we're changing the model, but only for armv8XXX. This is going
> > to be
> > confusing. I do like only showing people things they can use too.
> > 
> > How about we split the difference and add the new tune files into
> > subdirs by architecture?
> 
> So it would look something like
> meta/conf/machine/arm/armv8.0/tune-cortexa57.inc
> meta/conf/machine/arm/armv8.2/tune-cortexa75.inc

Yes. With an open question on whether we move any existing files. Maybe
just the armv8.0+ ones?

> And inside of those, it would reference
> meta/conf/machine/include/arm/arch-armv8.inc or
> meta/conf/machine/include/arm/arch-armv8-2a.inc (just as it does
> now). Correct?

Yes.

> Assuming so, does it make sense to try and match this with all the
> other CPU architectures (e.g., x86, ppc, mips, armv7, etc)?

I'd say not worth it, just do this either for new ones or the armv8*
onwards ones.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg
  2020-09-16 13:38   ` Jon Mason
@ 2020-09-17 12:05     ` Robert Berger
  0 siblings, 0 replies; 19+ messages in thread
From: Robert Berger @ 2020-09-17 12:05 UTC (permalink / raw)
  To: Jon Mason; +Cc: Patches and discussions about the oe-core layer

Hi,

Please see my comments in-line.

On 16/09/2020 16:38, Jon Mason wrote:
>> [1]
>> https://yoctoproject.blogspot.com/2020/09/compiler-tunes-benchmarks-with-yocto.html
>>
>> Should we expect so see differences?
> 
> There are more things at play here than simply performance.  

Hmm interesting. This did not cross my mind at all;)

> But
> speaking of performance, there might not be much of a benefit between
> a generic ARMv8.0 and a ARMv8 based core (like A53), but I do expect
> to see a performance bump for a ARMv8.2 based core (like A76).  The
> delta between the former is much smaller than the latter.

I guess you mean between ARMv8.0/A53 and ARMv8.2/A76. I'll need some 
toys to run tests/benchmarks there;)

> 
> The differences are not only performance.  Tuning for A76 versus a
> more generic armv8a allows for security features like
> branch-protection to be enabled (as it isn't supported in older
> versions).  You get these kind of things "by default" when tuning for
> the specific model.

OK - This is a very good point I completely ignored so far. I guess this 
will be tricky to test.

> 
> Thanks,
> Jon
> 
>> If so can you suggest benchmarks which show those differences?
>>
>>
>> Regards,
>>
>> Robert

Thanks - I updated my Blog post.

Regards,

Robert

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-09-17 12:05 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14 15:13 [meta-oe][PATCH 0/5] ARMv8 Tune reorg Jon Mason
2020-09-14 15:13 ` [meta-oe][PATCH 1/5] arch-armv8-2a.inc: Add Cortex-A55 tunings Jon Mason
2020-09-14 15:13 ` [meta-oe][PATCH 2/5] arch-armv8a.inc: Add existing tunings Jon Mason
2020-09-14 15:13 ` [meta-oe][PATCH 3/5] qemuarm64: change tuning Jon Mason
2020-09-14 15:13 ` [meta-oe][PATCH 4/5] arch-armv8a.inc: Add tunes for supported ARMv8a cores Jon Mason
2020-09-14 15:13 ` [meta-oe][PATCH 5/5] arch-armv8-2a.inc: Add tunes for supported ARMv8.2a cores Jon Mason
2020-09-14 15:32 ` [OE-core] [meta-oe][PATCH 0/5] ARMv8 Tune reorg Martin Jansa
2020-09-14 22:54   ` Jon Mason
2020-09-15 14:38     ` Martin Jansa
2020-09-15 16:59       ` Mark Hatle
2020-09-15  7:09 ` Robert Berger
2020-09-16 13:38   ` Jon Mason
2020-09-17 12:05     ` Robert Berger
2020-09-16 13:26 ` Richard Purdie
2020-09-16 13:45   ` Jon Mason
2020-09-16 13:49     ` Richard Purdie
2020-09-16 14:25       ` Jon Mason
2020-09-16 14:30         ` Martin Jansa
2020-09-16 14:46         ` Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.