All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] target/arm: Implement an IMPDEF pauth algorithm
@ 2020-08-12  6:53 Richard Henderson
  2020-08-12  6:53 ` [PATCH 1/2] target/arm: Add cpu property to control pauth Richard Henderson
  2020-08-12  6:53 ` [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
  0 siblings, 2 replies; 13+ messages in thread
From: Richard Henderson @ 2020-08-12  6:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.rutland, peter.maydell, alex.bennee

With recent linux kernels, which are built with pauth
enabled, boot times have been significantly impacted.
It turns out the architected pac algorithm is expensive
to implement without hardware accel.

I tried replacing this with AES128, which most hosts
have in hardware.  I did manage to make this perform
fairly well, but it *really* depends on the quality of
the crypto library shipped by the OS:

(0) For reference, pauth_computepac_architected consumes
    about 80% of the total runtime for a debug-enabled
    kernel build.  That's a 400% slowdown.

(1) libgcrypt performed well, with 15% of the total runtime,
    or about 18% slowdown.

(2a) libnettle, as shipped by ubuntu 18, does not have
     the x86_64 aes instruction set enabled, and so is not
     using the available hw accel.  That took about 40% of
     the total runtime or 170% slowdown.

(2b) Rebuilding libnettle locally, with --enable-fat, and
     using LD_LIBRARY_PATH to replace the system version,
     worked fairly well, with about 20% of the total runtime
     or 25% slowdown.

(2c) libnettle doesn't have support for armv8, ppc or s390.
     Those hosts should *really* be using libgcrypt.

But, silly me, I had used --target-list=aarch64-softmmu for
this testing, in order to speed up the builds.  When I went
back to building aarch64-linux-user, I was reminded that we
don't link linux-user binaries against the crypto libraries.
And those crypto libraries are usually only distributed as
shared libraries, which would cause problems for --static.

I very briefly looked into doing my own aes implementation,
with host cpu detection.  But aside from the ugliness of that,
it begs the question of what to do if there's no host accel.

I settled on a fast non-cryptographic hash with about 10% overhead.

I added a -cpu max,pauth={off,impdef,arch} property to choose
between the different algorithms.  The default remains arch,
since that's what 5.0 and 5.1 shipped.  We can discuss whether
it makes sense to change the default for "max".


r~


Richard Henderson (2):
  target/arm: Add cpu property to control pauth
  target/arm: Implement an IMPDEF pauth algorithm

 target/arm/cpu64.c        | 64 +++++++++++++++++++++++++++++++++
 target/arm/pauth_helper.c | 75 ++++++++++++++++++++++++++++++++++++---
 2 files changed, 134 insertions(+), 5 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-12  6:53 [PATCH 0/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
@ 2020-08-12  6:53 ` Richard Henderson
  2020-08-12 11:00   ` Andrew Jones
  2020-08-12  6:53 ` [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2020-08-12  6:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.rutland, peter.maydell, alex.bennee

The crypto overhead of emulating pauth can be significant for
some workloads.  Add an enumeration property that allows the
feature to be turned off, on with the architected algorithm,
or on with an implementation defined algorithm.

The architected algorithm is quite expensive to emulate;
using another algorithm may allow hardware acceleration.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu64.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index dd696183df..3181d0e2f8 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -572,6 +572,69 @@ void aarch64_add_sve_properties(Object *obj)
     }
 }
 
+static const char * const pauth_names[] = {
+    "off", "impdef", "arch"
+};
+
+static const QEnumLookup pauth_lookup = {
+    .array = pauth_names,
+    .size = ARRAY_SIZE(pauth_names)
+};
+
+static int cpu_arm_get_pauth(Object *obj, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    int value;
+
+    /* We will always set GPA+APA and GPI+API to the same value. */
+    if (FIELD_EX64(cpu->isar.id_aa64isar1, ID_AA64ISAR1, APA)) {
+        value = 2;
+    } else if (FIELD_EX64(cpu->isar.id_aa64isar1, ID_AA64ISAR1, API)) {
+        value = 1;
+    } else {
+        value = 0;
+    }
+    return value;
+}
+
+static void cpu_arm_set_pauth(Object *obj, int value, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+    uint64_t t;
+
+    /* TODO: Handle HaveEnhancedPAC, HaveEnhancedPAC2, HaveFPAC. */
+    t = cpu->isar.id_aa64isar1;
+    switch (value) {
+    case 0:
+        t = FIELD_DP64(t, ID_AA64ISAR1, APA, 0);
+        t = FIELD_DP64(t, ID_AA64ISAR1, API, 0);
+        t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 0);
+        t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 0);
+        break;
+    case 1:
+        t = FIELD_DP64(t, ID_AA64ISAR1, APA, 0);
+        t = FIELD_DP64(t, ID_AA64ISAR1, API, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 0);
+        t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 1);
+        break;
+    case 2:
+        t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR1, API, 0);
+        t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 1);
+        t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 0);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    cpu->isar.id_aa64isar1 = t;
+}
+
+static void aarch64_add_pauth_properties(Object *obj)
+{
+    object_property_add_enum(obj, "pauth", "ARMPauthKind", &pauth_lookup,
+                             cpu_arm_get_pauth, cpu_arm_set_pauth);
+}
+
 /* -cpu max: if KVM is enabled, like -cpu host (best possible with this host);
  * otherwise, a CPU with as many features enabled as our emulation supports.
  * The version of '-cpu max' for qemu-system-arm is defined in cpu.c;
@@ -720,6 +783,7 @@ static void aarch64_max_initfn(Object *obj)
 #endif
     }
 
+    aarch64_add_pauth_properties(obj);
     aarch64_add_sve_properties(obj);
     object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
                         cpu_max_set_sve_max_vq, NULL, NULL);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm
  2020-08-12  6:53 [PATCH 0/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
  2020-08-12  6:53 ` [PATCH 1/2] target/arm: Add cpu property to control pauth Richard Henderson
@ 2020-08-12  6:53 ` Richard Henderson
  2020-08-12  9:49   ` Alex Bennée
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2020-08-12  6:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.rutland, peter.maydell, alex.bennee

Without hardware acceleration, a cryptographically strong
algorithm is too expensive for pauth_computepac.

Even with hardware accel, we are not currently expecting
to link the linux-user binaries to any crypto libraries,
and doing so would generally make the --static build fail.

So choose XXH64 as a reasonably quick and decent hash.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/pauth_helper.c | 75 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 70 insertions(+), 5 deletions(-)

diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index 6dbab03768..f1a4389465 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -207,8 +207,8 @@ static uint64_t tweak_inv_shuffle(uint64_t i)
     return o;
 }
 
-static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
-                                 ARMPACKey key)
+static uint64_t __attribute__((noinline))
+pauth_computepac_architected(uint64_t data, uint64_t modifier, ARMPACKey key)
 {
     static const uint64_t RC[5] = {
         0x0000000000000000ull,
@@ -272,6 +272,71 @@ static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
     return workingval;
 }
 
+/*
+ * The XXH64 algorithm from
+ * https://github.com/Cyan4973/xxHash/blob/v0.8.0/xxhash.h
+ */
+#define PRIME64_1   0x9E3779B185EBCA87ULL
+#define PRIME64_2   0xC2B2AE3D27D4EB4FULL
+#define PRIME64_3   0x165667B19E3779F9ULL
+#define PRIME64_4   0x85EBCA77C2B2AE63ULL
+#define PRIME64_5   0x27D4EB2F165667C5ULL
+
+static inline uint64_t XXH64_round(uint64_t acc, uint64_t input)
+{
+    return rol64(acc + input * PRIME64_2, 31) * PRIME64_1;
+}
+
+static inline uint64_t XXH64_mergeround(uint64_t acc, uint64_t val)
+{
+    return (acc ^ XXH64_round(0, val)) * PRIME64_1 + PRIME64_4;
+}
+
+static inline uint64_t XXH64_avalanche(uint64_t h64)
+{
+    h64 ^= h64 >> 33;
+    h64 *= PRIME64_2;
+    h64 ^= h64 >> 29;
+    h64 *= PRIME64_3;
+    /* h64 ^= h64 >> 32; -- does not affect high 64 for pauth */
+    return h64;
+}
+
+static uint64_t __attribute__((noinline))
+pauth_computepac_impdef(uint64_t data, uint64_t modifier, ARMPACKey key)
+{
+    uint64_t v1 = 1 + PRIME64_1 + PRIME64_2;
+    uint64_t v2 = 1 + PRIME64_2;
+    uint64_t v3 = 1 + 0;
+    uint64_t v4 = 1 - PRIME64_1;
+    uint64_t h64;
+
+    v1 = XXH64_round(v1, data);
+    v2 = XXH64_round(v2, modifier);
+    v3 = XXH64_round(v3, key.lo);
+    v4 = XXH64_round(v4, key.hi);
+
+    h64 = rol64(v1, 1) + rol64(v2, 7) + rol64(v3, 12) + rol64(v4, 18);
+    h64 = XXH64_mergeround(h64, v1);
+    h64 = XXH64_mergeround(h64, v2);
+    h64 = XXH64_mergeround(h64, v3);
+    h64 = XXH64_mergeround(h64, v4);
+
+    return XXH64_avalanche(h64);
+}
+
+static uint64_t pauth_computepac(CPUARMState *env, uint64_t data,
+                                 uint64_t modifier, ARMPACKey key)
+{
+    ARMCPU *cpu = env_archcpu(env);
+
+    if (FIELD_EX64(cpu->isar.id_aa64isar1, ID_AA64ISAR1, APA)) {
+        return pauth_computepac_architected(data, modifier, key);
+    } else {
+        return pauth_computepac_impdef(data, modifier, key);
+    }
+}
+
 static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
                              ARMPACKey *key, bool data)
 {
@@ -292,7 +357,7 @@ static uint64_t pauth_addpac(CPUARMState *env, uint64_t ptr, uint64_t modifier,
     bot_bit = 64 - param.tsz;
     ext_ptr = deposit64(ptr, bot_bit, top_bit - bot_bit, ext);
 
-    pac = pauth_computepac(ext_ptr, modifier, *key);
+    pac = pauth_computepac(env, ext_ptr, modifier, *key);
 
     /*
      * Check if the ptr has good extension bits and corrupt the
@@ -341,7 +406,7 @@ static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
     uint64_t pac, orig_ptr, test;
 
     orig_ptr = pauth_original_ptr(ptr, param);
-    pac = pauth_computepac(orig_ptr, modifier, *key);
+    pac = pauth_computepac(env, orig_ptr, modifier, *key);
     bot_bit = 64 - param.tsz;
     top_bit = 64 - 8 * param.tbi;
 
@@ -442,7 +507,7 @@ uint64_t HELPER(pacga)(CPUARMState *env, uint64_t x, uint64_t y)
     uint64_t pac;
 
     pauth_check_trap(env, arm_current_el(env), GETPC());
-    pac = pauth_computepac(x, y, env->keys.apga);
+    pac = pauth_computepac(env, x, y, env->keys.apga);
 
     return pac & 0xffffffff00000000ull;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm
  2020-08-12  6:53 ` [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
@ 2020-08-12  9:49   ` Alex Bennée
  2020-08-12 15:13     ` Richard Henderson
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Bennée @ 2020-08-12  9:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.rutland, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Without hardware acceleration, a cryptographically strong
> algorithm is too expensive for pauth_computepac.
>
> Even with hardware accel, we are not currently expecting
> to link the linux-user binaries to any crypto libraries,
> and doing so would generally make the --static build fail.
>
> So choose XXH64 as a reasonably quick and decent hash.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/pauth_helper.c | 75 ++++++++++++++++++++++++++++++++++++---
>  1 file changed, 70 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
> index 6dbab03768..f1a4389465 100644
> --- a/target/arm/pauth_helper.c
> +++ b/target/arm/pauth_helper.c
> @@ -207,8 +207,8 @@ static uint64_t tweak_inv_shuffle(uint64_t i)
>      return o;
>  }
>  
> -static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
> -                                 ARMPACKey key)
> +static uint64_t __attribute__((noinline))
> +pauth_computepac_architected(uint64_t data, uint64_t modifier, ARMPACKey key)
>  {
>      static const uint64_t RC[5] = {
>          0x0000000000000000ull,
> @@ -272,6 +272,71 @@ static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
>      return workingval;
>  }
>  
> +/*
> + * The XXH64 algorithm from
> + * https://github.com/Cyan4973/xxHash/blob/v0.8.0/xxhash.h
> + */
> +#define PRIME64_1   0x9E3779B185EBCA87ULL
> +#define PRIME64_2   0xC2B2AE3D27D4EB4FULL
> +#define PRIME64_3   0x165667B19E3779F9ULL
> +#define PRIME64_4   0x85EBCA77C2B2AE63ULL
> +#define PRIME64_5   0x27D4EB2F165667C5ULL
> +
> +static inline uint64_t XXH64_round(uint64_t acc, uint64_t input)
> +{
> +    return rol64(acc + input * PRIME64_2, 31) * PRIME64_1;
> +}
> +
> +static inline uint64_t XXH64_mergeround(uint64_t acc, uint64_t val)
> +{
> +    return (acc ^ XXH64_round(0, val)) * PRIME64_1 + PRIME64_4;
> +}
> +
> +static inline uint64_t XXH64_avalanche(uint64_t h64)
> +{
> +    h64 ^= h64 >> 33;
> +    h64 *= PRIME64_2;
> +    h64 ^= h64 >> 29;
> +    h64 *= PRIME64_3;
> +    /* h64 ^= h64 >> 32; -- does not affect high 64 for pauth */
> +    return h64;
> +}
> +
> +static uint64_t __attribute__((noinline))
> +pauth_computepac_impdef(uint64_t data, uint64_t modifier, ARMPACKey key)
> +{
> +    uint64_t v1 = 1 + PRIME64_1 + PRIME64_2;
> +    uint64_t v2 = 1 + PRIME64_2;
> +    uint64_t v3 = 1 + 0;
> +    uint64_t v4 = 1 - PRIME64_1;
> +    uint64_t h64;
> +
> +    v1 = XXH64_round(v1, data);
> +    v2 = XXH64_round(v2, modifier);
> +    v3 = XXH64_round(v3, key.lo);
> +    v4 = XXH64_round(v4, key.hi);
> +
> +    h64 = rol64(v1, 1) + rol64(v2, 7) + rol64(v3, 12) + rol64(v4, 18);
> +    h64 = XXH64_mergeround(h64, v1);
> +    h64 = XXH64_mergeround(h64, v2);
> +    h64 = XXH64_mergeround(h64, v3);
> +    h64 = XXH64_mergeround(h64, v4);
> +
> +    return XXH64_avalanche(h64);
> +}

You might find it easier to #include "qemu/xxhash.h" which we use for tb
hashing amongst other things.  

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-12  6:53 ` [PATCH 1/2] target/arm: Add cpu property to control pauth Richard Henderson
@ 2020-08-12 11:00   ` Andrew Jones
  2020-08-12 15:10     ` Richard Henderson
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Jones @ 2020-08-12 11:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.rutland, peter.maydell, alex.bennee, qemu-devel

On Tue, Aug 11, 2020 at 11:53:38PM -0700, Richard Henderson wrote:
> The crypto overhead of emulating pauth can be significant for
> some workloads.  Add an enumeration property that allows the
> feature to be turned off, on with the architected algorithm,
> or on with an implementation defined algorithm.
> 
> The architected algorithm is quite expensive to emulate;
> using another algorithm may allow hardware acceleration.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpu64.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index dd696183df..3181d0e2f8 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -572,6 +572,69 @@ void aarch64_add_sve_properties(Object *obj)
>      }
>  }
>  
> +static const char * const pauth_names[] = {
> +    "off", "impdef", "arch"
> +};

Hi Richard,

Please add three boolean properties, rather than one enum:

pauth:            enable support of the pauth feature
pauth-fast:       enable QEMU's fast non-cryptographic hash for pauth
                  (pauth must be enabled)
pauth-arch:       enable the architected algorithm for pauth
                  (pauth must be enabled)

These booleans can then be added to the cpu feature probing list used by
qmp_query_cpu_model_expansion()

> +
> +static const QEnumLookup pauth_lookup = {
> +    .array = pauth_names,
> +    .size = ARRAY_SIZE(pauth_names)
> +};
> +
> +static int cpu_arm_get_pauth(Object *obj, Error **errp)
> +{
> +    ARMCPU *cpu = ARM_CPU(obj);
> +    int value;
> +
> +    /* We will always set GPA+APA and GPI+API to the same value. */
> +    if (FIELD_EX64(cpu->isar.id_aa64isar1, ID_AA64ISAR1, APA)) {
> +        value = 2;
> +    } else if (FIELD_EX64(cpu->isar.id_aa64isar1, ID_AA64ISAR1, API)) {
> +        value = 1;
> +    } else {
> +        value = 0;
> +    }
> +    return value;
> +}
> +
> +static void cpu_arm_set_pauth(Object *obj, int value, Error **errp)
> +{
> +    ARMCPU *cpu = ARM_CPU(obj);
> +    uint64_t t;
> +
> +    /* TODO: Handle HaveEnhancedPAC, HaveEnhancedPAC2, HaveFPAC. */
> +    t = cpu->isar.id_aa64isar1;
> +    switch (value) {
> +    case 0:
> +        t = FIELD_DP64(t, ID_AA64ISAR1, APA, 0);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, API, 0);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 0);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 0);
> +        break;
> +    case 1:
> +        t = FIELD_DP64(t, ID_AA64ISAR1, APA, 0);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, API, 1);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 0);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 1);
> +        break;
> +    case 2:
> +        t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, API, 0);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 1);
> +        t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 0);
> +        break;
> +    default:
> +        g_assert_not_reached();
> +    }
> +    cpu->isar.id_aa64isar1 = t;
> +}
> +
> +static void aarch64_add_pauth_properties(Object *obj)
> +{
> +    object_property_add_enum(obj, "pauth", "ARMPauthKind", &pauth_lookup,
> +                             cpu_arm_get_pauth, cpu_arm_set_pauth);
> +}
> +
>  /* -cpu max: if KVM is enabled, like -cpu host (best possible with this host);
>   * otherwise, a CPU with as many features enabled as our emulation supports.
>   * The version of '-cpu max' for qemu-system-arm is defined in cpu.c;
> @@ -720,6 +783,7 @@ static void aarch64_max_initfn(Object *obj)
>  #endif
>      }
>  
> +    aarch64_add_pauth_properties(obj);

We don't yet support enabling pauth on KVM accelerated guests, so this
call should be in the !kvm_enabled() part of this function.

>      aarch64_add_sve_properties(obj);
>      object_property_add(obj, "sve-max-vq", "uint32", cpu_max_get_sve_max_vq,
>                          cpu_max_set_sve_max_vq, NULL, NULL);
> -- 
> 2.25.1
> 
>

Thanks,
drew



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-12 11:00   ` Andrew Jones
@ 2020-08-12 15:10     ` Richard Henderson
  2020-08-12 16:31       ` Andrew Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2020-08-12 15:10 UTC (permalink / raw)
  To: Andrew Jones; +Cc: mark.rutland, peter.maydell, alex.bennee, qemu-devel

On 8/12/20 4:00 AM, Andrew Jones wrote:
> On Tue, Aug 11, 2020 at 11:53:38PM -0700, Richard Henderson wrote:
>> The crypto overhead of emulating pauth can be significant for
>> some workloads.  Add an enumeration property that allows the
>> feature to be turned off, on with the architected algorithm,
>> or on with an implementation defined algorithm.
>>
>> The architected algorithm is quite expensive to emulate;
>> using another algorithm may allow hardware acceleration.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>  target/arm/cpu64.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 64 insertions(+)
>>
>> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
>> index dd696183df..3181d0e2f8 100644
>> --- a/target/arm/cpu64.c
>> +++ b/target/arm/cpu64.c
>> @@ -572,6 +572,69 @@ void aarch64_add_sve_properties(Object *obj)
>>      }
>>  }
>>  
>> +static const char * const pauth_names[] = {
>> +    "off", "impdef", "arch"
>> +};
> 
> Hi Richard,
> 
> Please add three boolean properties, rather than one enum:
> 
> pauth:            enable support of the pauth feature
> pauth-fast:       enable QEMU's fast non-cryptographic hash for pauth
>                   (pauth must be enabled)
> pauth-arch:       enable the architected algorithm for pauth
>                   (pauth must be enabled)
> 
> These booleans can then be added to the cpu feature probing list used by
> qmp_query_cpu_model_expansion()

Why are 3 booleans better than one enum?

I'd forgotten about qmp_query_cpu_model_expansion(); can it not take anything
but booleans?


r~


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm
  2020-08-12  9:49   ` Alex Bennée
@ 2020-08-12 15:13     ` Richard Henderson
  2020-08-12 17:13       ` Alex Bennée
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2020-08-12 15:13 UTC (permalink / raw)
  To: Alex Bennée; +Cc: mark.rutland, peter.maydell, qemu-devel

On 8/12/20 2:49 AM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> Without hardware acceleration, a cryptographically strong
>> algorithm is too expensive for pauth_computepac.
>>
>> Even with hardware accel, we are not currently expecting
>> to link the linux-user binaries to any crypto libraries,
>> and doing so would generally make the --static build fail.
>>
>> So choose XXH64 as a reasonably quick and decent hash.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>  target/arm/pauth_helper.c | 75 ++++++++++++++++++++++++++++++++++++---
>>  1 file changed, 70 insertions(+), 5 deletions(-)
>>
>> diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
>> index 6dbab03768..f1a4389465 100644
>> --- a/target/arm/pauth_helper.c
>> +++ b/target/arm/pauth_helper.c
>> @@ -207,8 +207,8 @@ static uint64_t tweak_inv_shuffle(uint64_t i)
>>      return o;
>>  }
>>  
>> -static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
>> -                                 ARMPACKey key)
>> +static uint64_t __attribute__((noinline))
>> +pauth_computepac_architected(uint64_t data, uint64_t modifier, ARMPACKey key)
>>  {
>>      static const uint64_t RC[5] = {
>>          0x0000000000000000ull,
>> @@ -272,6 +272,71 @@ static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
>>      return workingval;
>>  }
>>  
>> +/*
>> + * The XXH64 algorithm from
>> + * https://github.com/Cyan4973/xxHash/blob/v0.8.0/xxhash.h
>> + */
>> +#define PRIME64_1   0x9E3779B185EBCA87ULL
>> +#define PRIME64_2   0xC2B2AE3D27D4EB4FULL
>> +#define PRIME64_3   0x165667B19E3779F9ULL
>> +#define PRIME64_4   0x85EBCA77C2B2AE63ULL
>> +#define PRIME64_5   0x27D4EB2F165667C5ULL
>> +
>> +static inline uint64_t XXH64_round(uint64_t acc, uint64_t input)
>> +{
>> +    return rol64(acc + input * PRIME64_2, 31) * PRIME64_1;
>> +}
>> +
>> +static inline uint64_t XXH64_mergeround(uint64_t acc, uint64_t val)
>> +{
>> +    return (acc ^ XXH64_round(0, val)) * PRIME64_1 + PRIME64_4;
>> +}
>> +
>> +static inline uint64_t XXH64_avalanche(uint64_t h64)
>> +{
>> +    h64 ^= h64 >> 33;
>> +    h64 *= PRIME64_2;
>> +    h64 ^= h64 >> 29;
>> +    h64 *= PRIME64_3;
>> +    /* h64 ^= h64 >> 32; -- does not affect high 64 for pauth */
>> +    return h64;
>> +}
>> +
>> +static uint64_t __attribute__((noinline))
>> +pauth_computepac_impdef(uint64_t data, uint64_t modifier, ARMPACKey key)
>> +{
>> +    uint64_t v1 = 1 + PRIME64_1 + PRIME64_2;
>> +    uint64_t v2 = 1 + PRIME64_2;
>> +    uint64_t v3 = 1 + 0;
>> +    uint64_t v4 = 1 - PRIME64_1;
>> +    uint64_t h64;
>> +
>> +    v1 = XXH64_round(v1, data);
>> +    v2 = XXH64_round(v2, modifier);
>> +    v3 = XXH64_round(v3, key.lo);
>> +    v4 = XXH64_round(v4, key.hi);
>> +
>> +    h64 = rol64(v1, 1) + rol64(v2, 7) + rol64(v3, 12) + rol64(v4, 18);
>> +    h64 = XXH64_mergeround(h64, v1);
>> +    h64 = XXH64_mergeround(h64, v2);
>> +    h64 = XXH64_mergeround(h64, v3);
>> +    h64 = XXH64_mergeround(h64, v4);
>> +
>> +    return XXH64_avalanche(h64);
>> +}
> 
> You might find it easier to #include "qemu/xxhash.h" which we use for tb
> hashing amongst other things.  

First, that's the 32-bit version, XXH32.
Second, we define xxhash7 there; we would need xxhash8 here.


r~


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-12 15:10     ` Richard Henderson
@ 2020-08-12 16:31       ` Andrew Jones
  2020-08-13  6:03         ` Andrew Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Jones @ 2020-08-12 16:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.rutland, peter.maydell, alex.bennee, qemu-devel

On Wed, Aug 12, 2020 at 08:10:47AM -0700, Richard Henderson wrote:
> On 8/12/20 4:00 AM, Andrew Jones wrote:
> > On Tue, Aug 11, 2020 at 11:53:38PM -0700, Richard Henderson wrote:
> >> The crypto overhead of emulating pauth can be significant for
> >> some workloads.  Add an enumeration property that allows the
> >> feature to be turned off, on with the architected algorithm,
> >> or on with an implementation defined algorithm.
> >>
> >> The architected algorithm is quite expensive to emulate;
> >> using another algorithm may allow hardware acceleration.
> >>
> >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> >> ---
> >>  target/arm/cpu64.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 64 insertions(+)
> >>
> >> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> >> index dd696183df..3181d0e2f8 100644
> >> --- a/target/arm/cpu64.c
> >> +++ b/target/arm/cpu64.c
> >> @@ -572,6 +572,69 @@ void aarch64_add_sve_properties(Object *obj)
> >>      }
> >>  }
> >>  
> >> +static const char * const pauth_names[] = {
> >> +    "off", "impdef", "arch"
> >> +};
> > 
> > Hi Richard,
> > 
> > Please add three boolean properties, rather than one enum:
> > 
> > pauth:            enable support of the pauth feature
> > pauth-fast:       enable QEMU's fast non-cryptographic hash for pauth
> >                   (pauth must be enabled)
> > pauth-arch:       enable the architected algorithm for pauth
> >                   (pauth must be enabled)
> > 
> > These booleans can then be added to the cpu feature probing list used by
> > qmp_query_cpu_model_expansion()
> 
> Why are 3 booleans better than one enum?
> 
> I'd forgotten about qmp_query_cpu_model_expansion(); can it not take anything
> but booleans?

Right. The probing works by getting a list of possible CPU features, which
are all boolean properties. That way the prober can try enabling/disabling
them without having to know about each property's set of valid values. We
could implement each as an enum and a second level of probing, but that
would complicate the probing, and I'm not sure enums gain us much over
multiple properties.

In this case, since pauth-fast and pauth-arch are mutually exclusive and
we want a pauth=on/off too, then we'll need a finalize function like SVE
has in order to support the following selections:

 # Default (pauth-arch), explicitly selected or not
 -cpu max[,pauth=on]
 -cpu max[,pauth=on][,pauth-fast=off],pauth-arch=on

 # Select pauth-fast
 -cpu max[,pauth=on][,pauth-arch=off],pauth-fast=on

 # Disable
 -cpu max,pauth=off
 -cpu max[,pauth=off],pauth-arch=off,pauth-fast=off

 # Mutual exclusion errors
 -cpu max,pauth=off,pauth-{arch,fast}=on
 -cpu max,pauth=on,pauth-arch=off,pauth-fast=off
 -cpu max[,pauth=on],pauth-arch=on,pauth-fast=on

 # Errors because we don't want to guess what the user means
 -cpu max[,pauth=on],pauth-arch=off
 -cpu max[,pauth=on],pauth-fast=off
 

Thanks,
drew



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm
  2020-08-12 15:13     ` Richard Henderson
@ 2020-08-12 17:13       ` Alex Bennée
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Bennée @ 2020-08-12 17:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.rutland, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> On 8/12/20 2:49 AM, Alex Bennée wrote:
>> 
>> Richard Henderson <richard.henderson@linaro.org> writes:
>> 
>>> Without hardware acceleration, a cryptographically strong
>>> algorithm is too expensive for pauth_computepac.
>>>
>>> Even with hardware accel, we are not currently expecting
>>> to link the linux-user binaries to any crypto libraries,
>>> and doing so would generally make the --static build fail.
>>>
>>> So choose XXH64 as a reasonably quick and decent hash.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>>  target/arm/pauth_helper.c | 75 ++++++++++++++++++++++++++++++++++++---
>>>  1 file changed, 70 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
>>> index 6dbab03768..f1a4389465 100644
>>> --- a/target/arm/pauth_helper.c
>>> +++ b/target/arm/pauth_helper.c
>>> @@ -207,8 +207,8 @@ static uint64_t tweak_inv_shuffle(uint64_t i)
>>>      return o;
>>>  }
>>>  
>>> -static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
>>> -                                 ARMPACKey key)
>>> +static uint64_t __attribute__((noinline))
>>> +pauth_computepac_architected(uint64_t data, uint64_t modifier, ARMPACKey key)
>>>  {
>>>      static const uint64_t RC[5] = {
>>>          0x0000000000000000ull,
>>> @@ -272,6 +272,71 @@ static uint64_t pauth_computepac(uint64_t data, uint64_t modifier,
>>>      return workingval;
>>>  }
>>>  
>>> +/*
>>> + * The XXH64 algorithm from
>>> + * https://github.com/Cyan4973/xxHash/blob/v0.8.0/xxhash.h
>>> + */
>>> +#define PRIME64_1   0x9E3779B185EBCA87ULL
>>> +#define PRIME64_2   0xC2B2AE3D27D4EB4FULL
>>> +#define PRIME64_3   0x165667B19E3779F9ULL
>>> +#define PRIME64_4   0x85EBCA77C2B2AE63ULL
>>> +#define PRIME64_5   0x27D4EB2F165667C5ULL
>>> +
>>> +static inline uint64_t XXH64_round(uint64_t acc, uint64_t input)
>>> +{
>>> +    return rol64(acc + input * PRIME64_2, 31) * PRIME64_1;
>>> +}
>>> +
>>> +static inline uint64_t XXH64_mergeround(uint64_t acc, uint64_t val)
>>> +{
>>> +    return (acc ^ XXH64_round(0, val)) * PRIME64_1 + PRIME64_4;
>>> +}
>>> +
>>> +static inline uint64_t XXH64_avalanche(uint64_t h64)
>>> +{
>>> +    h64 ^= h64 >> 33;
>>> +    h64 *= PRIME64_2;
>>> +    h64 ^= h64 >> 29;
>>> +    h64 *= PRIME64_3;
>>> +    /* h64 ^= h64 >> 32; -- does not affect high 64 for pauth */
>>> +    return h64;
>>> +}
>>> +
>>> +static uint64_t __attribute__((noinline))
>>> +pauth_computepac_impdef(uint64_t data, uint64_t modifier, ARMPACKey key)
>>> +{
>>> +    uint64_t v1 = 1 + PRIME64_1 + PRIME64_2;
>>> +    uint64_t v2 = 1 + PRIME64_2;
>>> +    uint64_t v3 = 1 + 0;
>>> +    uint64_t v4 = 1 - PRIME64_1;
>>> +    uint64_t h64;
>>> +
>>> +    v1 = XXH64_round(v1, data);
>>> +    v2 = XXH64_round(v2, modifier);
>>> +    v3 = XXH64_round(v3, key.lo);
>>> +    v4 = XXH64_round(v4, key.hi);
>>> +
>>> +    h64 = rol64(v1, 1) + rol64(v2, 7) + rol64(v3, 12) + rol64(v4, 18);
>>> +    h64 = XXH64_mergeround(h64, v1);
>>> +    h64 = XXH64_mergeround(h64, v2);
>>> +    h64 = XXH64_mergeround(h64, v3);
>>> +    h64 = XXH64_mergeround(h64, v4);
>>> +
>>> +    return XXH64_avalanche(h64);
>>> +}
>> 
>> You might find it easier to #include "qemu/xxhash.h" which we use for tb
>> hashing amongst other things.  
>
> First, that's the 32-bit version, XXH32.

Ahh I missed that detail.

> Second, we define xxhash7 there; we would need xxhash8 here.

We could at least put the code in the header with the others.

>
>
> r~


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-12 16:31       ` Andrew Jones
@ 2020-08-13  6:03         ` Andrew Jones
  2020-08-13  9:05           ` Mark Rutland
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Jones @ 2020-08-13  6:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.rutland, peter.maydell, alex.bennee, qemu-devel

On Wed, Aug 12, 2020 at 06:31:11PM +0200, Andrew Jones wrote:
> On Wed, Aug 12, 2020 at 08:10:47AM -0700, Richard Henderson wrote:
> > On 8/12/20 4:00 AM, Andrew Jones wrote:
> > > On Tue, Aug 11, 2020 at 11:53:38PM -0700, Richard Henderson wrote:
> > >> The crypto overhead of emulating pauth can be significant for
> > >> some workloads.  Add an enumeration property that allows the
> > >> feature to be turned off, on with the architected algorithm,
> > >> or on with an implementation defined algorithm.
> > >>
> > >> The architected algorithm is quite expensive to emulate;
> > >> using another algorithm may allow hardware acceleration.
> > >>
> > >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > >> ---
> > >>  target/arm/cpu64.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
> > >>  1 file changed, 64 insertions(+)
> > >>
> > >> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> > >> index dd696183df..3181d0e2f8 100644
> > >> --- a/target/arm/cpu64.c
> > >> +++ b/target/arm/cpu64.c
> > >> @@ -572,6 +572,69 @@ void aarch64_add_sve_properties(Object *obj)
> > >>      }
> > >>  }
> > >>  
> > >> +static const char * const pauth_names[] = {
> > >> +    "off", "impdef", "arch"
> > >> +};
> > > 
> > > Hi Richard,
> > > 
> > > Please add three boolean properties, rather than one enum:
> > > 
> > > pauth:            enable support of the pauth feature
> > > pauth-fast:       enable QEMU's fast non-cryptographic hash for pauth
> > >                   (pauth must be enabled)
> > > pauth-arch:       enable the architected algorithm for pauth
> > >                   (pauth must be enabled)
> > > 
> > > These booleans can then be added to the cpu feature probing list used by
> > > qmp_query_cpu_model_expansion()
> > 
> > Why are 3 booleans better than one enum?
> > 
> > I'd forgotten about qmp_query_cpu_model_expansion(); can it not take anything
> > but booleans?
> 
> Right. The probing works by getting a list of possible CPU features, which
> are all boolean properties. That way the prober can try enabling/disabling
> them without having to know about each property's set of valid values. We
> could implement each as an enum and a second level of probing, but that
> would complicate the probing, and I'm not sure enums gain us much over
> multiple properties.
> 
> In this case, since pauth-fast and pauth-arch are mutually exclusive and
> we want a pauth=on/off too, then we'll need a finalize function like SVE
> has in order to support the following selections:
> 
>  # Default (pauth-arch), explicitly selected or not
>  -cpu max[,pauth=on]
>  -cpu max[,pauth=on][,pauth-fast=off],pauth-arch=on
> 
>  # Select pauth-fast
>  -cpu max[,pauth=on][,pauth-arch=off],pauth-fast=on
> 
>  # Disable
>  -cpu max,pauth=off
>  -cpu max[,pauth=off],pauth-arch=off,pauth-fast=off
> 
>  # Mutual exclusion errors
>  -cpu max,pauth=off,pauth-{arch,fast}=on
>  -cpu max,pauth=on,pauth-arch=off,pauth-fast=off
>  -cpu max[,pauth=on],pauth-arch=on,pauth-fast=on
> 
>  # Errors because we don't want to guess what the user means
>  -cpu max[,pauth=on],pauth-arch=off
>  -cpu max[,pauth=on],pauth-fast=off

Thinking about this some more, maybe we don't need pauth-arch.
If we don't, then it simplifies nicely to

 # Default (enabled with architected algorithm)
 -cpu max[,pauth=on][,pauth-fast=off]

 # Select pauth-fast
 -cpu max[,pauth=on],pauth-fast=on

 # Disable
 -cpu max[,pauth-fast=off],pauth=off

 # Mutual exclusion errors
 -cpu max,pauth=off,pauth-fast=on

Thanks,
drew



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-13  6:03         ` Andrew Jones
@ 2020-08-13  9:05           ` Mark Rutland
  2020-08-13  9:49             ` Andrew Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2020-08-13  9:05 UTC (permalink / raw)
  To: Andrew Jones; +Cc: alex.bennee, peter.maydell, Richard Henderson, qemu-devel

On Thu, Aug 13, 2020 at 08:03:21AM +0200, Andrew Jones wrote:
> Thinking about this some more, maybe we don't need pauth-arch.
> If we don't, then it simplifies nicely to
> 
>  # Default (enabled with architected algorithm)
>  -cpu max[,pauth=on][,pauth-fast=off]
> 
>  # Select pauth-fast
>  -cpu max[,pauth=on],pauth-fast=on

One reason that users may wish to choose the IMP-DEF algorithm is for
functional testing regardless of speed (since APA+GPA / API+GPI depend
on whether the algo is architected or imp-def).

Given that, I think that "impdef" is a better option name than
"pauth-fast", and speed is a benefit but not the only reason to use the
option.

How about hacing a 'pauth-algo' option which defaults to architected,
e.g.

| -cpu max,pauth={on,off},pauth-algo={impdef,architected}

... then in future the 'pauth={on,off}' bit could de extended to cover
FPAC version etc independently of the algorithm, but for now that can be
a boolean.

Thanks,
Mark.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-13  9:05           ` Mark Rutland
@ 2020-08-13  9:49             ` Andrew Jones
  2020-08-13 11:10               ` Mark Rutland
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Jones @ 2020-08-13  9:49 UTC (permalink / raw)
  To: Mark Rutland; +Cc: alex.bennee, peter.maydell, Richard Henderson, qemu-devel

On Thu, Aug 13, 2020 at 10:05:04AM +0100, Mark Rutland wrote:
> On Thu, Aug 13, 2020 at 08:03:21AM +0200, Andrew Jones wrote:
> > Thinking about this some more, maybe we don't need pauth-arch.
> > If we don't, then it simplifies nicely to
> > 
> >  # Default (enabled with architected algorithm)
> >  -cpu max[,pauth=on][,pauth-fast=off]
> > 
> >  # Select pauth-fast
> >  -cpu max[,pauth=on],pauth-fast=on
> 
> One reason that users may wish to choose the IMP-DEF algorithm is for
> functional testing regardless of speed (since APA+GPA / API+GPI depend
> on whether the algo is architected or imp-def).
> 
> Given that, I think that "impdef" is a better option name than
> "pauth-fast", and speed is a benefit but not the only reason to use the
> option.

I was going with pauth-fast because in this case Richard identified a
need for a fast version. If we identify another need later, then we may
want to add another impdef version, e.g. pauth-foo. Maybe all the impdef
versions should have impdef in their name to make that more explicit?

 pauth-impdef-fast
 pauth-impdef-foo
 ...

That seems a bit verbose, though, and we can document that pauth-* are
impdefs and that the default is architected.

> 
> How about hacing a 'pauth-algo' option which defaults to architected,
> e.g.
> 
> | -cpu max,pauth={on,off},pauth-algo={impdef,architected}
> 
> ... then in future the 'pauth={on,off}' bit could de extended to cover
> FPAC version etc independently of the algorithm, but for now that can be
> a boolean.
>

Keeping pauth a boolean, but creating a pauth-algo enum doesn't help us
much, because probing will only be possible for pauth. The prober could
only decide to enable pauth with the default algo or not. We could create
a pauth-specific probe, similar to the gic-specific probe, but, IMO, it's
really not necessary. If we need multiple pauth-* properties, then we can
have them all. It just adds a few more lines of logic to the pauth
finalize function.

Thanks,
drew



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] target/arm: Add cpu property to control pauth
  2020-08-13  9:49             ` Andrew Jones
@ 2020-08-13 11:10               ` Mark Rutland
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Rutland @ 2020-08-13 11:10 UTC (permalink / raw)
  To: Andrew Jones; +Cc: alex.bennee, peter.maydell, Richard Henderson, qemu-devel

On Thu, Aug 13, 2020 at 11:49:07AM +0200, Andrew Jones wrote:
> On Thu, Aug 13, 2020 at 10:05:04AM +0100, Mark Rutland wrote:
> > On Thu, Aug 13, 2020 at 08:03:21AM +0200, Andrew Jones wrote:
> > > Thinking about this some more, maybe we don't need pauth-arch.
> > > If we don't, then it simplifies nicely to
> > > 
> > >  # Default (enabled with architected algorithm)
> > >  -cpu max[,pauth=on][,pauth-fast=off]
> > > 
> > >  # Select pauth-fast
> > >  -cpu max[,pauth=on],pauth-fast=on
> > 
> > One reason that users may wish to choose the IMP-DEF algorithm is for
> > functional testing regardless of speed (since APA+GPA / API+GPI depend
> > on whether the algo is architected or imp-def).
> > 
> > Given that, I think that "impdef" is a better option name than
> > "pauth-fast", and speed is a benefit but not the only reason to use the
> > option.
> 
> I was going with pauth-fast because in this case Richard identified a
> need for a fast version. If we identify another need later, then we may
> want to add another impdef version, e.g. pauth-foo. Maybe all the impdef
> versions should have impdef in their name to make that more explicit?
> 
>  pauth-impdef-fast
>  pauth-impdef-foo

Something like that is fine by me.

> > How about hacing a 'pauth-algo' option which defaults to architected,
> > e.g.
> > 
> > | -cpu max,pauth={on,off},pauth-algo={impdef,architected}
> > 
> > ... then in future the 'pauth={on,off}' bit could de extended to cover
> > FPAC version etc independently of the algorithm, but for now that can be
> > a boolean.
> 
> Keeping pauth a boolean, but creating a pauth-algo enum doesn't help us
> much, because probing will only be possible for pauth. The prober could
> only decide to enable pauth with the default algo or not. We could create
> a pauth-specific probe, similar to the gic-specific probe, but, IMO, it's
> really not necessary. If we need multiple pauth-* properties, then we can
> have them all. It just adds a few more lines of logic to the pauth
> finalize function.

I suspect that it will be necessary in future handle multiple options to
enumerate things like FPAC/EPAC separately from the algorithm used,
which is why I suggested the algo being its own option now.

I can live with whatever you folk decide on.

Thanks,
Mark.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-08-13 11:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-12  6:53 [PATCH 0/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
2020-08-12  6:53 ` [PATCH 1/2] target/arm: Add cpu property to control pauth Richard Henderson
2020-08-12 11:00   ` Andrew Jones
2020-08-12 15:10     ` Richard Henderson
2020-08-12 16:31       ` Andrew Jones
2020-08-13  6:03         ` Andrew Jones
2020-08-13  9:05           ` Mark Rutland
2020-08-13  9:49             ` Andrew Jones
2020-08-13 11:10               ` Mark Rutland
2020-08-12  6:53 ` [PATCH 2/2] target/arm: Implement an IMPDEF pauth algorithm Richard Henderson
2020-08-12  9:49   ` Alex Bennée
2020-08-12 15:13     ` Richard Henderson
2020-08-12 17:13       ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.