[Qemu-devel] [PATCH v5 0/3] Live migration optimization for Thunderx platform

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v5 0/3] Live migration optimization for Thunderx platform
@ 2016-12-07 17:06 vijay.kilari
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 1/3] cutils: Set __builtin_prefetch optional parameters vijay.kilari
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: vijay.kilari @ 2016-12-07 17:06 UTC (permalink / raw)
  To: qemu-arm, peter.maydell, pbonzini, rth
  Cc: qemu-devel, vijay.kilari, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

The CPU MIDR_EL1 register is exposed to userspace for arm64
with the below patch.
https://lkml.org/lkml/2016/7/8/467

Thunderx platform requires explicit prefetch instruction to
provide prefetch hint. Using MIDR_EL1 information, provided
by above kernel patch, prefetch is executed if the platform
is Thunderx.

The results of live migration time improvement is provided
in commit message of patch 3.

Note: Check for size of while prefetching beyond page is
not added. Making this check is counter productive on
performance of live migration.

v4 => v5:
   - Compile util/aarch64-cpuid.c when CONFIG_LINUX enabled
   - Added stubs include/qemu/aarch64-cpuid.h if __aarch64__ and
     CONFIG_LINUX are not enabled.
v3 => v4:
   - Dropped allocation of memory for buf in
     qemu_read_aarch64_midr_el1()
   - Moved MIDR reg definitions to header file
   - Dropped arm64 and thunder specific code from generic
     function.

v2 => v3:
   - Rebased on top of richard's patches.
   - Consider cache line size and line number to prefetch
   - Passed optional parameters to __builtin_prefetch
v1 => v2:
   - Rename util/cpuinfo.c as util/aarch64-cpuid.c
   - Introduced header file include/qemu/aarch64-cpuid.h
   - Place all arch specific code under define __aarch64__ and
     CONFIG_LINUX.
   - Used builtin_prefetch() to add prefetch instruction.
   - Moved arch specific changes out of generic code
   - Dropped prefetching 5th cache line.

Vijaya Kumar K (3):
  cutils: Set __builtin_prefetch optional parameters
  utils: Add helper to read arm MIDR_EL1 register
  utils: Add prefetch for Thunderx platform

 include/qemu/aarch64-cpuid.h | 38 ++++++++++++++++++++++++++++++++
 util/Makefile.objs           |  1 +
 util/aarch64-cpuid.c         | 52 ++++++++++++++++++++++++++++++++++++++++++++
 util/bufferiszero.c          | 43 +++++++++++++++++++++++++++++++-----
 4 files changed, 129 insertions(+), 5 deletions(-)
 create mode 100644 include/qemu/aarch64-cpuid.h
 create mode 100644 util/aarch64-cpuid.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH v5 1/3] cutils: Set __builtin_prefetch optional parameters
  2016-12-07 17:06 [Qemu-devel] [PATCH v5 0/3] Live migration optimization for Thunderx platform vijay.kilari
@ 2016-12-07 17:06 ` vijay.kilari
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register vijay.kilari
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 3/3] utils: Add prefetch for Thunderx platform vijay.kilari
  2 siblings, 0 replies; 6+ messages in thread
From: vijay.kilari @ 2016-12-07 17:06 UTC (permalink / raw)
  To: qemu-arm, peter.maydell, pbonzini, rth
  Cc: qemu-devel, vijay.kilari, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Optional parameters of __builtin_prefetch() which specifies
rw and locality to 0's. For checking buffer is zero, set rw as read
and temporal locality to 0.

On arm64, __builtin_prefetch(addr) generates 'prfm    pldl1keep'
where __builtin_prefetch(addr, 0, 0) generates 'prfm pldl1strm'
instruction which is optimal for this use case

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 util/bufferiszero.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index eb974b7..421d945 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -49,7 +49,7 @@ buffer_zero_int(const void *buf, size_t len)
         const uint64_t *e = (uint64_t *)(((uintptr_t)buf + len) & -8);
 
         for (; p + 8 <= e; p += 8) {
-            __builtin_prefetch(p + 8);
+            __builtin_prefetch(p + 8, 0, 0);
             if (t) {
                 return false;
             }
@@ -86,7 +86,7 @@ buffer_zero_sse2(const void *buf, size_t len)
 
     /* Loop over 16-byte aligned blocks of 64.  */
     while (likely(p <= e)) {
-        __builtin_prefetch(p);
+        __builtin_prefetch(p, 0, 0);
         t = _mm_cmpeq_epi8(t, zero);
         if (unlikely(_mm_movemask_epi8(t) != 0xFFFF)) {
             return false;
@@ -127,7 +127,7 @@ buffer_zero_sse4(const void *buf, size_t len)
 
     /* Loop over 16-byte aligned blocks of 64.  */
     while (likely(p <= e)) {
-        __builtin_prefetch(p);
+        __builtin_prefetch(p, 0, 0);
         if (unlikely(!_mm_testz_si128(t, t))) {
             return false;
         }
@@ -162,7 +162,7 @@ buffer_zero_avx2(const void *buf, size_t len)
     if (likely(p <= e)) {
         /* Loop over 32-byte aligned blocks of 128.  */
         do {
-            __builtin_prefetch(p);
+            __builtin_prefetch(p, 0, 0);
             if (unlikely(!_mm256_testz_si256(t, t))) {
                 return false;
             }
@@ -303,7 +303,7 @@ bool buffer_is_zero(const void *buf, size_t len)
     }
 
     /* Fetch the beginning of the buffer while we select the accelerator.  */
-    __builtin_prefetch(buf);
+    __builtin_prefetch(buf, 0, 0);
 
     /* Use an optimized zero check if possible.  Note that this also
        includes a check for an unrolled loop over 64-bit integers.  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register
  2016-12-07 17:06 [Qemu-devel] [PATCH v5 0/3] Live migration optimization for Thunderx platform vijay.kilari
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 1/3] cutils: Set __builtin_prefetch optional parameters vijay.kilari
@ 2016-12-07 17:06 ` vijay.kilari
  2016-12-16 14:04   ` Peter Maydell
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 3/3] utils: Add prefetch for Thunderx platform vijay.kilari
  2 siblings, 1 reply; 6+ messages in thread
From: vijay.kilari @ 2016-12-07 17:06 UTC (permalink / raw)
  To: qemu-arm, peter.maydell, pbonzini, rth
  Cc: qemu-devel, vijay.kilari, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Add helper API to read MIDR_EL1 registers to fetch
cpu identification information. This helps in
adding errata's and architecture specific features.

This is implemented only for arm architecture.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 include/qemu/aarch64-cpuid.h | 38 ++++++++++++++++++++++++++++++++
 util/Makefile.objs           |  1 +
 util/aarch64-cpuid.c         | 52 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+)

diff --git a/include/qemu/aarch64-cpuid.h b/include/qemu/aarch64-cpuid.h
new file mode 100644
index 0000000..fb88ed8
--- /dev/null
+++ b/include/qemu/aarch64-cpuid.h
@@ -0,0 +1,38 @@
+#ifndef QEMU_AARCH64_CPUID_H
+#define QEMU_AARCH64_CPUID_H
+
+#if defined(__aarch64__) && defined(CONFIG_LINUX)
+#define MIDR_IMPLEMENTER_SHIFT  24
+#define MIDR_IMPLEMENTER_MASK   (0xffULL << MIDR_IMPLEMENTER_SHIFT)
+#define MIDR_ARCHITECTURE_SHIFT 16
+#define MIDR_ARCHITECTURE_MASK  (0xf << MIDR_ARCHITECTURE_SHIFT)
+#define MIDR_PARTNUM_SHIFT      4
+#define MIDR_PARTNUM_MASK       (0xfff << MIDR_PARTNUM_SHIFT)
+
+#define MIDR_CPU_PART(imp, partnum) \
+        (((imp)                 << MIDR_IMPLEMENTER_SHIFT)  | \
+        (0xf                    << MIDR_ARCHITECTURE_SHIFT) | \
+        ((partnum)              << MIDR_PARTNUM_SHIFT))
+
+#define ARM_CPU_IMP_CAVIUM        0x43
+#define CAVIUM_CPU_PART_THUNDERX  0x0A1
+
+#define MIDR_THUNDERX_PASS2  \
+               MIDR_CPU_PART(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
+#define CPU_MODEL_MASK  (MIDR_IMPLEMENTER_MASK | MIDR_ARCHITECTURE_MASK | \
+                         MIDR_PARTNUM_MASK)
+
+uint64_t get_aarch64_cpu_id(void);
+bool is_thunderx_pass2_cpu(void);
+#else
+static inline uint64_t get_aarch64_cpu_id(void)
+{
+    return 0;
+}
+
+static inline bool is_thunderx_pass2_cpu(void)
+{
+    return false;
+}
+#endif
+#endif
diff --git a/util/Makefile.objs b/util/Makefile.objs
index ad0f9c7..a9585c9 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -36,3 +36,4 @@ util-obj-y += log.o
 util-obj-y += qdist.o
 util-obj-y += qht.o
 util-obj-y += range.o
+util-obj-$(CONFIG_LINUX) += aarch64-cpuid.o
diff --git a/util/aarch64-cpuid.c b/util/aarch64-cpuid.c
new file mode 100644
index 0000000..575f52e
--- /dev/null
+++ b/util/aarch64-cpuid.c
@@ -0,0 +1,52 @@
+/*
+ * Dealing with arm cpu identification information.
+ *
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * Authors:
+ *  Vijaya Kumar K <Vijaya.Kumar@cavium.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1
+ * or later.  See the COPYING.LIB file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/cutils.h"
+#include "qemu/aarch64-cpuid.h"
+
+#if defined(__aarch64__)
+static uint64_t qemu_read_aarch64_midr_el1(void)
+{
+    const char *file = "/sys/devices/system/cpu/cpu0/regs/identification/midr_el1";
+    char *buf;
+    uint64_t midr = 0;
+
+    if (!g_file_get_contents(file, &buf, 0, NULL)) {
+        goto out;
+    }
+
+    if (qemu_strtoull(buf, NULL, 0, &midr) < 0) {
+        midr = 0;
+        goto out;
+    }
+
+out:
+    g_free(buf);
+
+    return midr;
+}
+
+static uint64_t aarch64_midr_val;
+uint64_t get_aarch64_cpu_id(void)
+{
+    aarch64_midr_val = qemu_read_aarch64_midr_el1();
+    aarch64_midr_val &= CPU_MODEL_MASK;
+
+    return aarch64_midr_val;
+}
+
+bool is_thunderx_pass2_cpu(void)
+{
+    return aarch64_midr_val == MIDR_THUNDERX_PASS2;
+}
+#endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH v5 3/3] utils: Add prefetch for Thunderx platform
  2016-12-07 17:06 [Qemu-devel] [PATCH v5 0/3] Live migration optimization for Thunderx platform vijay.kilari
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 1/3] cutils: Set __builtin_prefetch optional parameters vijay.kilari
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register vijay.kilari
@ 2016-12-07 17:06 ` vijay.kilari
  2 siblings, 0 replies; 6+ messages in thread
From: vijay.kilari @ 2016-12-07 17:06 UTC (permalink / raw)
  To: qemu-arm, peter.maydell, pbonzini, rth
  Cc: qemu-devel, vijay.kilari, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Thunderx pass2 chip requires explicit prefetch
instruction to give prefetch hint.

To speed up live migration on Thunderx platform,
prefetch instruction is added in zero buffer check
function.The below results show live migration time improvement
with prefetch instruction. VM with 4 VCPUs, 8GB RAM is migrated.

Code for decoding cache size is taken from Richard's patch.

With 1K page size and without prefetch
======================================
Migration status: completed
total time: 13556 milliseconds
downtime: 380 milliseconds
setup: 15 milliseconds
transferred ram: 265557 kbytes
throughput: 160.51 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 8344672 pages
skipped: 0 pages
normal: 190724 pages
normal bytes: 190724 kbytes
dirty sync count: 3

With 1K page size and with prefetch
===================================
Migration status: completed
total time: 8218 milliseconds
downtime: 395 milliseconds
setup: 15 milliseconds
transferred ram: 274484 kbytes
throughput: 273.67 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 8341921 pages
skipped: 0 pages
normal: 199606 pages
normal bytes: 199606 kbytes
dirty sync count: 3
(qemu)

With 4K page size and without prefetch
======================================
Migration status: completed
total time: 11121 milliseconds
downtime: 372 milliseconds
setup: 5 milliseconds
transferred ram: 231777 kbytes
throughput: 170.77 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2082158 pages
skipped: 0 pages
normal: 53265 pages
normal bytes: 213060 kbytes
dirty sync count: 3

With 4K page size and with prefetch
===================================
Migration status: completed
total time: 5893 milliseconds
downtime: 359 milliseconds
setup: 5 milliseconds
transferred ram: 225795 kbytes
throughput: 313.96 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2081903 pages
skipped: 0 pages
normal: 51773 pages
normal bytes: 207092 kbytes
dirty sync count: 3

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 util/bufferiszero.c | 37 +++++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index 421d945..ed3b31d 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -25,6 +25,11 @@
 #include "qemu-common.h"
 #include "qemu/cutils.h"
 #include "qemu/bswap.h"
+#include "qemu/aarch64-cpuid.h"
+
+static uint32_t cache_line_size = 64;
+static uint32_t prefetch_line_dist = 1;
+static uint32_t prefetch_distance = 8;
 
 static bool
 buffer_zero_int(const void *buf, size_t len)
@@ -49,7 +54,7 @@ buffer_zero_int(const void *buf, size_t len)
         const uint64_t *e = (uint64_t *)(((uintptr_t)buf + len) & -8);
 
         for (; p + 8 <= e; p += 8) {
-            __builtin_prefetch(p + 8, 0, 0);
+            __builtin_prefetch(p + prefetch_distance, 0, 0);
             if (t) {
                 return false;
             }
@@ -293,17 +298,45 @@ bool test_buffer_is_zero_next_accel(void)
 }
 #endif
 
+static void __attribute__((constructor)) init_cache_size(void)
+{
+#if defined(__aarch64__)
+    uint64_t t;
+
+    /* Use the DZP block size as a proxy for the cacheline size,
+       since the later is not available to userspace.  This seems
+       to work in practice for existing implementations.  */
+    asm("mrs %0, dczid_el0" : "=r"(t));
+    if ((1 << ((t & 0xf) + 2)) >= 128) {
+        cache_line_size = 128;
+    }
+#endif
+
+    get_aarch64_cpu_id();
+    if (is_thunderx_pass2_cpu()) {
+        prefetch_line_dist = 3;
+        prefetch_distance = (prefetch_line_dist * cache_line_size) /
+                             sizeof(uint64_t);
+    }
+}
+
 /*
  * Checks if a buffer is all zeroes
  */
 bool buffer_is_zero(const void *buf, size_t len)
 {
+    int i;
+    uint32_t prefetch_distance_bytes;
+
     if (unlikely(len == 0)) {
         return true;
     }
 
     /* Fetch the beginning of the buffer while we select the accelerator.  */
-    __builtin_prefetch(buf, 0, 0);
+    prefetch_distance_bytes = prefetch_line_dist * cache_line_size;
+    for (i = 0; i < prefetch_distance_bytes && i < len; i += cache_line_size) {
+        __builtin_prefetch(buf + i, 0, 0);
+    }
 
     /* Use an optimized zero check if possible.  Note that this also
        includes a check for an unrolled loop over 64-bit integers.  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register
  2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register vijay.kilari
@ 2016-12-16 14:04   ` Peter Maydell
  2016-12-19  9:10     ` Vijay Kilari
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2016-12-16 14:04 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: qemu-arm, Paolo Bonzini, Richard Henderson, QEMU Developers,
	Vijaya Kumar K

On 7 December 2016 at 17:06,  <vijay.kilari@gmail.com> wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Add helper API to read MIDR_EL1 registers to fetch
> cpu identification information. This helps in
> adding errata's and architecture specific features.
>
> This is implemented only for arm architecture.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  include/qemu/aarch64-cpuid.h | 38 ++++++++++++++++++++++++++++++++
>  util/Makefile.objs           |  1 +
>  util/aarch64-cpuid.c         | 52 ++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 91 insertions(+)
>
> diff --git a/include/qemu/aarch64-cpuid.h b/include/qemu/aarch64-cpuid.h
> new file mode 100644
> index 0000000..fb88ed8
> --- /dev/null
> +++ b/include/qemu/aarch64-cpuid.h
> @@ -0,0 +1,38 @@
> +#ifndef QEMU_AARCH64_CPUID_H
> +#define QEMU_AARCH64_CPUID_H
> +
> +#if defined(__aarch64__) && defined(CONFIG_LINUX)
> +#define MIDR_IMPLEMENTER_SHIFT  24
> +#define MIDR_IMPLEMENTER_MASK   (0xffULL << MIDR_IMPLEMENTER_SHIFT)
> +#define MIDR_ARCHITECTURE_SHIFT 16
> +#define MIDR_ARCHITECTURE_MASK  (0xf << MIDR_ARCHITECTURE_SHIFT)
> +#define MIDR_PARTNUM_SHIFT      4
> +#define MIDR_PARTNUM_MASK       (0xfff << MIDR_PARTNUM_SHIFT)
> +
> +#define MIDR_CPU_PART(imp, partnum) \
> +        (((imp)                 << MIDR_IMPLEMENTER_SHIFT)  | \
> +        (0xf                    << MIDR_ARCHITECTURE_SHIFT) | \
> +        ((partnum)              << MIDR_PARTNUM_SHIFT))
> +
> +#define ARM_CPU_IMP_CAVIUM        0x43
> +#define CAVIUM_CPU_PART_THUNDERX  0x0A1
> +
> +#define MIDR_THUNDERX_PASS2  \
> +               MIDR_CPU_PART(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
> +#define CPU_MODEL_MASK  (MIDR_IMPLEMENTER_MASK | MIDR_ARCHITECTURE_MASK | \
> +                         MIDR_PARTNUM_MASK)
> +
> +uint64_t get_aarch64_cpu_id(void);
> +bool is_thunderx_pass2_cpu(void);
> +#else
> +static inline uint64_t get_aarch64_cpu_id(void)
> +{
> +    return 0;
> +}
> +
> +static inline bool is_thunderx_pass2_cpu(void)
> +{
> +    return false;
> +}
> +#endif
> +#endif
> diff --git a/util/Makefile.objs b/util/Makefile.objs
> index ad0f9c7..a9585c9 100644
> --- a/util/Makefile.objs
> +++ b/util/Makefile.objs
> @@ -36,3 +36,4 @@ util-obj-y += log.o
>  util-obj-y += qdist.o
>  util-obj-y += qht.o
>  util-obj-y += range.o
> +util-obj-$(CONFIG_LINUX) += aarch64-cpuid.o
> diff --git a/util/aarch64-cpuid.c b/util/aarch64-cpuid.c
> new file mode 100644
> index 0000000..575f52e
> --- /dev/null
> +++ b/util/aarch64-cpuid.c
> @@ -0,0 +1,52 @@
> +/*
> + * Dealing with arm cpu identification information.
> + *
> + * Copyright (C) 2016 Cavium, Inc.
> + *
> + * Authors:
> + *  Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.1
> + * or later.  See the COPYING.LIB file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/cutils.h"
> +#include "qemu/aarch64-cpuid.h"
> +
> +#if defined(__aarch64__)
> +static uint64_t qemu_read_aarch64_midr_el1(void)
> +{
> +    const char *file = "/sys/devices/system/cpu/cpu0/regs/identification/midr_el1";

If CPU0 happens to be offline (eg hot-unplugged) then this file
won't exist, and we'll fail to identify any MIDR value.

The API as designed here also doesn't seem to consider
the idea of big.LITTLE systems -- if there are multiple
CPUs with different MIDRs, which one should we return here?

> +    char *buf;
> +    uint64_t midr = 0;
> +
> +    if (!g_file_get_contents(file, &buf, 0, NULL)) {
> +        goto out;
> +    }
> +
> +    if (qemu_strtoull(buf, NULL, 0, &midr) < 0) {
> +        midr = 0;
> +        goto out;
> +    }
> +
> +out:
> +    g_free(buf);
> +
> +    return midr;

thanks
-- PMM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register
  2016-12-16 14:04   ` Peter Maydell
@ 2016-12-19  9:10     ` Vijay Kilari
  0 siblings, 0 replies; 6+ messages in thread
From: Vijay Kilari @ 2016-12-19  9:10 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Paolo Bonzini, Richard Henderson, QEMU Developers,
	Vijaya Kumar K

On Fri, Dec 16, 2016 at 7:34 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 7 December 2016 at 17:06,  <vijay.kilari@gmail.com> wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Add helper API to read MIDR_EL1 registers to fetch
>> cpu identification information. This helps in
>> adding errata's and architecture specific features.
>>
>> This is implemented only for arm architecture.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  include/qemu/aarch64-cpuid.h | 38 ++++++++++++++++++++++++++++++++
>>  util/Makefile.objs           |  1 +
>>  util/aarch64-cpuid.c         | 52 ++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 91 insertions(+)
>>
>> diff --git a/include/qemu/aarch64-cpuid.h b/include/qemu/aarch64-cpuid.h
>> new file mode 100644
>> index 0000000..fb88ed8
>> --- /dev/null
>> +++ b/include/qemu/aarch64-cpuid.h
>> @@ -0,0 +1,38 @@
>> +#ifndef QEMU_AARCH64_CPUID_H
>> +#define QEMU_AARCH64_CPUID_H
>> +
>> +#if defined(__aarch64__) && defined(CONFIG_LINUX)
>> +#define MIDR_IMPLEMENTER_SHIFT  24
>> +#define MIDR_IMPLEMENTER_MASK   (0xffULL << MIDR_IMPLEMENTER_SHIFT)
>> +#define MIDR_ARCHITECTURE_SHIFT 16
>> +#define MIDR_ARCHITECTURE_MASK  (0xf << MIDR_ARCHITECTURE_SHIFT)
>> +#define MIDR_PARTNUM_SHIFT      4
>> +#define MIDR_PARTNUM_MASK       (0xfff << MIDR_PARTNUM_SHIFT)
>> +
>> +#define MIDR_CPU_PART(imp, partnum) \
>> +        (((imp)                 << MIDR_IMPLEMENTER_SHIFT)  | \
>> +        (0xf                    << MIDR_ARCHITECTURE_SHIFT) | \
>> +        ((partnum)              << MIDR_PARTNUM_SHIFT))
>> +
>> +#define ARM_CPU_IMP_CAVIUM        0x43
>> +#define CAVIUM_CPU_PART_THUNDERX  0x0A1
>> +
>> +#define MIDR_THUNDERX_PASS2  \
>> +               MIDR_CPU_PART(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
>> +#define CPU_MODEL_MASK  (MIDR_IMPLEMENTER_MASK | MIDR_ARCHITECTURE_MASK | \
>> +                         MIDR_PARTNUM_MASK)
>> +
>> +uint64_t get_aarch64_cpu_id(void);
>> +bool is_thunderx_pass2_cpu(void);
>> +#else
>> +static inline uint64_t get_aarch64_cpu_id(void)
>> +{
>> +    return 0;
>> +}
>> +
>> +static inline bool is_thunderx_pass2_cpu(void)
>> +{
>> +    return false;
>> +}
>> +#endif
>> +#endif
>> diff --git a/util/Makefile.objs b/util/Makefile.objs
>> index ad0f9c7..a9585c9 100644
>> --- a/util/Makefile.objs
>> +++ b/util/Makefile.objs
>> @@ -36,3 +36,4 @@ util-obj-y += log.o
>>  util-obj-y += qdist.o
>>  util-obj-y += qht.o
>>  util-obj-y += range.o
>> +util-obj-$(CONFIG_LINUX) += aarch64-cpuid.o
>> diff --git a/util/aarch64-cpuid.c b/util/aarch64-cpuid.c
>> new file mode 100644
>> index 0000000..575f52e
>> --- /dev/null
>> +++ b/util/aarch64-cpuid.c
>> @@ -0,0 +1,52 @@
>> +/*
>> + * Dealing with arm cpu identification information.
>> + *
>> + * Copyright (C) 2016 Cavium, Inc.
>> + *
>> + * Authors:
>> + *  Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> + *
>> + * This work is licensed under the terms of the GNU LGPL, version 2.1
>> + * or later.  See the COPYING.LIB file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/cutils.h"
>> +#include "qemu/aarch64-cpuid.h"
>> +
>> +#if defined(__aarch64__)
>> +static uint64_t qemu_read_aarch64_midr_el1(void)
>> +{
>> +    const char *file = "/sys/devices/system/cpu/cpu0/regs/identification/midr_el1";
>
> If CPU0 happens to be offline (eg hot-unplugged) then this file
> won't exist, and we'll fail to identify any MIDR value.

I thought wrongly that cpu0 cannot be hot-plugged on arm64.
At-least on our platform, it is not allowed.

One solution I think of is to get current running cpu using sched_getcpu()
and fetch midr from that cpu path
OR  read /sys/devices/system/cpu/online and find online cpu.

>
> The API as designed here also doesn't seem to consider
> the idea of big.LITTLE systems -- if there are multiple
> CPUs with different MIDRs, which one should we return here?

Yes, this is the limitation here to handle big.LITTLE configuration.
It was discussed in initial version of this patch series.

https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg01221.html

(From use case point of view, we require only Implementer ID, which
 won't be different for big.LITTLE configuration. I agree that this generic
 function should work for other use cases as well).

So I will add a comment here.

>
>> +    char *buf;
>> +    uint64_t midr = 0;
>> +
>> +    if (!g_file_get_contents(file, &buf, 0, NULL)) {
>> +        goto out;
>> +    }
>> +
>> +    if (qemu_strtoull(buf, NULL, 0, &midr) < 0) {
>> +        midr = 0;
>> +        goto out;
>> +    }
>> +
>> +out:
>> +    g_free(buf);
>> +
>> +    return midr;
>
> thanks
> -- PMM

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-12-19  9:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-07 17:06 [Qemu-devel] [PATCH v5 0/3] Live migration optimization for Thunderx platform vijay.kilari
2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 1/3] cutils: Set __builtin_prefetch optional parameters vijay.kilari
2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 2/3] utils: Add helper to read arm MIDR_EL1 register vijay.kilari
2016-12-16 14:04   ` Peter Maydell
2016-12-19  9:10     ` Vijay Kilari
2016-12-07 17:06 ` [Qemu-devel] [PATCH v5 3/3] utils: Add prefetch for Thunderx platform vijay.kilari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.