qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] target/arm: Support for Data Cache Clean up to PoP
@ 2019-09-10  9:56 Beata Michalska
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read Beata Michalska
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Beata Michalska @ 2019-09-10  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini


ARMv8.2 introduced support for Data Cache Clean instructions to PoP
(point-of-persistence) and PoDP (point-of-deep-persistence):
ARMv8.2-DCCVAP &  ARMv8.2-DCCVADP respectively.
This patch set adds support for emulating both, though there is no
distinction between the two points: the PoDP is assumed to represent
the same point of persistence as PoP. Case there is no such point specified
for the considered memory system both will fall back to the DV CVAC inst
(clean up to the point of coherency).
The changes introduced include adding probe_read for validating read memory
access to allow verification for mandatory read access for both cache
clean instructions, along with support for writeback for requested memory
regions through msync, if one is available, based otherwise on fsyncdata.

As currently the virt platform is missing support for NVDIMM,
the changes have been tested  with [1] & [2]


[1] https://patchwork.kernel.org/cover/10830237/
[2] https://patchwork.kernel.org/project/qemu-devel/list/?series=159441


Beata Michalska (4):
  tcg: cputlb: Add probe_read
  Memory: Enable writeback for given memory region
  migration: ram: Switch to ram block writeback
  target/arm: Add support for DC CVAP & DC CVADP ins

 configure                  | 24 ++++++++++++++++++++++++
 exec.c                     | 38 ++++++++++++++++++++++++++++++++++++++
 include/exec/exec-all.h    |  6 ++++++
 include/exec/memory.h      |  6 ++++++
 include/exec/ram_addr.h    |  6 ++++++
 linux-user/elfload.c       | 18 +++++++++++++++++-
 memory.c                   | 12 ++++++++++++
 migration/ram.c            |  5 +----
 target/arm/cpu.h           | 13 ++++++++++++-
 target/arm/cpu64.c         |  1 +
 target/arm/helper.c        | 24 ++++++++++++++++++++++++
 target/arm/helper.h        |  1 +
 target/arm/op_helper.c     | 36 ++++++++++++++++++++++++++++++++++++
 target/arm/translate-a64.c |  5 +++++
 14 files changed, 189 insertions(+), 6 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read
  2019-09-10  9:56 [Qemu-devel] [PATCH 0/4] target/arm: Support for Data Cache Clean up to PoP Beata Michalska
@ 2019-09-10  9:56 ` Beata Michalska
  2019-09-23 23:54   ` Alex Bennée
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region Beata Michalska
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Beata Michalska @ 2019-09-10  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini

Add probe_read alongside the write probing equivalent.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 include/exec/exec-all.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 81b02eb2fe..e1785700c3 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -319,6 +319,12 @@ static inline void *probe_write(CPUArchState *env, target_ulong addr, int size,
     return probe_access(env, addr, size, MMU_DATA_STORE, mmu_idx, retaddr);
 }
 
+static inline void *probe_read(CPUArchState *env, target_ulong addr, int size,
+                               int mmu_idx, uintptr_t retaddr)
+{
+    return probe_access(env, addr, size, MMU_DATA_LOAD, mmu_idx, retaddr);
+}
+
 #define CODE_GEN_ALIGN           16 /* must be >= of the size of a icache line */
 
 /* Estimated block size for TB allocation.  */
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-10  9:56 [Qemu-devel] [PATCH 0/4] target/arm: Support for Data Cache Clean up to PoP Beata Michalska
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read Beata Michalska
@ 2019-09-10  9:56 ` Beata Michalska
  2019-09-24  0:00   ` Alex Bennée
                     ` (2 more replies)
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback Beata Michalska
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins Beata Michalska
  3 siblings, 3 replies; 25+ messages in thread
From: Beata Michalska @ 2019-09-10  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini

Add an option to trigger memory writeback to sync given memory region
with the corresponding backing store, case one is available.
This extends the support for persistent memory, allowing syncing on-demand.

Also, adding verification for msync support on host.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 configure               | 24 ++++++++++++++++++++++++
 exec.c                  | 38 ++++++++++++++++++++++++++++++++++++++
 include/exec/memory.h   |  6 ++++++
 include/exec/ram_addr.h |  6 ++++++
 memory.c                | 12 ++++++++++++
 5 files changed, 86 insertions(+)

diff --git a/configure b/configure
index 95134c0180..bdb7dc47e9 100755
--- a/configure
+++ b/configure
@@ -5081,6 +5081,26 @@ if compile_prog "" "" ; then
     fdatasync=yes
 fi
 
+##########################################
+# verify support for msyc
+
+msync=no
+cat > $TMPC << EOF
+#include <unistd.h>
+#include <sys/mman.h>
+int main(void) {
+#if defined(_POSIX_MAPPED_FILES) && _POSIX_MAPPED_FILES > 0 \
+&& defined(_POSIX_SYNCHRONIZED_IO) && _POSIX_SYNCHRONIZED_IO > 0
+return msync(NULL,0, MS_SYNC);
+#else
+#error Not supported
+#endif
+}
+EOF
+if compile_prog "" "" ; then
+    msync=yes
+fi
+
 ##########################################
 # check if we have madvise
 
@@ -6413,6 +6433,7 @@ echo "fdt support       $fdt"
 echo "membarrier        $membarrier"
 echo "preadv support    $preadv"
 echo "fdatasync         $fdatasync"
+echo "msync             $msync"
 echo "madvise           $madvise"
 echo "posix_madvise     $posix_madvise"
 echo "posix_memalign    $posix_memalign"
@@ -6952,6 +6973,9 @@ fi
 if test "$fdatasync" = "yes" ; then
   echo "CONFIG_FDATASYNC=y" >> $config_host_mak
 fi
+if test "$msync" = "yes" ; then
+    echo "CONFIG_MSYNC=y" >> $config_host_mak
+fi
 if test "$madvise" = "yes" ; then
   echo "CONFIG_MADVISE=y" >> $config_host_mak
 fi
diff --git a/exec.c b/exec.c
index 235d6bc883..5ed6908368 100644
--- a/exec.c
+++ b/exec.c
@@ -65,6 +65,8 @@
 #include "exec/ram_addr.h"
 #include "exec/log.h"
 
+#include "qemu/pmem.h"
+
 #include "migration/vmstate.h"
 
 #include "qemu/range.h"
@@ -2182,6 +2184,42 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
     return 0;
 }
 
+/*
+ * Trigger sync on the given ram block for range [start, start + length]
+ * with the backing store if available.
+ * @Note: this is supposed to be a SYNC op.
+ */
+void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length)
+{
+    void *addr = ramblock_ptr(block, start);
+
+    /*
+     * The requested range might spread up to the very end of the block
+     */
+    if ((start + length) > block->used_length) {
+        error_report("%s: sync range outside the block boundires: "
+                     "start: " RAM_ADDR_FMT " length: " RAM_ADDR_FMT
+                     " block length: " RAM_ADDR_FMT " Narrowing down ..." ,
+                     __func__, start, length, block->used_length);
+        length = block->used_length - start;
+    }
+
+#ifdef CONFIG_LIBPMEM
+    /* The lack of support for pmem should not block the sync */
+    if (ramblock_is_pmem(block)) {
+        pmem_persist(addr, length);
+    } else
+#endif
+    if (block->fd >= 0) {
+#ifdef CONFIG_MSYNC
+        msync((void *)((uintptr_t)addr & qemu_host_page_mask),
+               HOST_PAGE_ALIGN(length), MS_SYNC);
+#else
+        qemu_fdatasync(block->fd);
+#endif
+    }
+}
+
 /* Called with ram_list.mutex held */
 static void dirty_memory_extend(ram_addr_t old_ram_size,
                                 ram_addr_t new_ram_size)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2dd810259d..ff0d7937cf 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1242,6 +1242,12 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
  */
 void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize,
                               Error **errp);
+/**
+ * memory_region_do_writeback: Trigger writeback for selected address range
+ * [addr, addr + size]
+ *
+ */
+void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size);
 
 /**
  * memory_region_set_log: Turn dirty logging on or off for a region.
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index a327a80cfe..d4bce81a03 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -180,6 +180,12 @@ void qemu_ram_free(RAMBlock *block);
 
 int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp);
 
+void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length);
+
+/* Clear whole block of mem */
+#define qemu_ram_block_writeback(rb)    \
+    qemu_ram_writeback(rb, 0, rb->used_length)
+
 #define DIRTY_CLIENTS_ALL     ((1 << DIRTY_MEMORY_NUM) - 1)
 #define DIRTY_CLIENTS_NOCODE  (DIRTY_CLIENTS_ALL & ~(1 << DIRTY_MEMORY_CODE))
 
diff --git a/memory.c b/memory.c
index 61a254c3f9..436eb64737 100644
--- a/memory.c
+++ b/memory.c
@@ -2228,6 +2228,18 @@ void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize, Error **errp
     qemu_ram_resize(mr->ram_block, newsize, errp);
 }
 
+
+void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size)
+{
+    /*
+     * Might be extended case needed to cover
+     * different types of memory regions
+     */
+    if (mr->ram_block && mr->dirty_log_mask) {
+        qemu_ram_writeback(mr->ram_block, addr, size);
+    }
+}
+
 /*
  * Call proper memory listeners about the change on the newly
  * added/removed CoalescedMemoryRange.
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10  9:56 [Qemu-devel] [PATCH 0/4] target/arm: Support for Data Cache Clean up to PoP Beata Michalska
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read Beata Michalska
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region Beata Michalska
@ 2019-09-10  9:56 ` Beata Michalska
  2019-09-10 10:26   ` Dr. David Alan Gilbert
  2019-09-24 16:30   ` Richard Henderson
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins Beata Michalska
  3 siblings, 2 replies; 25+ messages in thread
From: Beata Michalska @ 2019-09-10  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini

Switch to ram block writeback for pmem migration.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 migration/ram.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index b01a37e7ca..8ea0bd63fc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -33,7 +33,6 @@
 #include "qemu/bitops.h"
 #include "qemu/bitmap.h"
 #include "qemu/main-loop.h"
-#include "qemu/pmem.h"
 #include "xbzrle.h"
 #include "ram.h"
 #include "migration.h"
@@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
     RAMBlock *rb;
 
     RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
-        if (ramblock_is_pmem(rb)) {
-            pmem_persist(rb->host, rb->used_length);
-        }
+        qemu_ram_block_writeback(rb);
     }
 
     xbzrle_load_cleanup();
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-10  9:56 [Qemu-devel] [PATCH 0/4] target/arm: Support for Data Cache Clean up to PoP Beata Michalska
                   ` (2 preceding siblings ...)
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback Beata Michalska
@ 2019-09-10  9:56 ` Beata Michalska
  2019-09-23 23:54   ` Alex Bennée
                     ` (2 more replies)
  3 siblings, 3 replies; 25+ messages in thread
From: Beata Michalska @ 2019-09-10  9:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini

ARMv8.2 introduced support for Data Cache Clean instructions
to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
- DV CVADP. Both specify conceptual points in a memory system where all writes
that are to reach them are considered persistent.
The support provided considers both to be actually the same so there is no
distinction between the two. If none is available (there is no backing store
for given memory) both will result in Data Cache Clean up to the point of
coherency. Otherwise sync for the specified range shall be performed.

Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
---
 linux-user/elfload.c       | 18 +++++++++++++++++-
 target/arm/cpu.h           | 13 ++++++++++++-
 target/arm/cpu64.c         |  1 +
 target/arm/helper.c        | 24 ++++++++++++++++++++++++
 target/arm/helper.h        |  1 +
 target/arm/op_helper.c     | 36 ++++++++++++++++++++++++++++++++++++
 target/arm/translate-a64.c |  5 +++++
 7 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 3365e192eb..1ec00308d5 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -609,7 +609,12 @@ enum {
     ARM_HWCAP_A64_PACG          = 1UL << 31,
 };
 
+enum {
+    ARM_HWCAP2_A64_DCPODP   = 1 << 0,
+};
+
 #define ELF_HWCAP get_elf_hwcap()
+#define ELF_HWCAP2 get_elf_hwcap2()
 
 static uint32_t get_elf_hwcap(void)
 {
@@ -644,12 +649,23 @@ static uint32_t get_elf_hwcap(void)
     GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
     GET_FEATURE_ID(aa64_sb, ARM_HWCAP_A64_SB);
     GET_FEATURE_ID(aa64_condm_4, ARM_HWCAP_A64_FLAGM);
+    GET_FEATURE_ID(aa64_dcpop, ARM_HWCAP_A64_DCPOP);
 
-#undef GET_FEATURE_ID
 
     return hwcaps;
 }
 
+static uint32_t get_elf_hwcap2(void)
+{
+    ARMCPU *cpu = ARM_CPU(thread_cpu);
+    uint32_t hwcaps = 0;
+
+    GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
+    return hwcaps;
+}
+
+#undef GET_FEATURE_ID
+
 #endif /* not TARGET_AARCH64 */
 #endif /* TARGET_ARM */
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 297ad5e47a..1713d76590 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
 #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
 #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
 #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
-#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
+#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
+#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP
 #define ARM_CP_FPU               0x1000
 #define ARM_CP_SVE               0x2000
 #define ARM_CP_NO_GDB            0x4000
@@ -3572,6 +3573,16 @@ static inline bool isar_feature_aa64_frint(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FRINTTS) != 0;
 }
 
+static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
+}
+
+static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
+{
+    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
+}
+
 static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
 {
     /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index d7f5bf610a..20094f980d 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -331,6 +331,7 @@ static void aarch64_max_initfn(Object *obj)
         cpu->isar.id_aa64isar0 = t;
 
         t = cpu->isar.id_aa64isar1;
+        t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
         t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
         t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
         t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 507026c915..99ae01b7e7 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3843,6 +3843,22 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
     return CP_ACCESS_OK;
 }
 
+static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
+                                                  const ARMCPRegInfo *ri,
+                                                  bool isread)
+{
+    ARMCPU *cpu = env_archcpu(env);
+    /*
+     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
+     * DC CVAP ->  CRm: 0xc
+     * DC CVADP -> CRm: 0xd
+     */
+    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
+           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
+           ? CP_ACCESS_TRAP_UNCATEGORIZED
+           : aa64_cacheop_access(env, ri, isread);
+}
+
 /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
  * Page D4-1736 (DDI0487A.b)
  */
@@ -4251,6 +4267,14 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NOP,
       .accessfn = aa64_cacheop_access },
+    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
+      .access = PL0_W, .type = ARM_CP_DC_CVAP,
+      .accessfn = aa64_cacheop_persist_access },
+    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
+      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
+      .access = PL0_W, .type = ARM_CP_DC_CVAP,
+      .accessfn = aa64_cacheop_persist_access },
     { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
       .access = PL1_W, .type = ARM_CP_NOP },
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 1fb2cb5a77..a850364944 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -561,6 +561,7 @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
 DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_2(dc_zva, void, env, i64)
+DEF_HELPER_2(dc_cvap, void, env, i64)
 
 DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 0fd4bd0238..75ebf6afa4 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -987,3 +987,39 @@ void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
     memset(g2h(vaddr), 0, blocklen);
 #endif
 }
+
+void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
+{
+#ifndef CONFIG_USER_ONLY
+    ARMCPU *cpu = env_archcpu(env);
+    /* CTR_EL0 System register -> DminLine, bits [19:16] */
+    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
+    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
+    void *haddr;
+    int mem_idx = cpu_mmu_index(env, false);
+
+    /* This won't be crossing page boundaries */
+    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
+    if (haddr) {
+
+        ram_addr_t offset;
+        MemoryRegion *mr;
+
+        /*
+         * RCU critical section + ref counting,
+         * so that MR won't disappear behind the scene
+         */
+        rcu_read_lock();
+        mr = memory_region_from_host(haddr, &offset);
+        if (mr) {
+            memory_region_ref(mr);
+        }
+        rcu_read_unlock();
+
+        if (mr) {
+            memory_region_do_writeback(mr, offset, dline_size);
+            memory_region_unref(mr);
+        }
+    }
+#endif
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 2d6cd09634..21ea3631d6 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1746,6 +1746,11 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         tcg_rt = cpu_reg(s, rt);
         gen_helper_dc_zva(cpu_env, tcg_rt);
         return;
+    case ARM_CP_DC_CVAP:
+        /* DC CVAP / DC CVADP */
+        tcg_rt = cpu_reg(s, rt);
+        gen_helper_dc_cvap(cpu_env, tcg_rt);
+        return;
     default:
         break;
     }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback Beata Michalska
@ 2019-09-10 10:26   ` Dr. David Alan Gilbert
  2019-09-10 11:28     ` Beata Michalska
  2019-09-24 16:30   ` Richard Henderson
  1 sibling, 1 reply; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2019-09-10 10:26 UTC (permalink / raw)
  To: Beata Michalska
  Cc: peter.maydell, quintela, richard.henderson, qemu-devel,
	shameerali.kolothum.thodi, eric.auger, qemu-arm, pbonzini

* Beata Michalska (beata.michalska@linaro.org) wrote:
> Switch to ram block writeback for pmem migration.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  migration/ram.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index b01a37e7ca..8ea0bd63fc 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -33,7 +33,6 @@
>  #include "qemu/bitops.h"
>  #include "qemu/bitmap.h"
>  #include "qemu/main-loop.h"
> -#include "qemu/pmem.h"
>  #include "xbzrle.h"
>  #include "ram.h"
>  #include "migration.h"
> @@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
>      RAMBlock *rb;
>  
>      RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
> -        if (ramblock_is_pmem(rb)) {
> -            pmem_persist(rb->host, rb->used_length);
> -        }
> +        qemu_ram_block_writeback(rb);

ACK for migration

Although I do worry that if you really have pmem hardware, is it better
to fail the migration if you don't have libpmem available?

Dave

>      }
>  
>      xbzrle_load_cleanup();
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10 10:26   ` Dr. David Alan Gilbert
@ 2019-09-10 11:28     ` Beata Michalska
  2019-09-10 13:16       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 25+ messages in thread
From: Beata Michalska @ 2019-09-10 11:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, eric.auger, qemu-arm, pbonzini

On Tue, 10 Sep 2019 at 12:26, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Beata Michalska (beata.michalska@linaro.org) wrote:
> > Switch to ram block writeback for pmem migration.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  migration/ram.c | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/migration/ram.c b/migration/ram.c
> > index b01a37e7ca..8ea0bd63fc 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -33,7 +33,6 @@
> >  #include "qemu/bitops.h"
> >  #include "qemu/bitmap.h"
> >  #include "qemu/main-loop.h"
> > -#include "qemu/pmem.h"
> >  #include "xbzrle.h"
> >  #include "ram.h"
> >  #include "migration.h"
> > @@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
> >      RAMBlock *rb;
> >
> >      RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
> > -        if (ramblock_is_pmem(rb)) {
> > -            pmem_persist(rb->host, rb->used_length);
> > -        }
> > +        qemu_ram_block_writeback(rb);
>
> ACK for migration
>
> Although I do worry that if you really have pmem hardware, is it better
> to fail the migration if you don't have libpmem available?

According to the PMDG man page, pmem_persist is supposed to be
equivalent for the msync.
It's just more performant. So in case of real pmem hardware it should
be all good.

[http://pmem.io/pmdk/manpages/linux/v1.2/libpmem.3.html]

BR
Beata
>
> Dave
>
> >      }
> >
> >      xbzrle_load_cleanup();
> > --
> > 2.17.1
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10 11:28     ` Beata Michalska
@ 2019-09-10 13:16       ` Dr. David Alan Gilbert
  2019-09-10 14:21         ` Beata Michalska
  0 siblings, 1 reply; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2019-09-10 13:16 UTC (permalink / raw)
  To: Beata Michalska
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, eric.auger, qemu-arm, pbonzini

* Beata Michalska (beata.michalska@linaro.org) wrote:
> On Tue, 10 Sep 2019 at 12:26, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> >
> > * Beata Michalska (beata.michalska@linaro.org) wrote:
> > > Switch to ram block writeback for pmem migration.
> > >
> > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > ---
> > >  migration/ram.c | 5 +----
> > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > >
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index b01a37e7ca..8ea0bd63fc 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -33,7 +33,6 @@
> > >  #include "qemu/bitops.h"
> > >  #include "qemu/bitmap.h"
> > >  #include "qemu/main-loop.h"
> > > -#include "qemu/pmem.h"
> > >  #include "xbzrle.h"
> > >  #include "ram.h"
> > >  #include "migration.h"
> > > @@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
> > >      RAMBlock *rb;
> > >
> > >      RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
> > > -        if (ramblock_is_pmem(rb)) {
> > > -            pmem_persist(rb->host, rb->used_length);
> > > -        }
> > > +        qemu_ram_block_writeback(rb);
> >
> > ACK for migration
> >
> > Although I do worry that if you really have pmem hardware, is it better
> > to fail the migration if you don't have libpmem available?
> 
> According to the PMDG man page, pmem_persist is supposed to be
> equivalent for the msync.

OK, but you do define qemu_ram_block_writeback to fall back to fdatasync;
so that would be too little?

> It's just more performant. So in case of real pmem hardware it should
> be all good.
> 
> [http://pmem.io/pmdk/manpages/linux/v1.2/libpmem.3.html]

Dave

> 
> BR
> Beata
> >
> > Dave
> >
> > >      }
> > >
> > >      xbzrle_load_cleanup();
> > > --
> > > 2.17.1
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10 13:16       ` Dr. David Alan Gilbert
@ 2019-09-10 14:21         ` Beata Michalska
  2019-09-11 10:36           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 25+ messages in thread
From: Beata Michalska @ 2019-09-10 14:21 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, eric.auger, qemu-arm, pbonzini

On Tue, 10 Sep 2019 at 14:16, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Beata Michalska (beata.michalska@linaro.org) wrote:
> > On Tue, 10 Sep 2019 at 12:26, Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > >
> > > * Beata Michalska (beata.michalska@linaro.org) wrote:
> > > > Switch to ram block writeback for pmem migration.
> > > >
> > > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > > ---
> > > >  migration/ram.c | 5 +----
> > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > >
> > > > diff --git a/migration/ram.c b/migration/ram.c
> > > > index b01a37e7ca..8ea0bd63fc 100644
> > > > --- a/migration/ram.c
> > > > +++ b/migration/ram.c
> > > > @@ -33,7 +33,6 @@
> > > >  #include "qemu/bitops.h"
> > > >  #include "qemu/bitmap.h"
> > > >  #include "qemu/main-loop.h"
> > > > -#include "qemu/pmem.h"
> > > >  #include "xbzrle.h"
> > > >  #include "ram.h"
> > > >  #include "migration.h"
> > > > @@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
> > > >      RAMBlock *rb;
> > > >
> > > >      RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
> > > > -        if (ramblock_is_pmem(rb)) {
> > > > -            pmem_persist(rb->host, rb->used_length);
> > > > -        }
> > > > +        qemu_ram_block_writeback(rb);
> > >
> > > ACK for migration
> > >
> > > Although I do worry that if you really have pmem hardware, is it better
> > > to fail the migration if you don't have libpmem available?
> >
> > According to the PMDG man page, pmem_persist is supposed to be
> > equivalent for the msync.
>
> OK, but you do define qemu_ram_block_writeback to fall back to fdatasync;
> so that would be too little?

Actually it shouldn't. All will end-up in vfs_fsync_range; msync will
narrow the range.
fdatasync will trigger the same call just that with a wider range. At
least for Linux.
fdatasync will also fallback to fsync if it is not available.
So it's going from the best case scenario (as of performance and range of mem
to be synced) towards the worst case one.

I should probably double-check earlier versions of Linux.
I'll also try to verify that for other host variants.

BTW: Thank you for having a look at the changes.

BR
Beata

>
> > It's just more performant. So in case of real pmem hardware it should
> > be all good.
> >
> > [http://pmem.io/pmdk/manpages/linux/v1.2/libpmem.3.html]
>
> Dave
>
> >
> > BR
> > Beata
> > >
> > > Dave
> > >
> > > >      }
> > > >
> > > >      xbzrle_load_cleanup();
> > > > --
> > > > 2.17.1
> > > >
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10 14:21         ` Beata Michalska
@ 2019-09-11 10:36           ` Dr. David Alan Gilbert
  2019-09-12  9:10             ` Beata Michalska
  0 siblings, 1 reply; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2019-09-11 10:36 UTC (permalink / raw)
  To: Beata Michalska
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, eric.auger, qemu-arm, pbonzini

* Beata Michalska (beata.michalska@linaro.org) wrote:
> On Tue, 10 Sep 2019 at 14:16, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> >
> > * Beata Michalska (beata.michalska@linaro.org) wrote:
> > > On Tue, 10 Sep 2019 at 12:26, Dr. David Alan Gilbert
> > > <dgilbert@redhat.com> wrote:
> > > >
> > > > * Beata Michalska (beata.michalska@linaro.org) wrote:
> > > > > Switch to ram block writeback for pmem migration.
> > > > >
> > > > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > > > ---
> > > > >  migration/ram.c | 5 +----
> > > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/migration/ram.c b/migration/ram.c
> > > > > index b01a37e7ca..8ea0bd63fc 100644
> > > > > --- a/migration/ram.c
> > > > > +++ b/migration/ram.c
> > > > > @@ -33,7 +33,6 @@
> > > > >  #include "qemu/bitops.h"
> > > > >  #include "qemu/bitmap.h"
> > > > >  #include "qemu/main-loop.h"
> > > > > -#include "qemu/pmem.h"
> > > > >  #include "xbzrle.h"
> > > > >  #include "ram.h"
> > > > >  #include "migration.h"
> > > > > @@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
> > > > >      RAMBlock *rb;
> > > > >
> > > > >      RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
> > > > > -        if (ramblock_is_pmem(rb)) {
> > > > > -            pmem_persist(rb->host, rb->used_length);
> > > > > -        }
> > > > > +        qemu_ram_block_writeback(rb);
> > > >
> > > > ACK for migration
> > > >
> > > > Although I do worry that if you really have pmem hardware, is it better
> > > > to fail the migration if you don't have libpmem available?
> > >
> > > According to the PMDG man page, pmem_persist is supposed to be
> > > equivalent for the msync.
> >
> > OK, but you do define qemu_ram_block_writeback to fall back to fdatasync;
> > so that would be too little?
> 
> Actually it shouldn't. All will end-up in vfs_fsync_range; msync will
> narrow the range.
> fdatasync will trigger the same call just that with a wider range. At
> least for Linux.
> fdatasync will also fallback to fsync if it is not available.
> So it's going from the best case scenario (as of performance and range of mem
> to be synced) towards the worst case one.
> 
> I should probably double-check earlier versions of Linux.
> I'll also try to verify that for other host variants.

Well I guess it should probably follow whatever Posix says;  it's OK to
make Linux specific assumptions for Linux specific bits - but you can't
do it by code examination to guarantee it'll be right for other
platforms, especially if this is in code ifdef'd for portability.
Also it needs commenting to explain why it's safe to avoid someone else
asking this question.

> BTW: Thank you for having a look at the changes.

No problem.

Dave

> BR
> Beata
> 
> >
> > > It's just more performant. So in case of real pmem hardware it should
> > > be all good.
> > >
> > > [http://pmem.io/pmdk/manpages/linux/v1.2/libpmem.3.html]
> >
> > Dave
> >
> > >
> > > BR
> > > Beata
> > > >
> > > > Dave
> > > >
> > > > >      }
> > > > >
> > > > >      xbzrle_load_cleanup();
> > > > > --
> > > > > 2.17.1
> > > > >
> > > > --
> > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-11 10:36           ` Dr. David Alan Gilbert
@ 2019-09-12  9:10             ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-09-12  9:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, eric.auger, qemu-arm, pbonzini

On Wed, 11 Sep 2019 at 11:36, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Beata Michalska (beata.michalska@linaro.org) wrote:
> > On Tue, 10 Sep 2019 at 14:16, Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > >
> > > * Beata Michalska (beata.michalska@linaro.org) wrote:
> > > > On Tue, 10 Sep 2019 at 12:26, Dr. David Alan Gilbert
> > > > <dgilbert@redhat.com> wrote:
> > > > >
> > > > > * Beata Michalska (beata.michalska@linaro.org) wrote:
> > > > > > Switch to ram block writeback for pmem migration.
> > > > > >
> > > > > > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > > > > > ---
> > > > > >  migration/ram.c | 5 +----
> > > > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/migration/ram.c b/migration/ram.c
> > > > > > index b01a37e7ca..8ea0bd63fc 100644
> > > > > > --- a/migration/ram.c
> > > > > > +++ b/migration/ram.c
> > > > > > @@ -33,7 +33,6 @@
> > > > > >  #include "qemu/bitops.h"
> > > > > >  #include "qemu/bitmap.h"
> > > > > >  #include "qemu/main-loop.h"
> > > > > > -#include "qemu/pmem.h"
> > > > > >  #include "xbzrle.h"
> > > > > >  #include "ram.h"
> > > > > >  #include "migration.h"
> > > > > > @@ -4064,9 +4063,7 @@ static int ram_load_cleanup(void *opaque)
> > > > > >      RAMBlock *rb;
> > > > > >
> > > > > >      RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
> > > > > > -        if (ramblock_is_pmem(rb)) {
> > > > > > -            pmem_persist(rb->host, rb->used_length);
> > > > > > -        }
> > > > > > +        qemu_ram_block_writeback(rb);
> > > > >
> > > > > ACK for migration
> > > > >
> > > > > Although I do worry that if you really have pmem hardware, is it better
> > > > > to fail the migration if you don't have libpmem available?
> > > >
> > > > According to the PMDG man page, pmem_persist is supposed to be
> > > > equivalent for the msync.
> > >
> > > OK, but you do define qemu_ram_block_writeback to fall back to fdatasync;
> > > so that would be too little?
> >
> > Actually it shouldn't. All will end-up in vfs_fsync_range; msync will
> > narrow the range.
> > fdatasync will trigger the same call just that with a wider range. At
> > least for Linux.
> > fdatasync will also fallback to fsync if it is not available.
> > So it's going from the best case scenario (as of performance and range of mem
> > to be synced) towards the worst case one.
> >
> > I should probably double-check earlier versions of Linux.
> > I'll also try to verify that for other host variants.
>
> Well I guess it should probably follow whatever Posix says;  it's OK to
> make Linux specific assumptions for Linux specific bits - but you can't
> do it by code examination to guarantee it'll be right for other
> platforms, especially if this is in code ifdef'd for portability.
> Also it needs commenting to explain why it's safe to avoid someone else
> asking this question.
>
I will definitely address that in the next version.
Will just wait a bit to potentially gather more input
on the series.

> > BTW: Thank you for having a look at the changes.
>
> No problem.
>
Thanks again.

BR
Beata
> Dave
>
> > BR
> > Beata
> >
> > >
> > > > It's just more performant. So in case of real pmem hardware it should
> > > > be all good.
> > > >
> > > > [http://pmem.io/pmdk/manpages/linux/v1.2/libpmem.3.html]
> > >
> > > Dave
> > >
> > > >
> > > > BR
> > > > Beata
> > > > >
> > > > > Dave
> > > > >
> > > > > >      }
> > > > > >
> > > > > >      xbzrle_load_cleanup();
> > > > > > --
> > > > > > 2.17.1
> > > > > >
> > > > > --
> > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins Beata Michalska
@ 2019-09-23 23:54   ` Alex Bennée
  2019-10-09 11:47     ` Beata Michalska
  2019-09-24  1:16   ` Alex Bennée
  2019-09-24 17:22   ` Richard Henderson
  2 siblings, 1 reply; 25+ messages in thread
From: Alex Bennée @ 2019-09-23 23:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini


Beata Michalska <beata.michalska@linaro.org> writes:

> ARMv8.2 introduced support for Data Cache Clean instructions
> to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
> - DV CVADP. Both specify conceptual points in a memory system where all writes
> that are to reach them are considered persistent.
> The support provided considers both to be actually the same so there is no
> distinction between the two. If none is available (there is no backing store
> for given memory) both will result in Data Cache Clean up to the point of
> coherency. Otherwise sync for the specified range shall be performed.
>
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  linux-user/elfload.c       | 18 +++++++++++++++++-

There are conflicts from the recent elfload.c tweaks to fix on your next rebase.

>  target/arm/cpu.h           | 13 ++++++++++++-
>  target/arm/cpu64.c         |  1 +
>  target/arm/helper.c        | 24 ++++++++++++++++++++++++
>  target/arm/helper.h        |  1 +
>  target/arm/op_helper.c     | 36 ++++++++++++++++++++++++++++++++++++
>  target/arm/translate-a64.c |  5 +++++
>  7 files changed, 96 insertions(+), 2 deletions(-)
>
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 3365e192eb..1ec00308d5 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -609,7 +609,12 @@ enum {
>      ARM_HWCAP_A64_PACG          = 1UL << 31,
>  };
>
> +enum {
> +    ARM_HWCAP2_A64_DCPODP   = 1 << 0,
> +};
> +
>  #define ELF_HWCAP get_elf_hwcap()
> +#define ELF_HWCAP2 get_elf_hwcap2()
>
>  static uint32_t get_elf_hwcap(void)
>  {
> @@ -644,12 +649,23 @@ static uint32_t get_elf_hwcap(void)
>      GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
>      GET_FEATURE_ID(aa64_sb, ARM_HWCAP_A64_SB);
>      GET_FEATURE_ID(aa64_condm_4, ARM_HWCAP_A64_FLAGM);
> +    GET_FEATURE_ID(aa64_dcpop, ARM_HWCAP_A64_DCPOP);
>
> -#undef GET_FEATURE_ID
>
>      return hwcaps;
>  }
>
> +static uint32_t get_elf_hwcap2(void)
> +{
> +    ARMCPU *cpu = ARM_CPU(thread_cpu);
> +    uint32_t hwcaps = 0;
> +
> +    GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
> +    return hwcaps;
> +}
> +
> +#undef GET_FEATURE_ID
> +
>  #endif /* not TARGET_AARCH64 */
>  #endif /* TARGET_ARM */
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 297ad5e47a..1713d76590 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
>  #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
>  #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
>  #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
> -#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
> +#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
> +#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP
>  #define ARM_CP_FPU               0x1000
>  #define ARM_CP_SVE               0x2000
>  #define ARM_CP_NO_GDB            0x4000
> @@ -3572,6 +3573,16 @@ static inline bool isar_feature_aa64_frint(const ARMISARegisters *id)
>      return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FRINTTS) != 0;
>  }
>
> +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
> +{
> +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
> +}
> +
> +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
> +{
> +    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
> +}
> +
>  static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
>  {
>      /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index d7f5bf610a..20094f980d 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -331,6 +331,7 @@ static void aarch64_max_initfn(Object *obj)
>          cpu->isar.id_aa64isar0 = t;
>
>          t = cpu->isar.id_aa64isar1;
> +        t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
>          t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
>          t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
>          t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 507026c915..99ae01b7e7 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -3843,6 +3843,22 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
>      return CP_ACCESS_OK;
>  }
>
> +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
> +                                                  const ARMCPRegInfo *ri,
> +                                                  bool isread)
> +{
> +    ARMCPU *cpu = env_archcpu(env);
> +    /*
> +     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
> +     * DC CVAP ->  CRm: 0xc
> +     * DC CVADP -> CRm: 0xd
> +     */
> +    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
> +           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
> +           ? CP_ACCESS_TRAP_UNCATEGORIZED
> +           : aa64_cacheop_access(env, ri, isread);
> +}
> +
>  /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
>   * Page D4-1736 (DDI0487A.b)
>   */
> @@ -4251,6 +4267,14 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
>        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
>        .access = PL0_W, .type = ARM_CP_NOP,
>        .accessfn = aa64_cacheop_access },
> +    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> +      .accessfn = aa64_cacheop_persist_access },
> +    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> +      .accessfn = aa64_cacheop_persist_access },
>      { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
>        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
>        .access = PL1_W, .type = ARM_CP_NOP },
> diff --git a/target/arm/helper.h b/target/arm/helper.h
> index 1fb2cb5a77..a850364944 100644
> --- a/target/arm/helper.h
> +++ b/target/arm/helper.h
> @@ -561,6 +561,7 @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
>  DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
>  DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
>  DEF_HELPER_2(dc_zva, void, env, i64)
> +DEF_HELPER_2(dc_cvap, void, env, i64)
>
>  DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
>  DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index 0fd4bd0238..75ebf6afa4 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -987,3 +987,39 @@ void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
>      memset(g2h(vaddr), 0, blocklen);
>  #endif
>  }
> +
> +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
> +{
> +#ifndef CONFIG_USER_ONLY
> +    ARMCPU *cpu = env_archcpu(env);
> +    /* CTR_EL0 System register -> DminLine, bits [19:16] */
> +    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
> +    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
> +    void *haddr;
> +    int mem_idx = cpu_mmu_index(env, false);
> +
> +    /* This won't be crossing page boundaries */
> +    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
> +    if (haddr) {
> +
> +        ram_addr_t offset;
> +        MemoryRegion *mr;
> +
> +        /*
> +         * RCU critical section + ref counting,
> +         * so that MR won't disappear behind the scene
> +         */
> +        rcu_read_lock();
> +        mr = memory_region_from_host(haddr, &offset);
> +        if (mr) {
> +            memory_region_ref(mr);
> +        }
> +        rcu_read_unlock();
> +
> +        if (mr) {
> +            memory_region_do_writeback(mr, offset, dline_size);
> +            memory_region_unref(mr);
> +        }
> +    }
> +#endif
> +}
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 2d6cd09634..21ea3631d6 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -1746,6 +1746,11 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
>          tcg_rt = cpu_reg(s, rt);
>          gen_helper_dc_zva(cpu_env, tcg_rt);
>          return;
> +    case ARM_CP_DC_CVAP:
> +        /* DC CVAP / DC CVADP */
> +        tcg_rt = cpu_reg(s, rt);
> +        gen_helper_dc_cvap(cpu_env, tcg_rt);
> +        return;
>      default:
>          break;
>      }


--
Alex Bennée


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read Beata Michalska
@ 2019-09-23 23:54   ` Alex Bennée
  0 siblings, 0 replies; 25+ messages in thread
From: Alex Bennée @ 2019-09-23 23:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini


Beata Michalska <beata.michalska@linaro.org> writes:

> Add probe_read alongside the write probing equivalent.
>
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  include/exec/exec-all.h | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 81b02eb2fe..e1785700c3 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -319,6 +319,12 @@ static inline void *probe_write(CPUArchState *env, target_ulong addr, int size,
>      return probe_access(env, addr, size, MMU_DATA_STORE, mmu_idx, retaddr);
>  }
>
> +static inline void *probe_read(CPUArchState *env, target_ulong addr, int size,
> +                               int mmu_idx, uintptr_t retaddr)
> +{
> +    return probe_access(env, addr, size, MMU_DATA_LOAD, mmu_idx, retaddr);
> +}
> +
>  #define CODE_GEN_ALIGN           16 /* must be >= of the size of a icache line */
>
>  /* Estimated block size for TB allocation.  */


--
Alex Bennée


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region Beata Michalska
@ 2019-09-24  0:00   ` Alex Bennée
  2019-10-09 11:40     ` Beata Michalska
  2019-09-24 16:28   ` Richard Henderson
  2019-09-24 16:30   ` Richard Henderson
  2 siblings, 1 reply; 25+ messages in thread
From: Alex Bennée @ 2019-09-24  0:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini


Beata Michalska <beata.michalska@linaro.org> writes:

> Add an option to trigger memory writeback to sync given memory region
> with the corresponding backing store, case one is available.
> This extends the support for persistent memory, allowing syncing on-demand.
>
> Also, adding verification for msync support on host.
>
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  configure               | 24 ++++++++++++++++++++++++
>  exec.c                  | 38 ++++++++++++++++++++++++++++++++++++++
>  include/exec/memory.h   |  6 ++++++
>  include/exec/ram_addr.h |  6 ++++++
>  memory.c                | 12 ++++++++++++
>  5 files changed, 86 insertions(+)
>
> diff --git a/configure b/configure
> index 95134c0180..bdb7dc47e9 100755
> --- a/configure
> +++ b/configure
> @@ -5081,6 +5081,26 @@ if compile_prog "" "" ; then
>      fdatasync=yes
>  fi
>
> +##########################################
> +# verify support for msyc
> +
> +msync=no
> +cat > $TMPC << EOF
> +#include <unistd.h>
> +#include <sys/mman.h>
> +int main(void) {
> +#if defined(_POSIX_MAPPED_FILES) && _POSIX_MAPPED_FILES > 0 \
> +&& defined(_POSIX_SYNCHRONIZED_IO) && _POSIX_SYNCHRONIZED_IO > 0
> +return msync(NULL,0, MS_SYNC);
> +#else
> +#error Not supported
> +#endif
> +}
> +EOF
> +if compile_prog "" "" ; then
> +    msync=yes
> +fi
> +
>  ##########################################
>  # check if we have madvise
>
> @@ -6413,6 +6433,7 @@ echo "fdt support       $fdt"
>  echo "membarrier        $membarrier"
>  echo "preadv support    $preadv"
>  echo "fdatasync         $fdatasync"
> +echo "msync             $msync"
>  echo "madvise           $madvise"
>  echo "posix_madvise     $posix_madvise"
>  echo "posix_memalign    $posix_memalign"
> @@ -6952,6 +6973,9 @@ fi
>  if test "$fdatasync" = "yes" ; then
>    echo "CONFIG_FDATASYNC=y" >> $config_host_mak
>  fi
> +if test "$msync" = "yes" ; then
> +    echo "CONFIG_MSYNC=y" >> $config_host_mak
> +fi

I think it's best to split this configure check into a new prequel patch and...

>  if test "$madvise" = "yes" ; then
>    echo "CONFIG_MADVISE=y" >> $config_host_mak
>  fi
> diff --git a/exec.c b/exec.c
> index 235d6bc883..5ed6908368 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -65,6 +65,8 @@
>  #include "exec/ram_addr.h"
>  #include "exec/log.h"
>
> +#include "qemu/pmem.h"
> +
>  #include "migration/vmstate.h"
>
>  #include "qemu/range.h"
> @@ -2182,6 +2184,42 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
>      return 0;
>  }
>
> +/*
> + * Trigger sync on the given ram block for range [start, start + length]
> + * with the backing store if available.
> + * @Note: this is supposed to be a SYNC op.
> + */
> +void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length)
> +{
> +    void *addr = ramblock_ptr(block, start);
> +
> +    /*
> +     * The requested range might spread up to the very end of the block
> +     */
> +    if ((start + length) > block->used_length) {
> +        error_report("%s: sync range outside the block boundires: "
> +                     "start: " RAM_ADDR_FMT " length: " RAM_ADDR_FMT
> +                     " block length: " RAM_ADDR_FMT " Narrowing down ..." ,
> +                     __func__, start, length, block->used_length);

Is this an error or just logging? error_report should be used for stuff
that the user needs to know about so it will appear on the HMP console
(or via stderr). If so what is the user expected to do? Have they
misconfigured their system?

> +        length = block->used_length - start;
> +    }
> +
> +#ifdef CONFIG_LIBPMEM
> +    /* The lack of support for pmem should not block the sync */
> +    if (ramblock_is_pmem(block)) {
> +        pmem_persist(addr, length);
> +    } else
> +#endif
> +    if (block->fd >= 0) {
> +#ifdef CONFIG_MSYNC
> +        msync((void *)((uintptr_t)addr & qemu_host_page_mask),
> +               HOST_PAGE_ALIGN(length), MS_SYNC);
> +#else
> +        qemu_fdatasync(block->fd);
> +#endif

... hide the implementation details in util/cutils.c, maybe as
qemu_msync()?

> +    }
> +}
> +
>  /* Called with ram_list.mutex held */
>  static void dirty_memory_extend(ram_addr_t old_ram_size,
>                                  ram_addr_t new_ram_size)
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 2dd810259d..ff0d7937cf 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -1242,6 +1242,12 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
>   */
>  void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize,
>                                Error **errp);
> +/**
> + * memory_region_do_writeback: Trigger writeback for selected address range
> + * [addr, addr + size]
> + *
> + */
> +void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size);
>
>  /**
>   * memory_region_set_log: Turn dirty logging on or off for a region.
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index a327a80cfe..d4bce81a03 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -180,6 +180,12 @@ void qemu_ram_free(RAMBlock *block);
>
>  int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp);
>
> +void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length);
> +
> +/* Clear whole block of mem */
> +#define qemu_ram_block_writeback(rb)    \
> +    qemu_ram_writeback(rb, 0, rb->used_length)
> +
>  #define DIRTY_CLIENTS_ALL     ((1 << DIRTY_MEMORY_NUM) - 1)
>  #define DIRTY_CLIENTS_NOCODE  (DIRTY_CLIENTS_ALL & ~(1 << DIRTY_MEMORY_CODE))
>
> diff --git a/memory.c b/memory.c
> index 61a254c3f9..436eb64737 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -2228,6 +2228,18 @@ void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize, Error **errp
>      qemu_ram_resize(mr->ram_block, newsize, errp);
>  }
>
> +
> +void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size)
> +{
> +    /*
> +     * Might be extended case needed to cover
> +     * different types of memory regions
> +     */
> +    if (mr->ram_block && mr->dirty_log_mask) {
> +        qemu_ram_writeback(mr->ram_block, addr, size);
> +    }
> +}
> +
>  /*
>   * Call proper memory listeners about the change on the newly
>   * added/removed CoalescedMemoryRange.


--
Alex Bennée


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins Beata Michalska
  2019-09-23 23:54   ` Alex Bennée
@ 2019-09-24  1:16   ` Alex Bennée
  2019-10-09 11:49     ` Beata Michalska
  2019-09-24 17:22   ` Richard Henderson
  2 siblings, 1 reply; 25+ messages in thread
From: Alex Bennée @ 2019-09-24  1:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Beata Michalska, quintela, richard.henderson,
	dgilbert, shameerali.kolothum.thodi, eric.auger, qemu-arm,
	pbonzini


Beata Michalska <beata.michalska@linaro.org> writes:

> ARMv8.2 introduced support for Data Cache Clean instructions
> to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
> - DV CVADP. Both specify conceptual points in a memory system where all writes
> that are to reach them are considered persistent.
> The support provided considers both to be actually the same so there is no
> distinction between the two. If none is available (there is no backing store
> for given memory) both will result in Data Cache Clean up to the point of
> coherency. Otherwise sync for the specified range shall be performed.
>
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  linux-user/elfload.c       | 18 +++++++++++++++++-
>  target/arm/cpu.h           | 13 ++++++++++++-
>  target/arm/cpu64.c         |  1 +
>  target/arm/helper.c        | 24 ++++++++++++++++++++++++
>  target/arm/helper.h        |  1 +
>  target/arm/op_helper.c     | 36 ++++++++++++++++++++++++++++++++++++
>  target/arm/translate-a64.c |  5 +++++
>  7 files changed, 96 insertions(+), 2 deletions(-)
>
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 3365e192eb..1ec00308d5 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -609,7 +609,12 @@ enum {
>      ARM_HWCAP_A64_PACG          = 1UL << 31,
>  };
>
> +enum {
> +    ARM_HWCAP2_A64_DCPODP   = 1 << 0,
> +};
> +
>  #define ELF_HWCAP get_elf_hwcap()
> +#define ELF_HWCAP2 get_elf_hwcap2()
>
>  static uint32_t get_elf_hwcap(void)
>  {
> @@ -644,12 +649,23 @@ static uint32_t get_elf_hwcap(void)
>      GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
>      GET_FEATURE_ID(aa64_sb, ARM_HWCAP_A64_SB);
>      GET_FEATURE_ID(aa64_condm_4, ARM_HWCAP_A64_FLAGM);
> +    GET_FEATURE_ID(aa64_dcpop, ARM_HWCAP_A64_DCPOP);
>
> -#undef GET_FEATURE_ID
>
>      return hwcaps;
>  }
>
> +static uint32_t get_elf_hwcap2(void)
> +{
> +    ARMCPU *cpu = ARM_CPU(thread_cpu);
> +    uint32_t hwcaps = 0;
> +
> +    GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
> +    return hwcaps;
> +}
> +
> +#undef GET_FEATURE_ID
> +
>  #endif /* not TARGET_AARCH64 */
>  #endif /* TARGET_ARM */
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 297ad5e47a..1713d76590 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
>  #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
>  #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
>  #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
> -#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
> +#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
> +#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP
>  #define ARM_CP_FPU               0x1000
>  #define ARM_CP_SVE               0x2000
>  #define ARM_CP_NO_GDB            0x4000
> @@ -3572,6 +3573,16 @@ static inline bool isar_feature_aa64_frint(const ARMISARegisters *id)
>      return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FRINTTS) != 0;
>  }
>
> +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
> +{
> +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
> +}
> +
> +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
> +{
> +    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
> +}
> +
>  static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
>  {
>      /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index d7f5bf610a..20094f980d 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -331,6 +331,7 @@ static void aarch64_max_initfn(Object *obj)
>          cpu->isar.id_aa64isar0 = t;
>
>          t = cpu->isar.id_aa64isar1;
> +        t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
>          t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
>          t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
>          t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 507026c915..99ae01b7e7 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -3843,6 +3843,22 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
>      return CP_ACCESS_OK;
>  }
>
> +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
> +                                                  const ARMCPRegInfo *ri,
> +                                                  bool isread)
> +{
> +    ARMCPU *cpu = env_archcpu(env);
> +    /*
> +     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
> +     * DC CVAP ->  CRm: 0xc
> +     * DC CVADP -> CRm: 0xd
> +     */
> +    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
> +           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
> +           ? CP_ACCESS_TRAP_UNCATEGORIZED
> +           : aa64_cacheop_access(env, ri, isread);
> +}
> +
>  /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
>   * Page D4-1736 (DDI0487A.b)
>   */
> @@ -4251,6 +4267,14 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
>        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
>        .access = PL0_W, .type = ARM_CP_NOP,
>        .accessfn = aa64_cacheop_access },
> +    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> +      .accessfn = aa64_cacheop_persist_access },
> +    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> +      .accessfn = aa64_cacheop_persist_access },
>      { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
>        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
>        .access = PL1_W, .type = ARM_CP_NOP },
> diff --git a/target/arm/helper.h b/target/arm/helper.h
> index 1fb2cb5a77..a850364944 100644
> --- a/target/arm/helper.h
> +++ b/target/arm/helper.h
> @@ -561,6 +561,7 @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
>  DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
>  DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
>  DEF_HELPER_2(dc_zva, void, env, i64)
> +DEF_HELPER_2(dc_cvap, void, env, i64)
>
>  DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
>  DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index 0fd4bd0238..75ebf6afa4 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -987,3 +987,39 @@ void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
>      memset(g2h(vaddr), 0, blocklen);
>  #endif
>  }
> +
> +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
> +{
> +#ifndef CONFIG_USER_ONLY

Are we essentially saying the operation is a NOP for user-mode
emulation? Is it just because we don't really expose HW to linux-user?

If so move the #ifndef outside the HELPER definition and...

> +    ARMCPU *cpu = env_archcpu(env);
> +    /* CTR_EL0 System register -> DminLine, bits [19:16] */
> +    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
> +    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
> +    void *haddr;
> +    int mem_idx = cpu_mmu_index(env, false);
> +
> +    /* This won't be crossing page boundaries */
> +    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
> +    if (haddr) {
> +
> +        ram_addr_t offset;
> +        MemoryRegion *mr;
> +
> +        /*
> +         * RCU critical section + ref counting,
> +         * so that MR won't disappear behind the scene
> +         */
> +        rcu_read_lock();
> +        mr = memory_region_from_host(haddr, &offset);
> +        if (mr) {
> +            memory_region_ref(mr);
> +        }
> +        rcu_read_unlock();
> +
> +        if (mr) {
> +            memory_region_do_writeback(mr, offset, dline_size);
> +            memory_region_unref(mr);
> +        }
> +    }
> +#endif
> +}
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 2d6cd09634..21ea3631d6 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -1746,6 +1746,11 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
>          tcg_rt = cpu_reg(s, rt);
>          gen_helper_dc_zva(cpu_env, tcg_rt);
>          return;
> +    case ARM_CP_DC_CVAP:
> +        /* DC CVAP / DC CVADP */

.. #ifndef CONFIG_USER_ONLY this code so we don't add the overhead of a
helper call in linux-user mode.

> +        tcg_rt = cpu_reg(s, rt);
> +        gen_helper_dc_cvap(cpu_env, tcg_rt);
> +        return;
>      default:
>          break;
>      }


--
Alex Bennée


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region Beata Michalska
  2019-09-24  0:00   ` Alex Bennée
@ 2019-09-24 16:28   ` Richard Henderson
  2019-10-09 11:44     ` Beata Michalska
  2019-09-24 16:30   ` Richard Henderson
  2 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2019-09-24 16:28 UTC (permalink / raw)
  To: Beata Michalska, qemu-devel
  Cc: peter.maydell, quintela, dgilbert, shameerali.kolothum.thodi,
	eric.auger, qemu-arm, pbonzini

On 9/10/19 2:56 AM, Beata Michalska wrote:
> +int main(void) {
> +#if defined(_POSIX_MAPPED_FILES) && _POSIX_MAPPED_FILES > 0 \
> +&& defined(_POSIX_SYNCHRONIZED_IO) && _POSIX_SYNCHRONIZED_IO > 0
> +return msync(NULL,0, MS_SYNC);
> +#else
> +#error Not supported
> +#endif
> +}

Is there any particular reason to check _POSIX_MAPPED_FILES &
_POSIX_SYNCHRONIZED_IO?  IIRC, you can use those to "safely" use MS_SYNC.  But
this is a configure test, and an error is in fact our defined failure case, so
"safely" doesn't seem particularly relevant.

Alternately, do we even support any systems (besides perhaps windows) that do
not provide POSIX-2001 support, and so include msync + MS_SYNC?  My first guess
is that we don't.

> +        msync((void *)((uintptr_t)addr & qemu_host_page_mask),
> +               HOST_PAGE_ALIGN(length), MS_SYNC);

This isn't quite right.  If you move addr down to a lower address via this page
mask, you must also increase length by the same amount, and only afterward
increase length to the host page size.

Consider addr == 0xffffff, length = 2.  This covers two pages, so you'd expect
the final parameters to be, for 4k page size, 0xfff000, 0x2000.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region Beata Michalska
  2019-09-24  0:00   ` Alex Bennée
  2019-09-24 16:28   ` Richard Henderson
@ 2019-09-24 16:30   ` Richard Henderson
  2019-10-09 11:45     ` Beata Michalska
  2 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2019-09-24 16:30 UTC (permalink / raw)
  To: Beata Michalska, qemu-devel
  Cc: peter.maydell, quintela, dgilbert, shameerali.kolothum.thodi,
	eric.auger, qemu-arm, pbonzini

On 9/10/19 2:56 AM, Beata Michalska wrote:
> +void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length);
> +
> +/* Clear whole block of mem */
> +#define qemu_ram_block_writeback(rb)    \
> +    qemu_ram_writeback(rb, 0, rb->used_length)
> +

Inline function, with its typechecking, is preferred.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback Beata Michalska
  2019-09-10 10:26   ` Dr. David Alan Gilbert
@ 2019-09-24 16:30   ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2019-09-24 16:30 UTC (permalink / raw)
  To: Beata Michalska, qemu-devel
  Cc: peter.maydell, quintela, dgilbert, shameerali.kolothum.thodi,
	eric.auger, qemu-arm, pbonzini

On 9/10/19 2:56 AM, Beata Michalska wrote:
> Switch to ram block writeback for pmem migration.
> 
> Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> ---
>  migration/ram.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-10  9:56 ` [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins Beata Michalska
  2019-09-23 23:54   ` Alex Bennée
  2019-09-24  1:16   ` Alex Bennée
@ 2019-09-24 17:22   ` Richard Henderson
  2019-10-09 11:53     ` Beata Michalska
  2 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2019-09-24 17:22 UTC (permalink / raw)
  To: Beata Michalska, qemu-devel
  Cc: peter.maydell, quintela, dgilbert, shameerali.kolothum.thodi,
	eric.auger, qemu-arm, pbonzini

On 9/10/19 2:56 AM, Beata Michalska wrote:
> @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
>  #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
>  #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
>  #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
> -#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
> +#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
> +#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP

I don't see that this operation needs to be handled via "special".  It's a
function call upon write, as for many other system registers.

> +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
> +{
> +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
> +}
> +
> +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
> +{
> +    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
> +}

The correct check is FIELD_EX(...) >= 2.  This is a 4-bit field, as with all of
the others.

> +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
> +                                                  const ARMCPRegInfo *ri,
> +                                                  bool isread)
> +{
> +    ARMCPU *cpu = env_archcpu(env);
> +    /*
> +     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
> +     * DC CVAP ->  CRm: 0xc
> +     * DC CVADP -> CRm: 0xd
> +     */
> +    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
> +           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
> +           ? CP_ACCESS_TRAP_UNCATEGORIZED
> +           : aa64_cacheop_access(env, ri, isread);
> +}
...
> +    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> +      .accessfn = aa64_cacheop_persist_access },
> +    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> +      .accessfn = aa64_cacheop_persist_access },

While this access function works, it's better to simply not register these at
all when they're not supported.  Compare the registration of rndr_reginfo.

As I described above, I think this can use a normal write function.  In which
case this would use .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END.

(I believe that ARM_CP_IO is not required, but I'm not 100% sure.  Certainly
there does not seem to be anything in dc_cvap() that affects state which can
queried from another virtual cpu, so there does not appear to be any reason to
grab the global i/o lock.  The host kernel should be just fine with concurrent
msync syscalls, or whatever it is that libpmem uses.)


> +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
> +{
> +#ifndef CONFIG_USER_ONLY
> +    ARMCPU *cpu = env_archcpu(env);
> +    /* CTR_EL0 System register -> DminLine, bits [19:16] */
> +    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
> +    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
> +    void *haddr;
> +    int mem_idx = cpu_mmu_index(env, false);
> +
> +    /* This won't be crossing page boundaries */
> +    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
> +    if (haddr) {
> +
> +        ram_addr_t offset;
> +        MemoryRegion *mr;
> +
> +        /*
> +         * RCU critical section + ref counting,
> +         * so that MR won't disappear behind the scene
> +         */
> +        rcu_read_lock();
> +        mr = memory_region_from_host(haddr, &offset);
> +        if (mr) {
> +            memory_region_ref(mr);
> +        }
> +        rcu_read_unlock();
> +
> +        if (mr) {
> +            memory_region_do_writeback(mr, offset, dline_size);
> +            memory_region_unref(mr);
> +        }
> +    }
> +#endif


We hold the rcu lock whenever a TB is executing.  I don't believe there's any
point in increasing the lock count here.  Similarly with memory_region
refcounts -- they cannot vanish while we're executing a TB.

Thus I believe that about half of this function can fold away.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-24  0:00   ` Alex Bennée
@ 2019-10-09 11:40     ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-10-09 11:40 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, Dr. David Alan Gilbert, eric.auger,
	qemu-arm, pbonzini

On Tue, 24 Sep 2019 at 01:00, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Beata Michalska <beata.michalska@linaro.org> writes:
>
> > Add an option to trigger memory writeback to sync given memory region
> > with the corresponding backing store, case one is available.
> > This extends the support for persistent memory, allowing syncing on-demand.
> >
> > Also, adding verification for msync support on host.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  configure               | 24 ++++++++++++++++++++++++
> >  exec.c                  | 38 ++++++++++++++++++++++++++++++++++++++
> >  include/exec/memory.h   |  6 ++++++
> >  include/exec/ram_addr.h |  6 ++++++
> >  memory.c                | 12 ++++++++++++
> >  5 files changed, 86 insertions(+)
> >
> > diff --git a/configure b/configure
> > index 95134c0180..bdb7dc47e9 100755
> > --- a/configure
> > +++ b/configure
> > @@ -5081,6 +5081,26 @@ if compile_prog "" "" ; then
> >      fdatasync=yes
> >  fi
> >
> > +##########################################
> > +# verify support for msyc
> > +
> > +msync=no
> > +cat > $TMPC << EOF
> > +#include <unistd.h>
> > +#include <sys/mman.h>
> > +int main(void) {
> > +#if defined(_POSIX_MAPPED_FILES) && _POSIX_MAPPED_FILES > 0 \
> > +&& defined(_POSIX_SYNCHRONIZED_IO) && _POSIX_SYNCHRONIZED_IO > 0
> > +return msync(NULL,0, MS_SYNC);
> > +#else
> > +#error Not supported
> > +#endif
> > +}
> > +EOF
> > +if compile_prog "" "" ; then
> > +    msync=yes
> > +fi
> > +
> >  ##########################################
> >  # check if we have madvise
> >
> > @@ -6413,6 +6433,7 @@ echo "fdt support       $fdt"
> >  echo "membarrier        $membarrier"
> >  echo "preadv support    $preadv"
> >  echo "fdatasync         $fdatasync"
> > +echo "msync             $msync"
> >  echo "madvise           $madvise"
> >  echo "posix_madvise     $posix_madvise"
> >  echo "posix_memalign    $posix_memalign"
> > @@ -6952,6 +6973,9 @@ fi
> >  if test "$fdatasync" = "yes" ; then
> >    echo "CONFIG_FDATASYNC=y" >> $config_host_mak
> >  fi
> > +if test "$msync" = "yes" ; then
> > +    echo "CONFIG_MSYNC=y" >> $config_host_mak
> > +fi
>
> I think it's best to split this configure check into a new prequel patch and...

I might just drop it in favour of CONFIG_POSIX switch ......
>
> >  if test "$madvise" = "yes" ; then
> >    echo "CONFIG_MADVISE=y" >> $config_host_mak
> >  fi
> > diff --git a/exec.c b/exec.c
> > index 235d6bc883..5ed6908368 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -65,6 +65,8 @@
> >  #include "exec/ram_addr.h"
> >  #include "exec/log.h"
> >
> > +#include "qemu/pmem.h"
> > +
> >  #include "migration/vmstate.h"
> >
> >  #include "qemu/range.h"
> > @@ -2182,6 +2184,42 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
> >      return 0;
> >  }
> >
> > +/*
> > + * Trigger sync on the given ram block for range [start, start + length]
> > + * with the backing store if available.
> > + * @Note: this is supposed to be a SYNC op.
> > + */
> > +void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length)
> > +{
> > +    void *addr = ramblock_ptr(block, start);
> > +
> > +    /*
> > +     * The requested range might spread up to the very end of the block
> > +     */
> > +    if ((start + length) > block->used_length) {
> > +        error_report("%s: sync range outside the block boundires: "
> > +                     "start: " RAM_ADDR_FMT " length: " RAM_ADDR_FMT
> > +                     " block length: " RAM_ADDR_FMT " Narrowing down ..." ,
> > +                     __func__, start, length, block->used_length);
>
> Is this an error or just logging? error_report should be used for stuff
> that the user needs to know about so it will appear on the HMP console
> (or via stderr). If so what is the user expected to do? Have they
> misconfigured their system?
>

This should be logging  rather than 'error reporting as such. My bad.
Will address that in the next version.

> > +        length = block->used_length - start;
> > +    }
> > +
> > +#ifdef CONFIG_LIBPMEM
> > +    /* The lack of support for pmem should not block the sync */
> > +    if (ramblock_is_pmem(block)) {
> > +        pmem_persist(addr, length);
> > +    } else
> > +#endif
> > +    if (block->fd >= 0) {
> > +#ifdef CONFIG_MSYNC
> > +        msync((void *)((uintptr_t)addr & qemu_host_page_mask),
> > +               HOST_PAGE_ALIGN(length), MS_SYNC);
> > +#else
> > +        qemu_fdatasync(block->fd);
> > +#endif
>
> ... hide the implementation details in util/cutils.c, maybe as
> qemu_msync()?
>
> > +    }
> > +}
> > +
> >  /* Called with ram_list.mutex held */
> >  static void dirty_memory_extend(ram_addr_t old_ram_size,
> >                                  ram_addr_t new_ram_size)
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 2dd810259d..ff0d7937cf 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -1242,6 +1242,12 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
> >   */
> >  void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize,
> >                                Error **errp);
> > +/**
> > + * memory_region_do_writeback: Trigger writeback for selected address range
> > + * [addr, addr + size]
> > + *
> > + */
> > +void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size);
> >
> >  /**
> >   * memory_region_set_log: Turn dirty logging on or off for a region.
> > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> > index a327a80cfe..d4bce81a03 100644
> > --- a/include/exec/ram_addr.h
> > +++ b/include/exec/ram_addr.h
> > @@ -180,6 +180,12 @@ void qemu_ram_free(RAMBlock *block);
> >
> >  int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp);
> >
> > +void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length);
> > +
> > +/* Clear whole block of mem */
> > +#define qemu_ram_block_writeback(rb)    \
> > +    qemu_ram_writeback(rb, 0, rb->used_length)
> > +
> >  #define DIRTY_CLIENTS_ALL     ((1 << DIRTY_MEMORY_NUM) - 1)
> >  #define DIRTY_CLIENTS_NOCODE  (DIRTY_CLIENTS_ALL & ~(1 << DIRTY_MEMORY_CODE))
> >
> > diff --git a/memory.c b/memory.c
> > index 61a254c3f9..436eb64737 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -2228,6 +2228,18 @@ void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize, Error **errp
> >      qemu_ram_resize(mr->ram_block, newsize, errp);
> >  }
> >
> > +
> > +void memory_region_do_writeback(MemoryRegion *mr, hwaddr addr, hwaddr size)
> > +{
> > +    /*
> > +     * Might be extended case needed to cover
> > +     * different types of memory regions
> > +     */
> > +    if (mr->ram_block && mr->dirty_log_mask) {
> > +        qemu_ram_writeback(mr->ram_block, addr, size);
> > +    }
> > +}
> > +
> >  /*
> >   * Call proper memory listeners about the change on the newly
> >   * added/removed CoalescedMemoryRange.
>
>
> --
> Alex Bennée

Thank you.

BR
Beata


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-24 16:28   ` Richard Henderson
@ 2019-10-09 11:44     ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-10-09 11:44 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Peter Maydell, quintela, Dr. David Alan Gilbert,
	shameerali.kolothum.thodi, qemu-devel, eric.auger, qemu-arm,
	pbonzini

On Tue, 24 Sep 2019 at 17:28, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/10/19 2:56 AM, Beata Michalska wrote:
> > +int main(void) {
> > +#if defined(_POSIX_MAPPED_FILES) && _POSIX_MAPPED_FILES > 0 \
> > +&& defined(_POSIX_SYNCHRONIZED_IO) && _POSIX_SYNCHRONIZED_IO > 0
> > +return msync(NULL,0, MS_SYNC);
> > +#else
> > +#error Not supported
> > +#endif
> > +}
>
> Is there any particular reason to check _POSIX_MAPPED_FILES &
> _POSIX_SYNCHRONIZED_IO?  IIRC, you can use those to "safely" use MS_SYNC.  But
> this is a configure test, and an error is in fact our defined failure case, so
> "safely" doesn't seem particularly relevant.
>
> Alternately, do we even support any systems (besides perhaps windows) that do
> not provide POSIX-2001 support, and so include msync + MS_SYNC?  My first guess
> is that we don't.
>

Both flags are there to verify support for msync itself.
The check there is for posix systems , where if both set to value
greater than '0'
the msync call is available.
AFAIK Windows is the only posix non-compliant system being supported . Though
I might be wrong (?)
I might just drop the check here and use CONFIG_POSIX to handle the
msync call instead.

> > +        msync((void *)((uintptr_t)addr & qemu_host_page_mask),
> > +               HOST_PAGE_ALIGN(length), MS_SYNC);
>
> This isn't quite right.  If you move addr down to a lower address via this page
> mask, you must also increase length by the same amount, and only afterward
> increase length to the host page size.
>
> Consider addr == 0xffffff, length = 2.  This covers two pages, so you'd expect
> the final parameters to be, for 4k page size, 0xfff000, 0x2000.
>
Thanks for catching this - guess I was too focused on the cache line
size, which would not cross page boundaries. Will fix that in the next version.

> r~

Thank you.

BR
Beata


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region
  2019-09-24 16:30   ` Richard Henderson
@ 2019-10-09 11:45     ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-10-09 11:45 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Peter Maydell, quintela, Dr. David Alan Gilbert,
	shameerali.kolothum.thodi, qemu-devel, eric.auger, qemu-arm,
	pbonzini

On Tue, 24 Sep 2019 at 17:30, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/10/19 2:56 AM, Beata Michalska wrote:
> > +void qemu_ram_writeback(RAMBlock *block, ram_addr_t start, ram_addr_t length);
> > +
> > +/* Clear whole block of mem */
> > +#define qemu_ram_block_writeback(rb)    \
> > +    qemu_ram_writeback(rb, 0, rb->used_length)
> > +
>
> Inline function, with its typechecking, is preferred.
>
>
Noted.
To be fixed in the next version.

BR
Beata
> r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-23 23:54   ` Alex Bennée
@ 2019-10-09 11:47     ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-10-09 11:47 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, Dr. David Alan Gilbert, eric.auger,
	qemu-arm, pbonzini

On Tue, 24 Sep 2019 at 00:54, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Beata Michalska <beata.michalska@linaro.org> writes:
>
> > ARMv8.2 introduced support for Data Cache Clean instructions
> > to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
> > - DV CVADP. Both specify conceptual points in a memory system where all writes
> > that are to reach them are considered persistent.
> > The support provided considers both to be actually the same so there is no
> > distinction between the two. If none is available (there is no backing store
> > for given memory) both will result in Data Cache Clean up to the point of
> > coherency. Otherwise sync for the specified range shall be performed.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  linux-user/elfload.c       | 18 +++++++++++++++++-
>
> There are conflicts from the recent elfload.c tweaks to fix on your next rebase.

Will address it in the next version.

> >  target/arm/cpu.h           | 13 ++++++++++++-
> >  target/arm/cpu64.c         |  1 +
> >  target/arm/helper.c        | 24 ++++++++++++++++++++++++
> >  target/arm/helper.h        |  1 +
> >  target/arm/op_helper.c     | 36 ++++++++++++++++++++++++++++++++++++
> >  target/arm/translate-a64.c |  5 +++++
> >  7 files changed, 96 insertions(+), 2 deletions(-)
> >
> > diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> > index 3365e192eb..1ec00308d5 100644
> > --- a/linux-user/elfload.c
> > +++ b/linux-user/elfload.c
> > @@ -609,7 +609,12 @@ enum {
> >      ARM_HWCAP_A64_PACG          = 1UL << 31,
> >  };
> >
> > +enum {
> > +    ARM_HWCAP2_A64_DCPODP   = 1 << 0,
> > +};
> > +
> >  #define ELF_HWCAP get_elf_hwcap()
> > +#define ELF_HWCAP2 get_elf_hwcap2()
> >
> >  static uint32_t get_elf_hwcap(void)
> >  {
> > @@ -644,12 +649,23 @@ static uint32_t get_elf_hwcap(void)
> >      GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
> >      GET_FEATURE_ID(aa64_sb, ARM_HWCAP_A64_SB);
> >      GET_FEATURE_ID(aa64_condm_4, ARM_HWCAP_A64_FLAGM);
> > +    GET_FEATURE_ID(aa64_dcpop, ARM_HWCAP_A64_DCPOP);
> >
> > -#undef GET_FEATURE_ID
> >
> >      return hwcaps;
> >  }
> >
> > +static uint32_t get_elf_hwcap2(void)
> > +{
> > +    ARMCPU *cpu = ARM_CPU(thread_cpu);
> > +    uint32_t hwcaps = 0;
> > +
> > +    GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
> > +    return hwcaps;
> > +}
> > +
> > +#undef GET_FEATURE_ID
> > +
> >  #endif /* not TARGET_AARCH64 */
> >  #endif /* TARGET_ARM */
> >
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index 297ad5e47a..1713d76590 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
> >  #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
> >  #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
> >  #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
> > -#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
> > +#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
> > +#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP
> >  #define ARM_CP_FPU               0x1000
> >  #define ARM_CP_SVE               0x2000
> >  #define ARM_CP_NO_GDB            0x4000
> > @@ -3572,6 +3573,16 @@ static inline bool isar_feature_aa64_frint(const ARMISARegisters *id)
> >      return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FRINTTS) != 0;
> >  }
> >
> > +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
> > +{
> > +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
> > +}
> > +
> > +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
> > +{
> > +    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
> > +}
> > +
> >  static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
> >  {
> >      /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
> > diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> > index d7f5bf610a..20094f980d 100644
> > --- a/target/arm/cpu64.c
> > +++ b/target/arm/cpu64.c
> > @@ -331,6 +331,7 @@ static void aarch64_max_initfn(Object *obj)
> >          cpu->isar.id_aa64isar0 = t;
> >
> >          t = cpu->isar.id_aa64isar1;
> > +        t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
> >          t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
> >          t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
> >          t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index 507026c915..99ae01b7e7 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -3843,6 +3843,22 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
> >      return CP_ACCESS_OK;
> >  }
> >
> > +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
> > +                                                  const ARMCPRegInfo *ri,
> > +                                                  bool isread)
> > +{
> > +    ARMCPU *cpu = env_archcpu(env);
> > +    /*
> > +     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
> > +     * DC CVAP ->  CRm: 0xc
> > +     * DC CVADP -> CRm: 0xd
> > +     */
> > +    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
> > +           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
> > +           ? CP_ACCESS_TRAP_UNCATEGORIZED
> > +           : aa64_cacheop_access(env, ri, isread);
> > +}
> > +
> >  /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
> >   * Page D4-1736 (DDI0487A.b)
> >   */
> > @@ -4251,6 +4267,14 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
> >        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
> >        .access = PL0_W, .type = ARM_CP_NOP,
> >        .accessfn = aa64_cacheop_access },
> > +    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> > +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> > +      .accessfn = aa64_cacheop_persist_access },
> > +    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> > +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> > +      .accessfn = aa64_cacheop_persist_access },
> >      { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
> >        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
> >        .access = PL1_W, .type = ARM_CP_NOP },
> > diff --git a/target/arm/helper.h b/target/arm/helper.h
> > index 1fb2cb5a77..a850364944 100644
> > --- a/target/arm/helper.h
> > +++ b/target/arm/helper.h
> > @@ -561,6 +561,7 @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
> >  DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
> >  DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
> >  DEF_HELPER_2(dc_zva, void, env, i64)
> > +DEF_HELPER_2(dc_cvap, void, env, i64)
> >
> >  DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> >  DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> > diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> > index 0fd4bd0238..75ebf6afa4 100644
> > --- a/target/arm/op_helper.c
> > +++ b/target/arm/op_helper.c
> > @@ -987,3 +987,39 @@ void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
> >      memset(g2h(vaddr), 0, blocklen);
> >  #endif
> >  }
> > +
> > +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
> > +{
> > +#ifndef CONFIG_USER_ONLY
> > +    ARMCPU *cpu = env_archcpu(env);
> > +    /* CTR_EL0 System register -> DminLine, bits [19:16] */
> > +    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
> > +    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
> > +    void *haddr;
> > +    int mem_idx = cpu_mmu_index(env, false);
> > +
> > +    /* This won't be crossing page boundaries */
> > +    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
> > +    if (haddr) {
> > +
> > +        ram_addr_t offset;
> > +        MemoryRegion *mr;
> > +
> > +        /*
> > +         * RCU critical section + ref counting,
> > +         * so that MR won't disappear behind the scene
> > +         */
> > +        rcu_read_lock();
> > +        mr = memory_region_from_host(haddr, &offset);
> > +        if (mr) {
> > +            memory_region_ref(mr);
> > +        }
> > +        rcu_read_unlock();
> > +
> > +        if (mr) {
> > +            memory_region_do_writeback(mr, offset, dline_size);
> > +            memory_region_unref(mr);
> > +        }
> > +    }
> > +#endif
> > +}
> > diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> > index 2d6cd09634..21ea3631d6 100644
> > --- a/target/arm/translate-a64.c
> > +++ b/target/arm/translate-a64.c
> > @@ -1746,6 +1746,11 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
> >          tcg_rt = cpu_reg(s, rt);
> >          gen_helper_dc_zva(cpu_env, tcg_rt);
> >          return;
> > +    case ARM_CP_DC_CVAP:
> > +        /* DC CVAP / DC CVADP */
> > +        tcg_rt = cpu_reg(s, rt);
> > +        gen_helper_dc_cvap(cpu_env, tcg_rt);
> > +        return;
> >      default:
> >          break;
> >      }
>
>
> --
> Alex Bennée

BR
Beata


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-24  1:16   ` Alex Bennée
@ 2019-10-09 11:49     ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-10-09 11:49 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, quintela, Richard Henderson, qemu-devel,
	shameerali.kolothum.thodi, Dr. David Alan Gilbert, eric.auger,
	qemu-arm, pbonzini

On Tue, 24 Sep 2019 at 02:16, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Beata Michalska <beata.michalska@linaro.org> writes:
>
> > ARMv8.2 introduced support for Data Cache Clean instructions
> > to PoP (point-of-persistence) - DC CVAP and PoDP (point-of-deep-persistence)
> > - DV CVADP. Both specify conceptual points in a memory system where all writes
> > that are to reach them are considered persistent.
> > The support provided considers both to be actually the same so there is no
> > distinction between the two. If none is available (there is no backing store
> > for given memory) both will result in Data Cache Clean up to the point of
> > coherency. Otherwise sync for the specified range shall be performed.
> >
> > Signed-off-by: Beata Michalska <beata.michalska@linaro.org>
> > ---
> >  linux-user/elfload.c       | 18 +++++++++++++++++-
> >  target/arm/cpu.h           | 13 ++++++++++++-
> >  target/arm/cpu64.c         |  1 +
> >  target/arm/helper.c        | 24 ++++++++++++++++++++++++
> >  target/arm/helper.h        |  1 +
> >  target/arm/op_helper.c     | 36 ++++++++++++++++++++++++++++++++++++
> >  target/arm/translate-a64.c |  5 +++++
> >  7 files changed, 96 insertions(+), 2 deletions(-)
> >
> > diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> > index 3365e192eb..1ec00308d5 100644
> > --- a/linux-user/elfload.c
> > +++ b/linux-user/elfload.c
> > @@ -609,7 +609,12 @@ enum {
> >      ARM_HWCAP_A64_PACG          = 1UL << 31,
> >  };
> >
> > +enum {
> > +    ARM_HWCAP2_A64_DCPODP   = 1 << 0,
> > +};
> > +
> >  #define ELF_HWCAP get_elf_hwcap()
> > +#define ELF_HWCAP2 get_elf_hwcap2()
> >
> >  static uint32_t get_elf_hwcap(void)
> >  {
> > @@ -644,12 +649,23 @@ static uint32_t get_elf_hwcap(void)
> >      GET_FEATURE_ID(aa64_jscvt, ARM_HWCAP_A64_JSCVT);
> >      GET_FEATURE_ID(aa64_sb, ARM_HWCAP_A64_SB);
> >      GET_FEATURE_ID(aa64_condm_4, ARM_HWCAP_A64_FLAGM);
> > +    GET_FEATURE_ID(aa64_dcpop, ARM_HWCAP_A64_DCPOP);
> >
> > -#undef GET_FEATURE_ID
> >
> >      return hwcaps;
> >  }
> >
> > +static uint32_t get_elf_hwcap2(void)
> > +{
> > +    ARMCPU *cpu = ARM_CPU(thread_cpu);
> > +    uint32_t hwcaps = 0;
> > +
> > +    GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
> > +    return hwcaps;
> > +}
> > +
> > +#undef GET_FEATURE_ID
> > +
> >  #endif /* not TARGET_AARCH64 */
> >  #endif /* TARGET_ARM */
> >
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index 297ad5e47a..1713d76590 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
> >  #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
> >  #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
> >  #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
> > -#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
> > +#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
> > +#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP
> >  #define ARM_CP_FPU               0x1000
> >  #define ARM_CP_SVE               0x2000
> >  #define ARM_CP_NO_GDB            0x4000
> > @@ -3572,6 +3573,16 @@ static inline bool isar_feature_aa64_frint(const ARMISARegisters *id)
> >      return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, FRINTTS) != 0;
> >  }
> >
> > +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
> > +{
> > +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
> > +}
> > +
> > +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
> > +{
> > +    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
> > +}
> > +
> >  static inline bool isar_feature_aa64_fp16(const ARMISARegisters *id)
> >  {
> >      /* We always set the AdvSIMD and FP fields identically wrt FP16.  */
> > diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> > index d7f5bf610a..20094f980d 100644
> > --- a/target/arm/cpu64.c
> > +++ b/target/arm/cpu64.c
> > @@ -331,6 +331,7 @@ static void aarch64_max_initfn(Object *obj)
> >          cpu->isar.id_aa64isar0 = t;
> >
> >          t = cpu->isar.id_aa64isar1;
> > +        t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
> >          t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
> >          t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
> >          t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only */
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index 507026c915..99ae01b7e7 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -3843,6 +3843,22 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
> >      return CP_ACCESS_OK;
> >  }
> >
> > +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
> > +                                                  const ARMCPRegInfo *ri,
> > +                                                  bool isread)
> > +{
> > +    ARMCPU *cpu = env_archcpu(env);
> > +    /*
> > +     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
> > +     * DC CVAP ->  CRm: 0xc
> > +     * DC CVADP -> CRm: 0xd
> > +     */
> > +    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
> > +           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
> > +           ? CP_ACCESS_TRAP_UNCATEGORIZED
> > +           : aa64_cacheop_access(env, ri, isread);
> > +}
> > +
> >  /* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions
> >   * Page D4-1736 (DDI0487A.b)
> >   */
> > @@ -4251,6 +4267,14 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
> >        .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 1,
> >        .access = PL0_W, .type = ARM_CP_NOP,
> >        .accessfn = aa64_cacheop_access },
> > +    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> > +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> > +      .accessfn = aa64_cacheop_persist_access },
> > +    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> > +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> > +      .accessfn = aa64_cacheop_persist_access },
> >      { .name = "DC_CSW", .state = ARM_CP_STATE_AA64,
> >        .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 2,
> >        .access = PL1_W, .type = ARM_CP_NOP },
> > diff --git a/target/arm/helper.h b/target/arm/helper.h
> > index 1fb2cb5a77..a850364944 100644
> > --- a/target/arm/helper.h
> > +++ b/target/arm/helper.h
> > @@ -561,6 +561,7 @@ DEF_HELPER_FLAGS_3(crypto_sm4ekey, TCG_CALL_NO_RWG, void, ptr, ptr, ptr)
> >  DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
> >  DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
> >  DEF_HELPER_2(dc_zva, void, env, i64)
> > +DEF_HELPER_2(dc_cvap, void, env, i64)
> >
> >  DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> >  DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> > diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> > index 0fd4bd0238..75ebf6afa4 100644
> > --- a/target/arm/op_helper.c
> > +++ b/target/arm/op_helper.c
> > @@ -987,3 +987,39 @@ void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
> >      memset(g2h(vaddr), 0, blocklen);
> >  #endif
> >  }
> > +
> > +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
> > +{
> > +#ifndef CONFIG_USER_ONLY
>
> Are we essentially saying the operation is a NOP for user-mode
> emulation? Is it just because we don't really expose HW to linux-user?
>
> If so move the #ifndef outside the HELPER definition and...
>

I do not think there is a way to specify persistent mem for user mode.
Or am I mistaken ?
I will move the switch to the HELPER def.

> > +    ARMCPU *cpu = env_archcpu(env);
> > +    /* CTR_EL0 System register -> DminLine, bits [19:16] */
> > +    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
> > +    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
> > +    void *haddr;
> > +    int mem_idx = cpu_mmu_index(env, false);
> > +
> > +    /* This won't be crossing page boundaries */
> > +    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
> > +    if (haddr) {
> > +
> > +        ram_addr_t offset;
> > +        MemoryRegion *mr;
> > +
> > +        /*
> > +         * RCU critical section + ref counting,
> > +         * so that MR won't disappear behind the scene
> > +         */
> > +        rcu_read_lock();
> > +        mr = memory_region_from_host(haddr, &offset);
> > +        if (mr) {
> > +            memory_region_ref(mr);
> > +        }
> > +        rcu_read_unlock();
> > +
> > +        if (mr) {
> > +            memory_region_do_writeback(mr, offset, dline_size);
> > +            memory_region_unref(mr);
> > +        }
> > +    }
> > +#endif
> > +}
> > diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> > index 2d6cd09634..21ea3631d6 100644
> > --- a/target/arm/translate-a64.c
> > +++ b/target/arm/translate-a64.c
> > @@ -1746,6 +1746,11 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
> >          tcg_rt = cpu_reg(s, rt);
> >          gen_helper_dc_zva(cpu_env, tcg_rt);
> >          return;
> > +    case ARM_CP_DC_CVAP:
> > +        /* DC CVAP / DC CVADP */
>
> .. #ifndef CONFIG_USER_ONLY this code so we don't add the overhead of a
> helper call in linux-user mode.

Noted.
>
> > +        tcg_rt = cpu_reg(s, rt);
> > +        gen_helper_dc_cvap(cpu_env, tcg_rt);
> > +        return;
> >      default:
> >          break;
> >      }
>
>
> --
> Alex Bennée

Thank you,

BR
Beata


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins
  2019-09-24 17:22   ` Richard Henderson
@ 2019-10-09 11:53     ` Beata Michalska
  0 siblings, 0 replies; 25+ messages in thread
From: Beata Michalska @ 2019-10-09 11:53 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Peter Maydell, quintela, Dr. David Alan Gilbert,
	shameerali.kolothum.thodi, qemu-devel, eric.auger, qemu-arm,
	pbonzini

On Tue, 24 Sep 2019 at 18:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/10/19 2:56 AM, Beata Michalska wrote:
> > @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
> >  #define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
> >  #define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
> >  #define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
> > -#define ARM_LAST_SPECIAL         ARM_CP_DC_ZVA
> > +#define ARM_CP_DC_CVAP           (ARM_CP_SPECIAL | 0x0600)
> > +#define ARM_LAST_SPECIAL         ARM_CP_DC_CVAP
>
> I don't see that this operation needs to be handled via "special".  It's a
> function call upon write, as for many other system registers.
>

Too inspired by ZVA I guess. Will make the appropriate changes in the
next version.

> > +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id)
> > +{
> > +    return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0;
> > +}
> > +
> > +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id)
> > +{
> > +    return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0;
> > +}
>
> The correct check is FIELD_EX(...) >= 2.  This is a 4-bit field, as with all of
> the others.
>
Noted.
> > +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env,
> > +                                                  const ARMCPRegInfo *ri,
> > +                                                  bool isread)
> > +{
> > +    ARMCPU *cpu = env_archcpu(env);
> > +    /*
> > +     * Access is UNDEF if lacking implementation for either DC CVAP or DC CVADP
> > +     * DC CVAP ->  CRm: 0xc
> > +     * DC CVADP -> CRm: 0xd
> > +     */
> > +    return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) ||
> > +           (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu))
> > +           ? CP_ACCESS_TRAP_UNCATEGORIZED
> > +           : aa64_cacheop_access(env, ri, isread);
> > +}
> ...
> > +    { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> > +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> > +      .accessfn = aa64_cacheop_persist_access },
> > +    { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> > +      .access = PL0_W, .type = ARM_CP_DC_CVAP,
> > +      .accessfn = aa64_cacheop_persist_access },
>
> While this access function works, it's better to simply not register these at
> all when they're not supported.  Compare the registration of rndr_reginfo.
>
> As I described above, I think this can use a normal write function.  In which
> case this would use .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END.
>
> (I believe that ARM_CP_IO is not required, but I'm not 100% sure.  Certainly
> there does not seem to be anything in dc_cvap() that affects state which can
> queried from another virtual cpu, so there does not appear to be any reason to
> grab the global i/o lock.  The host kernel should be just fine with concurrent
> msync syscalls, or whatever it is that libpmem uses.)
>
>
OK, will move that to conditional registration and double check the type.
Thanks for the suggestion.

> > +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in)
> > +{
> > +#ifndef CONFIG_USER_ONLY
> > +    ARMCPU *cpu = env_archcpu(env);
> > +    /* CTR_EL0 System register -> DminLine, bits [19:16] */
> > +    uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF);
> > +    uint64_t vaddr = vaddr_in & ~(dline_size - 1);
> > +    void *haddr;
> > +    int mem_idx = cpu_mmu_index(env, false);
> > +
> > +    /* This won't be crossing page boundaries */
> > +    haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC());
> > +    if (haddr) {
> > +
> > +        ram_addr_t offset;
> > +        MemoryRegion *mr;
> > +
> > +        /*
> > +         * RCU critical section + ref counting,
> > +         * so that MR won't disappear behind the scene
> > +         */
> > +        rcu_read_lock();
> > +        mr = memory_region_from_host(haddr, &offset);
> > +        if (mr) {
> > +            memory_region_ref(mr);
> > +        }
> > +        rcu_read_unlock();
> > +
> > +        if (mr) {
> > +            memory_region_do_writeback(mr, offset, dline_size);
> > +            memory_region_unref(mr);
> > +        }
> > +    }
> > +#endif
>
>
> We hold the rcu lock whenever a TB is executing.  I don't believe there's any
> point in increasing the lock count here.  Similarly with memory_region
> refcounts -- they cannot vanish while we're executing a TB.
>
> Thus I believe that about half of this function can fold away.
>
>
So I was chasing the wrong locking herre...
Indeed if the RCU lock is being already held I can safely drop the locking here.

> r~

Thank you for the review,

BR
Beata


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-10-09 18:32 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-10  9:56 [Qemu-devel] [PATCH 0/4] target/arm: Support for Data Cache Clean up to PoP Beata Michalska
2019-09-10  9:56 ` [Qemu-devel] [PATCH 1/4] tcg: cputlb: Add probe_read Beata Michalska
2019-09-23 23:54   ` Alex Bennée
2019-09-10  9:56 ` [Qemu-devel] [PATCH 2/4] Memory: Enable writeback for given memory region Beata Michalska
2019-09-24  0:00   ` Alex Bennée
2019-10-09 11:40     ` Beata Michalska
2019-09-24 16:28   ` Richard Henderson
2019-10-09 11:44     ` Beata Michalska
2019-09-24 16:30   ` Richard Henderson
2019-10-09 11:45     ` Beata Michalska
2019-09-10  9:56 ` [Qemu-devel] [PATCH 3/4] migration: ram: Switch to ram block writeback Beata Michalska
2019-09-10 10:26   ` Dr. David Alan Gilbert
2019-09-10 11:28     ` Beata Michalska
2019-09-10 13:16       ` Dr. David Alan Gilbert
2019-09-10 14:21         ` Beata Michalska
2019-09-11 10:36           ` Dr. David Alan Gilbert
2019-09-12  9:10             ` Beata Michalska
2019-09-24 16:30   ` Richard Henderson
2019-09-10  9:56 ` [Qemu-devel] [PATCH 4/4] target/arm: Add support for DC CVAP & DC CVADP ins Beata Michalska
2019-09-23 23:54   ` Alex Bennée
2019-10-09 11:47     ` Beata Michalska
2019-09-24  1:16   ` Alex Bennée
2019-10-09 11:49     ` Beata Michalska
2019-09-24 17:22   ` Richard Henderson
2019-10-09 11:53     ` Beata Michalska

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).