All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
@ 2017-09-30  2:08 Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 01/13] target/arm: A53: Initialize PMCEID[0] Aaron Lindsay
                   ` (14 more replies)
  0 siblings, 15 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

The ARM PMU implementation currently contains a basic cycle counter, but it is
often useful to gather counts of other events and filter them based on
execution mode. These patches flesh out the implementations of various PMU
registers including PM[X]EVCNTR and PM[X]EVTYPER, add a struct definition to
represent arbitrary counter types, implement mode filtering, and add
instruction, cycle, and software increment events.

I am particularly interested in feedback on the following two patches because I
think I'm likely Doing It Wrong:
 [1] target/arm: Filter cycle counter based on PMCCFILTR_EL0
 [2] target/arm: PMU: Add instruction and cycle events

In order to implement mode filtering in an event-driven way, [1] adds a pair of
calls to pmu_sync() surrounding every update to a register/variable which may
affect whether any counter is currently filtered. These pmu_sync() calls
ultimately call cpu_get_icount_raw() for enabled instruction and cycle counters
when using icount. Unfortunately, cpu->can_do_io may otherwise be zero for
these calls so the current implementation in [2] temporarily sets can_do_io to
1. I haven't see any ill side effects from this in my testing, but it doesn't
seem like the right way to handle this.

I would like to eventually add sending interrupts on counter overflow.
Suggestions for the best direction to handle this are most welcome.  

Thanks for any feedback,
Aaron

Aaron Lindsay (13):
  target/arm: A53: Initialize PMCEID[0]
  target/arm: Check PMCNTEN for whether PMCCNTR is enabled
  target/arm: Reorganize PMCCNTR read, write, sync
  target/arm: Mask PMU register writes based on PMCR_EL0.N
  target/arm: Allow AArch32 access for PMCCFILTR
  target/arm: Filter cycle counter based on PMCCFILTR_EL0
  target/arm: Implement PMOVSSET
  target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled
  target/arm: Add array for supported PMU events, generate PMCEID[01]
  target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER
  target/arm: PMU: Add instruction and cycle events
  target/arm: PMU: Set PMCR.N to 4
  target/arm: Implement PMSWINC

 target/arm/cpu.c       |  10 +-
 target/arm/cpu.h       |  34 +++-
 target/arm/cpu64.c     |   2 +
 target/arm/helper.c    | 535 +++++++++++++++++++++++++++++++++++++++++--------
 target/arm/kvm64.c     |   2 +
 target/arm/machine.c   |   2 +
 target/arm/op_helper.c |   4 +
 7 files changed, 500 insertions(+), 89 deletions(-)

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 01/13] target/arm: A53: Initialize PMCEID[0]
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled Aaron Lindsay
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

A53 advertises ARM_FEATURE_PMU, but wasn't initializing pmceid[01]

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/cpu.c   | 2 +-
 target/arm/cpu64.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 4300de6..a6c29cf 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1402,7 +1402,7 @@ static void cortex_a15_initfn(Object *obj)
     cpu->id_pfr0 = 0x00001131;
     cpu->id_pfr1 = 0x00011011;
     cpu->id_dfr0 = 0x02010555;
-    cpu->pmceid0 = 0x0000000;
+    cpu->pmceid0 = 0x00000000;
     cpu->pmceid1 = 0x00000000;
     cpu->id_afr0 = 0x00000000;
     cpu->id_mmfr0 = 0x10201105;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 670c07a..7b1642e 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -198,6 +198,8 @@ static void aarch64_a53_initfn(Object *obj)
     cpu->id_isar5 = 0x00011121;
     cpu->id_aa64pfr0 = 0x00002222;
     cpu->id_aa64dfr0 = 0x10305106;
+    cpu->pmceid0 = 0x00000000;
+    cpu->pmceid1 = 0x00000000;
     cpu->id_aa64isar0 = 0x00011120;
     cpu->id_aa64mmfr0 = 0x00001122; /* 40 bit physical addr */
     cpu->dbgdidr = 0x3516d000;
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 01/13] target/arm: A53: Initialize PMCEID[0] Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 12:49   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync Aaron Lindsay
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8be78ea..40c9273 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -964,7 +964,7 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
 {
     /* This does not support checking PMCCFILTR_EL0 register */
 
-    if (!(env->cp15.c9_pmcr & PMCRE)) {
+    if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
         return false;
     }
 
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 01/13] target/arm: A53: Initialize PMCEID[0] Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 13:25   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N Aaron Lindsay
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

pmccntr_read and pmccntr_write contained duplicate code that was already
being handled by pmccntr_sync. This also moves the calls to get the
clock inside the 'if' statement so they are not executed if not needed.

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 55 ++++++++++++++++-------------------------------------
 1 file changed, 16 insertions(+), 39 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 40c9273..ecf8c55 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -973,17 +973,18 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
 
 void pmccntr_sync(CPUARMState *env)
 {
-    uint64_t temp_ticks;
+    if (arm_ccnt_enabled(env) &&
+          !pmu_counter_filtered(env, env->cp15.pmccfiltr_el0)) {
+        uint64_t temp_ticks;
 
-    temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
-                          ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
+        temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
+                              ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
 
-    if (env->cp15.c9_pmcr & PMCRD) {
-        /* Increment once every 64 processor clock cycles */
-        temp_ticks /= 64;
-    }
+        if (env->cp15.c9_pmcr & PMCRD) {
+            /* Increment once every 64 processor clock cycles */
+            temp_ticks /= 64;
+        }
 
-    if (arm_ccnt_enabled(env)) {
         env->cp15.c15_ccnt = temp_ticks - env->cp15.c15_ccnt;
     }
 }
@@ -1007,21 +1008,11 @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
 static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
-    uint64_t total_ticks;
-
-    if (!arm_ccnt_enabled(env)) {
-        /* Counter is disabled, do not change value */
-        return env->cp15.c15_ccnt;
-    }
-
-    total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
-                           ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
-
-    if (env->cp15.c9_pmcr & PMCRD) {
-        /* Increment once every 64 processor clock cycles */
-        total_ticks /= 64;
-    }
-    return total_ticks - env->cp15.c15_ccnt;
+    uint64_t ret;
+    pmccntr_sync(env);
+    ret = env->cp15.c15_ccnt;
+    pmccntr_sync(env);
+    return ret;
 }
 
 static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -1038,22 +1029,8 @@ static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void pmccntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                         uint64_t value)
 {
-    uint64_t total_ticks;
-
-    if (!arm_ccnt_enabled(env)) {
-        /* Counter is disabled, set the absolute value */
-        env->cp15.c15_ccnt = value;
-        return;
-    }
-
-    total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
-                           ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
-
-    if (env->cp15.c9_pmcr & PMCRD) {
-        /* Increment once every 64 processor clock cycles */
-        total_ticks /= 64;
-    }
-    env->cp15.c15_ccnt = total_ticks - value;
+    env->cp15.c15_ccnt = value;
+    pmccntr_sync(env);
 }
 
 static void pmccntr_write32(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (2 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 13:41   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 05/13] target/arm: Allow AArch32 access for PMCCFILTR Aaron Lindsay
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

This is in preparation for enabling counters other than PMCCNTR

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index ecf8c55..54070a3 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -30,11 +30,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr, uint32_t *fsr,
                                ARMMMUFaultInfo *fi);
-
-/* Definitions for the PMCCNTR and PMCR registers */
-#define PMCRD   0x8
-#define PMCRC   0x4
-#define PMCRE   0x1
 #endif
 
 static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
@@ -876,6 +871,17 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
     REGINFO_SENTINEL
 };
 
+/* Definitions for the PMU registers */
+#define PMCRN   0xf800
+#define PMCRN_SHIFT 11
+#define PMCRD   0x8
+#define PMCRC   0x4
+#define PMCRE   0x1
+
+#define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN) >> PMCRN_SHIFT)
+/* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
+#define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - 1))
+
 static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                    bool isread)
 {
@@ -1060,14 +1066,14 @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
-    value &= (1 << 31);
+    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
     env->cp15.c9_pmcnten |= value;
 }
 
 static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    value &= (1 << 31);
+    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
     env->cp15.c9_pmcnten &= ~value;
 }
 
@@ -1115,14 +1121,14 @@ static void pmintenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
     /* We have no event counters so only the C bit can be changed */
-    value &= (1 << 31);
+    value &= PMU_COUNTER_MASK(env);
     env->cp15.c9_pminten |= value;
 }
 
 static void pmintenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    value &= (1 << 31);
+    value &= PMU_COUNTER_MASK(env);
     env->cp15.c9_pminten &= ~value;
 }
 
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 05/13] target/arm: Allow AArch32 access for PMCCFILTR
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (3 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 13:52   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0 Aaron Lindsay
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

Also fix the existing bitmask for writes.

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 54070a3..fcc2fcf 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1059,10 +1059,25 @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
     pmccntr_sync(env);
-    env->cp15.pmccfiltr_el0 = value & 0x7E000000;
+    env->cp15.pmccfiltr_el0 = value & 0xfc000000;
     pmccntr_sync(env);
 }
 
+static void pmccfiltr_write_a32(CPUARMState *env, const ARMCPRegInfo *ri,
+                            uint64_t value)
+{
+    pmccntr_sync(env);
+    env->cp15.pmccfiltr_el0 = (env->cp15.pmccfiltr_el0 & 0x04000000) |
+        (value & 0xf8000000); /* M is not visible in AArch32 */
+    pmccntr_sync(env);
+}
+
+static uint64_t pmccfiltr_read_a32(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    /* M is not visible in AArch32 */
+    return env->cp15.pmccfiltr_el0 & 0xf8000000;
+}
+
 static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
@@ -1280,6 +1295,12 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .type = ARM_CP_IO,
       .readfn = pmccntr_read, .writefn = pmccntr_write, },
 #endif
+    { .name = "PMCCFILTR", .cp = 15, .opc1 = 0, .crn = 14, .crm = 15, .opc2 = 7,
+      .writefn = pmccfiltr_write_a32, .readfn = pmccfiltr_read_a32,
+      .access = PL0_RW, .accessfn = pmreg_access,
+      .type = ARM_CP_ALIAS,
+      .fieldoffset = offsetoflow32(CPUARMState, cp15.pmccfiltr_el0),
+      .resetvalue = 0, },
     { .name = "PMCCFILTR_EL0", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 15, .opc2 = 7,
       .writefn = pmccfiltr_write,
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (4 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 05/13] target/arm: Allow AArch32 access for PMCCFILTR Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 14:57   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET Aaron Lindsay
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

The pmu_counter_filtered and pmu_sync functions are generic (as opposed
to PMCCNTR-specific) to allow for the implementation of other events.

RFC: I know that many of the locations of the calls to pmu_sync are
problematic when icount is enabled because can_do_io will not be set.
The documentation says that for deterministic execution, IO must only be
performed by the last instruction of a thread block. Because
cpu_handle_interrupt() and cpu_handle_exception() are actually made
outside of a thread block, is it safe to set can_do_io=1 for them to
allow this to succeed? Is there a better mechanism for handling this?

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/cpu.c       |  4 +++
 target/arm/cpu.h       | 15 +++++++++++
 target/arm/helper.c    | 73 +++++++++++++++++++++++++++++++++++++++++++++++---
 target/arm/kvm64.c     |  2 ++
 target/arm/machine.c   |  2 ++
 target/arm/op_helper.c |  4 +++
 6 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index a6c29cf..dfadaad 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -139,6 +139,8 @@ static void arm_cpu_reset(CPUState *s)
         env->iwmmxt.cregs[ARM_IWMMXT_wCID] = 0x69051000 | 'Q';
     }
 
+    pmu_sync(env); /* Surround writes to uncached_cpsr, pstate, and aarch64 */
+
     if (arm_feature(env, ARM_FEATURE_AARCH64)) {
         /* 64 bit CPUs always start in 64 bit mode */
         env->aarch64 = 1;
@@ -180,6 +182,8 @@ static void arm_cpu_reset(CPUState *s)
     env->uncached_cpsr = ARM_CPU_MODE_SVC;
     env->daif = PSTATE_D | PSTATE_A | PSTATE_I | PSTATE_F;
 
+    pmu_sync(env); /* Surround writes to uncached_cpsr, pstate, and aarch64 */
+
     if (arm_feature(env, ARM_FEATURE_M)) {
         uint32_t initial_msp; /* Loaded from 0x0 */
         uint32_t initial_pc; /* Loaded from 0x4 */
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 8afceca..811b1fe 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -822,6 +822,19 @@ int cpu_arm_signal_handler(int host_signum, void *pinfo,
  */
 void pmccntr_sync(CPUARMState *env);
 
+/**
+ * pmu_sync
+ * @env: CPUARMState
+ *
+ * Synchronises all PMU counters. This must always be called twice, once before
+ * any action that might affect the filtering of all counters and again
+ * afterwards. The function is used to swap the state of the registers if
+ * required. This only happens when not in user mode (!CONFIG_USER_ONLY). Any
+ * writes to env's aarch64, pstate, uncached_cpsr, cp15.scr_el3, or
+ * cp15.hcr_el2 must be protected by calls to this function.
+ */
+void pmu_sync(CPUARMState *env);
+
 /* SCTLR bit meanings. Several bits have been reused in newer
  * versions of the architecture; in that case we define constants
  * for both old and new bit meanings. Code which tests against those
@@ -1018,7 +1031,9 @@ static inline void pstate_write(CPUARMState *env, uint32_t val)
     env->CF = (val >> 29) & 1;
     env->VF = (val << 3) & 0x80000000;
     env->daif = val & PSTATE_DAIF;
+    pmu_sync(env);
     env->pstate = val & ~CACHED_PSTATE_BITS;
+    pmu_sync(env);
 }
 
 /* Return the current CPSR value.  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index fcc2fcf..74e90c5 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -878,6 +878,15 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
 #define PMCRC   0x4
 #define PMCRE   0x1
 
+#define PMXEVTYPER_P          0x80000000
+#define PMXEVTYPER_U          0x40000000
+#define PMXEVTYPER_NSK        0x20000000
+#define PMXEVTYPER_NSU        0x10000000
+#define PMXEVTYPER_NSH        0x08000000
+#define PMXEVTYPER_M          0x04000000
+#define PMXEVTYPER_MT         0x02000000
+#define PMXEVTYPER_EVTCOUNT   0x000003ff
+
 #define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN) >> PMCRN_SHIFT)
 /* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
 #define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - 1))
@@ -968,7 +977,7 @@ static CPAccessResult pmreg_access_ccntr(CPUARMState *env,
 
 static inline bool arm_ccnt_enabled(CPUARMState *env)
 {
-    /* This does not support checking PMCCFILTR_EL0 register */
+    /* Does not check PMCCFILTR_EL0, which is handled by pmu_counter_filtered */
 
     if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
         return false;
@@ -977,6 +986,43 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
     return true;
 }
 
+/* Returns true if the counter corresponding to the passed-in pmevtyper or
+ * pmccfiltr value is filtered using the current state */
+static inline bool pmu_counter_filtered(CPUARMState *env, uint64_t pmxevtyper)
+{
+    bool secure = arm_is_secure(env);
+    int el = arm_current_el(env);
+
+    bool P   = pmxevtyper & PMXEVTYPER_P;
+    bool U   = pmxevtyper & PMXEVTYPER_U;
+    bool NSK = pmxevtyper & PMXEVTYPER_NSK;
+    bool NSU = pmxevtyper & PMXEVTYPER_NSU;
+    bool NSH = pmxevtyper & PMXEVTYPER_NSH;
+    bool M   = pmxevtyper & PMXEVTYPER_M;
+
+    if (el == 1 && P) {
+        return true;
+    } else if (el == 0 && U) {
+        return true;
+    }
+
+    if (arm_feature(env, ARM_FEATURE_EL3)) {
+        if (el == 1 && !secure && NSK != P) {
+            return true;
+        } else if (el == 0 && !secure && NSU != U) {
+            return true;
+        } else if (el == 3 && secure && M != P) {
+            return true;
+        }
+    }
+
+    if (arm_feature(env, ARM_FEATURE_EL2) && el == 2 && !secure && !NSH) {
+        return true;
+    }
+
+    return false;
+}
+
 void pmccntr_sync(CPUARMState *env)
 {
     if (arm_ccnt_enabled(env) &&
@@ -995,10 +1041,15 @@ void pmccntr_sync(CPUARMState *env)
     }
 }
 
+void pmu_sync(CPUARMState *env)
+{
+    pmccntr_sync(env);
+}
+
 static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                        uint64_t value)
 {
-    pmccntr_sync(env);
+    pmu_sync(env);
 
     if (value & PMCRC) {
         /* The counter has been reset */
@@ -1009,7 +1060,7 @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     env->cp15.c9_pmcr &= ~0x39;
     env->cp15.c9_pmcr |= (value & 0x39);
 
-    pmccntr_sync(env);
+    pmu_sync(env);
 }
 
 static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -1053,6 +1104,10 @@ void pmccntr_sync(CPUARMState *env)
 {
 }
 
+void pmu_sync(CPUARMState *env)
+{
+}
+
 #endif
 
 static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -1184,7 +1239,9 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 
     /* Clear all-context RES0 bits.  */
     value &= valid_mask;
+    pmu_sync(env);
     raw_write(env, ri, value);
+    pmu_sync(env);
 }
 
 static uint64_t ccsidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -3736,7 +3793,9 @@ static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
     if ((raw_read(env, ri) ^ value) & (HCR_VM | HCR_PTW | HCR_DC)) {
         tlb_flush(CPU(cpu));
     }
+    pmu_sync(env);
     raw_write(env, ri, value);
+    pmu_sync(env);
 }
 
 static const ARMCPRegInfo el2_cp_reginfo[] = {
@@ -5815,7 +5874,9 @@ void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask,
         }
     }
     mask &= ~CACHED_CPSR_BITS;
+    pmu_sync(env);
     env->uncached_cpsr = (env->uncached_cpsr & ~mask) | (val & mask);
+    pmu_sync(env);
 }
 
 /* Sign/zero extend */
@@ -6864,6 +6925,8 @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
         addr += A32_BANKED_CURRENT_REG_GET(env, vbar);
     }
 
+    pmu_sync(env); /* Surrounds updates to scr_el3 and uncached_cpsr */
+
     if ((env->uncached_cpsr & CPSR_M) == ARM_CPU_MODE_MON) {
         env->cp15.scr_el3 &= ~SCR_NS;
     }
@@ -6891,6 +6954,8 @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
     }
     env->regs[14] = env->regs[15] + offset;
     env->regs[15] = addr;
+
+    pmu_sync(env); /* Surrounds updates to scr_el3 and uncached_cpsr */
 }
 
 /* Handle exception entry to a target EL which is using AArch64 */
@@ -6980,7 +7045,9 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
                   env->elr_el[new_el]);
 
     pstate_write(env, PSTATE_DAIF | new_mode);
+    pmu_sync(env);
     env->aarch64 = 1;
+    pmu_sync(env);
     aarch64_restore_sp(env, new_el);
 
     env->pc = addr;
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 6554c30..55a4e04 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -783,7 +783,9 @@ int kvm_arch_get_registers(CPUState *cs)
         return ret;
     }
 
+    pmu_sync(env);
     env->aarch64 = ((val & PSTATE_nRW) == 0);
+    pmu_sync(env);
     if (is_a64(env)) {
         pstate_write(env, val);
     } else {
diff --git a/target/arm/machine.c b/target/arm/machine.c
index 29df7ac..496b67c 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -325,7 +325,9 @@ static int get_cpsr(QEMUFile *f, void *opaque, size_t size,
         return 0;
     }
 
+    pmu_sync(env);
     env->aarch64 = ((val & PSTATE_nRW) == 0);
+    pmu_sync(env);
 
     if (is_a64(env)) {
         pstate_write(env, val);
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 6a60464..fa976a8 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -1063,7 +1063,9 @@ void HELPER(exception_return)(CPUARMState *env)
     }
 
     if (!return_to_aa64) {
+        pmu_sync(env);
         env->aarch64 = 0;
+        pmu_sync(env);
         /* We do a raw CPSR write because aarch64_sync_64_to_32()
          * will sort the register banks out for us, and we've already
          * caught all the bad-mode cases in el_from_spsr().
@@ -1083,7 +1085,9 @@ void HELPER(exception_return)(CPUARMState *env)
                       "AArch32 EL%d PC 0x%" PRIx32 "\n",
                       cur_el, new_el, env->regs[15]);
     } else {
+        pmu_sync(env);
         env->aarch64 = 1;
+        pmu_sync(env);
         pstate_write(env, spsr);
         if (!arm_singlestep_active(env)) {
             env->pstate &= ~PSTATE_SS;
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (5 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0 Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 14:19   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 08/13] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled Aaron Lindsay
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

Also modify it to be stored as a uint64_t

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/cpu.h    |  2 +-
 target/arm/helper.c | 27 ++++++++++++++++++++++++---
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 811b1fe..365a809 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -325,7 +325,7 @@ typedef struct CPUARMState {
         uint32_t c9_data;
         uint64_t c9_pmcr; /* performance monitor control register */
         uint64_t c9_pmcnten; /* perf monitor counter enables */
-        uint32_t c9_pmovsr; /* perf monitor overflow status */
+        uint64_t c9_pmovsr; /* perf monitor overflow status */
         uint32_t c9_pmuserenr; /* perf monitor user enable */
         uint64_t c9_pmselr; /* perf monitor counter selection register */
         uint64_t c9_pminten; /* perf monitor interrupt enables */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 74e90c5..3932ac0 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1150,9 +1150,17 @@ static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void pmovsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                          uint64_t value)
 {
+    value &= PMU_COUNTER_MASK(env);
     env->cp15.c9_pmovsr &= ~value;
 }
 
+static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                         uint64_t value)
+{
+    value &= PMU_COUNTER_MASK(env);
+    env->cp15.c9_pmovsr |= value;
+}
+
 static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
@@ -1317,10 +1325,10 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcnten),
       .writefn = pmcntenclr_write },
     { .name = "PMOVSR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 3,
-      .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
-      .accessfn = pmreg_access,
+      .access = PL0_RW, .accessfn = pmreg_access,
+      .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
       .writefn = pmovsr_write,
-      .raw_writefn = raw_write },
+      .raw_writefn = raw_write, .resetvalue = 0 },
     { .name = "PMOVSCLR_EL0", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 3,
       .access = PL0_RW, .accessfn = pmreg_access,
@@ -1328,6 +1336,19 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
       .writefn = pmovsr_write,
       .raw_writefn = raw_write },
+    { .name = "PMOVSSET", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 3,
+      .access = PL0_RW, .accessfn = pmreg_access,
+      .type = ARM_CP_ALIAS,
+      .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
+      .writefn = pmovsset_write,
+      .raw_writefn = raw_write },
+    { .name = "PMOVSSET_EL0", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 14, .opc2 = 3,
+      .access = PL0_RW, .accessfn = pmreg_access,
+      .type = ARM_CP_ALIAS,
+      .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
+      .writefn = pmovsset_write,
+      .raw_writefn = raw_write },
     /* Unimplemented so WI. */
     { .name = "PMSWINC", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 4,
       .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NOP },
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 08/13] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (6 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-17 14:04   ` Peter Maydell
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 09/13] target/arm: Add array for supported PMU events, generate PMCEID[01] Aaron Lindsay
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3932ac0..c0ef367 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -975,17 +975,22 @@ static CPAccessResult pmreg_access_ccntr(CPUARMState *env,
     return pmreg_access(env, ri, isread);
 }
 
-static inline bool arm_ccnt_enabled(CPUARMState *env)
+static inline bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
 {
     /* Does not check PMCCFILTR_EL0, which is handled by pmu_counter_filtered */
-
-    if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
+    if (!(env->cp15.c9_pmcr & PMCRE) ||
+            !(env->cp15.c9_pmcnten & (1 << counter))) {
         return false;
     }
 
     return true;
 }
 
+static inline bool arm_ccnt_enabled(CPUARMState *env)
+{
+    return pmu_counter_enabled(env, 31);
+}
+
 /* Returns true if the counter corresponding to the passed-in pmevtyper or
  * pmccfiltr value is filtered using the current state */
 static inline bool pmu_counter_filtered(CPUARMState *env, uint64_t pmxevtyper)
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 09/13] target/arm: Add array for supported PMU events, generate PMCEID[01]
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (7 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 08/13] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 10/13] target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER Aaron Lindsay
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

This commit doesn't add any supported events, but provides the framework
for adding them. We store the pm_event structs in a simple array, and
provide the mapping from the event numbers to array indexes in
the supported_event_map array.

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/cpu.c    |  4 ++++
 target/arm/cpu.h    | 10 ++++++++++
 target/arm/helper.c | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index dfadaad..da14864 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -830,6 +830,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     if (!cpu->has_pmu) {
         unset_feature(env, ARM_FEATURE_PMU);
         cpu->id_aa64dfr0 &= ~0xf00;
+    } else {
+        uint64_t pmceid = get_pmceid(&cpu->env);
+        cpu->pmceid0 = pmceid & 0xffffffff;
+        cpu->pmceid1 = (pmceid >> 32) & 0xffffffff;
     }
 
     if (!arm_feature(env, ARM_FEATURE_EL2)) {
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 365a809..9feb45a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -835,6 +835,16 @@ void pmccntr_sync(CPUARMState *env);
  */
 void pmu_sync(CPUARMState *env);
 
+/*
+ * get_pmceid
+ * @env: CPUARMState
+ *
+ * Return the PMCEID[01] register values corresponding to the counters which
+ * are supported given the current configuration (0 is low 32, 1 is high 32
+ * bits)
+ */
+uint64_t get_pmceid(CPUARMState *env);
+
 /* SCTLR bit meanings. Several bits have been reused in newer
  * versions of the architecture; in that case we define constants
  * for both old and new bit meanings. Code which tests against those
diff --git a/target/arm/helper.c b/target/arm/helper.c
index c0ef367..a56e7b4 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -891,6 +891,43 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
 /* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
 #define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - 1))
 
+typedef struct pm_event {
+    uint16_t number; /* PMEVTYPER.evtCount is 10 bits wide */
+    /* If the event is supported on this CPU (used to generate PMCEID[01]) */
+    bool (*supported)(CPUARMState *);
+    /* Retrieve the current count of the underlying event. The programmed
+     * counters hold a difference from the return value from this function */
+    uint64_t (*get_count)(CPUARMState *);
+} pm_event;
+
+#define SUPPORTED_EVENT_SENTINEL UINT16_MAX
+static const pm_event pm_events[] = {
+    { .number = SUPPORTED_EVENT_SENTINEL }
+};
+static uint16_t supported_event_map[0x3f];
+
+/*
+ * Called upon initialization to build PMCEID0 (low 32 bits) and PMCEID1 (high
+ * 32). We also use it to build a map of ARM event numbers to indices in
+ * our pm_events array.
+ */
+uint64_t get_pmceid(CPUARMState *env)
+{
+    uint64_t pmceid = 0;
+    unsigned int i = 0;
+    while (pm_events[i].number != SUPPORTED_EVENT_SENTINEL) {
+        const pm_event *cnt = &pm_events[i];
+        if (cnt->number < 0x3f && cnt->supported(env)) {
+            pmceid |= (1 << cnt->number);
+            supported_event_map[cnt->number] = i;
+        } else {
+            supported_event_map[cnt->number] = SUPPORTED_EVENT_SENTINEL;
+        }
+        i++;
+    }
+    return pmceid;
+}
+
 static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                    bool isread)
 {
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 10/13] target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (8 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 09/13] target/arm: Add array for supported PMU events, generate PMCEID[01] Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 11/13] target/arm: PMU: Add instruction and cycle events Aaron Lindsay
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

Add arrays to hold the registers, the definitions themselves, access
functions, and add logic to reset counters when PMCR.P is set.

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/cpu.h    |   7 +-
 target/arm/helper.c | 187 ++++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 179 insertions(+), 15 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9feb45a..d816e0a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -412,10 +412,13 @@ typedef struct CPUARMState {
         uint64_t oslsr_el1; /* OS Lock Status */
         uint64_t mdcr_el2;
         uint64_t mdcr_el3;
-        /* If the counter is enabled, this stores the last time the counter
-         * was reset. Otherwise it stores the counter value
+        /* If the pmccntr and pmevcntr counters are enabled, they store the
+         * offset the last time the counter was reset. Otherwise they store the
+         * counter value.
          */
         uint64_t c15_ccnt;
+        uint64_t c14_pmevcntr[31];
+        uint64_t c14_pmevtyper[31];
         uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */
         uint64_t vpidr_el2; /* Virtualization Processor ID Register */
         uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index a56e7b4..f183199 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -876,6 +876,7 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
 #define PMCRN_SHIFT 11
 #define PMCRD   0x8
 #define PMCRC   0x4
+#define PMCRP   0x2
 #define PMCRE   0x1
 
 #define PMXEVTYPER_P          0x80000000
@@ -1020,6 +1021,21 @@ static inline bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
         return false;
     }
 
+    if (counter != 31) {
+        /* If not checking PMCCNTR, ensure the counter is setup to an event we
+         * support */
+        uint16_t event = env->cp15.c14_pmevtyper[counter] & PMXEVTYPER_EVTCOUNT;
+        if (event > 0x3f) {
+            return false; /* We only support common architectural and
+                             microarchitectural events */
+        }
+
+        uint16_t event_idx = supported_event_map[event];
+        if (event_idx == SUPPORTED_EVENT_SENTINEL) {
+            return false;
+        }
+    }
+
     return true;
 }
 
@@ -1083,8 +1099,26 @@ void pmccntr_sync(CPUARMState *env)
     }
 }
 
+static void pmu_sync_counter(CPUARMState *env, uint8_t counter)
+{
+    if (pmu_counter_enabled(env, counter) &&
+        !pmu_counter_filtered(env, env->cp15.c14_pmevtyper[counter])) {
+
+        uint16_t event = env->cp15.c14_pmevtyper[counter] & PMXEVTYPER_EVTCOUNT;
+        uint16_t event_idx = supported_event_map[event];
+
+        uint64_t count = pm_events[event_idx].get_count(env);
+        env->cp15.c14_pmevcntr[counter] =
+            count - env->cp15.c14_pmevcntr[counter];
+    }
+}
+
 void pmu_sync(CPUARMState *env)
 {
+    unsigned int i;
+    for (i = 0; i < PMU_NUM_COUNTERS(env); i++) {
+        pmu_sync_counter(env, i);
+    }
     pmccntr_sync(env);
 }
 
@@ -1098,6 +1132,13 @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
         env->cp15.c15_ccnt = 0;
     }
 
+    if (value & PMCRP) {
+        unsigned int i;
+        for (i = 0; i < PMU_NUM_COUNTERS(env); i++) {
+            env->cp15.c14_pmevcntr[i] = 0;
+        }
+    }
+
     /* only the DP, X, D and E bits are writable */
     env->cp15.c9_pmcr &= ~0x39;
     env->cp15.c9_pmcr |= (value & 0x39);
@@ -1203,30 +1244,112 @@ static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri,
     env->cp15.c9_pmovsr |= value;
 }
 
-static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                             uint64_t value)
+static void pmevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                             uint64_t value, const uint8_t counter)
 {
+    if (counter == 0x1f) {
+        pmccfiltr_write(env, ri, value);
+    } else if (counter < PMU_NUM_COUNTERS(env)) {
+        pmu_sync_counter(env, counter);
+        env->cp15.c14_pmevtyper[counter] = value & 0xfe0003ff;
+        pmu_sync_counter(env, counter);
+    }
     /* Attempts to access PMXEVTYPER are CONSTRAINED UNPREDICTABLE when
      * PMSELR value is equal to or greater than the number of implemented
      * counters, but not equal to 0x1f. We opt to behave as a RAZ/WI.
      */
-    if (env->cp15.c9_pmselr == 0x1f) {
-        pmccfiltr_write(env, ri, value);
+}
+
+static uint64_t pmevtyper_read(CPUARMState *env, const ARMCPRegInfo *ri,
+                               const uint8_t counter)
+{
+    if (counter == 0x1f) {
+        return env->cp15.pmccfiltr_el0;
+    } else if (counter < PMU_NUM_COUNTERS(env)) {
+        return env->cp15.c14_pmevtyper[counter];
+    } else {
+      /* We opt to behave as a RAZ/WI when attempts to access PMXEVTYPER
+       * are CONSTRAINED UNPREDICTABLE. See comments in pmevtyper_write().
+       */
+        return 0;
     }
 }
 
+static void pmevtyper_writefn(CPUARMState *env, const ARMCPRegInfo *ri,
+                              uint64_t value)
+{
+    uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
+    pmevtyper_write(env, ri, value, counter);
+}
+
+static uint64_t pmevtyper_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
+    return pmevtyper_read(env, ri, counter);
+}
+
+static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                             uint64_t value)
+{
+    pmevtyper_write(env, ri, value, env->cp15.c9_pmselr & 31);
+}
+
 static uint64_t pmxevtyper_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
-    /* We opt to behave as a RAZ/WI when attempts to access PMXEVTYPER
-     * are CONSTRAINED UNPREDICTABLE. See comments in pmxevtyper_write().
-     */
-    if (env->cp15.c9_pmselr == 0x1f) {
-        return env->cp15.pmccfiltr_el0;
+    return pmevtyper_read(env, ri, env->cp15.c9_pmselr & 31);
+}
+
+static void pmevcntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                             uint64_t value, uint8_t counter)
+{
+    if (counter < PMU_NUM_COUNTERS(env)) {
+        env->cp15.c14_pmevcntr[counter] = value;
+        pmu_sync_counter(env, counter);
+    }
+    /* We opt to behave as a RAZ/WI when attempts to access PM[X]EVCNTR
+     * are CONSTRAINED UNPREDICTABLE. */
+}
+
+static uint64_t pmevcntr_read(CPUARMState *env, const ARMCPRegInfo *ri,
+                              uint8_t counter)
+{
+    if (counter < PMU_NUM_COUNTERS(env)) {
+        uint64_t ret;
+        pmu_sync_counter(env, counter);
+        ret = env->cp15.c14_pmevcntr[counter];
+        pmu_sync_counter(env, counter);
+        return ret;
     } else {
+      /* We opt to behave as a RAZ/WI when attempts to access PM[X]EVCNTR
+       * are CONSTRAINED UNPREDICTABLE. */
         return 0;
     }
 }
 
+static void pmevcntr_writefn(CPUARMState *env, const ARMCPRegInfo *ri,
+                             uint64_t value)
+{
+    uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
+    pmevcntr_write(env, ri, value, counter);
+}
+
+static uint64_t pmevcntr_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    uint8_t counter = ((ri->crm & 3) << 3) | (ri->opc2 & 7);
+    return pmevcntr_read(env, ri, counter);
+}
+
+static void pmxevcntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                             uint64_t value)
+{
+    pmevcntr_write(env, ri, value, env->cp15.c9_pmselr & 31);
+}
+
+static uint64_t pmxevcntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    return pmevcntr_read(env, ri, env->cp15.c9_pmselr & 31);
+}
+
 static void pmuserenr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
@@ -1435,10 +1558,13 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 1,
       .access = PL0_RW, .type = ARM_CP_NO_RAW, .accessfn = pmreg_access,
       .writefn = pmxevtyper_write, .readfn = pmxevtyper_read },
-    /* Unimplemented, RAZ/WI. */
     { .name = "PMXEVCNTR", .cp = 15, .crn = 9, .crm = 13, .opc1 = 0, .opc2 = 2,
-      .access = PL0_RW, .type = ARM_CP_CONST, .resetvalue = 0,
-      .accessfn = pmreg_access_xevcntr },
+      .access = PL0_RW, .type = ARM_CP_NO_RAW, .accessfn = pmreg_access_xevcntr,
+      .writefn = pmxevcntr_write, .readfn = pmxevcntr_read },
+    { .name = "PMXEVCNTR_EL0", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 13, .opc2 = 2,
+      .access = PL0_RW, .type = ARM_CP_NO_RAW, .accessfn = pmreg_access_xevcntr,
+      .writefn = pmxevcntr_write, .readfn = pmxevcntr_read },
     { .name = "PMUSERENR", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 0,
       .access = PL0_R | PL1_RW, .accessfn = access_tpm,
       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmuserenr),
@@ -4073,7 +4199,7 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
 #endif
     /* The only field of MDCR_EL2 that has a defined architectural reset value
      * is MDCR_EL2.HPMN which should reset to the value of PMCR_EL0.N; but we
-     * don't impelment any PMU event counters, so using zero as a reset
+     * don't implement any PMU event counters, so using zero as a reset
      * value for MDCR_EL2 is okay
      */
     { .name = "MDCR_EL2", .state = ARM_CP_STATE_BOTH,
@@ -4746,6 +4872,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_arm_cp_regs(cpu, v7mp_cp_reginfo);
     }
     if (arm_feature(env, ARM_FEATURE_V7)) {
+        unsigned int i;
         /* v7 performance monitor control register: same implementor
          * field as main ID register, and we implement only the cycle
          * count register.
@@ -4770,6 +4897,40 @@ void register_cp_regs_for_features(ARMCPU *cpu)
         };
         define_one_arm_cp_reg(cpu, &pmcr);
         define_one_arm_cp_reg(cpu, &pmcr64);
+        for (i = 0; i < 31; i++) {
+            char *pmevcntr_name = g_strdup_printf("PMEVCNTR%d", i);
+            char *pmevcntr_el0_name = g_strdup_printf("PMEVCNTR%d_EL0", i);
+            char *pmevtyper_name = g_strdup_printf("PMEVTYPER%d", i);
+            char *pmevtyper_el0_name = g_strdup_printf("PMEVTYPER%d_EL0", i);
+            ARMCPRegInfo pmev_regs[] = {
+                { .name = pmevcntr_name, .cp = 15, .crn = 15,
+                  .crm = 8 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
+                  .access = PL0_RW, .type = ARM_CP_NO_RAW,
+                  .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn,
+                  .accessfn = pmreg_access },
+                { .name = pmevcntr_el0_name, .state = ARM_CP_STATE_AA64,
+                  .opc0 = 3, .opc1 = 3, .crn = 15, .crm = 8 | (3 & (i >> 3)),
+                  .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
+                  .type = ARM_CP_NO_RAW,
+                  .readfn = pmevcntr_readfn, .writefn = pmevcntr_writefn },
+                { .name = pmevtyper_name, .cp = 15, .crn = 15,
+                  .crm = 12 | (3 & (i >> 3)), .opc1 = 0, .opc2 = i & 7,
+                  .access = PL0_RW, .type = ARM_CP_NO_RAW,
+                  .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
+                  .accessfn = pmreg_access },
+                { .name = pmevtyper_el0_name, .state = ARM_CP_STATE_AA64,
+                  .opc0 = 3, .opc1 = 3, .crn = 15, .crm = 12 | (3 & (i >> 3)),
+                  .opc2 = i & 7, .access = PL0_RW, .accessfn = pmreg_access,
+                  .type = ARM_CP_NO_RAW,
+                  .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn },
+                REGINFO_SENTINEL
+            };
+            define_arm_cp_regs(cpu, pmev_regs);
+            g_free(pmevcntr_name);
+            g_free(pmevcntr_el0_name);
+            g_free(pmevtyper_name);
+            g_free(pmevtyper_el0_name);
+        }
 #endif
         ARMCPRegInfo clidr = {
             .name = "CLIDR", .state = ARM_CP_STATE_BOTH,
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 11/13] target/arm: PMU: Add instruction and cycle events
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (9 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 10/13] target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 12/13] target/arm: PMU: Set PMCR.N to 4 Aaron Lindsay
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

The instruction event is only enabled when icount is used, cycles are
always supported.

Note: Setting can_do_io=1 should not be done here. It is ugly and wrong,
but I am not sure of the proper way to handle this (See 'target/arm:
Filter cycle counter based on PMCCFILTR_EL0')

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 75 ++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 54 insertions(+), 21 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index f183199..e48bb67 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -14,6 +14,7 @@
 #include "arm_ldst.h"
 #include <zlib.h> /* For crc32 */
 #include "exec/semihost.h"
+#include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
 
 #define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
@@ -901,8 +902,57 @@ typedef struct pm_event {
     uint64_t (*get_count)(CPUARMState *);
 } pm_event;
 
+static bool event_always_supported(CPUARMState *env)
+{
+    return true;
+}
+
+#ifndef CONFIG_USER_ONLY
+static uint64_t cycles_get_count(CPUARMState *env)
+{
+    uint64_t ret;
+    CPUState *cpu = ENV_GET_CPU(env);
+    uint32_t saved_can_do_io = cpu->can_do_io;
+    cpu->can_do_io = 1;
+
+    ret = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
+                   ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
+
+    cpu->can_do_io = saved_can_do_io;
+    return ret;
+}
+
+static bool instructions_supported(CPUARMState *env)
+{
+    return use_icount == 1 /* Precise instruction counting */;
+}
+
+static uint64_t instructions_get_count(CPUARMState *env)
+{
+    uint64_t ret;
+    CPUState *cpu = ENV_GET_CPU(env);
+    uint32_t saved_can_do_io = cpu->can_do_io;
+    cpu->can_do_io = 1;
+
+    ret = (uint64_t)cpu_get_icount_raw();
+
+    cpu->can_do_io = saved_can_do_io;
+    return ret;
+}
+#endif
+
 #define SUPPORTED_EVENT_SENTINEL UINT16_MAX
 static const pm_event pm_events[] = {
+#ifndef CONFIG_USER_ONLY
+    { .number = 0x008, /* INST_RETIRED */
+      .supported = instructions_supported,
+      .get_count = instructions_get_count
+    },
+    { .number = 0x011, /* CPU_CYCLES */
+      .supported = event_always_supported,
+      .get_count = cycles_get_count
+    },
+#endif
     { .number = SUPPORTED_EVENT_SENTINEL }
 };
 static uint16_t supported_event_map[0x3f];
@@ -982,8 +1032,6 @@ static CPAccessResult pmreg_access_swinc(CPUARMState *env,
     return pmreg_access(env, ri, isread);
 }
 
-#ifndef CONFIG_USER_ONLY
-
 static CPAccessResult pmreg_access_selr(CPUARMState *env,
                                         const ARMCPRegInfo *ri,
                                         bool isread)
@@ -1085,10 +1133,11 @@ void pmccntr_sync(CPUARMState *env)
 {
     if (arm_ccnt_enabled(env) &&
           !pmu_counter_filtered(env, env->cp15.pmccfiltr_el0)) {
-        uint64_t temp_ticks;
+        uint64_t temp_ticks = 0;
 
-        temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
-                              ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
+#ifndef CONFIG_USER_ONLY
+        temp_ticks = cycles_get_count(env);
+#endif
 
         if (env->cp15.c9_pmcr & PMCRD) {
             /* Increment once every 64 processor clock cycles */
@@ -1181,18 +1230,6 @@ static void pmccntr_write32(CPUARMState *env, const ARMCPRegInfo *ri,
     pmccntr_write(env, ri, deposit64(cur_val, 0, 32, value));
 }
 
-#else /* CONFIG_USER_ONLY */
-
-void pmccntr_sync(CPUARMState *env)
-{
-}
-
-void pmu_sync(CPUARMState *env)
-{
-}
-
-#endif
-
 static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
@@ -1517,7 +1554,6 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
     /* Unimplemented so WI. */
     { .name = "PMSWINC", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 4,
       .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NOP },
-#ifndef CONFIG_USER_ONLY
     { .name = "PMSELR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 5,
       .access = PL0_RW, .type = ARM_CP_ALIAS,
       .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmselr),
@@ -1537,7 +1573,6 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .access = PL0_RW, .accessfn = pmreg_access_ccntr,
       .type = ARM_CP_IO,
       .readfn = pmccntr_read, .writefn = pmccntr_write, },
-#endif
     { .name = "PMCCFILTR", .cp = 15, .opc1 = 0, .crn = 14, .crm = 15, .opc2 = 7,
       .writefn = pmccfiltr_write_a32, .readfn = pmccfiltr_read_a32,
       .access = PL0_RW, .accessfn = pmreg_access,
@@ -4877,7 +4912,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
          * field as main ID register, and we implement only the cycle
          * count register.
          */
-#ifndef CONFIG_USER_ONLY
         ARMCPRegInfo pmcr = {
             .name = "PMCR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 0,
             .access = PL0_RW,
@@ -4931,7 +4965,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             g_free(pmevtyper_name);
             g_free(pmevtyper_el0_name);
         }
-#endif
         ARMCPRegInfo clidr = {
             .name = "CLIDR", .state = ARM_CP_STATE_BOTH,
             .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 1,
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 12/13] target/arm: PMU: Set PMCR.N to 4
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (10 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 11/13] target/arm: PMU: Add instruction and cycle events Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 13/13] target/arm: Implement PMSWINC Aaron Lindsay
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

This both advertises that we support four counters and adds them to the
implementation because the PMU_NUM_COUNTERS macro reads this value from
the PMCR.

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index e48bb67..659b2b8 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4926,7 +4926,8 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             .access = PL0_RW, .accessfn = pmreg_access,
             .type = ARM_CP_IO,
             .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcr),
-            .resetvalue = cpu->midr & 0xff000000,
+            /* 4 counters enabled */
+            .resetvalue = (cpu->midr & 0xff000000) | (0x4 << PMCRN_SHIFT),
             .writefn = pmcr_write, .raw_writefn = raw_write,
         };
         define_one_arm_cp_reg(cpu, &pmcr);
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 13/13] target/arm: Implement PMSWINC
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (11 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 12/13] target/arm: PMU: Set PMCR.N to 4 Aaron Lindsay
@ 2017-09-30  2:08 ` Aaron Lindsay
  2017-10-09 14:46 ` [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
  2017-10-17 15:09 ` Peter Maydell
  14 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-09-30  2:08 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Peter Crosthwaite, Wei Huang
  Cc: qemu-devel, Michael Spradling, Aaron Lindsay, Digant Desai

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 659b2b8..c688e56 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -907,6 +907,15 @@ static bool event_always_supported(CPUARMState *env)
     return true;
 }
 
+static uint64_t swinc_get_count(CPUARMState *env)
+{
+    /*
+     * SW_INCR events are written directly to the pmevcntr's by writes to
+     * PMSWINC, so don't do anything here...
+     */
+    return 0;
+}
+
 #ifndef CONFIG_USER_ONLY
 static uint64_t cycles_get_count(CPUARMState *env)
 {
@@ -943,6 +952,10 @@ static uint64_t instructions_get_count(CPUARMState *env)
 
 #define SUPPORTED_EVENT_SENTINEL UINT16_MAX
 static const pm_event pm_events[] = {
+    { .number = 0x000, /* SW_INCR */
+      .supported = event_always_supported,
+      .get_count = swinc_get_count
+    },
 #ifndef CONFIG_USER_ONLY
     { .number = 0x008, /* INST_RETIRED */
       .supported = instructions_supported,
@@ -1195,6 +1208,25 @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     pmu_sync(env);
 }
 
+static void pmswinc_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                          uint64_t value)
+{
+    unsigned int i;
+    for (i = 0; i < PMU_NUM_COUNTERS(env); i++) {
+        /* Increment a counter's count iff: */
+        if ((value & (1 << i)) && /* counter's bit is set */
+                /* counter is enabled and not filtered */
+                pmu_counter_enabled(env, i) &&
+                !pmu_counter_filtered(env, env->cp15.c14_pmevtyper[i]) &&
+                /* counter is SW_INCR */
+                (env->cp15.c14_pmevtyper[i] & PMXEVTYPER_EVTCOUNT) == 0x0) {
+            pmu_sync_counter(env, i);
+            env->cp15.c14_pmevcntr[i]++;
+            pmu_sync_counter(env, i);
+        }
+    }
+}
+
 static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
     uint64_t ret;
@@ -1551,9 +1583,13 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
       .writefn = pmovsset_write,
       .raw_writefn = raw_write },
-    /* Unimplemented so WI. */
     { .name = "PMSWINC", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 4,
-      .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NOP },
+      .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NO_RAW,
+      .writefn = pmswinc_write },
+    { .name = "PMSWINC_EL0", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 4,
+      .access = PL0_W, .accessfn = pmreg_access_swinc, .type = ARM_CP_NO_RAW,
+      .writefn = pmswinc_write },
     { .name = "PMSELR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 5,
       .access = PL0_RW, .type = ARM_CP_ALIAS,
       .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmselr),
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (12 preceding siblings ...)
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 13/13] target/arm: Implement PMSWINC Aaron Lindsay
@ 2017-10-09 14:46 ` Aaron Lindsay
  2017-10-09 18:27   ` Peter Maydell
  2017-10-17 15:09 ` Peter Maydell
  14 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-09 14:46 UTC (permalink / raw)
  To: qemu-arm, Peter Maydell, Alistair Francis, Wei Huang, Peter Crosthwaite
  Cc: qemu-devel, Michael Spradling, Digant Desai

Ping!

Unfortunately I'm not sure who to add other than the current recipients,
but I'm eager for feedback and would love to work this into something
that will allow for using the full ARM PMU. 

I've also updated Peter Crosthwaite's email since the xilinx one appears
to be stale.

-Aaron

On Sep 29 22:08, Aaron Lindsay wrote:
> The ARM PMU implementation currently contains a basic cycle counter, but it is
> often useful to gather counts of other events and filter them based on
> execution mode. These patches flesh out the implementations of various PMU
> registers including PM[X]EVCNTR and PM[X]EVTYPER, add a struct definition to
> represent arbitrary counter types, implement mode filtering, and add
> instruction, cycle, and software increment events.
> 
> I am particularly interested in feedback on the following two patches because I
> think I'm likely Doing It Wrong:
>  [1] target/arm: Filter cycle counter based on PMCCFILTR_EL0
>  [2] target/arm: PMU: Add instruction and cycle events
> 
> In order to implement mode filtering in an event-driven way, [1] adds a pair of
> calls to pmu_sync() surrounding every update to a register/variable which may
> affect whether any counter is currently filtered. These pmu_sync() calls
> ultimately call cpu_get_icount_raw() for enabled instruction and cycle counters
> when using icount. Unfortunately, cpu->can_do_io may otherwise be zero for
> these calls so the current implementation in [2] temporarily sets can_do_io to
> 1. I haven't see any ill side effects from this in my testing, but it doesn't
> seem like the right way to handle this.
> 
> I would like to eventually add sending interrupts on counter overflow.
> Suggestions for the best direction to handle this are most welcome.  
> 
> Thanks for any feedback,
> Aaron
> 
> Aaron Lindsay (13):
>   target/arm: A53: Initialize PMCEID[0]
>   target/arm: Check PMCNTEN for whether PMCCNTR is enabled
>   target/arm: Reorganize PMCCNTR read, write, sync
>   target/arm: Mask PMU register writes based on PMCR_EL0.N
>   target/arm: Allow AArch32 access for PMCCFILTR
>   target/arm: Filter cycle counter based on PMCCFILTR_EL0
>   target/arm: Implement PMOVSSET
>   target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled
>   target/arm: Add array for supported PMU events, generate PMCEID[01]
>   target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER
>   target/arm: PMU: Add instruction and cycle events
>   target/arm: PMU: Set PMCR.N to 4
>   target/arm: Implement PMSWINC
> 
>  target/arm/cpu.c       |  10 +-
>  target/arm/cpu.h       |  34 +++-
>  target/arm/cpu64.c     |   2 +
>  target/arm/helper.c    | 535 +++++++++++++++++++++++++++++++++++++++++--------
>  target/arm/kvm64.c     |   2 +
>  target/arm/machine.c   |   2 +
>  target/arm/op_helper.c |   4 +
>  7 files changed, 500 insertions(+), 89 deletions(-)
> 
> -- 
> Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.
> 

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
  2017-10-09 14:46 ` [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
@ 2017-10-09 18:27   ` Peter Maydell
  2017-10-09 20:25     ` Aaron Lindsay
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-09 18:27 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Wei Huang, Peter Crosthwaite,
	QEMU Developers, Michael Spradling, Digant Desai

On 9 October 2017 at 15:46, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> Unfortunately I'm not sure who to add other than the current recipients,
> but I'm eager for feedback and would love to work this into something
> that will allow for using the full ARM PMU.

Hi -- I do have this on my review queue, but unfortunately it's
sitting behind some other fairly chunky hard-to-review patchsets.

As a first quick "is this going in the right direction" review based
pretty much only on the cover letter:

What extra events do you want to try to support in the emulated PMU?
Part of the reason we only support the cycle counter is because
 (1) a lot of the events in a real PMU would be hard to support
 (2) it's not clear to me that exposing events to the guest would be
     very useful to it anyway -- performance profiling of guest code
     running under emulation is fraught with difficulty

Giving more of an idea of what your use case is would help in
evaluating these patches.

Some of what you're doing looks like it's fixing bugs in our current
implementation, which is definitely fine in principle.

I haven't looked at the icount related stuff (and I can never remember
how it works either) but fiddling with can_do_io does sound like it's not
the right approach...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
  2017-10-09 18:27   ` Peter Maydell
@ 2017-10-09 20:25     ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-09 20:25 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Wei Huang, Peter Crosthwaite,
	QEMU Developers, Michael Spradling, Digant Desai

On Oct 09 19:27, Peter Maydell wrote:
> On 9 October 2017 at 15:46, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > Unfortunately I'm not sure who to add other than the current recipients,
> > but I'm eager for feedback and would love to work this into something
> > that will allow for using the full ARM PMU.
> 
> Hi -- I do have this on my review queue, but unfortunately it's
> sitting behind some other fairly chunky hard-to-review patchsets.

No problem - I'll wait my turn.

> As a first quick "is this going in the right direction" review based
> pretty much only on the cover letter:
> 
> What extra events do you want to try to support in the emulated PMU?
> Part of the reason we only support the cycle counter is because
>  (1) a lot of the events in a real PMU would be hard to support
>  (2) it's not clear to me that exposing events to the guest would be
>      very useful to it anyway -- performance profiling of guest code
>      running under emulation is fraught with difficulty
> 
> Giving more of an idea of what your use case is would help in
> evaluating these patches.

My goal isn't to expose events concerned with microarchitectural
performance, but rather those that can help characterize architectural
behavior (initially instructions and maybe branches, but perhaps
anything in ARM ARM D5.10.4: "Common architectural event numbers"). We
use a number of platforms at different points along the accuracy/speed
trade-off continuum, and it is convenient to use self-hosted tools that
we can expect to work on all of them.

For instance, implementing the instruction counter allows us to stop
processes after executing prescribed numbers of instructions and resume
them on other platforms for further study (using CRIU). It can also be
useful to get a `perf` profile based on instruction count to get a rough
idea of where an application is spending its time.

> Some of what you're doing looks like it's fixing bugs in our current
> implementation, which is definitely fine in principle.

Yes, I believe patches 1-5 are fairly straightforward fixes, with the
later patches getting into the more invasive changes.

> I haven't looked at the icount related stuff (and I can never remember
> how it works either) but fiddling with can_do_io does sound like it's not
> the right approach...

Agreed. Perhaps part of the same offense as the pmu_sync() calls
scattered around - I was unable to find a better way to drive the mode
filtering, and am more than glad to pursue a different approach if
pointed in the right direction.

-Aaron

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled Aaron Lindsay
@ 2017-10-17 12:49   ` Peter Maydell
  2017-10-17 14:59     ` Aaron Lindsay
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 12:49 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> ---
>  target/arm/helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 8be78ea..40c9273 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -964,7 +964,7 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
>  {
>      /* This does not support checking PMCCFILTR_EL0 register */
>
> -    if (!(env->cp15.c9_pmcr & PMCRE)) {
> +    if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
>          return false;
>      }
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

I keep replying to your v1 patchset by mistake -- can you use
the git format-patch options to put make the subject prefix
be "[PATCH v3]" for the next round in all the patchmails to
help keep the versions distinct, please?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync Aaron Lindsay
@ 2017-10-17 13:25   ` Peter Maydell
  2017-10-17 15:26     ` Aaron Lindsay
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 13:25 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> pmccntr_read and pmccntr_write contained duplicate code that was already
> being handled by pmccntr_sync. This also moves the calls to get the
> clock inside the 'if' statement so they are not executed if not needed.

It is duplicate code, yes, but I also find it a bit confusing,
because it's the same code doing two different operations:

(1) if (as is usual when the counter is running) c15_ccnt
contains a delta between the clock ticks and the register
value, pmccntr_sync() sets c15_ccnt to the current
guest-visible register value

(2) if c15_ccnt contains a guest-visible register value
and the clock is running, pmccntr_sync() sets c15_ccnt
to the ticks-to-value delta

and we use this in a couple of places like:
   pmccntr_sync();
   do stuff that operates on the guest register values,
     or maybe turns off the counter, etc;
   pmccntr_sync();

But that's wrong really -- we'll slightly lose time because
the nanosecond clock advances between the two calls
to qemu_clock_get_ns(). (It only works at all because
it happens that f(x) = C - x  is a self-inverse function.)
It's also a confusing API because it looks like something
you only need to call once but actually in most cases
you need to call it twice, before and after whatever it
is you're doing with the counter.

So I think we should refactor this so that we have a
pair of functions, something like:
    uint64_t clocknow = pmccntr_op_start(env);
    /* Now c15_ccnt is the guest visible value, and
     * we can do things like change PMCRD, enable bits, etc
     */
    [...]
    /* convert c15_ccnt back to the clock-to-value delta,
     * passing it the tick count we used when we did the
     * delta-to-value conversion earlier.
     */
    pmccntr_op_finish(env, clocknow);

(clocknow here should be the output of the muldiv64(),
not the divided-by-64 version.)

Some more specific review comments below, though the
suggested refactoring above might render some of them moot.


> Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> ---
>  target/arm/helper.c | 55 ++++++++++++++++-------------------------------------
>  1 file changed, 16 insertions(+), 39 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 40c9273..ecf8c55 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -973,17 +973,18 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
>
>  void pmccntr_sync(CPUARMState *env)
>  {
> -    uint64_t temp_ticks;
> +    if (arm_ccnt_enabled(env) &&
> +          !pmu_counter_filtered(env, env->cp15.pmccfiltr_el0)) {

This seems to be adding an extra condition that the commit
message doesn't mention.

> +        uint64_t temp_ticks;
>
> -    temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> -                          ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> +        temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> +                              ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
>
> -    if (env->cp15.c9_pmcr & PMCRD) {
> -        /* Increment once every 64 processor clock cycles */
> -        temp_ticks /= 64;
> -    }
> +        if (env->cp15.c9_pmcr & PMCRD) {
> +            /* Increment once every 64 processor clock cycles */
> +            temp_ticks /= 64;
> +        }
>
> -    if (arm_ccnt_enabled(env)) {
>          env->cp15.c15_ccnt = temp_ticks - env->cp15.c15_ccnt;
>      }
>  }
> @@ -1007,21 +1008,11 @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>
>  static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
>  {
> -    uint64_t total_ticks;
> -
> -    if (!arm_ccnt_enabled(env)) {
> -        /* Counter is disabled, do not change value */
> -        return env->cp15.c15_ccnt;
> -    }
> -
> -    total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> -                           ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> -
> -    if (env->cp15.c9_pmcr & PMCRD) {
> -        /* Increment once every 64 processor clock cycles */
> -        total_ticks /= 64;
> -    }
> -    return total_ticks - env->cp15.c15_ccnt;
> +    uint64_t ret;
> +    pmccntr_sync(env);
> +    ret = env->cp15.c15_ccnt;
> +    pmccntr_sync(env);
> +    return ret;

This change means that as a result of this refactoring we
now do the 'sync' operations (muldiv etc) twice, whereas
previously we did them only once.

>  }
>
>  static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -1038,22 +1029,8 @@ static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void pmccntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                          uint64_t value)
>  {
> -    uint64_t total_ticks;
> -
> -    if (!arm_ccnt_enabled(env)) {
> -        /* Counter is disabled, set the absolute value */
> -        env->cp15.c15_ccnt = value;
> -        return;
> -    }
> -
> -    total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> -                           ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> -
> -    if (env->cp15.c9_pmcr & PMCRD) {
> -        /* Increment once every 64 processor clock cycles */
> -        total_ticks /= 64;
> -    }
> -    env->cp15.c15_ccnt = total_ticks - value;
> +    env->cp15.c15_ccnt = value;
> +    pmccntr_sync(env);
>  }

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N Aaron Lindsay
@ 2017-10-17 13:41   ` Peter Maydell
  2017-10-17 15:42     ` Aaron Lindsay
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 13:41 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> This is in preparation for enabling counters other than PMCCNTR
>
> Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> ---
>  target/arm/helper.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index ecf8c55..54070a3 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -30,11 +30,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
>                                 hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
>                                 target_ulong *page_size_ptr, uint32_t *fsr,
>                                 ARMMMUFaultInfo *fi);
> -
> -/* Definitions for the PMCCNTR and PMCR registers */
> -#define PMCRD   0x8
> -#define PMCRC   0x4
> -#define PMCRE   0x1
>  #endif
>
>  static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
> @@ -876,6 +871,17 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
>      REGINFO_SENTINEL
>  };
>
> +/* Definitions for the PMU registers */
> +#define PMCRN   0xf800

We usually name this kind of whole-field mask value something like PMCRN_MASK.

(We also have a FIELD() macro in hw/registerfields.h for defining
mask/length/shift constants; though it can be a bit verbose in
the length of the names it uses, so I don't insist on its use
if you don't like the results. I'm on the fence myself about it.)

> +#define PMCRN_SHIFT 11
> +#define PMCRD   0x8
> +#define PMCRC   0x4
> +#define PMCRE   0x1
> +
> +#define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN) >> PMCRN_SHIFT)
> +/* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
> +#define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - 1))
> +
>  static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
>                                     bool isread)
>  {
> @@ -1060,14 +1066,14 @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                              uint64_t value)
>  {
> -    value &= (1 << 31);
> +    value &= (PMU_COUNTER_MASK(env) | (1 << 31));

PMU_COUNTER_MASK always contains bit 31, so why do we manually OR it in
again here?

>      env->cp15.c9_pmcnten |= value;
>  }
>
>  static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    value &= (1 << 31);
> +    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
>      env->cp15.c9_pmcnten &= ~value;
>  }
>
> @@ -1115,14 +1121,14 @@ static void pmintenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
>      /* We have no event counters so only the C bit can be changed */
> -    value &= (1 << 31);
> +    value &= PMU_COUNTER_MASK(env);
>      env->cp15.c9_pminten |= value;
>  }
>
>  static void pmintenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    value &= (1 << 31);
> +    value &= PMU_COUNTER_MASK(env);
>      env->cp15.c9_pminten &= ~value;
>  }
>
> --
> Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.

Shouldn't these '\n' in your sig be literal line breaks? :-)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 05/13] target/arm: Allow AArch32 access for PMCCFILTR
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 05/13] target/arm: Allow AArch32 access for PMCCFILTR Aaron Lindsay
@ 2017-10-17 13:52   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 13:52 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> Also fix the existing bitmask for writes.

"Also" in a commit message can often be translated as
"This should really go in a separate commit, but" :-)

> Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> ---
>  target/arm/helper.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 54070a3..fcc2fcf 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -1059,10 +1059,25 @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                              uint64_t value)
>  {
>      pmccntr_sync(env);
> -    env->cp15.pmccfiltr_el0 = value & 0x7E000000;
> +    env->cp15.pmccfiltr_el0 = value & 0xfc000000;
>      pmccntr_sync(env);
>  }
>
> +static void pmccfiltr_write_a32(CPUARMState *env, const ARMCPRegInfo *ri,
> +                            uint64_t value)
> +{
> +    pmccntr_sync(env);
> +    env->cp15.pmccfiltr_el0 = (env->cp15.pmccfiltr_el0 & 0x04000000) |
> +        (value & 0xf8000000); /* M is not visible in AArch32 */

I think the comment would look better on its own line.

> +    pmccntr_sync(env);
> +}

I think we could probably benefit from defining constants for
the bit names in this register.

> +
> +static uint64_t pmccfiltr_read_a32(CPUARMState *env, const ARMCPRegInfo *ri)
> +{
> +    /* M is not visible in AArch32 */
> +    return env->cp15.pmccfiltr_el0 & 0xf8000000;
> +}
> +
>  static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                              uint64_t value)
>  {
> @@ -1280,6 +1295,12 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
>        .type = ARM_CP_IO,
>        .readfn = pmccntr_read, .writefn = pmccntr_write, },
>  #endif
> +    { .name = "PMCCFILTR", .cp = 15, .opc1 = 0, .crn = 14, .crm = 15, .opc2 = 7,
> +      .writefn = pmccfiltr_write_a32, .readfn = pmccfiltr_read_a32,
> +      .access = PL0_RW, .accessfn = pmreg_access,
> +      .type = ARM_CP_ALIAS,
> +      .fieldoffset = offsetoflow32(CPUARMState, cp15.pmccfiltr_el0),
> +      .resetvalue = 0, },
>      { .name = "PMCCFILTR_EL0", .state = ARM_CP_STATE_AA64,
>        .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 15, .opc2 = 7,
>        .writefn = pmccfiltr_write,

If you're defining both a .readfn and a .writefn for a register
you don't need to provide a .fieldoffset.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 08/13] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 08/13] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled Aaron Lindsay
@ 2017-10-17 14:04   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 14:04 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> ---
>  target/arm/helper.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 3932ac0..c0ef367 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -975,17 +975,22 @@ static CPAccessResult pmreg_access_ccntr(CPUARMState *env,
>      return pmreg_access(env, ri, isread);
>  }
>
> -static inline bool arm_ccnt_enabled(CPUARMState *env)
> +static inline bool pmu_counter_enabled(CPUARMState *env, uint8_t counter)
>  {
>      /* Does not check PMCCFILTR_EL0, which is handled by pmu_counter_filtered */
> -
> -    if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
> +    if (!(env->cp15.c9_pmcr & PMCRE) ||
> +            !(env->cp15.c9_pmcnten & (1 << counter))) {
>          return false;
>      }
>
>      return true;
>  }
>
> +static inline bool arm_ccnt_enabled(CPUARMState *env)
> +{
> +    return pmu_counter_enabled(env, 31);
> +}
> +

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET Aaron Lindsay
@ 2017-10-17 14:19   ` Peter Maydell
  2017-10-17 16:02     ` Aaron Lindsay
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 14:19 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> Also modify it to be stored as a uint64_t
>
> Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> ---
>  target/arm/cpu.h    |  2 +-
>  target/arm/helper.c | 27 ++++++++++++++++++++++++---
>  2 files changed, 25 insertions(+), 4 deletions(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 811b1fe..365a809 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -325,7 +325,7 @@ typedef struct CPUARMState {
>          uint32_t c9_data;
>          uint64_t c9_pmcr; /* performance monitor control register */
>          uint64_t c9_pmcnten; /* perf monitor counter enables */
> -        uint32_t c9_pmovsr; /* perf monitor overflow status */
> +        uint64_t c9_pmovsr; /* perf monitor overflow status */

This is a bug fix, so it should go in its own patch. Specifically,
we already have an AArch64 sysreg PMOVSCLR_EL0 which has
      .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
so we should have made the CPUARMState field 64 bits when we
added it. (I think without this bugfix reads of the AArch64
reg will return data from the adjoining field in the struct.)


>          uint32_t c9_pmuserenr; /* perf monitor user enable */
>          uint64_t c9_pmselr; /* perf monitor counter selection register */
>          uint64_t c9_pminten; /* perf monitor interrupt enables */
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 74e90c5..3932ac0 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -1150,9 +1150,17 @@ static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void pmovsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                           uint64_t value)
>  {
> +    value &= PMU_COUNTER_MASK(env);
>      env->cp15.c9_pmovsr &= ~value;
>  }
>
> +static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri,
> +                         uint64_t value)
> +{
> +    value &= PMU_COUNTER_MASK(env);
> +    env->cp15.c9_pmovsr |= value;
> +}
> +
>  static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> @@ -1317,10 +1325,10 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
>        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcnten),
>        .writefn = pmcntenclr_write },
>      { .name = "PMOVSR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 3,
> -      .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
> -      .accessfn = pmreg_access,
> +      .access = PL0_RW, .accessfn = pmreg_access,
> +      .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),

Is this just reshuffling the order of .field initializers?

>        .writefn = pmovsr_write,
> -      .raw_writefn = raw_write },
> +      .raw_writefn = raw_write, .resetvalue = 0 },

.resetvalue 0 is the default, but it doesn't hurt to specify it
explicitly I guess.

>      { .name = "PMOVSCLR_EL0", .state = ARM_CP_STATE_AA64,
>        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 3,
>        .access = PL0_RW, .accessfn = pmreg_access,
> @@ -1328,6 +1336,19 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
>        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
>        .writefn = pmovsr_write,
>        .raw_writefn = raw_write },
> +    { .name = "PMOVSSET", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 3,
> +      .access = PL0_RW, .accessfn = pmreg_access,
> +      .type = ARM_CP_ALIAS,
> +      .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
> +      .writefn = pmovsset_write,
> +      .raw_writefn = raw_write },
> +    { .name = "PMOVSSET_EL0", .state = ARM_CP_STATE_AA64,
> +      .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 14, .opc2 = 3,
> +      .access = PL0_RW, .accessfn = pmreg_access,
> +      .type = ARM_CP_ALIAS,
> +      .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
> +      .writefn = pmovsset_write,
> +      .raw_writefn = raw_write },

You can implement these using a single STATE_BOTH regdef, I think.
Also, there's a standard order for the fields which goes
.cp .opc0 .opc1 .crn .crm .opc2
(you can often omit the .cp as the default is 15).

We need to be a bit careful here, because the AArch32 PMMOVSET
register isn't implemented in ARMv7 until v7VE. So we need to
put the regdef somewhere else...


thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0
  2017-09-30  2:08 ` [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0 Aaron Lindsay
@ 2017-10-17 14:57   ` Peter Maydell
  2017-10-17 19:32     ` Aaron Lindsay
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 14:57 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Wei Huang, QEMU Developers,
	Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> The pmu_counter_filtered and pmu_sync functions are generic (as opposed
> to PMCCNTR-specific) to allow for the implementation of other events.
>
> RFC: I know that many of the locations of the calls to pmu_sync are
> problematic when icount is enabled because can_do_io will not be set.
> The documentation says that for deterministic execution, IO must only be
> performed by the last instruction of a thread block.

Yes. You need to arrange that gen_io_start() and gen_io_end()
are called around the generation of code for operations that might
do IO or care about the state of the clock, and that we end the TB
after gen_io_end().

> Because
> cpu_handle_interrupt() and cpu_handle_exception() are actually made
> outside of a thread block, is it safe to set can_do_io=1 for them to
> allow this to succeed? Is there a better mechanism for handling this?

>From my reading of the code, can_do_io should already be 1
when these functions are called. It's only set to 0 just
before we call tcg_qemu_tb_exec() and then set back to 1
immediately after (see cpu_tb_exec()).

In general, the approach you have here looks like it's going to
be pretty invasive and also hard to keep working right. I think
we can come up with something a bit better.

Specifically, the filtering criteria effectively only change
when we change exception level, right? (since you can only
change security state as part of an exception level change).
We already have a mechanism for getting a callback when the EL
changes -- arm_register_el_change_hook(). (We would need to
upgrade it to support more than one hook function.)

You still need to get the io-count handling right, but there
are many fewer places that need to change (just the codegen
for calls to exception-return helpers, I think) and they already
end the TB, so you don't have the complexity of trying to end the
TB in places you didn't before, you just need the gen_io_start/end.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled
  2017-10-17 12:49   ` Peter Maydell
@ 2017-10-17 14:59     ` Aaron Lindsay
  2017-10-17 15:00       ` Peter Maydell
  0 siblings, 1 reply; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-17 14:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On Oct 17 13:49, Peter Maydell wrote:
> On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> > ---
> >  target/arm/helper.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index 8be78ea..40c9273 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -964,7 +964,7 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
> >  {
> >      /* This does not support checking PMCCFILTR_EL0 register */
> >
> > -    if (!(env->cp15.c9_pmcr & PMCRE)) {
> > +    if (!(env->cp15.c9_pmcr & PMCRE) || !(env->cp15.c9_pmcnten & (1 << 31))) {
> >          return false;
> >      }
> >
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> I keep replying to your v1 patchset by mistake -- can you use
> the git format-patch options to put make the subject prefix
> be "[PATCH v3]" for the next round in all the patchmails to
> help keep the versions distinct, please?

Yes, and I apologize for the nuisance. Also, thanks - I didn't know
about the `-v, --reroll-count` option and must have gotten distracted
between updating the subject to be 'v2' for the cover letter and the patches.

-Aaron

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled
  2017-10-17 14:59     ` Aaron Lindsay
@ 2017-10-17 15:00       ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 15:00 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 17 October 2017 at 15:59, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> On Oct 17 13:49, Peter Maydell wrote:
>> I keep replying to your v1 patchset by mistake -- can you use
>> the git format-patch options to put make the subject prefix
>> be "[PATCH v3]" for the next round in all the patchmails to
>> help keep the versions distinct, please?
>
> Yes, and I apologize for the nuisance. Also, thanks - I didn't know
> about the `-v, --reroll-count` option and must have gotten distracted
> between updating the subject to be 'v2' for the cover letter and the patches.

There's also --subject-prefix if you need more flexibility than -v.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
  2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
                   ` (13 preceding siblings ...)
  2017-10-09 14:46 ` [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
@ 2017-10-17 15:09 ` Peter Maydell
  2017-10-17 19:52   ` Aaron Lindsay
  14 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2017-10-17 15:09 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> The ARM PMU implementation currently contains a basic cycle counter, but it is
> often useful to gather counts of other events and filter them based on
> execution mode. These patches flesh out the implementations of various PMU
> registers including PM[X]EVCNTR and PM[X]EVTYPER, add a struct definition to
> represent arbitrary counter types, implement mode filtering, and add
> instruction, cycle, and software increment events.
>
> I am particularly interested in feedback on the following two patches because I
> think I'm likely Doing It Wrong:
>  [1] target/arm: Filter cycle counter based on PMCCFILTR_EL0
>  [2] target/arm: PMU: Add instruction and cycle events
>
> In order to implement mode filtering in an event-driven way, [1] adds a pair of
> calls to pmu_sync() surrounding every update to a register/variable which may
> affect whether any counter is currently filtered. These pmu_sync() calls
> ultimately call cpu_get_icount_raw() for enabled instruction and cycle counters
> when using icount. Unfortunately, cpu->can_do_io may otherwise be zero for
> these calls so the current implementation in [2] temporarily sets can_do_io to
> 1. I haven't see any ill side effects from this in my testing, but it doesn't
> seem like the right way to handle this.

I've now reviewed the early stuff and provided what I hope is
a useful direction for the mode-filtering, so I'm not going to
look at the patches at the tail end on this version of the series.


> I would like to eventually add sending interrupts on counter overflow.
> Suggestions for the best direction to handle this are most welcome.

Check out how the helper.c timer interrupts are wired up
(the CPU object exposes outbound irq lines, which then get
wired up by the board to the GIC.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync
  2017-10-17 13:25   ` Peter Maydell
@ 2017-10-17 15:26     ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-17 15:26 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On Oct 17 14:25, Peter Maydell wrote:
> On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > pmccntr_read and pmccntr_write contained duplicate code that was already
> > being handled by pmccntr_sync. This also moves the calls to get the
> > clock inside the 'if' statement so they are not executed if not needed.
> 
> It is duplicate code, yes, but I also find it a bit confusing,
> because it's the same code doing two different operations:
> 
> (1) if (as is usual when the counter is running) c15_ccnt
> contains a delta between the clock ticks and the register
> value, pmccntr_sync() sets c15_ccnt to the current
> guest-visible register value
> 
> (2) if c15_ccnt contains a guest-visible register value
> and the clock is running, pmccntr_sync() sets c15_ccnt
> to the ticks-to-value delta
> 
> and we use this in a couple of places like:
>    pmccntr_sync();
>    do stuff that operates on the guest register values,
>      or maybe turns off the counter, etc;
>    pmccntr_sync();
> 
> But that's wrong really -- we'll slightly lose time because
> the nanosecond clock advances between the two calls
> to qemu_clock_get_ns(). (It only works at all because
> it happens that f(x) = C - x  is a self-inverse function.)
> It's also a confusing API because it looks like something
> you only need to call once but actually in most cases
> you need to call it twice, before and after whatever it
> is you're doing with the counter.

Interesting, I hadn't thought about the loss of time issue.

> So I think we should refactor this so that we have a
> pair of functions, something like:
>     uint64_t clocknow = pmccntr_op_start(env);
>     /* Now c15_ccnt is the guest visible value, and
>      * we can do things like change PMCRD, enable bits, etc
>      */
>     [...]
>     /* convert c15_ccnt back to the clock-to-value delta,
>      * passing it the tick count we used when we did the
>      * delta-to-value conversion earlier.
>      */
>     pmccntr_op_finish(env, clocknow);
> 
> (clocknow here should be the output of the muldiv64(),
> not the divided-by-64 version.)
> 
> Some more specific review comments below, though the
> suggested refactoring above might render some of them moot.

I agree that what you describe would be cleaner. I'll work towards that
in v3.

> > Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> > ---
> >  target/arm/helper.c | 55 ++++++++++++++++-------------------------------------
> >  1 file changed, 16 insertions(+), 39 deletions(-)
> >
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index 40c9273..ecf8c55 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -973,17 +973,18 @@ static inline bool arm_ccnt_enabled(CPUARMState *env)
> >
> >  void pmccntr_sync(CPUARMState *env)
> >  {
> > -    uint64_t temp_ticks;
> > +    if (arm_ccnt_enabled(env) &&
> > +          !pmu_counter_filtered(env, env->cp15.pmccfiltr_el0)) {
> 
> This seems to be adding an extra condition that the commit
> message doesn't mention.

Eek. This slipped the cracks through during a rebase and belongs in
'target/arm: Filter cycle counter based on PMCCFILTR_EL0'.

> > +        uint64_t temp_ticks;
> >
> > -    temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> > -                          ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> > +        temp_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> > +                              ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> >
> > -    if (env->cp15.c9_pmcr & PMCRD) {
> > -        /* Increment once every 64 processor clock cycles */
> > -        temp_ticks /= 64;
> > -    }
> > +        if (env->cp15.c9_pmcr & PMCRD) {
> > +            /* Increment once every 64 processor clock cycles */
> > +            temp_ticks /= 64;
> > +        }
> >
> > -    if (arm_ccnt_enabled(env)) {
> >          env->cp15.c15_ccnt = temp_ticks - env->cp15.c15_ccnt;
> >      }
> >  }
> > @@ -1007,21 +1008,11 @@ static void pmcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >
> >  static uint64_t pmccntr_read(CPUARMState *env, const ARMCPRegInfo *ri)
> >  {
> > -    uint64_t total_ticks;
> > -
> > -    if (!arm_ccnt_enabled(env)) {
> > -        /* Counter is disabled, do not change value */
> > -        return env->cp15.c15_ccnt;
> > -    }
> > -
> > -    total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> > -                           ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> > -
> > -    if (env->cp15.c9_pmcr & PMCRD) {
> > -        /* Increment once every 64 processor clock cycles */
> > -        total_ticks /= 64;
> > -    }
> > -    return total_ticks - env->cp15.c15_ccnt;
> > +    uint64_t ret;
> > +    pmccntr_sync(env);
> > +    ret = env->cp15.c15_ccnt;
> > +    pmccntr_sync(env);
> > +    return ret;
> 
> This change means that as a result of this refactoring we
> now do the 'sync' operations (muldiv etc) twice, whereas
> previously we did them only once.
> 
> >  }
> >
> >  static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> > @@ -1038,22 +1029,8 @@ static void pmselr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >  static void pmccntr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                          uint64_t value)
> >  {
> > -    uint64_t total_ticks;
> > -
> > -    if (!arm_ccnt_enabled(env)) {
> > -        /* Counter is disabled, set the absolute value */
> > -        env->cp15.c15_ccnt = value;
> > -        return;
> > -    }
> > -
> > -    total_ticks = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> > -                           ARM_CPU_FREQ, NANOSECONDS_PER_SECOND);
> > -
> > -    if (env->cp15.c9_pmcr & PMCRD) {
> > -        /* Increment once every 64 processor clock cycles */
> > -        total_ticks /= 64;
> > -    }
> > -    env->cp15.c15_ccnt = total_ticks - value;
> > +    env->cp15.c15_ccnt = value;
> > +    pmccntr_sync(env);
> >  }
> 
> thanks
> -- PMM

-Aaron

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N
  2017-10-17 13:41   ` Peter Maydell
@ 2017-10-17 15:42     ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-17 15:42 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On Oct 17 14:41, Peter Maydell wrote:
> On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > This is in preparation for enabling counters other than PMCCNTR
> >
> > Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> > ---
> >  target/arm/helper.c | 24 +++++++++++++++---------
> >  1 file changed, 15 insertions(+), 9 deletions(-)
> >
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index ecf8c55..54070a3 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -30,11 +30,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
> >                                 hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
> >                                 target_ulong *page_size_ptr, uint32_t *fsr,
> >                                 ARMMMUFaultInfo *fi);
> > -
> > -/* Definitions for the PMCCNTR and PMCR registers */
> > -#define PMCRD   0x8
> > -#define PMCRC   0x4
> > -#define PMCRE   0x1
> >  #endif
> >
> >  static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
> > @@ -876,6 +871,17 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
> >      REGINFO_SENTINEL
> >  };
> >
> > +/* Definitions for the PMU registers */
> > +#define PMCRN   0xf800
> 
> We usually name this kind of whole-field mask value something like PMCRN_MASK.
> 
> (We also have a FIELD() macro in hw/registerfields.h for defining
> mask/length/shift constants; though it can be a bit verbose in
> the length of the names it uses, so I don't insist on its use
> if you don't like the results. I'm on the fence myself about it.)

Will do.

> > +#define PMCRN_SHIFT 11
> > +#define PMCRD   0x8
> > +#define PMCRC   0x4
> > +#define PMCRE   0x1
> > +
> > +#define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN) >> PMCRN_SHIFT)
> > +/* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
> > +#define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - 1))
> > +
> >  static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
> >                                     bool isread)
> >  {
> > @@ -1060,14 +1066,14 @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >  static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                              uint64_t value)
> >  {
> > -    value &= (1 << 31);
> > +    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
> 
> PMU_COUNTER_MASK always contains bit 31, so why do we manually OR it in
> again here?

I forgot I'd also included the cycle counter in PMU_COUNTER_MASK... I'll
remove the duplication here and in pmcntenclr_write below.

> >      env->cp15.c9_pmcnten |= value;
> >  }
> >
> >  static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                               uint64_t value)
> >  {
> > -    value &= (1 << 31);
> > +    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
> >      env->cp15.c9_pmcnten &= ~value;
> >  }
> >
> > @@ -1115,14 +1121,14 @@ static void pmintenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                               uint64_t value)
> >  {
> >      /* We have no event counters so only the C bit can be changed */
> > -    value &= (1 << 31);
> > +    value &= PMU_COUNTER_MASK(env);
> >      env->cp15.c9_pminten |= value;
> >  }
> >
> >  static void pmintenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                               uint64_t value)
> >  {
> > -    value &= (1 << 31);
> > +    value &= PMU_COUNTER_MASK(env);
> >      env->cp15.c9_pminten &= ~value;
> >  }
> >
> > --
> > Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.\nQualcomm Technologies, Inc. is a member of the\nCode Aurora Forum, a Linux Foundation Collaborative Project.
> 
> Shouldn't these '\n' in your sig be literal line breaks? :-)

Yes :( ...apparently `git config` doesn't interpret them as I expected.

> thanks
> -- PMM

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET
  2017-10-17 14:19   ` Peter Maydell
@ 2017-10-17 16:02     ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-17 16:02 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Peter Crosthwaite, Wei Huang,
	QEMU Developers, Michael Spradling, Digant Desai

On Oct 17 15:19, Peter Maydell wrote:
> On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > Also modify it to be stored as a uint64_t
> >
> > Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
> > ---
> >  target/arm/cpu.h    |  2 +-
> >  target/arm/helper.c | 27 ++++++++++++++++++++++++---
> >  2 files changed, 25 insertions(+), 4 deletions(-)
> >
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index 811b1fe..365a809 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -325,7 +325,7 @@ typedef struct CPUARMState {
> >          uint32_t c9_data;
> >          uint64_t c9_pmcr; /* performance monitor control register */
> >          uint64_t c9_pmcnten; /* perf monitor counter enables */
> > -        uint32_t c9_pmovsr; /* perf monitor overflow status */
> > +        uint64_t c9_pmovsr; /* perf monitor overflow status */
> 
> This is a bug fix, so it should go in its own patch. Specifically,
> we already have an AArch64 sysreg PMOVSCLR_EL0 which has
>       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
> so we should have made the CPUARMState field 64 bits when we
> added it. (I think without this bugfix reads of the AArch64
> reg will return data from the adjoining field in the struct.)

Okay, I'll split it off in v3.

> 
> >          uint32_t c9_pmuserenr; /* perf monitor user enable */
> >          uint64_t c9_pmselr; /* perf monitor counter selection register */
> >          uint64_t c9_pminten; /* perf monitor interrupt enables */
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index 74e90c5..3932ac0 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -1150,9 +1150,17 @@ static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >  static void pmovsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                           uint64_t value)
> >  {
> > +    value &= PMU_COUNTER_MASK(env);
> >      env->cp15.c9_pmovsr &= ~value;
> >  }
> >
> > +static void pmovsset_write(CPUARMState *env, const ARMCPRegInfo *ri,
> > +                         uint64_t value)
> > +{
> > +    value &= PMU_COUNTER_MASK(env);
> > +    env->cp15.c9_pmovsr |= value;
> > +}
> > +
> >  static void pmxevtyper_write(CPUARMState *env, const ARMCPRegInfo *ri,
> >                               uint64_t value)
> >  {
> > @@ -1317,10 +1325,10 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
> >        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmcnten),
> >        .writefn = pmcntenclr_write },
> >      { .name = "PMOVSR", .cp = 15, .crn = 9, .crm = 12, .opc1 = 0, .opc2 = 3,
> > -      .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
> > -      .accessfn = pmreg_access,
> > +      .access = PL0_RW, .accessfn = pmreg_access,
> > +      .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
> 
> Is this just reshuffling the order of .field initializers?

Also updated offsetof -> offsetoflow32. IIRC, I shuffled the order to
match that of the surrounding registers and to appease my sense of
order. Would you prefer that I don't shuffle them, or split that off as
a separate patch?

> >        .writefn = pmovsr_write,
> > -      .raw_writefn = raw_write },
> > +      .raw_writefn = raw_write, .resetvalue = 0 },
> 
> .resetvalue 0 is the default, but it doesn't hurt to specify it
> explicitly I guess.
> 
> >      { .name = "PMOVSCLR_EL0", .state = ARM_CP_STATE_AA64,
> >        .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 3,
> >        .access = PL0_RW, .accessfn = pmreg_access,
> > @@ -1328,6 +1336,19 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
> >        .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
> >        .writefn = pmovsr_write,
> >        .raw_writefn = raw_write },
> > +    { .name = "PMOVSSET", .cp = 15, .crn = 9, .crm = 14, .opc1 = 0, .opc2 = 3,
> > +      .access = PL0_RW, .accessfn = pmreg_access,
> > +      .type = ARM_CP_ALIAS,
> > +      .fieldoffset = offsetoflow32(CPUARMState, cp15.c9_pmovsr),
> > +      .writefn = pmovsset_write,
> > +      .raw_writefn = raw_write },
> > +    { .name = "PMOVSSET_EL0", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 14, .opc2 = 3,
> > +      .access = PL0_RW, .accessfn = pmreg_access,
> > +      .type = ARM_CP_ALIAS,
> > +      .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
> > +      .writefn = pmovsset_write,
> > +      .raw_writefn = raw_write },
> 
> You can implement these using a single STATE_BOTH regdef, I think.
> Also, there's a standard order for the fields which goes
> .cp .opc0 .opc1 .crn .crm .opc2
> (you can often omit the .cp as the default is 15).
> 
> We need to be a bit careful here, because the AArch32 PMMOVSET
> register isn't implemented in ARMv7 until v7VE. So we need to
> put the regdef somewhere else...

Hmmm, what does it need to be protected by? I assume ARM_FEATURE_V8 ||
(ARM_FEATURE_V7 && something), but I'm not familiar enough with v7 that
it's immediately obvious from looking around what 'something' is.

-Aaron

> 
> 
> thanks
> -- PMM

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0
  2017-10-17 14:57   ` Peter Maydell
@ 2017-10-17 19:32     ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-17 19:32 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Wei Huang, QEMU Developers,
	Michael Spradling, Digant Desai

On Oct 17 15:57, Peter Maydell wrote:
> On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > The pmu_counter_filtered and pmu_sync functions are generic (as opposed
> > to PMCCNTR-specific) to allow for the implementation of other events.
> >
> > RFC: I know that many of the locations of the calls to pmu_sync are
> > problematic when icount is enabled because can_do_io will not be set.
> > The documentation says that for deterministic execution, IO must only be
> > performed by the last instruction of a thread block.
> 
> Yes. You need to arrange that gen_io_start() and gen_io_end()
> are called around the generation of code for operations that might
> do IO or care about the state of the clock, and that we end the TB
> after gen_io_end().
> 
> > Because
> > cpu_handle_interrupt() and cpu_handle_exception() are actually made
> > outside of a thread block, is it safe to set can_do_io=1 for them to
> > allow this to succeed? Is there a better mechanism for handling this?
> 
> From my reading of the code, can_do_io should already be 1
> when these functions are called. It's only set to 0 just
> before we call tcg_qemu_tb_exec() and then set back to 1
> immediately after (see cpu_tb_exec()).
> 
> In general, the approach you have here looks like it's going to
> be pretty invasive and also hard to keep working right. I think
> we can come up with something a bit better.

Yes, I am hoping so.

> Specifically, the filtering criteria effectively only change
> when we change exception level, right? (since you can only
> change security state as part of an exception level change).
> We already have a mechanism for getting a callback when the EL
> changes -- arm_register_el_change_hook(). (We would need to
> upgrade it to support more than one hook function.)
> 
> You still need to get the io-count handling right, but there
> are many fewer places that need to change (just the codegen
> for calls to exception-return helpers, I think) and they already
> end the TB, so you don't have the complexity of trying to end the
> TB in places you didn't before, you just need the gen_io_start/end.

Using hooks/callbacks is clearly a better solution. I shied away from
using *el_change_hook in this patchset because A) it seemed to be marked
explicitly for GICv3 emulation, B) the one-hook limitation you
mentioned, and C) it doesn't currently provide you with which EL you're
coming from.

If everyone is amenable to changing how the hook is structured, I don't
think any of those reasons stand. My initial thought is that the best
way to fix C) is to add a pre_el_change_hook to be called at the top of
the exception_return and cpsr_write_eret helpers in
target/arm/op_helper.c to drive the first call to pmu_sync() (or
whatever I rename the first half of that pair to based on your
suggestions on that patch).

If I don't receive any additional feedback before then, I'll do my best
to adapt the existing hooks to allow for a cleaner approach to filtering
for v3.

-Aaron

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
  2017-10-17 15:09 ` Peter Maydell
@ 2017-10-17 19:52   ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-10-17 19:52 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-arm, Alistair Francis, Wei Huang, QEMU Developers,
	Michael Spradling, Digant Desai

On Oct 17 16:09, Peter Maydell wrote:
> On 30 September 2017 at 03:08, Aaron Lindsay <alindsay@codeaurora.org> wrote:
> > The ARM PMU implementation currently contains a basic cycle counter, but it is
> > often useful to gather counts of other events and filter them based on
> > execution mode. These patches flesh out the implementations of various PMU
> > registers including PM[X]EVCNTR and PM[X]EVTYPER, add a struct definition to
> > represent arbitrary counter types, implement mode filtering, and add
> > instruction, cycle, and software increment events.
> >
> > I am particularly interested in feedback on the following two patches because I
> > think I'm likely Doing It Wrong:
> >  [1] target/arm: Filter cycle counter based on PMCCFILTR_EL0
> >  [2] target/arm: PMU: Add instruction and cycle events
> >
> > In order to implement mode filtering in an event-driven way, [1] adds a pair of
> > calls to pmu_sync() surrounding every update to a register/variable which may
> > affect whether any counter is currently filtered. These pmu_sync() calls
> > ultimately call cpu_get_icount_raw() for enabled instruction and cycle counters
> > when using icount. Unfortunately, cpu->can_do_io may otherwise be zero for
> > these calls so the current implementation in [2] temporarily sets can_do_io to
> > 1. I haven't see any ill side effects from this in my testing, but it doesn't
> > seem like the right way to handle this.
> 
> I've now reviewed the early stuff and provided what I hope is
> a useful direction for the mode-filtering, so I'm not going to
> look at the patches at the tail end on this version of the series.

Thank you, your review has been very helpful - the next pass of the
mode-filtering patch will be more maintainable.

> > I would like to eventually add sending interrupts on counter overflow.
> > Suggestions for the best direction to handle this are most welcome.
> 
> Check out how the helper.c timer interrupts are wired up
> (the CPU object exposes outbound irq lines, which then get
> wired up by the board to the GIC.)

Thanks for the pointer - I'll take a look.

-Aaron

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N
  2017-04-19 17:41 [Qemu-devel] [PATCH " Aaron Lindsay
@ 2017-04-19 17:41 ` Aaron Lindsay
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Lindsay @ 2017-04-19 17:41 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm; +Cc: qemu-devel, mspradli, Aaron Lindsay

This is in preparation for enabling counters other than PMCCNTR

Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
---
 target/arm/helper.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 390256b..e8189b8 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -30,11 +30,6 @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
                                hwaddr *phys_ptr, MemTxAttrs *txattrs, int *prot,
                                target_ulong *page_size_ptr, uint32_t *fsr,
                                ARMMMUFaultInfo *fi);
-
-/* Definitions for the PMCCNTR and PMCR registers */
-#define PMCRD   0x8
-#define PMCRC   0x4
-#define PMCRE   0x1
 #endif
 
 static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
@@ -876,6 +871,17 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
     REGINFO_SENTINEL
 };
 
+/* Definitions for the PMU registers */
+#define PMCRN   0xf800
+#define PMCRN_SHIFT 11
+#define PMCRD   0x8
+#define PMCRC   0x4
+#define PMCRE   0x1
+
+#define PMU_NUM_COUNTERS(env) ((env->cp15.c9_pmcr & PMCRN) >> PMCRN_SHIFT)
+/* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
+#define PMU_COUNTER_MASK(env) ((1 << 31) | ((1 << PMU_NUM_COUNTERS(env)) - 1))
+
 static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                    bool isread)
 {
@@ -1060,14 +1066,14 @@ static void pmccfiltr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
-    value &= (1 << 31);
+    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
     env->cp15.c9_pmcnten |= value;
 }
 
 static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    value &= (1 << 31);
+    value &= (PMU_COUNTER_MASK(env) | (1 << 31));
     env->cp15.c9_pmcnten &= ~value;
 }
 
@@ -1115,14 +1121,14 @@ static void pmintenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
     /* We have no event counters so only the C bit can be changed */
-    value &= (1 << 31);
+    value &= PMU_COUNTER_MASK(env);
     env->cp15.c9_pminten |= value;
 }
 
 static void pmintenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    value &= (1 << 31);
+    value &= PMU_COUNTER_MASK(env);
     env->cp15.c9_pminten &= ~value;
 }
 
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-10-17 19:52 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-30  2:08 [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 01/13] target/arm: A53: Initialize PMCEID[0] Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 02/13] target/arm: Check PMCNTEN for whether PMCCNTR is enabled Aaron Lindsay
2017-10-17 12:49   ` Peter Maydell
2017-10-17 14:59     ` Aaron Lindsay
2017-10-17 15:00       ` Peter Maydell
2017-09-30  2:08 ` [Qemu-devel] [PATCH 03/13] target/arm: Reorganize PMCCNTR read, write, sync Aaron Lindsay
2017-10-17 13:25   ` Peter Maydell
2017-10-17 15:26     ` Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N Aaron Lindsay
2017-10-17 13:41   ` Peter Maydell
2017-10-17 15:42     ` Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 05/13] target/arm: Allow AArch32 access for PMCCFILTR Aaron Lindsay
2017-10-17 13:52   ` Peter Maydell
2017-09-30  2:08 ` [Qemu-devel] [PATCH 06/13] target/arm: Filter cycle counter based on PMCCFILTR_EL0 Aaron Lindsay
2017-10-17 14:57   ` Peter Maydell
2017-10-17 19:32     ` Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 07/13] target/arm: Implement PMOVSSET Aaron Lindsay
2017-10-17 14:19   ` Peter Maydell
2017-10-17 16:02     ` Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 08/13] target/arm: Split arm_ccnt_enabled into generic pmu_counter_enabled Aaron Lindsay
2017-10-17 14:04   ` Peter Maydell
2017-09-30  2:08 ` [Qemu-devel] [PATCH 09/13] target/arm: Add array for supported PMU events, generate PMCEID[01] Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 10/13] target/arm: Finish implementation of PM[X]EVCNTR and PM[X]EVTYPER Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 11/13] target/arm: PMU: Add instruction and cycle events Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 12/13] target/arm: PMU: Set PMCR.N to 4 Aaron Lindsay
2017-09-30  2:08 ` [Qemu-devel] [PATCH 13/13] target/arm: Implement PMSWINC Aaron Lindsay
2017-10-09 14:46 ` [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3 Aaron Lindsay
2017-10-09 18:27   ` Peter Maydell
2017-10-09 20:25     ` Aaron Lindsay
2017-10-17 15:09 ` Peter Maydell
2017-10-17 19:52   ` Aaron Lindsay
  -- strict thread matches above, loose matches on Subject: below --
2017-04-19 17:41 [Qemu-devel] [PATCH " Aaron Lindsay
2017-04-19 17:41 ` [Qemu-devel] [PATCH 04/13] target/arm: Mask PMU register writes based on PMCR_EL0.N Aaron Lindsay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.