linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt
@ 2019-07-08 14:32 Julien Thierry
  2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
                   ` (8 more replies)
  0 siblings, 9 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	acme, alexander.shishkin, mingo, namhyung, jolsa

Hi,

After fixing the arm64 Pseudo-NMIs, I'm dusting off this series.

The series makes the arm_pmu driver use NMIs for the perf interrupt when
NMIs are available on the platform (currently, only arm64 + GICv3).

* Patches 1 to 4 remove the need to use spinlocks for the Arm PMU
  driver for Armv7 and Armv8 (aarch64).
* Patches 5 moves the locking to Armv6 specific code which is the sole
  user
* Patches 6 and 7 make the PMU interrupt handler NMI-safe
* Patches 8 and 9 enable using pseudo-NMI for the PMU interrupt when
  the feature is available

Changes since v2[1]:
- Rebased on recent linux-next (next-20190708)
- Fixed a number of bugs with indices (reported by Wei)
- Minor style fixes

Changes since v1[2]:
- Rebased on v5.1-rc1
- Pseudo-NMI has changed a lot since then, use the (now merged) NMI API
- Remove locking from armv7 perf_event
- Use locking only in armv6 perf_event
- Use direct counter/type registers insted of selector register for armv8

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/640536.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-January/554611.html

Cheers,

Julien

-->

Julien Thierry (8):
  arm64: perf: Remove PMU locking
  arm: perf: save/resore pmsel
  arm: perf: Remove Remove PMU locking
  perf/arm_pmu: Move PMU lock to ARMv6 events
  arm64: perf: Do not call irq_work_run in NMI context
  arm/arm64: kvm: pmu: Make overflow handler NMI safe
  arm_pmu: Introduce pmu_irq_ops
  arm_pmu: Use NMIs for PMU

Mark Rutland (1):
  arm64: perf: avoid PMXEV* indirection

 arch/arm/kernel/perf_event_v6.c |  26 +++++---
 arch/arm/kernel/perf_event_v7.c |  77 +++++++---------------
 arch/arm64/kernel/perf_event.c  | 131 ++++++++++++++++++++++++------------
 drivers/perf/arm_pmu.c          | 143 ++++++++++++++++++++++++++++++++++------
 include/kvm/arm_pmu.h           |   1 +
 include/linux/perf/arm_pmu.h    |   5 --
 virt/kvm/arm/pmu.c              |  25 ++++++-
 7 files changed, 277 insertions(+), 131 deletions(-)

--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:03   ` Mark Rutland
  2019-07-10 10:57   ` Steven Price
  2019-07-08 14:32 ` [PATCH v3 2/9] arm64: perf: Remove PMU locking Julien Thierry
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	acme, alexander.shishkin, mingo, Catalin Marinas, namhyung,
	jolsa

From: Mark Rutland <mark.rutland@arm.com>

Currently we access the counter registers and their respective type
registers indirectly. This requires us to write to PMSELR, issue an ISB,
then access the relevant PMXEV* registers.

This is unfortunate, because:

* Under virtualization, accessing one registers requires two traps to
  the hypervisor, even though we could access the register directly with
  a single trap.

* We have to issue an ISB which we could otherwise avoid the cost of.

* When we use NMIs, the NMI handler will have to save/restore the select
  register in case the code it preempted was attempting to access a
  counter or its type register.

We can avoid these issues by directly accessing the relevant registers.
This patch adds helpers to do so.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[Julien T.: Don't inline read/write functions to avoid big code-size
	increase, remove unused read_pmevtypern function,
	fix counter index issue.]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 83 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 96e90e2..7759f8a 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
 #define	ARMV8_IDX_TO_COUNTER(x)	\
 	(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)

+/*
+ * This code is really good
+ */
+
+#define PMEVN_CASE(n, case_macro) \
+	case n: case_macro(n); break
+
+#define PMEVN_SWITCH(x, case_macro)				\
+	do {							\
+		switch (x) {					\
+		PMEVN_CASE(0,  case_macro);			\
+		PMEVN_CASE(1,  case_macro);			\
+		PMEVN_CASE(2,  case_macro);			\
+		PMEVN_CASE(3,  case_macro);			\
+		PMEVN_CASE(4,  case_macro);			\
+		PMEVN_CASE(5,  case_macro);			\
+		PMEVN_CASE(6,  case_macro);			\
+		PMEVN_CASE(7,  case_macro);			\
+		PMEVN_CASE(8,  case_macro);			\
+		PMEVN_CASE(9,  case_macro);			\
+		PMEVN_CASE(10, case_macro);			\
+		PMEVN_CASE(11, case_macro);			\
+		PMEVN_CASE(12, case_macro);			\
+		PMEVN_CASE(13, case_macro);			\
+		PMEVN_CASE(14, case_macro);			\
+		PMEVN_CASE(15, case_macro);			\
+		PMEVN_CASE(16, case_macro);			\
+		PMEVN_CASE(17, case_macro);			\
+		PMEVN_CASE(18, case_macro);			\
+		PMEVN_CASE(19, case_macro);			\
+		PMEVN_CASE(21, case_macro);			\
+		PMEVN_CASE(22, case_macro);			\
+		PMEVN_CASE(23, case_macro);			\
+		PMEVN_CASE(24, case_macro);			\
+		PMEVN_CASE(25, case_macro);			\
+		PMEVN_CASE(26, case_macro);			\
+		PMEVN_CASE(27, case_macro);			\
+		PMEVN_CASE(28, case_macro);			\
+		PMEVN_CASE(29, case_macro);			\
+		PMEVN_CASE(30, case_macro);			\
+		default: WARN(1, "Inavlid PMEV* index");	\
+		}						\
+	} while (0)
+
+#define RETURN_READ_PMEVCNTRN(n) \
+	return read_sysreg(pmevcntr##n##_el0);
+static unsigned long read_pmevcntrn(int n)
+{
+	PMEVN_SWITCH(n, RETURN_READ_PMEVCNTRN);
+	return 0;
+}
+#undef RETURN_READ_PMEVCNTRN
+
+#define WRITE_PMEVCNTRN(n) \
+	write_sysreg(val, pmevcntr##n##_el0);
+static void write_pmevcntrn(int n, unsigned long val)
+{
+	PMEVN_SWITCH(n, WRITE_PMEVCNTRN);
+}
+#undef WRITE_PMEVCNTRN
+
+#define WRITE_PMEVTYPERN(n) \
+	write_sysreg(val, pmevtyper##n##_el0);
+static void write_pmevtypern(int n, unsigned long val)
+{
+	PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
+}
+#undef WRITE_PMEVTYPERN
+
+#undef PMEVN_SWITCH
+
 static inline u32 armv8pmu_pmcr_read(void)
 {
 	return read_sysreg(pmcr_el0);
@@ -397,17 +468,11 @@ static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
 	return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx));
 }

-static inline void armv8pmu_select_counter(int idx)
+static inline u32 armv8pmu_read_evcntr(int idx)
 {
 	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
-	write_sysreg(counter, pmselr_el0);
-	isb();
-}

-static inline u32 armv8pmu_read_evcntr(int idx)
-{
-	armv8pmu_select_counter(idx);
-	return read_sysreg(pmxevcntr_el0);
+	return read_pmevcntrn(counter);
 }

 static inline u64 armv8pmu_read_hw_counter(struct perf_event *event)
@@ -441,8 +506,9 @@ static u64 armv8pmu_read_counter(struct perf_event *event)

 static inline void armv8pmu_write_evcntr(int idx, u32 value)
 {
-	armv8pmu_select_counter(idx);
-	write_sysreg(value, pmxevcntr_el0);
+	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
+
+	write_pmevcntrn(counter, value);
 }

 static inline void armv8pmu_write_hw_counter(struct perf_event *event,
@@ -483,9 +549,10 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)

 static inline void armv8pmu_write_evtype(int idx, u32 val)
 {
-	armv8pmu_select_counter(idx);
+	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
+
 	val &= ARMV8_PMU_EVTYPE_MASK;
-	write_sysreg(val, pmxevtyper_el0);
+	write_pmevtypern(counter, val);
 }

 static inline void armv8pmu_write_event_type(struct perf_event *event)
@@ -505,7 +572,10 @@ static inline void armv8pmu_write_event_type(struct perf_event *event)
 		armv8pmu_write_evtype(idx - 1, hwc->config_base);
 		armv8pmu_write_evtype(idx, chain_evt);
 	} else {
-		armv8pmu_write_evtype(idx, hwc->config_base);
+		if (idx == ARMV8_IDX_CYCLE_COUNTER)
+			write_sysreg(hwc->config_base, pmccfiltr_el0);
+		else
+			armv8pmu_write_evtype(idx, hwc->config_base);
 	}
 }

--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/9] arm64: perf: Remove PMU locking
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
  2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:03   ` Mark Rutland
  2019-07-08 14:32 ` [PATCH v3 3/9] arm: perf: save/resore pmsel Julien Thierry
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	acme, alexander.shishkin, mingo, Catalin Marinas, namhyung,
	jolsa

Since the PMU driver uses direct registers for counter
setup/manipulation, locking around these operations is no longer needed.

For operations that can be called with interrupts enabled, preemption
still needs to be disabled to ensure the programming of the PMU is
done on the expected CPU and not migrated mid-programming.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/kernel/perf_event.c | 32 ++++----------------------------
 1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 7759f8a..878c142 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -677,15 +677,10 @@ static inline u32 armv8pmu_getreset_flags(void)

 static void armv8pmu_enable_event(struct perf_event *event)
 {
-	unsigned long flags;
-	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
 	/*
 	 * Enable counter and interrupt, and set the counter to count
 	 * the event that we're interested in.
 	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);

 	/*
 	 * Disable counter
@@ -706,21 +701,10 @@ static void armv8pmu_enable_event(struct perf_event *event)
 	 * Enable counter
 	 */
 	armv8pmu_enable_event_counter(event);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void armv8pmu_disable_event(struct perf_event *event)
 {
-	unsigned long flags;
-	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	/*
-	 * Disable counter and interrupt
-	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
-
 	/*
 	 * Disable counter
 	 */
@@ -730,30 +714,22 @@ static void armv8pmu_disable_event(struct perf_event *event)
 	 * Disable interrupt for this counter
 	 */
 	armv8pmu_disable_event_irq(event);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void armv8pmu_start(struct arm_pmu *cpu_pmu)
 {
-	unsigned long flags;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	preempt_disable();
 	/* Enable all counters */
 	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	preempt_enable();
 }

 static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
 {
-	unsigned long flags;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	preempt_disable();
 	/* Disable all counters */
 	armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	preempt_enable();
 }

 static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/9] arm: perf: save/resore pmsel
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
  2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
  2019-07-08 14:32 ` [PATCH v3 2/9] arm64: perf: Remove PMU locking Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:06   ` Mark Rutland
  2019-07-08 14:32 ` [PATCH v3 4/9] arm: perf: Remove Remove PMU locking Julien Thierry
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	Russell King, acme, alexander.shishkin, mingo, namhyung, jolsa

The callback pmu->read() can be called with interrupts enabled.
Currently, on ARM, this can cause the following callchain:

armpmu_read() -> armpmu_event_update() -> armv7pmu_read_counter()

The last function might modify the counter selector register and then
read the target counter, without taking any lock. With interrupts
enabled, a PMU interrupt could occur and modify the selector register
as well, between the selection and read of the interrupted context.

Save and restore the value of the selector register in the PMU interrupt
handler, ensuring the interrupted context is left with the correct PMU
registers selected.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
---
 arch/arm/kernel/perf_event_v7.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index a4fb0f8..c3da7a5 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -736,10 +736,22 @@ static inline int armv7_pmnc_counter_has_overflowed(u32 pmnc, int idx)
 	return pmnc & BIT(ARMV7_IDX_TO_COUNTER(idx));
 }

-static inline void armv7_pmnc_select_counter(int idx)
+static inline u32 armv7_pmsel_read(void)
+{
+	u32 pmsel;
+
+	asm volatile("mrc p15, 0, %0, c9, c12, 5" : "=&r" (pmsel));
+	return pmsel;
+}
+
+static inline void armv7_pmsel_write(u32 counter)
 {
-	u32 counter = ARMV7_IDX_TO_COUNTER(idx);
 	asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (counter));
+}
+
+static inline void armv7_pmnc_select_counter(int idx)
+{
+	armv7_pmsel_write(ARMV7_IDX_TO_COUNTER(idx));
 	isb();
 }

@@ -952,8 +964,11 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
 	struct perf_sample_data data;
 	struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
 	struct pt_regs *regs;
+	u32 pmsel;
 	int idx;

+	pmsel = armv7_pmsel_read();
+
 	/*
 	 * Get and reset the IRQ flags
 	 */
@@ -1004,6 +1019,8 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
 	 */
 	irq_work_run();

+	armv7_pmsel_write(pmsel);
+
 	return IRQ_HANDLED;
 }

--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 4/9] arm: perf: Remove Remove PMU locking
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
                   ` (2 preceding siblings ...)
  2019-07-08 14:32 ` [PATCH v3 3/9] arm: perf: save/resore pmsel Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:10   ` Mark Rutland
  2019-07-08 14:32 ` [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events Julien Thierry
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	Russell King, acme, alexander.shishkin, mingo, namhyung, jolsa

Since the PMU interrupt saves and restores the value of the selector
register, there is no need to serialize register accesses against the
interrupt contexts.

For operations that can be called with interrupts enabled, preemption
still needs to be disabled to ensure the programming of the PMU is
done on the expected CPU and not migrated mid-programming.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
---
 arch/arm/kernel/perf_event_v7.c | 56 +++--------------------------------------
 1 file changed, 4 insertions(+), 52 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index c3da7a5..cb3b7a4 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -882,10 +882,8 @@ static void armv7_pmnc_dump_regs(struct arm_pmu *cpu_pmu)

 static void armv7pmu_enable_event(struct perf_event *event)
 {
-	unsigned long flags;
 	struct hw_perf_event *hwc = &event->hw;
 	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
 	int idx = hwc->idx;

 	if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) {
@@ -898,7 +896,6 @@ static void armv7pmu_enable_event(struct perf_event *event)
 	 * Enable counter and interrupt, and set the counter to count
 	 * the event that we're interested in.
 	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);

 	/*
 	 * Disable counter
@@ -922,16 +919,12 @@ static void armv7pmu_enable_event(struct perf_event *event)
 	 * Enable counter
 	 */
 	armv7_pmnc_enable_counter(idx);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void armv7pmu_disable_event(struct perf_event *event)
 {
-	unsigned long flags;
 	struct hw_perf_event *hwc = &event->hw;
 	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
 	int idx = hwc->idx;

 	if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) {
@@ -941,11 +934,6 @@ static void armv7pmu_disable_event(struct perf_event *event)
 	}

 	/*
-	 * Disable counter and interrupt
-	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
-
-	/*
 	 * Disable counter
 	 */
 	armv7_pmnc_disable_counter(idx);
@@ -954,8 +942,6 @@ static void armv7pmu_disable_event(struct perf_event *event)
 	 * Disable interrupt for this counter
 	 */
 	armv7_pmnc_disable_intens(idx);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
@@ -1026,24 +1012,18 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)

 static void armv7pmu_start(struct arm_pmu *cpu_pmu)
 {
-	unsigned long flags;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	preempt_disable();
 	/* Enable all counters */
 	armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	preempt_enable();
 }

 static void armv7pmu_stop(struct arm_pmu *cpu_pmu)
 {
-	unsigned long flags;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	preempt_disable();
 	/* Disable all counters */
 	armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	preempt_enable();
 }

 static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc,
@@ -1509,14 +1489,8 @@ static void krait_clearpmu(u32 config_base)

 static void krait_pmu_disable_event(struct perf_event *event)
 {
-	unsigned long flags;
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
-	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	/* Disable counter and interrupt */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);

 	/* Disable counter */
 	armv7_pmnc_disable_counter(idx);
@@ -1529,23 +1503,17 @@ static void krait_pmu_disable_event(struct perf_event *event)

 	/* Disable interrupt for this counter */
 	armv7_pmnc_disable_intens(idx);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void krait_pmu_enable_event(struct perf_event *event)
 {
-	unsigned long flags;
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
-	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);

 	/*
 	 * Enable counter and interrupt, and set the counter to count
 	 * the event that we're interested in.
 	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);

 	/* Disable counter */
 	armv7_pmnc_disable_counter(idx);
@@ -1565,8 +1533,6 @@ static void krait_pmu_enable_event(struct perf_event *event)

 	/* Enable counter */
 	armv7_pmnc_enable_counter(idx);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void krait_pmu_reset(void *info)
@@ -1842,14 +1808,8 @@ static void scorpion_clearpmu(u32 config_base)

 static void scorpion_pmu_disable_event(struct perf_event *event)
 {
-	unsigned long flags;
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
-	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
-
-	/* Disable counter and interrupt */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);

 	/* Disable counter */
 	armv7_pmnc_disable_counter(idx);
@@ -1862,23 +1822,17 @@ static void scorpion_pmu_disable_event(struct perf_event *event)

 	/* Disable interrupt for this counter */
 	armv7_pmnc_disable_intens(idx);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void scorpion_pmu_enable_event(struct perf_event *event)
 {
-	unsigned long flags;
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
-	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);

 	/*
 	 * Enable counter and interrupt, and set the counter to count
 	 * the event that we're interested in.
 	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);

 	/* Disable counter */
 	armv7_pmnc_disable_counter(idx);
@@ -1898,8 +1852,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event)

 	/* Enable counter */
 	armv7_pmnc_enable_counter(idx);
-
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }

 static void scorpion_pmu_reset(void *info)
--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
                   ` (3 preceding siblings ...)
  2019-07-08 14:32 ` [PATCH v3 4/9] arm: perf: Remove Remove PMU locking Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:19   ` Mark Rutland
  2019-07-08 14:32 ` [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context Julien Thierry
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	Russell King, acme, alexander.shishkin, mingo, namhyung, jolsa

Perf event backend for ARMv8 and ARMv7 no longer uses the pmu_lock.
The only remaining user is the ARMv6 event backend.

Move the pmu_lock out of the generic arm_pmu driver into the ARMv6 code.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
---
 arch/arm/kernel/perf_event_v6.c | 26 +++++++++++++++++---------
 drivers/perf/arm_pmu.c          |  1 -
 include/linux/perf/arm_pmu.h    |  5 -----
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c
index 1ae99de..4106a03 100644
--- a/arch/arm/kernel/perf_event_v6.c
+++ b/arch/arm/kernel/perf_event_v6.c
@@ -69,6 +69,12 @@ enum armv6_counters {
 };

 /*
+ * Hardware lock to serialize accesses to PMU registers. Needed for the
+ * read/modify/write sequences.
+ */
+DEFINE_PER_CPU(raw_spinlock_t, pmu_lock);
+
+/*
  * The hardware events that we support. We do support cache operations but
  * we have harvard caches and no way to combine instruction and data
  * accesses/misses in hardware.
@@ -271,7 +277,7 @@ static void armv6pmu_enable_event(struct perf_event *event)
 	unsigned long val, mask, evt, flags;
 	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
+	struct raw_spinlock_t *lock = this_cpu_ptr(&pmu_lock);
 	int idx = hwc->idx;

 	if (ARMV6_CYCLE_COUNTER == idx) {
@@ -294,12 +300,12 @@ static void armv6pmu_enable_event(struct perf_event *event)
 	 * Mask out the current event and set the counter to count the event
 	 * that we're interested in.
 	 */
-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	raw_spin_lock_irqsave(lock, flags);
 	val = armv6_pmcr_read();
 	val &= ~mask;
 	val |= evt;
 	armv6_pmcr_write(val);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	raw_spin_unlock_irqrestore(lock, flags);
 }

 static irqreturn_t
@@ -363,25 +369,25 @@ static void armv6pmu_enable_event(struct perf_event *event)
 static void armv6pmu_start(struct arm_pmu *cpu_pmu)
 {
 	unsigned long flags, val;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
+	raw_spinlock_t *lock = this_cpu_ptr(&pmu_lock);

-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	raw_spin_lock_irqsave(lock, flags);
 	val = armv6_pmcr_read();
 	val |= ARMV6_PMCR_ENABLE;
 	armv6_pmcr_write(val);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	raw_spin_unlock_irqrestore(lock, flags);
 }

 static void armv6pmu_stop(struct arm_pmu *cpu_pmu)
 {
 	unsigned long flags, val;
-	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
+	raw_spinlock_t *lock = this_cpu_ptr(&pmu_lock);

-	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	raw_spin_lock_irqsave(lock, flags);
 	val = armv6_pmcr_read();
 	val &= ~ARMV6_PMCR_ENABLE;
 	armv6_pmcr_write(val);
-	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+	raw_spin_unlock_irqrestore(lock, flags);
 }

 static int
@@ -502,6 +508,8 @@ static void armv6pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->stop		= armv6pmu_stop;
 	cpu_pmu->map_event	= armv6_map_event;
 	cpu_pmu->num_events	= 3;
+
+	raw_spin_lock_init(this_cpu_ptr(&pmu_lock));
 }

 static int armv6_1136_pmu_init(struct arm_pmu *cpu_pmu)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2d06b80..7fd9f15 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -823,7 +823,6 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 		struct pmu_hw_events *events;

 		events = per_cpu_ptr(pmu->hw_events, cpu);
-		raw_spin_lock_init(&events->pmu_lock);
 		events->percpu_pmu = pmu;
 	}

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 71f525a..8640b23 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -54,11 +54,6 @@ struct pmu_hw_events {
 	 */
 	DECLARE_BITMAP(used_mask, ARMPMU_MAX_HWEVENTS);

-	/*
-	 * Hardware lock to serialize accesses to PMU registers. Needed for the
-	 * read/modify/write sequences.
-	 */
-	raw_spinlock_t		pmu_lock;

 	/*
 	 * When using percpu IRQs, we need a percpu dev_id. Place it here as we
--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
                   ` (4 preceding siblings ...)
  2019-07-08 14:32 ` [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:29   ` Mark Rutland
  2019-07-08 14:32 ` [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe Julien Thierry
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	acme, alexander.shishkin, mingo, Catalin Marinas, namhyung,
	jolsa

Function irq_work_run is not NMI safe and should not be called from NMI
context.

When PMU interrupt is an NMI do not call irq_work_run. Instead rely on the
IRQ work IPI to run the irq_work queue once NMI/IRQ contexts have been
exited.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/kernel/perf_event.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 878c142..a622139 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -794,7 +794,8 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 	 * platforms that can have the PMU interrupts raised as an NMI, this
 	 * will not work.
 	 */
-	irq_work_run();
+	if (!in_nmi())
+		irq_work_run();
 
 	return IRQ_HANDLED;
 }
-- 
1.9.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
                   ` (5 preceding siblings ...)
  2019-07-08 14:32 ` [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 15:30   ` Mark Rutland
  2019-07-11 12:38   ` Zenghui Yu
  2019-07-08 14:32 ` [PATCH v3 8/9] arm_pmu: Introduce pmu_irq_ops Julien Thierry
  2019-07-08 14:32 ` [PATCH v3 9/9] arm_pmu: Use NMIs for PMU Julien Thierry
  8 siblings, 2 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, Suzuki K Pouloze,
	will.deacon, Christoffer Dall, acme, alexander.shishkin, mingo,
	James Morse, Marc Zyngier, namhyung, jolsa, kvmarm

When using an NMI for the PMU interrupt, taking any lock migh cause a
deadlock. The current PMU overflow handler in KVM takes takes locks when
trying to wake up a vcpu.

When overflow handler is called by an NMI, defer the vcpu waking in an
irq_work queue.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Pouloze <suzuki.poulose@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
---
 include/kvm/arm_pmu.h |  1 +
 virt/kvm/arm/pmu.c    | 25 ++++++++++++++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 16c769a..8202ed7 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -27,6 +27,7 @@ struct kvm_pmu {
 	bool ready;
 	bool created;
 	bool irq_level;
+	struct irq_work overflow_work;
 };

 #define kvm_arm_pmu_v3_ready(v)		((v)->arch.pmu.ready)
diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 3dd8238..63f358e 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -421,6 +421,22 @@ void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu)
 }

 /**
+ * When perf interrupt is an NMI, we cannot safely notify the vcpu corresponding
+ * to the even.
+ * This is why we need a callback to do it once outside of the NMI context.
+ */
+static void kvm_pmu_perf_overflow_notify_vcpu(struct irq_work *work)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_pmu *pmu;
+
+	pmu = container_of(work, struct kvm_pmu, overflow_work);
+	vcpu = kvm_pmc_to_vcpu(&pmu->pmc[0]);
+
+	kvm_vcpu_kick(vcpu);
+}
+
+/**
  * When the perf event overflows, set the overflow status and inform the vcpu.
  */
 static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
@@ -435,7 +451,11 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,

 	if (kvm_pmu_overflow_status(vcpu)) {
 		kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
-		kvm_vcpu_kick(vcpu);
+
+		if (!in_nmi())
+			kvm_vcpu_kick(vcpu);
+		else
+			irq_work_queue(&vcpu->arch.pmu.overflow_work);
 	}
 }

@@ -706,6 +726,9 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
 			return ret;
 	}

+	init_irq_work(&vcpu->arch.pmu.overflow_work,
+		      kvm_pmu_perf_overflow_notify_vcpu);
+
 	vcpu->arch.pmu.created = true;
 	return 0;
 }
--
1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 8/9] arm_pmu: Introduce pmu_irq_ops
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
                   ` (6 preceding siblings ...)
  2019-07-08 14:32 ` [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  2019-07-08 14:32 ` [PATCH v3 9/9] arm_pmu: Use NMIs for PMU Julien Thierry
  8 siblings, 0 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	acme, alexander.shishkin, mingo, namhyung, jolsa

Currently the PMU interrupt can either be a normal irq or a percpu irq.
Supporting NMI will introduce two cases for each existing one. It becomes
a mess of 'if's when managing the interrupt.

Define sets of callbacks for operations commonly done on the interrupt. The
appropriate set of callbacks is selected at interrupt request time and
simplifies interrupt enabling/disabling and freeing.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
---
 drivers/perf/arm_pmu.c | 86 ++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 70 insertions(+), 16 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 7fd9f15..9ac072a 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -26,8 +26,46 @@
 
 #include <asm/irq_regs.h>
 
+static int armpmu_count_irq_users(const int irq);
+
+struct pmu_irq_ops {
+	void (*enable_pmuirq)(unsigned int irq);
+	void (*disable_pmuirq)(unsigned int irq);
+	void (*free_pmuirq)(unsigned int irq, int cpu, void __percpu *devid);
+};
+
+static void armpmu_free_pmuirq(unsigned int irq, int cpu, void __percpu *devid)
+{
+	free_irq(irq, per_cpu_ptr(devid, cpu));
+}
+
+static const struct pmu_irq_ops pmuirq_ops = {
+	.enable_pmuirq = enable_irq,
+	.disable_pmuirq = disable_irq_nosync,
+	.free_pmuirq = armpmu_free_pmuirq
+};
+
+static void armpmu_enable_percpu_pmuirq(unsigned int irq)
+{
+	enable_percpu_irq(irq, IRQ_TYPE_NONE);
+}
+
+static void armpmu_free_percpu_pmuirq(unsigned int irq, int cpu,
+				   void __percpu *devid)
+{
+	if (armpmu_count_irq_users(irq) == 1)
+		free_percpu_irq(irq, devid);
+}
+
+static const struct pmu_irq_ops percpu_pmuirq_ops = {
+	.enable_pmuirq = armpmu_enable_percpu_pmuirq,
+	.disable_pmuirq = disable_percpu_irq,
+	.free_pmuirq = armpmu_free_percpu_pmuirq
+};
+
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
+static DEFINE_PER_CPU(const struct pmu_irq_ops *, cpu_irq_ops);
 
 static inline u64 arm_pmu_event_max_period(struct perf_event *event)
 {
@@ -544,6 +582,19 @@ static int armpmu_count_irq_users(const int irq)
 	return count;
 }
 
+static const struct pmu_irq_ops *armpmu_find_irq_ops(int irq)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		if (per_cpu(cpu_irq, cpu) == irq
+		    && per_cpu(cpu_irq_ops, cpu))
+			return per_cpu(cpu_irq_ops, cpu);
+	}
+
+	return NULL;
+}
+
 void armpmu_free_irq(int irq, int cpu)
 {
 	if (per_cpu(cpu_irq, cpu) == 0)
@@ -551,18 +602,18 @@ void armpmu_free_irq(int irq, int cpu)
 	if (WARN_ON(irq != per_cpu(cpu_irq, cpu)))
 		return;
 
-	if (!irq_is_percpu_devid(irq))
-		free_irq(irq, per_cpu_ptr(&cpu_armpmu, cpu));
-	else if (armpmu_count_irq_users(irq) == 1)
-		free_percpu_irq(irq, &cpu_armpmu);
+	per_cpu(cpu_irq_ops, cpu)->free_pmuirq(irq, cpu, &cpu_armpmu);
 
 	per_cpu(cpu_irq, cpu) = 0;
+	per_cpu(cpu_irq_ops, cpu) = NULL;
 }
 
 int armpmu_request_irq(int irq, int cpu)
 {
 	int err = 0;
 	const irq_handler_t handler = armpmu_dispatch_irq;
+	const struct pmu_irq_ops *irq_ops;
+
 	if (!irq)
 		return 0;
 
@@ -584,15 +635,26 @@ int armpmu_request_irq(int irq, int cpu)
 		irq_set_status_flags(irq, IRQ_NOAUTOEN);
 		err = request_irq(irq, handler, irq_flags, "arm-pmu",
 				  per_cpu_ptr(&cpu_armpmu, cpu));
+
+		irq_ops = &pmuirq_ops;
 	} else if (armpmu_count_irq_users(irq) == 0) {
 		err = request_percpu_irq(irq, handler, "arm-pmu",
 					 &cpu_armpmu);
+
+		irq_ops = &percpu_pmuirq_ops;
+	} else {
+		/* Per cpudevid irq was already requested by another CPU */
+		irq_ops = armpmu_find_irq_ops(irq);
+
+		if (WARN_ON(!irq_ops))
+			err = -EINVAL;
 	}
 
 	if (err)
 		goto err_out;
 
 	per_cpu(cpu_irq, cpu) = irq;
+	per_cpu(cpu_irq_ops, cpu) = irq_ops;
 	return 0;
 
 err_out:
@@ -625,12 +687,8 @@ static int arm_perf_starting_cpu(unsigned int cpu, struct hlist_node *node)
 	per_cpu(cpu_armpmu, cpu) = pmu;
 
 	irq = armpmu_get_cpu_irq(pmu, cpu);
-	if (irq) {
-		if (irq_is_percpu_devid(irq))
-			enable_percpu_irq(irq, IRQ_TYPE_NONE);
-		else
-			enable_irq(irq);
-	}
+	if (irq)
+		per_cpu(cpu_irq_ops, cpu)->enable_pmuirq(irq);
 
 	return 0;
 }
@@ -644,12 +702,8 @@ static int arm_perf_teardown_cpu(unsigned int cpu, struct hlist_node *node)
 		return 0;
 
 	irq = armpmu_get_cpu_irq(pmu, cpu);
-	if (irq) {
-		if (irq_is_percpu_devid(irq))
-			disable_percpu_irq(irq);
-		else
-			disable_irq_nosync(irq);
-	}
+	if (irq)
+		per_cpu(cpu_irq_ops, cpu)->disable_pmuirq(irq);
 
 	per_cpu(cpu_armpmu, cpu) = NULL;
 
-- 
1.9.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 9/9] arm_pmu: Use NMIs for PMU
  2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
                   ` (7 preceding siblings ...)
  2019-07-08 14:32 ` [PATCH v3 8/9] arm_pmu: Introduce pmu_irq_ops Julien Thierry
@ 2019-07-08 14:32 ` Julien Thierry
  8 siblings, 0 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 14:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, Julien Thierry, peterz, liwei391, will.deacon,
	acme, alexander.shishkin, mingo, namhyung, jolsa

Add required PMU interrupt operations for NMIs. Request interrupt lines as
NMIs when possible, otherwise fall back to normal interrupts.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
---
 drivers/perf/arm_pmu.c | 62 +++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 9ac072a..a9c00cd 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -45,6 +45,17 @@ static void armpmu_free_pmuirq(unsigned int irq, int cpu, void __percpu *devid)
 	.free_pmuirq = armpmu_free_pmuirq
 };
 
+static void armpmu_free_pmunmi(unsigned int irq, int cpu, void __percpu *devid)
+{
+	free_nmi(irq, per_cpu_ptr(devid, cpu));
+}
+
+static const struct pmu_irq_ops pmunmi_ops = {
+	.enable_pmuirq = enable_nmi,
+	.disable_pmuirq = disable_nmi_nosync,
+	.free_pmuirq = armpmu_free_pmunmi
+};
+
 static void armpmu_enable_percpu_pmuirq(unsigned int irq)
 {
 	enable_percpu_irq(irq, IRQ_TYPE_NONE);
@@ -63,6 +74,31 @@ static void armpmu_free_percpu_pmuirq(unsigned int irq, int cpu,
 	.free_pmuirq = armpmu_free_percpu_pmuirq
 };
 
+static void armpmu_enable_percpu_pmunmi(unsigned int irq)
+{
+	if (!prepare_percpu_nmi(irq))
+		enable_percpu_nmi(irq, IRQ_TYPE_NONE);
+}
+
+static void armpmu_disable_percpu_pmunmi(unsigned int irq)
+{
+	disable_percpu_nmi(irq);
+	teardown_percpu_nmi(irq);
+}
+
+static void armpmu_free_percpu_pmunmi(unsigned int irq, int cpu,
+				      void __percpu *devid)
+{
+	if (armpmu_count_irq_users(irq) == 1)
+		free_percpu_nmi(irq, devid);
+}
+
+static const struct pmu_irq_ops percpu_pmunmi_ops = {
+	.enable_pmuirq = armpmu_enable_percpu_pmunmi,
+	.disable_pmuirq = armpmu_disable_percpu_pmunmi,
+	.free_pmuirq = armpmu_free_percpu_pmunmi
+};
+
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
 static DEFINE_PER_CPU(const struct pmu_irq_ops *, cpu_irq_ops);
@@ -633,15 +669,29 @@ int armpmu_request_irq(int irq, int cpu)
 			    IRQF_NO_THREAD;
 
 		irq_set_status_flags(irq, IRQ_NOAUTOEN);
-		err = request_irq(irq, handler, irq_flags, "arm-pmu",
+
+		err = request_nmi(irq, handler, irq_flags, "arm-pmu",
 				  per_cpu_ptr(&cpu_armpmu, cpu));
 
-		irq_ops = &pmuirq_ops;
+		/* If cannot get an NMI, get a normal interrupt */
+		if (err) {
+			err = request_irq(irq, handler, irq_flags, "arm-pmu",
+					  per_cpu_ptr(&cpu_armpmu, cpu));
+			irq_ops = &pmuirq_ops;
+		} else {
+			irq_ops = &pmunmi_ops;
+		}
 	} else if (armpmu_count_irq_users(irq) == 0) {
-		err = request_percpu_irq(irq, handler, "arm-pmu",
-					 &cpu_armpmu);
-
-		irq_ops = &percpu_pmuirq_ops;
+		err = request_percpu_nmi(irq, handler, "arm-pmu", &cpu_armpmu);
+
+		/* If cannot get an NMI, get a normal interrupt */
+		if (err) {
+			err = request_percpu_irq(irq, handler, "arm-pmu",
+						 &cpu_armpmu);
+			irq_ops = &percpu_pmuirq_ops;
+		} else {
+			irq_ops = &percpu_pmunmi_ops;
+		}
 	} else {
 		/* Per cpudevid irq was already requested by another CPU */
 		irq_ops = armpmu_find_irq_ops(irq);
-- 
1.9.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/9] arm64: perf: Remove PMU locking
  2019-07-08 14:32 ` [PATCH v3 2/9] arm64: perf: Remove PMU locking Julien Thierry
@ 2019-07-08 15:03   ` Mark Rutland
  2019-07-08 15:34     ` Julien Thierry
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:03 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, liwei391, will.deacon, acme, alexander.shishkin, mingo,
	Catalin Marinas, namhyung, jolsa, linux-arm-kernel

On Mon, Jul 08, 2019 at 03:32:50PM +0100, Julien Thierry wrote:
> Since the PMU driver uses direct registers for counter
> setup/manipulation, locking around these operations is no longer needed.
> 
> For operations that can be called with interrupts enabled, preemption
> still needs to be disabled to ensure the programming of the PMU is
> done on the expected CPU and not migrated mid-programming.

[...]

>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>  {
> -	unsigned long flags;
> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
> -
> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
> +	preempt_disable();
>  	/* Enable all counters */
>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> +	preempt_enable();
>  }
> 
>  static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
>  {
> -	unsigned long flags;
> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
> -
> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
> +	preempt_disable();
>  	/* Disable all counters */
>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> +	preempt_enable();
>  }

I think that we'd have bigger problems if these could be called in
preemptible context, since we couldn't guarantee which HW PMU instance
they'd operate on.

I also thought that the interrupt disable/enable was a hold-over from
the old days of perf core, and these days all of the synchronous
operations are held with the pmu ctx lock held (and interrupts
disabled).

Do you have an example of when these are called with interrupts enabled?
Or in a preemptible context?

Thanks,
Mark.

> 
>  static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
> --
> 1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
@ 2019-07-08 15:03   ` Mark Rutland
  2019-07-10 10:57   ` Steven Price
  1 sibling, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:03 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, liwei391, will.deacon, acme, alexander.shishkin, mingo,
	Catalin Marinas, namhyung, jolsa, linux-arm-kernel

On Mon, Jul 08, 2019 at 03:32:49PM +0100, Julien Thierry wrote:
> From: Mark Rutland <mark.rutland@arm.com>
> 
> Currently we access the counter registers and their respective type
> registers indirectly. This requires us to write to PMSELR, issue an ISB,
> then access the relevant PMXEV* registers.
> 
> This is unfortunate, because:
> 
> * Under virtualization, accessing one registers requires two traps to
>   the hypervisor, even though we could access the register directly with
>   a single trap.
> 
> * We have to issue an ISB which we could otherwise avoid the cost of.
> 
> * When we use NMIs, the NMI handler will have to save/restore the select
>   register in case the code it preempted was attempting to access a
>   counter or its type register.
> 
> We can avoid these issues by directly accessing the relevant registers.
> This patch adds helpers to do so.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> [Julien T.: Don't inline read/write functions to avoid big code-size
> 	increase, remove unused read_pmevtypern function,
> 	fix counter index issue.]
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 83 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 96e90e2..7759f8a 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>  #define	ARMV8_IDX_TO_COUNTER(x)	\
>  	(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
> 
> +/*
> + * This code is really good
> + */
> +
> +#define PMEVN_CASE(n, case_macro) \
> +	case n: case_macro(n); break
> +
> +#define PMEVN_SWITCH(x, case_macro)				\
> +	do {							\
> +		switch (x) {					\
> +		PMEVN_CASE(0,  case_macro);			\
> +		PMEVN_CASE(1,  case_macro);			\
> +		PMEVN_CASE(2,  case_macro);			\
> +		PMEVN_CASE(3,  case_macro);			\
> +		PMEVN_CASE(4,  case_macro);			\
> +		PMEVN_CASE(5,  case_macro);			\
> +		PMEVN_CASE(6,  case_macro);			\
> +		PMEVN_CASE(7,  case_macro);			\
> +		PMEVN_CASE(8,  case_macro);			\
> +		PMEVN_CASE(9,  case_macro);			\
> +		PMEVN_CASE(10, case_macro);			\
> +		PMEVN_CASE(11, case_macro);			\
> +		PMEVN_CASE(12, case_macro);			\
> +		PMEVN_CASE(13, case_macro);			\
> +		PMEVN_CASE(14, case_macro);			\
> +		PMEVN_CASE(15, case_macro);			\
> +		PMEVN_CASE(16, case_macro);			\
> +		PMEVN_CASE(17, case_macro);			\
> +		PMEVN_CASE(18, case_macro);			\
> +		PMEVN_CASE(19, case_macro);			\
> +		PMEVN_CASE(21, case_macro);			\
> +		PMEVN_CASE(22, case_macro);			\
> +		PMEVN_CASE(23, case_macro);			\
> +		PMEVN_CASE(24, case_macro);			\
> +		PMEVN_CASE(25, case_macro);			\
> +		PMEVN_CASE(26, case_macro);			\
> +		PMEVN_CASE(27, case_macro);			\
> +		PMEVN_CASE(28, case_macro);			\
> +		PMEVN_CASE(29, case_macro);			\
> +		PMEVN_CASE(30, case_macro);			\
> +		default: WARN(1, "Inavlid PMEV* index");	\

Nit: s/inavlid/invalid/

> +		}						\
> +	} while (0)
> +
> +#define RETURN_READ_PMEVCNTRN(n) \
> +	return read_sysreg(pmevcntr##n##_el0);
> +static unsigned long read_pmevcntrn(int n)
> +{
> +	PMEVN_SWITCH(n, RETURN_READ_PMEVCNTRN);
> +	return 0;
> +}
> +#undef RETURN_READ_PMEVCNTRN
> +
> +#define WRITE_PMEVCNTRN(n) \
> +	write_sysreg(val, pmevcntr##n##_el0);
> +static void write_pmevcntrn(int n, unsigned long val)
> +{
> +	PMEVN_SWITCH(n, WRITE_PMEVCNTRN);
> +}
> +#undef WRITE_PMEVCNTRN
> +
> +#define WRITE_PMEVTYPERN(n) \
> +	write_sysreg(val, pmevtyper##n##_el0);
> +static void write_pmevtypern(int n, unsigned long val)
> +{
> +	PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
> +}
> +#undef WRITE_PMEVTYPERN
> +
> +#undef PMEVN_SWITCH

I think we can drop the undefs. These are local to this C file, and the
names are sufficiently unique to avoid collision. Note that we missed
the undef for PMEVN_CASE, and I imagine keeping that sane will be messy
in future.

Other than that, I haven't come up with a nicer way of writing the
above, so that looks good to me.

> +
>  static inline u32 armv8pmu_pmcr_read(void)
>  {
>  	return read_sysreg(pmcr_el0);
> @@ -397,17 +468,11 @@ static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
>  	return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx));
>  }
> 
> -static inline void armv8pmu_select_counter(int idx)
> +static inline u32 armv8pmu_read_evcntr(int idx)
>  {
>  	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> -	write_sysreg(counter, pmselr_el0);
> -	isb();
> -}
> 
> -static inline u32 armv8pmu_read_evcntr(int idx)
> -{
> -	armv8pmu_select_counter(idx);
> -	return read_sysreg(pmxevcntr_el0);
> +	return read_pmevcntrn(counter);
>  }
> 
>  static inline u64 armv8pmu_read_hw_counter(struct perf_event *event)
> @@ -441,8 +506,9 @@ static u64 armv8pmu_read_counter(struct perf_event *event)
> 
>  static inline void armv8pmu_write_evcntr(int idx, u32 value)
>  {
> -	armv8pmu_select_counter(idx);
> -	write_sysreg(value, pmxevcntr_el0);
> +	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> +
> +	write_pmevcntrn(counter, value);
>  }
> 
>  static inline void armv8pmu_write_hw_counter(struct perf_event *event,
> @@ -483,9 +549,10 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
> 
>  static inline void armv8pmu_write_evtype(int idx, u32 val)
>  {
> -	armv8pmu_select_counter(idx);
> +	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> +
>  	val &= ARMV8_PMU_EVTYPE_MASK;
> -	write_sysreg(val, pmxevtyper_el0);
> +	write_pmevtypern(counter, val);
>  }
> 
>  static inline void armv8pmu_write_event_type(struct perf_event *event)
> @@ -505,7 +572,10 @@ static inline void armv8pmu_write_event_type(struct perf_event *event)
>  		armv8pmu_write_evtype(idx - 1, hwc->config_base);
>  		armv8pmu_write_evtype(idx, chain_evt);
>  	} else {
> -		armv8pmu_write_evtype(idx, hwc->config_base);
> +		if (idx == ARMV8_IDX_CYCLE_COUNTER)
> +			write_sysreg(hwc->config_base, pmccfiltr_el0);
> +		else
> +			armv8pmu_write_evtype(idx, hwc->config_base);
>  	}
>  }

... and this all looks sound.

With the typo fixed and undefs dropped:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> 
> --
> 1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/9] arm: perf: save/resore pmsel
  2019-07-08 14:32 ` [PATCH v3 3/9] arm: perf: save/resore pmsel Julien Thierry
@ 2019-07-08 15:06   ` Mark Rutland
  2019-07-08 15:40     ` Julien Thierry
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:06 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, liwei391, will.deacon, Russell King, acme,
	alexander.shishkin, mingo, namhyung, jolsa, linux-arm-kernel

On Mon, Jul 08, 2019 at 03:32:51PM +0100, Julien Thierry wrote:
> The callback pmu->read() can be called with interrupts enabled.
> Currently, on ARM, this can cause the following callchain:
> 
> armpmu_read() -> armpmu_event_update() -> armv7pmu_read_counter()
> 
> The last function might modify the counter selector register and then
> read the target counter, without taking any lock. With interrupts
> enabled, a PMU interrupt could occur and modify the selector register
> as well, between the selection and read of the interrupted context.
> 
> Save and restore the value of the selector register in the PMU interrupt
> handler, ensuring the interrupted context is left with the correct PMU
> registers selected.

IIUC, this is a latent bug, so I guess it should be Cc'd stable?

> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Russell King <linux@armlinux.org.uk>
> ---
>  arch/arm/kernel/perf_event_v7.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
> index a4fb0f8..c3da7a5 100644
> --- a/arch/arm/kernel/perf_event_v7.c
> +++ b/arch/arm/kernel/perf_event_v7.c
> @@ -736,10 +736,22 @@ static inline int armv7_pmnc_counter_has_overflowed(u32 pmnc, int idx)
>  	return pmnc & BIT(ARMV7_IDX_TO_COUNTER(idx));
>  }
> 
> -static inline void armv7_pmnc_select_counter(int idx)
> +static inline u32 armv7_pmsel_read(void)
> +{
> +	u32 pmsel;
> +
> +	asm volatile("mrc p15, 0, %0, c9, c12, 5" : "=&r" (pmsel));
> +	return pmsel;
> +}
> +
> +static inline void armv7_pmsel_write(u32 counter)
>  {
> -	u32 counter = ARMV7_IDX_TO_COUNTER(idx);
>  	asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (counter));
> +}
> +
> +static inline void armv7_pmnc_select_counter(int idx)
> +{
> +	armv7_pmsel_write(ARMV7_IDX_TO_COUNTER(idx));
>  	isb();
>  }
> 
> @@ -952,8 +964,11 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  	struct perf_sample_data data;
>  	struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
>  	struct pt_regs *regs;
> +	u32 pmsel;
>  	int idx;
> 
> +	pmsel = armv7_pmsel_read();

Could we add a comment explaining why we need to save/restore this?

Otherwise, this looks good to me.

Thanks,
Mark.

> +
>  	/*
>  	 * Get and reset the IRQ flags
>  	 */
> @@ -1004,6 +1019,8 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  	 */
>  	irq_work_run();
> 
> +	armv7_pmsel_write(pmsel);
> +
>  	return IRQ_HANDLED;
>  }
> 
> --
> 1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 4/9] arm: perf: Remove Remove PMU locking
  2019-07-08 14:32 ` [PATCH v3 4/9] arm: perf: Remove Remove PMU locking Julien Thierry
@ 2019-07-08 15:10   ` Mark Rutland
  0 siblings, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:10 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, liwei391, will.deacon, Russell King, acme,
	alexander.shishkin, mingo, namhyung, jolsa, linux-arm-kernel

On Mon, Jul 08, 2019 at 03:32:52PM +0100, Julien Thierry wrote:
> Since the PMU interrupt saves and restores the value of the selector
> register, there is no need to serialize register accesses against the
> interrupt contexts.
> 
> For operations that can be called with interrupts enabled, preemption
> still needs to be disabled to ensure the programming of the PMU is
> done on the expected CPU and not migrated mid-programming.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Russell King <linux@armlinux.org.uk>
> ---
>  arch/arm/kernel/perf_event_v7.c | 56 +++--------------------------------------
>  1 file changed, 4 insertions(+), 52 deletions(-)

>  static void armv7pmu_start(struct arm_pmu *cpu_pmu)
>  {
> -	unsigned long flags;
> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
> -
> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
> +	preempt_disable();
>  	/* Enable all counters */
>  	armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E);
> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> +	preempt_enable();
>  }
> 
>  static void armv7pmu_stop(struct arm_pmu *cpu_pmu)
>  {
> -	unsigned long flags;
> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
> -
> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
> +	preempt_disable();
>  	/* Disable all counters */
>  	armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E);
> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> +	preempt_enable();
>  }

As with the arm64 patch, I suspect/hope it's not necessary to disable
preemption here.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events
  2019-07-08 14:32 ` [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events Julien Thierry
@ 2019-07-08 15:19   ` Mark Rutland
  2019-07-08 15:50     ` Julien Thierry
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:19 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, liwei391, will.deacon, Russell King, acme,
	alexander.shishkin, mingo, namhyung, jolsa, linux-arm-kernel

On Mon, Jul 08, 2019 at 03:32:53PM +0100, Julien Thierry wrote:
> Perf event backend for ARMv8 and ARMv7 no longer uses the pmu_lock.
> The only remaining user is the ARMv6 event backend.
> 
> Move the pmu_lock out of the generic arm_pmu driver into the ARMv6 code.

Moving this into the ARMv6 PMU code makes sense to me.

[...]

> @@ -502,6 +508,8 @@ static void armv6pmu_init(struct arm_pmu *cpu_pmu)
>  	cpu_pmu->stop		= armv6pmu_stop;
>  	cpu_pmu->map_event	= armv6_map_event;
>  	cpu_pmu->num_events	= 3;
> +
> +	raw_spin_lock_init(this_cpu_ptr(&pmu_lock));

This needs to initialize the lock on each CPU in order for armv6mpcore.

In __armpmu_alloc we had:

for_each_possible_cpu(cpu) {
	struct pmu_hw_events *events;
	events = per_cpu_ptr(pmu->hw_events, cpu);
	raw_spin_lock_init(&events->pmu_lock);
	...
}

... which did that for us.

Otherwise, this looks good to me.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context
  2019-07-08 14:32 ` [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context Julien Thierry
@ 2019-07-08 15:29   ` Mark Rutland
  2019-07-08 16:00     ` Julien Thierry
  0 siblings, 1 reply; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:29 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, liwei391, will.deacon, acme, alexander.shishkin, mingo,
	Catalin Marinas, namhyung, jolsa, linux-arm-kernel

On Mon, Jul 08, 2019 at 03:32:54PM +0100, Julien Thierry wrote:
> Function irq_work_run is not NMI safe and should not be called from NMI
> context.
> 
> When PMU interrupt is an NMI do not call irq_work_run. Instead rely on the
> IRQ work IPI to run the irq_work queue once NMI/IRQ contexts have been
> exited.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 878c142..a622139 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -794,7 +794,8 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  	 * platforms that can have the PMU interrupts raised as an NMI, this
>  	 * will not work.
>  	 */
> -	irq_work_run();
> +	if (!in_nmi())
> +		irq_work_run();

Could we always defer the IRQ work, or would that be problematic for
non-NMI?

We should probably update the comment to explain why this is safe in
either case.

Thanks,
Mark.

>  
>  	return IRQ_HANDLED;
>  }
> -- 
> 1.9.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe
  2019-07-08 14:32 ` [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe Julien Thierry
@ 2019-07-08 15:30   ` Mark Rutland
  2019-07-11 12:38   ` Zenghui Yu
  1 sibling, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2019-07-08 15:30 UTC (permalink / raw)
  To: Julien Thierry, Marc Zyngier
  Cc: Suzuki K Pouloze, peterz, liwei391, will.deacon,
	Christoffer Dall, acme, alexander.shishkin, mingo, James Morse,
	namhyung, jolsa, kvmarm, linux-arm-kernel

Marc, I assume that you will take a look at this from the KVM side. :)

On Mon, Jul 08, 2019 at 03:32:55PM +0100, Julien Thierry wrote:
> When using an NMI for the PMU interrupt, taking any lock migh cause a
> deadlock. The current PMU overflow handler in KVM takes takes locks when
> trying to wake up a vcpu.
> 
> When overflow handler is called by an NMI, defer the vcpu waking in an
> irq_work queue.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com>
> Cc: kvmarm@lists.cs.columbia.edu
> ---
>  include/kvm/arm_pmu.h |  1 +
>  virt/kvm/arm/pmu.c    | 25 ++++++++++++++++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 16c769a..8202ed7 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -27,6 +27,7 @@ struct kvm_pmu {
>  	bool ready;
>  	bool created;
>  	bool irq_level;
> +	struct irq_work overflow_work;
>  };
> 
>  #define kvm_arm_pmu_v3_ready(v)		((v)->arch.pmu.ready)
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 3dd8238..63f358e 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -421,6 +421,22 @@ void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu)
>  }
> 
>  /**
> + * When perf interrupt is an NMI, we cannot safely notify the vcpu corresponding
> + * to the even.

Nit: s/even/event/

Thanks,
Mark.

> + * This is why we need a callback to do it once outside of the NMI context.
> + */
> +static void kvm_pmu_perf_overflow_notify_vcpu(struct irq_work *work)
> +{
> +	struct kvm_vcpu *vcpu;
> +	struct kvm_pmu *pmu;
> +
> +	pmu = container_of(work, struct kvm_pmu, overflow_work);
> +	vcpu = kvm_pmc_to_vcpu(&pmu->pmc[0]);
> +
> +	kvm_vcpu_kick(vcpu);
> +}
> +
> +/**
>   * When the perf event overflows, set the overflow status and inform the vcpu.
>   */
>  static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
> @@ -435,7 +451,11 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
> 
>  	if (kvm_pmu_overflow_status(vcpu)) {
>  		kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
> -		kvm_vcpu_kick(vcpu);
> +
> +		if (!in_nmi())
> +			kvm_vcpu_kick(vcpu);
> +		else
> +			irq_work_queue(&vcpu->arch.pmu.overflow_work);
>  	}
>  }
> 
> @@ -706,6 +726,9 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>  			return ret;
>  	}
> 
> +	init_irq_work(&vcpu->arch.pmu.overflow_work,
> +		      kvm_pmu_perf_overflow_notify_vcpu);
> +
>  	vcpu->arch.pmu.created = true;
>  	return 0;
>  }
> --
> 1.9.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/9] arm64: perf: Remove PMU locking
  2019-07-08 15:03   ` Mark Rutland
@ 2019-07-08 15:34     ` Julien Thierry
  2019-07-09 11:22       ` Mark Rutland
  0 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 15:34 UTC (permalink / raw)
  To: Mark Rutland
  Cc: peterz, Catalin Marinas, jolsa, will.deacon, acme,
	alexander.shishkin, mingo, namhyung, liwei391, linux-arm-kernel



On 08/07/2019 16:03, Mark Rutland wrote:
> On Mon, Jul 08, 2019 at 03:32:50PM +0100, Julien Thierry wrote:
>> Since the PMU driver uses direct registers for counter
>> setup/manipulation, locking around these operations is no longer needed.
>>
>> For operations that can be called with interrupts enabled, preemption
>> still needs to be disabled to ensure the programming of the PMU is
>> done on the expected CPU and not migrated mid-programming.
> 
> [...]
> 
>>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>>  {
>> -	unsigned long flags;
>> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
>> -
>> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
>> +	preempt_disable();
>>  	/* Enable all counters */
>>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
>> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
>> +	preempt_enable();
>>  }
>>
>>  static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
>>  {
>> -	unsigned long flags;
>> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
>> -
>> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
>> +	preempt_disable();
>>  	/* Disable all counters */
>>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
>> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
>> +	preempt_enable();
>>  }
> 
> I think that we'd have bigger problems if these could be called in
> preemptible context, since we couldn't guarantee which HW PMU instance
> they'd operate on.
> 
> I also thought that the interrupt disable/enable was a hold-over from
> the old days of perf core, and these days all of the synchronous
> operations are held with the pmu ctx lock held (and interrupts
> disabled).
> 
> Do you have an example of when these are called with interrupts enabled?
> Or in a preemptible context?
> 

I must admit that not seeing mention of the pmu_enable/disable callbacks
being called with preemption disabled in perf_event.h, I assumed the
worst. But looking at it, it does look like they are always called
either with interrupts or preemption disabled, and the x86
implementation doesn't seem to worry about being preempted in those
callbacks.

I guess I can get rid of the preempt_enable/disable.

Thanks,

-- 
Julien Thierry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/9] arm: perf: save/resore pmsel
  2019-07-08 15:06   ` Mark Rutland
@ 2019-07-08 15:40     ` Julien Thierry
  0 siblings, 0 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 15:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: peterz, liwei391, will.deacon, Russell King, acme,
	alexander.shishkin, mingo, namhyung, jolsa, linux-arm-kernel



On 08/07/2019 16:06, Mark Rutland wrote:
> On Mon, Jul 08, 2019 at 03:32:51PM +0100, Julien Thierry wrote:
>> The callback pmu->read() can be called with interrupts enabled.
>> Currently, on ARM, this can cause the following callchain:
>>
>> armpmu_read() -> armpmu_event_update() -> armv7pmu_read_counter()
>>
>> The last function might modify the counter selector register and then
>> read the target counter, without taking any lock. With interrupts
>> enabled, a PMU interrupt could occur and modify the selector register
>> as well, between the selection and read of the interrupted context.
>>
>> Save and restore the value of the selector register in the PMU interrupt
>> handler, ensuring the interrupted context is left with the correct PMU
>> registers selected.
> 
> IIUC, this is a latent bug, so I guess it should be Cc'd stable?
> 

It's my understanding as well. I'll put stable in copy for next iteration.

>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Russell King <linux@armlinux.org.uk>
>> ---
>>  arch/arm/kernel/perf_event_v7.c | 21 +++++++++++++++++++--
>>  1 file changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
>> index a4fb0f8..c3da7a5 100644
>> --- a/arch/arm/kernel/perf_event_v7.c
>> +++ b/arch/arm/kernel/perf_event_v7.c
>> @@ -736,10 +736,22 @@ static inline int armv7_pmnc_counter_has_overflowed(u32 pmnc, int idx)
>>  	return pmnc & BIT(ARMV7_IDX_TO_COUNTER(idx));
>>  }
>>
>> -static inline void armv7_pmnc_select_counter(int idx)
>> +static inline u32 armv7_pmsel_read(void)
>> +{
>> +	u32 pmsel;
>> +
>> +	asm volatile("mrc p15, 0, %0, c9, c12, 5" : "=&r" (pmsel));
>> +	return pmsel;
>> +}
>> +
>> +static inline void armv7_pmsel_write(u32 counter)
>>  {
>> -	u32 counter = ARMV7_IDX_TO_COUNTER(idx);
>>  	asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (counter));
>> +}
>> +
>> +static inline void armv7_pmnc_select_counter(int idx)
>> +{
>> +	armv7_pmsel_write(ARMV7_IDX_TO_COUNTER(idx));
>>  	isb();
>>  }
>>
>> @@ -952,8 +964,11 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>  	struct perf_sample_data data;
>>  	struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
>>  	struct pt_regs *regs;
>> +	u32 pmsel;
>>  	int idx;
>>
>> +	pmsel = armv7_pmsel_read();
> 
> Could we add a comment explaining why we need to save/restore this?
> 

Sure, will do.

> Otherwise, this looks good to me.

Thanks,

-- 
Julien Thierry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events
  2019-07-08 15:19   ` Mark Rutland
@ 2019-07-08 15:50     ` Julien Thierry
  0 siblings, 0 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 15:50 UTC (permalink / raw)
  To: Mark Rutland
  Cc: peterz, liwei391, will.deacon, Russell King, acme,
	alexander.shishkin, mingo, namhyung, jolsa, linux-arm-kernel



On 08/07/2019 16:19, Mark Rutland wrote:
> On Mon, Jul 08, 2019 at 03:32:53PM +0100, Julien Thierry wrote:
>> Perf event backend for ARMv8 and ARMv7 no longer uses the pmu_lock.
>> The only remaining user is the ARMv6 event backend.
>>
>> Move the pmu_lock out of the generic arm_pmu driver into the ARMv6 code.
> 
> Moving this into the ARMv6 PMU code makes sense to me.
> 
> [...]
> 
>> @@ -502,6 +508,8 @@ static void armv6pmu_init(struct arm_pmu *cpu_pmu)
>>  	cpu_pmu->stop		= armv6pmu_stop;
>>  	cpu_pmu->map_event	= armv6_map_event;
>>  	cpu_pmu->num_events	= 3;
>> +
>> +	raw_spin_lock_init(this_cpu_ptr(&pmu_lock));
> 
> This needs to initialize the lock on each CPU in order for armv6mpcore.
> 
> In __armpmu_alloc we had:
> 
> for_each_possible_cpu(cpu) {
> 	struct pmu_hw_events *events;
> 	events = per_cpu_ptr(pmu->hw_events, cpu);
> 	raw_spin_lock_init(&events->pmu_lock);
> 	...
> }
> 
> ... which did that for us.
> 

Oops, thanks for spotting that. Will fix it for next version.

> Otherwise, this looks good to me.
> 

Thanks,

-- 
Julien Thierry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context
  2019-07-08 15:29   ` Mark Rutland
@ 2019-07-08 16:00     ` Julien Thierry
  0 siblings, 0 replies; 28+ messages in thread
From: Julien Thierry @ 2019-07-08 16:00 UTC (permalink / raw)
  To: Mark Rutland
  Cc: peterz, Catalin Marinas, jolsa, will.deacon, acme,
	alexander.shishkin, mingo, namhyung, liwei391, linux-arm-kernel



On 08/07/2019 16:29, Mark Rutland wrote:
> On Mon, Jul 08, 2019 at 03:32:54PM +0100, Julien Thierry wrote:
>> Function irq_work_run is not NMI safe and should not be called from NMI
>> context.
>>
>> When PMU interrupt is an NMI do not call irq_work_run. Instead rely on the
>> IRQ work IPI to run the irq_work queue once NMI/IRQ contexts have been
>> exited.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> ---
>>  arch/arm64/kernel/perf_event.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>> index 878c142..a622139 100644
>> --- a/arch/arm64/kernel/perf_event.c
>> +++ b/arch/arm64/kernel/perf_event.c
>> @@ -794,7 +794,8 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>  	 * platforms that can have the PMU interrupts raised as an NMI, this
>>  	 * will not work.
>>  	 */
>> -	irq_work_run();
>> +	if (!in_nmi())
>> +		irq_work_run();
> 
> Could we always defer the IRQ work, or would that be problematic for
> non-NMI?
> 

In terms of functionality, we can always defer it. It will just delay
the event processing (and possibly re-enabling) to the moment where the
IRQ work IPI is taken.

If I remember correctly, queuing irq_works will send the IPI request
anyway, the only difference is how much work will be done in the IPI
handler...

Personally I think it makes sense to do it always rely on the IPI.

> We should probably update the comment to explain why this is safe in
> either case.
> 

I'll add a comment where the irq_work is queued.

Thanks,

-- 
Julien Thierry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/9] arm64: perf: Remove PMU locking
  2019-07-08 15:34     ` Julien Thierry
@ 2019-07-09 11:22       ` Mark Rutland
  0 siblings, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2019-07-09 11:22 UTC (permalink / raw)
  To: Julien Thierry
  Cc: peterz, Catalin Marinas, jolsa, will.deacon, acme,
	alexander.shishkin, mingo, namhyung, liwei391, linux-arm-kernel

On Mon, Jul 08, 2019 at 04:34:01PM +0100, Julien Thierry wrote:
> 
> 
> On 08/07/2019 16:03, Mark Rutland wrote:
> > On Mon, Jul 08, 2019 at 03:32:50PM +0100, Julien Thierry wrote:
> >> Since the PMU driver uses direct registers for counter
> >> setup/manipulation, locking around these operations is no longer needed.
> >>
> >> For operations that can be called with interrupts enabled, preemption
> >> still needs to be disabled to ensure the programming of the PMU is
> >> done on the expected CPU and not migrated mid-programming.
> > 
> > [...]
> > 
> >>  static void armv8pmu_start(struct arm_pmu *cpu_pmu)
> >>  {
> >> -	unsigned long flags;
> >> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
> >> -
> >> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
> >> +	preempt_disable();
> >>  	/* Enable all counters */
> >>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> >> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> >> +	preempt_enable();
> >>  }
> >>
> >>  static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
> >>  {
> >> -	unsigned long flags;
> >> -	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
> >> -
> >> -	raw_spin_lock_irqsave(&events->pmu_lock, flags);
> >> +	preempt_disable();
> >>  	/* Disable all counters */
> >>  	armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
> >> -	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> >> +	preempt_enable();
> >>  }
> > 
> > I think that we'd have bigger problems if these could be called in
> > preemptible context, since we couldn't guarantee which HW PMU instance
> > they'd operate on.
> > 
> > I also thought that the interrupt disable/enable was a hold-over from
> > the old days of perf core, and these days all of the synchronous
> > operations are held with the pmu ctx lock held (and interrupts
> > disabled).
> > 
> > Do you have an example of when these are called with interrupts enabled?
> > Or in a preemptible context?
> 
> I must admit that not seeing mention of the pmu_enable/disable callbacks
> being called with preemption disabled in perf_event.h, I assumed the
> worst. But looking at it, it does look like they are always called
> either with interrupts or preemption disabled, and the x86
> implementation doesn't seem to worry about being preempted in those
> callbacks.
> 
> I guess I can get rid of the preempt_enable/disable.

Yes please!

As an aside, this is something I'd like to explicitly annotate on
functions so that we could statically and dnyamically test it. For now
I'm happy to rely on the semantics documented in perf_event.h.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
  2019-07-08 15:03   ` Mark Rutland
@ 2019-07-10 10:57   ` Steven Price
  2019-07-10 11:01     ` Julien Thierry
  1 sibling, 1 reply; 28+ messages in thread
From: Steven Price @ 2019-07-10 10:57 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel
  Cc: mark.rutland, peterz, Catalin Marinas, jolsa, will.deacon, acme,
	alexander.shishkin, mingo, namhyung, liwei391

On 08/07/2019 15:32, Julien Thierry wrote:
> From: Mark Rutland <mark.rutland@arm.com>
> 
> Currently we access the counter registers and their respective type
> registers indirectly. This requires us to write to PMSELR, issue an ISB,
> then access the relevant PMXEV* registers.
> 
> This is unfortunate, because:
> 
> * Under virtualization, accessing one registers requires two traps to
>   the hypervisor, even though we could access the register directly with
>   a single trap.
> 
> * We have to issue an ISB which we could otherwise avoid the cost of.
> 
> * When we use NMIs, the NMI handler will have to save/restore the select
>   register in case the code it preempted was attempting to access a
>   counter or its type register.
> 
> We can avoid these issues by directly accessing the relevant registers.
> This patch adds helpers to do so.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> [Julien T.: Don't inline read/write functions to avoid big code-size
> 	increase, remove unused read_pmevtypern function,
> 	fix counter index issue.]
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 83 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 96e90e2..7759f8a 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>  #define	ARMV8_IDX_TO_COUNTER(x)	\
>  	(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
> 
> +/*
> + * This code is really good
> + */
> +
> +#define PMEVN_CASE(n, case_macro) \
> +	case n: case_macro(n); break
> +
> +#define PMEVN_SWITCH(x, case_macro)				\
> +	do {							\
> +		switch (x) {					\
> +		PMEVN_CASE(0,  case_macro);			\
> +		PMEVN_CASE(1,  case_macro);			\
> +		PMEVN_CASE(2,  case_macro);			\
> +		PMEVN_CASE(3,  case_macro);			\
> +		PMEVN_CASE(4,  case_macro);			\
> +		PMEVN_CASE(5,  case_macro);			\
> +		PMEVN_CASE(6,  case_macro);			\
> +		PMEVN_CASE(7,  case_macro);			\
> +		PMEVN_CASE(8,  case_macro);			\
> +		PMEVN_CASE(9,  case_macro);			\
> +		PMEVN_CASE(10, case_macro);			\
> +		PMEVN_CASE(11, case_macro);			\
> +		PMEVN_CASE(12, case_macro);			\
> +		PMEVN_CASE(13, case_macro);			\
> +		PMEVN_CASE(14, case_macro);			\
> +		PMEVN_CASE(15, case_macro);			\
> +		PMEVN_CASE(16, case_macro);			\
> +		PMEVN_CASE(17, case_macro);			\
> +		PMEVN_CASE(18, case_macro);			\
> +		PMEVN_CASE(19, case_macro);			\

Is 20 missing on purpose?

> +		PMEVN_CASE(21, case_macro);			\
> +		PMEVN_CASE(22, case_macro);			\
> +		PMEVN_CASE(23, case_macro);			\
> +		PMEVN_CASE(24, case_macro);			\
> +		PMEVN_CASE(25, case_macro);			\
> +		PMEVN_CASE(26, case_macro);			\
> +		PMEVN_CASE(27, case_macro);			\
> +		PMEVN_CASE(28, case_macro);			\
> +		PMEVN_CASE(29, case_macro);			\
> +		PMEVN_CASE(30, case_macro);			\
> +		default: WARN(1, "Inavlid PMEV* index");	\
				  ^^^^^^^ Invalid

Steve

> +		}						\
> +	} while (0)
> +
> +#define RETURN_READ_PMEVCNTRN(n) \
> +	return read_sysreg(pmevcntr##n##_el0);
> +static unsigned long read_pmevcntrn(int n)
> +{
> +	PMEVN_SWITCH(n, RETURN_READ_PMEVCNTRN);
> +	return 0;
> +}
> +#undef RETURN_READ_PMEVCNTRN
> +
> +#define WRITE_PMEVCNTRN(n) \
> +	write_sysreg(val, pmevcntr##n##_el0);
> +static void write_pmevcntrn(int n, unsigned long val)
> +{
> +	PMEVN_SWITCH(n, WRITE_PMEVCNTRN);
> +}
> +#undef WRITE_PMEVCNTRN
> +
> +#define WRITE_PMEVTYPERN(n) \
> +	write_sysreg(val, pmevtyper##n##_el0);
> +static void write_pmevtypern(int n, unsigned long val)
> +{
> +	PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
> +}
> +#undef WRITE_PMEVTYPERN
> +
> +#undef PMEVN_SWITCH
> +
>  static inline u32 armv8pmu_pmcr_read(void)
>  {
>  	return read_sysreg(pmcr_el0);
> @@ -397,17 +468,11 @@ static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
>  	return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx));
>  }
> 
> -static inline void armv8pmu_select_counter(int idx)
> +static inline u32 armv8pmu_read_evcntr(int idx)
>  {
>  	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> -	write_sysreg(counter, pmselr_el0);
> -	isb();
> -}
> 
> -static inline u32 armv8pmu_read_evcntr(int idx)
> -{
> -	armv8pmu_select_counter(idx);
> -	return read_sysreg(pmxevcntr_el0);
> +	return read_pmevcntrn(counter);
>  }
> 
>  static inline u64 armv8pmu_read_hw_counter(struct perf_event *event)
> @@ -441,8 +506,9 @@ static u64 armv8pmu_read_counter(struct perf_event *event)
> 
>  static inline void armv8pmu_write_evcntr(int idx, u32 value)
>  {
> -	armv8pmu_select_counter(idx);
> -	write_sysreg(value, pmxevcntr_el0);
> +	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> +
> +	write_pmevcntrn(counter, value);
>  }
> 
>  static inline void armv8pmu_write_hw_counter(struct perf_event *event,
> @@ -483,9 +549,10 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
> 
>  static inline void armv8pmu_write_evtype(int idx, u32 val)
>  {
> -	armv8pmu_select_counter(idx);
> +	u32 counter = ARMV8_IDX_TO_COUNTER(idx);
> +
>  	val &= ARMV8_PMU_EVTYPE_MASK;
> -	write_sysreg(val, pmxevtyper_el0);
> +	write_pmevtypern(counter, val);
>  }
> 
>  static inline void armv8pmu_write_event_type(struct perf_event *event)
> @@ -505,7 +572,10 @@ static inline void armv8pmu_write_event_type(struct perf_event *event)
>  		armv8pmu_write_evtype(idx - 1, hwc->config_base);
>  		armv8pmu_write_evtype(idx, chain_evt);
>  	} else {
> -		armv8pmu_write_evtype(idx, hwc->config_base);
> +		if (idx == ARMV8_IDX_CYCLE_COUNTER)
> +			write_sysreg(hwc->config_base, pmccfiltr_el0);
> +		else
> +			armv8pmu_write_evtype(idx, hwc->config_base);
>  	}
>  }
> 
> --
> 1.9.1
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-10 10:57   ` Steven Price
@ 2019-07-10 11:01     ` Julien Thierry
  2019-07-16 10:33       ` Shijith Thotton
  0 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-10 11:01 UTC (permalink / raw)
  To: Steven Price, linux-arm-kernel
  Cc: mark.rutland, peterz, Catalin Marinas, jolsa, will.deacon, acme,
	alexander.shishkin, mingo, namhyung, liwei391



On 10/07/2019 11:57, Steven Price wrote:
> On 08/07/2019 15:32, Julien Thierry wrote:
>> From: Mark Rutland <mark.rutland@arm.com>
>>
>> Currently we access the counter registers and their respective type
>> registers indirectly. This requires us to write to PMSELR, issue an ISB,
>> then access the relevant PMXEV* registers.
>>
>> This is unfortunate, because:
>>
>> * Under virtualization, accessing one registers requires two traps to
>>   the hypervisor, even though we could access the register directly with
>>   a single trap.
>>
>> * We have to issue an ISB which we could otherwise avoid the cost of.
>>
>> * When we use NMIs, the NMI handler will have to save/restore the select
>>   register in case the code it preempted was attempting to access a
>>   counter or its type register.
>>
>> We can avoid these issues by directly accessing the relevant registers.
>> This patch adds helpers to do so.
>>
>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>> [Julien T.: Don't inline read/write functions to avoid big code-size
>> 	increase, remove unused read_pmevtypern function,
>> 	fix counter index issue.]
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> ---
>>  arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 83 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>> index 96e90e2..7759f8a 100644
>> --- a/arch/arm64/kernel/perf_event.c
>> +++ b/arch/arm64/kernel/perf_event.c
>> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>>  #define	ARMV8_IDX_TO_COUNTER(x)	\
>>  	(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
>>
>> +/*
>> + * This code is really good
>> + */
>> +
>> +#define PMEVN_CASE(n, case_macro) \
>> +	case n: case_macro(n); break
>> +
>> +#define PMEVN_SWITCH(x, case_macro)				\
>> +	do {							\
>> +		switch (x) {					\
>> +		PMEVN_CASE(0,  case_macro);			\
>> +		PMEVN_CASE(1,  case_macro);			\
>> +		PMEVN_CASE(2,  case_macro);			\
>> +		PMEVN_CASE(3,  case_macro);			\
>> +		PMEVN_CASE(4,  case_macro);			\
>> +		PMEVN_CASE(5,  case_macro);			\
>> +		PMEVN_CASE(6,  case_macro);			\
>> +		PMEVN_CASE(7,  case_macro);			\
>> +		PMEVN_CASE(8,  case_macro);			\
>> +		PMEVN_CASE(9,  case_macro);			\
>> +		PMEVN_CASE(10, case_macro);			\
>> +		PMEVN_CASE(11, case_macro);			\
>> +		PMEVN_CASE(12, case_macro);			\
>> +		PMEVN_CASE(13, case_macro);			\
>> +		PMEVN_CASE(14, case_macro);			\
>> +		PMEVN_CASE(15, case_macro);			\
>> +		PMEVN_CASE(16, case_macro);			\
>> +		PMEVN_CASE(17, case_macro);			\
>> +		PMEVN_CASE(18, case_macro);			\
>> +		PMEVN_CASE(19, case_macro);			\
> 
> Is 20 missing on purpose?
> 

That would have been fun to debug. Well spotted!

I'll fix it in the next version.

Thanks,

-- 
Julien Thierry

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe
  2019-07-08 14:32 ` [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe Julien Thierry
  2019-07-08 15:30   ` Mark Rutland
@ 2019-07-11 12:38   ` Zenghui Yu
  1 sibling, 0 replies; 28+ messages in thread
From: Zenghui Yu @ 2019-07-11 12:38 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel
  Cc: peterz, jolsa, will.deacon, acme, alexander.shishkin, mingo,
	Marc Zyngier, namhyung, liwei391, kvmarm

Hi Julien,

On 2019/7/8 22:32, Julien Thierry wrote:
> When using an NMI for the PMU interrupt, taking any lock migh cause a
                                                            ^^^^
s/migh/might/

> deadlock. The current PMU overflow handler in KVM takes takes locks when
                                                     ^^^^^ ^^^^^
                                                     two "takes" ?


Thanks,
zenghui

> trying to wake up a vcpu.
> 
> When overflow handler is called by an NMI, defer the vcpu waking in an
> irq_work queue.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-10 11:01     ` Julien Thierry
@ 2019-07-16 10:33       ` Shijith Thotton
  2019-07-16 10:54         ` Julien Thierry
  0 siblings, 1 reply; 28+ messages in thread
From: Shijith Thotton @ 2019-07-16 10:33 UTC (permalink / raw)
  To: Julien Thierry, Steven Price, linux-arm-kernel
  Cc: mark.rutland, peterz, Catalin Marinas, will.deacon, acme,
	alexander.shishkin, mingo, namhyung, jolsa, liwei391



On 7/10/19 4:01 AM, Julien Thierry wrote:
> 
> 
> On 10/07/2019 11:57, Steven Price wrote:
>> On 08/07/2019 15:32, Julien Thierry wrote:
>>> From: Mark Rutland <mark.rutland@arm.com>
>>>
>>> Currently we access the counter registers and their respective type
>>> registers indirectly. This requires us to write to PMSELR, issue an ISB,
>>> then access the relevant PMXEV* registers.
>>>
>>> This is unfortunate, because:
>>>
>>> * Under virtualization, accessing one registers requires two traps to
>>>    the hypervisor, even though we could access the register directly with
>>>    a single trap.
>>>
>>> * We have to issue an ISB which we could otherwise avoid the cost of.
>>>
>>> * When we use NMIs, the NMI handler will have to save/restore the select
>>>    register in case the code it preempted was attempting to access a
>>>    counter or its type register.
>>>
>>> We can avoid these issues by directly accessing the relevant registers.
>>> This patch adds helpers to do so.
>>>
>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>> [Julien T.: Don't inline read/write functions to avoid big code-size
>>> 	increase, remove unused read_pmevtypern function,
>>> 	fix counter index issue.]
>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>> Cc: Will Deacon <will.deacon@arm.com>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> ---
>>>   arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>>>   1 file changed, 83 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>>> index 96e90e2..7759f8a 100644
>>> --- a/arch/arm64/kernel/perf_event.c
>>> +++ b/arch/arm64/kernel/perf_event.c
>>> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>>>   #define	ARMV8_IDX_TO_COUNTER(x)	\
>>>   	(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
>>>
>>> +/*
>>> + * This code is really good
>>> + */
>>> +
>>> +#define PMEVN_CASE(n, case_macro) \
>>> +	case n: case_macro(n); break
>>> +
>>> +#define PMEVN_SWITCH(x, case_macro)				\
>>> +	do {							\
>>> +		switch (x) {					\
>>> +		PMEVN_CASE(0,  case_macro);			\
>>> +		PMEVN_CASE(1,  case_macro);			\
>>> +		PMEVN_CASE(2,  case_macro);			\
>>> +		PMEVN_CASE(3,  case_macro);			\
>>> +		PMEVN_CASE(4,  case_macro);			\
>>> +		PMEVN_CASE(5,  case_macro);			\
>>> +		PMEVN_CASE(6,  case_macro);			\
>>> +		PMEVN_CASE(7,  case_macro);			\
>>> +		PMEVN_CASE(8,  case_macro);			\
>>> +		PMEVN_CASE(9,  case_macro);			\
>>> +		PMEVN_CASE(10, case_macro);			\
>>> +		PMEVN_CASE(11, case_macro);			\
>>> +		PMEVN_CASE(12, case_macro);			\
>>> +		PMEVN_CASE(13, case_macro);			\
>>> +		PMEVN_CASE(14, case_macro);			\
>>> +		PMEVN_CASE(15, case_macro);			\
>>> +		PMEVN_CASE(16, case_macro);			\
>>> +		PMEVN_CASE(17, case_macro);			\
>>> +		PMEVN_CASE(18, case_macro);			\
>>> +		PMEVN_CASE(19, case_macro);			\
>>
>> Is 20 missing on purpose?
>>
> 
> That would have been fun to debug. Well spotted!
> 
> I'll fix it in the next version.
> 
> Thanks,
> 

Tried perf top/record on this patch and are working fine.
Output of perf record on "iperf -c 127.0.0.1 -t 30" is below. (single core)

With Pseudo-NMI:
     20.35%  [k] lock_acquire
     16.95%  [k] lock_release
     11.02%  [k] __arch_copy_from_user
      7.78%  [k] lock_is_held_type
      2.12%  [k] ipt_do_table
      1.34%  [k] kmem_cache_free
      1.25%  [k] _raw_spin_unlock_irqrestore
      1.21%  [k] __nf_conntrack_find_get
      1.06%  [k] get_page_from_freelist
      1.04%  [k] ktime_get
      1.03%  [k] kfree
      1.00%  [k] nf_conntrack_tcp_packet
      0.96%  [k] tcp_sendmsg_locked
      0.87%  [k] __softirqentry_text_start
      0.87%  [k] process_backlog
      0.76%  [k] __local_bh_enable_ip
      0.75%  [k] ip_finish_output2
      0.68%  [k] __tcp_transmit_skb
      0.62%  [k] enqueue_to_backlog
      0.60%  [k] __lock_acquire.isra.17
      0.58%  [k] __free_pages_ok
      0.54%  [k] nf_conntrack_in

With IRQ:
     16.49%  [k] __arch_copy_from_user
     12.38%  [k] _raw_spin_unlock_irqrestore
      9.41%  [k] lock_acquire
      6.92%  [k] lock_release
      3.71%  [k] lock_is_held_type
      2.80%  [k] ipt_do_table
      2.06%  [k] get_page_from_freelist
      1.82%  [k] ktime_get
      1.73%  [k] process_backlog
      1.27%  [k] nf_conntrack_tcp_packet
      1.21%  [k] enqueue_to_backlog
      1.17%  [k] __tcp_transmit_skb
      1.12%  [k] ip_finish_output2
      1.11%  [k] tcp_sendmsg_locked
      1.06%  [k] __free_pages_ok
      1.05%  [k] tcp_ack
      0.99%  [k] __netif_receive_skb_core
      0.88%  [k] __nf_conntrack_find_get
      0.71%  [k] nf_conntrack_in
      0.61%  [k] kmem_cache_free
      0.59%  [k] kfree
      0.57%  [k] __alloc_pages_nodemask

Thanks Juilen and Wei,
Tested-by: Shijith Thotton <sthotton@marvell.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-16 10:33       ` Shijith Thotton
@ 2019-07-16 10:54         ` Julien Thierry
  2019-07-17  4:45           ` Shijith Thotton
  0 siblings, 1 reply; 28+ messages in thread
From: Julien Thierry @ 2019-07-16 10:54 UTC (permalink / raw)
  To: Shijith Thotton, Steven Price, linux-arm-kernel
  Cc: Mark Rutland, peterz, Catalin Marinas, Will Deacon, acme,
	alexander.shishkin, mingo, namhyung, jolsa, liwei391

Hi Shijith,

On 16/07/2019 11:33, Shijith Thotton wrote:
>
>
> On 7/10/19 4:01 AM, Julien Thierry wrote:
>>
>>
>> On 10/07/2019 11:57, Steven Price wrote:
>>> On 08/07/2019 15:32, Julien Thierry wrote:
>>>> From: Mark Rutland <mark.rutland@arm.com>
>>>>
>>>> Currently we access the counter registers and their respective type
>>>> registers indirectly. This requires us to write to PMSELR, issue an ISB,
>>>> then access the relevant PMXEV* registers.
>>>>
>>>> This is unfortunate, because:
>>>>
>>>> * Under virtualization, accessing one registers requires two traps to
>>>>    the hypervisor, even though we could access the register directly with
>>>>    a single trap.
>>>>
>>>> * We have to issue an ISB which we could otherwise avoid the cost of.
>>>>
>>>> * When we use NMIs, the NMI handler will have to save/restore the select
>>>>    register in case the code it preempted was attempting to access a
>>>>    counter or its type register.
>>>>
>>>> We can avoid these issues by directly accessing the relevant registers.
>>>> This patch adds helpers to do so.
>>>>
>>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>>> [Julien T.: Don't inline read/write functions to avoid big code-size
>>>>    increase, remove unused read_pmevtypern function,
>>>>    fix counter index issue.]
>>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>> ---
>>>>   arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>>>>   1 file changed, 83 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>>>> index 96e90e2..7759f8a 100644
>>>> --- a/arch/arm64/kernel/perf_event.c
>>>> +++ b/arch/arm64/kernel/perf_event.c
>>>> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>>>>   #define  ARMV8_IDX_TO_COUNTER(x) \
>>>>    (((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
>>>>
>>>> +/*
>>>> + * This code is really good
>>>> + */
>>>> +
>>>> +#define PMEVN_CASE(n, case_macro) \
>>>> +  case n: case_macro(n); break
>>>> +
>>>> +#define PMEVN_SWITCH(x, case_macro)                               \
>>>> +  do {                                                    \
>>>> +          switch (x) {                                    \
>>>> +          PMEVN_CASE(0,  case_macro);                     \
>>>> +          PMEVN_CASE(1,  case_macro);                     \
>>>> +          PMEVN_CASE(2,  case_macro);                     \
>>>> +          PMEVN_CASE(3,  case_macro);                     \
>>>> +          PMEVN_CASE(4,  case_macro);                     \
>>>> +          PMEVN_CASE(5,  case_macro);                     \
>>>> +          PMEVN_CASE(6,  case_macro);                     \
>>>> +          PMEVN_CASE(7,  case_macro);                     \
>>>> +          PMEVN_CASE(8,  case_macro);                     \
>>>> +          PMEVN_CASE(9,  case_macro);                     \
>>>> +          PMEVN_CASE(10, case_macro);                     \
>>>> +          PMEVN_CASE(11, case_macro);                     \
>>>> +          PMEVN_CASE(12, case_macro);                     \
>>>> +          PMEVN_CASE(13, case_macro);                     \
>>>> +          PMEVN_CASE(14, case_macro);                     \
>>>> +          PMEVN_CASE(15, case_macro);                     \
>>>> +          PMEVN_CASE(16, case_macro);                     \
>>>> +          PMEVN_CASE(17, case_macro);                     \
>>>> +          PMEVN_CASE(18, case_macro);                     \
>>>> +          PMEVN_CASE(19, case_macro);                     \
>>>
>>> Is 20 missing on purpose?
>>>
>>
>> That would have been fun to debug. Well spotted!
>>
>> I'll fix it in the next version.
>>
>> Thanks,
>>
>
> Tried perf top/record on this patch and are working fine.
> Output of perf record on "iperf -c 127.0.0.1 -t 30" is below. (single core)
>
> With Pseudo-NMI:
>      20.35%  [k] lock_acquire
>      16.95%  [k] lock_release
>      11.02%  [k] __arch_copy_from_user
>       7.78%  [k] lock_is_held_type
>       2.12%  [k] ipt_do_table
>       1.34%  [k] kmem_cache_free
>       1.25%  [k] _raw_spin_unlock_irqrestore
>       1.21%  [k] __nf_conntrack_find_get
>       1.06%  [k] get_page_from_freelist
>       1.04%  [k] ktime_get
>       1.03%  [k] kfree
>       1.00%  [k] nf_conntrack_tcp_packet
>       0.96%  [k] tcp_sendmsg_locked
>       0.87%  [k] __softirqentry_text_start
>       0.87%  [k] process_backlog
>       0.76%  [k] __local_bh_enable_ip
>       0.75%  [k] ip_finish_output2
>       0.68%  [k] __tcp_transmit_skb
>       0.62%  [k] enqueue_to_backlog
>       0.60%  [k] __lock_acquire.isra.17
>       0.58%  [k] __free_pages_ok
>       0.54%  [k] nf_conntrack_in
>
> With IRQ:
>      16.49%  [k] __arch_copy_from_user
>      12.38%  [k] _raw_spin_unlock_irqrestore
>       9.41%  [k] lock_acquire
>       6.92%  [k] lock_release
>       3.71%  [k] lock_is_held_type
>       2.80%  [k] ipt_do_table
>       2.06%  [k] get_page_from_freelist
>       1.82%  [k] ktime_get
>       1.73%  [k] process_backlog
>       1.27%  [k] nf_conntrack_tcp_packet
>       1.21%  [k] enqueue_to_backlog
>       1.17%  [k] __tcp_transmit_skb
>       1.12%  [k] ip_finish_output2
>       1.11%  [k] tcp_sendmsg_locked
>       1.06%  [k] __free_pages_ok
>       1.05%  [k] tcp_ack
>       0.99%  [k] __netif_receive_skb_core
>       0.88%  [k] __nf_conntrack_find_get
>       0.71%  [k] nf_conntrack_in
>       0.61%  [k] kmem_cache_free
>       0.59%  [k] kfree
>       0.57%  [k] __alloc_pages_nodemask
>
> Thanks Juilen and Wei,
> Tested-by: Shijith Thotton <sthotton@marvell.com>
>

Thanks for testing this and confirming the improvement.

I'm gonna post a new version soon. Is it alright if I apply this tag for
the other arm64 patches that enable the use of Pseudo-NMI for the PMU?
(I'm mostly thinking of patches 8 and 9 since there haven't been
comments on them and won't have behavioural changes in the next version).

Thanks,

--
Julien Thierry
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
  2019-07-16 10:54         ` Julien Thierry
@ 2019-07-17  4:45           ` Shijith Thotton
  0 siblings, 0 replies; 28+ messages in thread
From: Shijith Thotton @ 2019-07-17  4:45 UTC (permalink / raw)
  To: Julien Thierry, Steven Price, linux-arm-kernel
  Cc: Mark Rutland, peterz, Catalin Marinas, Will Deacon, acme,
	alexander.shishkin, mingo, namhyung, jolsa, liwei391

Hi Julien,

On 7/16/19 3:54 AM, Julien Thierry wrote:
> On 16/07/2019 11:33, Shijith Thotton wrote:
>> On 7/10/19 4:01 AM, Julien Thierry wrote:
>>> On 10/07/2019 11:57, Steven Price wrote:
>>>> On 08/07/2019 15:32, Julien Thierry wrote:
>>>>> From: Mark Rutland <mark.rutland@arm.com>
>>>>>
>>>>> Currently we access the counter registers and their respective type
>>>>> registers indirectly. This requires us to write to PMSELR, issue an ISB,
>>>>> then access the relevant PMXEV* registers.
>>>>>
>>>>> This is unfortunate, because:
>>>>>
>>>>> * Under virtualization, accessing one registers requires two traps to
>>>>>     the hypervisor, even though we could access the register directly with
>>>>>     a single trap.
>>>>>
>>>>> * We have to issue an ISB which we could otherwise avoid the cost of.
>>>>>
>>>>> * When we use NMIs, the NMI handler will have to save/restore the select
>>>>>     register in case the code it preempted was attempting to access a
>>>>>     counter or its type register.
>>>>>
>>>>> We can avoid these issues by directly accessing the relevant registers.
>>>>> This patch adds helpers to do so.
>>>>>
>>>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>>>> [Julien T.: Don't inline read/write functions to avoid big code-size
>>>>>     increase, remove unused read_pmevtypern function,
>>>>>     fix counter index issue.]
>>>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>> ---
>>>>>    arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>>>>>    1 file changed, 83 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>>>>> index 96e90e2..7759f8a 100644
>>>>> --- a/arch/arm64/kernel/perf_event.c
>>>>> +++ b/arch/arm64/kernel/perf_event.c
>>>>> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>>>>>    #define  ARMV8_IDX_TO_COUNTER(x) \
>>>>>     (((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
>>>>>
>>>>> +/*
>>>>> + * This code is really good
>>>>> + */
>>>>> +
>>>>> +#define PMEVN_CASE(n, case_macro) \
>>>>> +  case n: case_macro(n); break
>>>>> +
>>>>> +#define PMEVN_SWITCH(x, case_macro)                               \
>>>>> +  do {                                                    \
>>>>> +          switch (x) {                                    \
>>>>> +          PMEVN_CASE(0,  case_macro);                     \
>>>>> +          PMEVN_CASE(1,  case_macro);                     \
>>>>> +          PMEVN_CASE(2,  case_macro);                     \
>>>>> +          PMEVN_CASE(3,  case_macro);                     \
>>>>> +          PMEVN_CASE(4,  case_macro);                     \
>>>>> +          PMEVN_CASE(5,  case_macro);                     \
>>>>> +          PMEVN_CASE(6,  case_macro);                     \
>>>>> +          PMEVN_CASE(7,  case_macro);                     \
>>>>> +          PMEVN_CASE(8,  case_macro);                     \
>>>>> +          PMEVN_CASE(9,  case_macro);                     \
>>>>> +          PMEVN_CASE(10, case_macro);                     \
>>>>> +          PMEVN_CASE(11, case_macro);                     \
>>>>> +          PMEVN_CASE(12, case_macro);                     \
>>>>> +          PMEVN_CASE(13, case_macro);                     \
>>>>> +          PMEVN_CASE(14, case_macro);                     \
>>>>> +          PMEVN_CASE(15, case_macro);                     \
>>>>> +          PMEVN_CASE(16, case_macro);                     \
>>>>> +          PMEVN_CASE(17, case_macro);                     \
>>>>> +          PMEVN_CASE(18, case_macro);                     \
>>>>> +          PMEVN_CASE(19, case_macro);                     \
>>>>
>>>> Is 20 missing on purpose?
>>>>
>>>
>>> That would have been fun to debug. Well spotted!
>>>
>>> I'll fix it in the next version.
>>>
>>> Thanks,
>>>
>>
>> Tried perf top/record on this patch and are working fine.
>> Output of perf record on "iperf -c 127.0.0.1 -t 30" is below. (single core)
>>
>> With Pseudo-NMI:
>>       20.35%  [k] lock_acquire
>>       16.95%  [k] lock_release
>>       11.02%  [k] __arch_copy_from_user
>>        7.78%  [k] lock_is_held_type
>>        2.12%  [k] ipt_do_table
>>        1.34%  [k] kmem_cache_free
>>        1.25%  [k] _raw_spin_unlock_irqrestore
>>        1.21%  [k] __nf_conntrack_find_get
>>        1.06%  [k] get_page_from_freelist
>>        1.04%  [k] ktime_get
>>        1.03%  [k] kfree
>>        1.00%  [k] nf_conntrack_tcp_packet
>>        0.96%  [k] tcp_sendmsg_locked
>>        0.87%  [k] __softirqentry_text_start
>>        0.87%  [k] process_backlog
>>        0.76%  [k] __local_bh_enable_ip
>>        0.75%  [k] ip_finish_output2
>>        0.68%  [k] __tcp_transmit_skb
>>        0.62%  [k] enqueue_to_backlog
>>        0.60%  [k] __lock_acquire.isra.17
>>        0.58%  [k] __free_pages_ok
>>        0.54%  [k] nf_conntrack_in
>>
>> With IRQ:
>>       16.49%  [k] __arch_copy_from_user
>>       12.38%  [k] _raw_spin_unlock_irqrestore
>>        9.41%  [k] lock_acquire
>>        6.92%  [k] lock_release
>>        3.71%  [k] lock_is_held_type
>>        2.80%  [k] ipt_do_table
>>        2.06%  [k] get_page_from_freelist
>>        1.82%  [k] ktime_get
>>        1.73%  [k] process_backlog
>>        1.27%  [k] nf_conntrack_tcp_packet
>>        1.21%  [k] enqueue_to_backlog
>>        1.17%  [k] __tcp_transmit_skb
>>        1.12%  [k] ip_finish_output2
>>        1.11%  [k] tcp_sendmsg_locked
>>        1.06%  [k] __free_pages_ok
>>        1.05%  [k] tcp_ack
>>        0.99%  [k] __netif_receive_skb_core
>>        0.88%  [k] __nf_conntrack_find_get
>>        0.71%  [k] nf_conntrack_in
>>        0.61%  [k] kmem_cache_free
>>        0.59%  [k] kfree
>>        0.57%  [k] __alloc_pages_nodemask
>>
>> Thanks Juilen and Wei,
>> Tested-by: Shijith Thotton <sthotton@marvell.com>
>>
> 
> Thanks for testing this and confirming the improvement.
> 
> I'm gonna post a new version soon. Is it alright if I apply this tag for
> the other arm64 patches that enable the use of Pseudo-NMI for the PMU?
> (I'm mostly thinking of patches 8 and 9 since there haven't been
> comments on them and won't have behavioural changes in the next version).
> 

Yes please.

Thanks,
Shijith
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-07-17  4:46 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
2019-07-08 15:03   ` Mark Rutland
2019-07-10 10:57   ` Steven Price
2019-07-10 11:01     ` Julien Thierry
2019-07-16 10:33       ` Shijith Thotton
2019-07-16 10:54         ` Julien Thierry
2019-07-17  4:45           ` Shijith Thotton
2019-07-08 14:32 ` [PATCH v3 2/9] arm64: perf: Remove PMU locking Julien Thierry
2019-07-08 15:03   ` Mark Rutland
2019-07-08 15:34     ` Julien Thierry
2019-07-09 11:22       ` Mark Rutland
2019-07-08 14:32 ` [PATCH v3 3/9] arm: perf: save/resore pmsel Julien Thierry
2019-07-08 15:06   ` Mark Rutland
2019-07-08 15:40     ` Julien Thierry
2019-07-08 14:32 ` [PATCH v3 4/9] arm: perf: Remove Remove PMU locking Julien Thierry
2019-07-08 15:10   ` Mark Rutland
2019-07-08 14:32 ` [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events Julien Thierry
2019-07-08 15:19   ` Mark Rutland
2019-07-08 15:50     ` Julien Thierry
2019-07-08 14:32 ` [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context Julien Thierry
2019-07-08 15:29   ` Mark Rutland
2019-07-08 16:00     ` Julien Thierry
2019-07-08 14:32 ` [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe Julien Thierry
2019-07-08 15:30   ` Mark Rutland
2019-07-11 12:38   ` Zenghui Yu
2019-07-08 14:32 ` [PATCH v3 8/9] arm_pmu: Introduce pmu_irq_ops Julien Thierry
2019-07-08 14:32 ` [PATCH v3 9/9] arm_pmu: Use NMIs for PMU Julien Thierry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).