All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] perf IRQ fixes
@ 2012-02-23 15:58 Will Deacon
  2012-02-23 15:58 ` [PATCH 1/4] ARM: perf: limit sample_period to half max_period in non-sampling mode Will Deacon
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Will Deacon @ 2012-02-23 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This series is a set of fixes to the IRQ handling for perf-events on
ARM. Thanks to Ming Lei for identifying a couple of the issues addressed
here.

I tested the xscale2 changes on my Gumstix Verdex (PXA270 although the
board isn't actually supported in mainline), the v6 changes on a PB1176
and the v7 changes on a Cortex-A9 vexpress.

I think all of these are candidates for -stable, although it may be too
late for the final cut of 3.3.

All comments welcome,

Will


Will Deacon (4):
  ARM: perf: limit sample_period to half max_period in non-sampling
    mode
  ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU
  ARM: perf: check that we have an event in the PMU IRQ handlers
  ARM: perf: fix overflow handling for xscale2 PMUs

 arch/arm/include/asm/pmu.h          |    2 +-
 arch/arm/kernel/perf_event.c        |   22 +++++++++++-----------
 arch/arm/kernel/perf_event_v6.c     |   22 +++-------------------
 arch/arm/kernel/perf_event_v7.c     |   11 ++++++++++-
 arch/arm/kernel/perf_event_xscale.c |   20 ++++++++++++++++----
 5 files changed, 41 insertions(+), 36 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] ARM: perf: limit sample_period to half max_period in non-sampling mode
  2012-02-23 15:58 [PATCH 0/4] perf IRQ fixes Will Deacon
@ 2012-02-23 15:58 ` Will Deacon
  2012-02-23 15:58 ` [PATCH 2/4] ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU Will Deacon
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2012-02-23 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

On ARM, the PMU does not stop counting after an overflow and therefore
IRQ latency affects the new counter value read by the kernel. This is
significant for non-sampling runs where it is possible for the new value
to overtake the previous one, causing the delta to be out by up to
max_period events.

Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
to IRQ latency") attempted to fix this problem by allowing interrupt
handlers to pass an overflow flag to the event update function, causing
the overflow calculation to assume that the counter passed through zero
when going from prev to new. Unfortunately, this doesn't work when
overflow occurs on the perf_task_tick path because we have the flag
cleared and end up computing a large negative delta.

This patch removes the overflow flag from armpmu_event_update and
instead limits the sample_period to half of the max_period for
non-sampling profiling runs.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/pmu.h          |    2 +-
 arch/arm/kernel/perf_event.c        |   22 +++++++++++-----------
 arch/arm/kernel/perf_event_v6.c     |    2 +-
 arch/arm/kernel/perf_event_v7.c     |    2 +-
 arch/arm/kernel/perf_event_xscale.c |    4 ++--
 5 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
index b5a5be2..90114fa 100644
--- a/arch/arm/include/asm/pmu.h
+++ b/arch/arm/include/asm/pmu.h
@@ -134,7 +134,7 @@ int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type);
 
 u64 armpmu_event_update(struct perf_event *event,
 			struct hw_perf_event *hwc,
-			int idx, int overflow);
+			int idx);
 
 int armpmu_event_set_period(struct perf_event *event,
 			    struct hw_perf_event *hwc,
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 5bb91bf..56173ae 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -180,7 +180,7 @@ armpmu_event_set_period(struct perf_event *event,
 u64
 armpmu_event_update(struct perf_event *event,
 		    struct hw_perf_event *hwc,
-		    int idx, int overflow)
+		    int idx)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
 	u64 delta, prev_raw_count, new_raw_count;
@@ -193,13 +193,7 @@ again:
 			     new_raw_count) != prev_raw_count)
 		goto again;
 
-	new_raw_count &= armpmu->max_period;
-	prev_raw_count &= armpmu->max_period;
-
-	if (overflow)
-		delta = armpmu->max_period - prev_raw_count + new_raw_count + 1;
-	else
-		delta = new_raw_count - prev_raw_count;
+	delta = (new_raw_count - prev_raw_count) & armpmu->max_period;
 
 	local64_add(delta, &event->count);
 	local64_sub(delta, &hwc->period_left);
@@ -216,7 +210,7 @@ armpmu_read(struct perf_event *event)
 	if (hwc->idx < 0)
 		return;
 
-	armpmu_event_update(event, hwc, hwc->idx, 0);
+	armpmu_event_update(event, hwc, hwc->idx);
 }
 
 static void
@@ -232,7 +226,7 @@ armpmu_stop(struct perf_event *event, int flags)
 	if (!(hwc->state & PERF_HES_STOPPED)) {
 		armpmu->disable(hwc, hwc->idx);
 		barrier(); /* why? */
-		armpmu_event_update(event, hwc, hwc->idx, 0);
+		armpmu_event_update(event, hwc, hwc->idx);
 		hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
 	}
 }
@@ -518,7 +512,13 @@ __hw_perf_event_init(struct perf_event *event)
 	hwc->config_base	    |= (unsigned long)mapping;
 
 	if (!hwc->sample_period) {
-		hwc->sample_period  = armpmu->max_period;
+		/*
+		 * For non-sampling runs, limit the sample_period to half
+		 * of the counter width. That way, the new counter value
+		 * is far less likely to overtake the previous one unless
+		 * you have some serious IRQ latency issues.
+		 */
+		hwc->sample_period  = armpmu->max_period >> 1;
 		hwc->last_period    = hwc->sample_period;
 		local64_set(&hwc->period_left, hwc->sample_period);
 	}
diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c
index 533be99..88bf152 100644
--- a/arch/arm/kernel/perf_event_v6.c
+++ b/arch/arm/kernel/perf_event_v6.c
@@ -524,7 +524,7 @@ armv6pmu_handle_irq(int irq_num,
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx, 1);
+		armpmu_event_update(event, hwc, idx);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 6933244..6f48861 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -963,7 +963,7 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev)
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx, 1);
+		armpmu_event_update(event, hwc, idx);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c
index 3b99d82..831e019 100644
--- a/arch/arm/kernel/perf_event_xscale.c
+++ b/arch/arm/kernel/perf_event_xscale.c
@@ -259,7 +259,7 @@ xscale1pmu_handle_irq(int irq_num, void *dev)
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx, 1);
+		armpmu_event_update(event, hwc, idx);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
@@ -596,7 +596,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev)
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx, 1);
+		armpmu_event_update(event, hwc, idx);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU
  2012-02-23 15:58 [PATCH 0/4] perf IRQ fixes Will Deacon
  2012-02-23 15:58 ` [PATCH 1/4] ARM: perf: limit sample_period to half max_period in non-sampling mode Will Deacon
@ 2012-02-23 15:58 ` Will Deacon
  2012-02-24  2:05   ` Ming Lei
  2012-02-23 15:58 ` [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers Will Deacon
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2012-02-23 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

When disabling a counter on an ARMv7 PMU, we should also clear the
overflow flag in case an overflow occurred whilst stopping the counter.
This prevents a spurious overflow being picked up later and leading to
either false accounting or a NULL dereference.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/perf_event_v7.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 6f48861..050cc8b 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -809,6 +809,11 @@ static inline int armv7_pmnc_disable_intens(int idx)
 
 	counter = ARMV7_IDX_TO_COUNTER(idx);
 	asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(counter)));
+	isb();
+	/* Clear the overflow flag in case an interrupt is pending. */
+	asm volatile("mcr p15, 0, %0, c9, c12, 3" : : "r" (BIT(counter)));
+	isb();
+
 	return idx;
 }
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-23 15:58 [PATCH 0/4] perf IRQ fixes Will Deacon
  2012-02-23 15:58 ` [PATCH 1/4] ARM: perf: limit sample_period to half max_period in non-sampling mode Will Deacon
  2012-02-23 15:58 ` [PATCH 2/4] ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU Will Deacon
@ 2012-02-23 15:58 ` Will Deacon
  2012-02-24  1:34   ` Ming Lei
  2012-02-23 15:58 ` [PATCH 4/4] ARM: perf: fix overflow handling for xscale2 PMUs Will Deacon
  2012-03-07  9:41 ` [PATCH 0/4] perf IRQ fixes Russell King - ARM Linux
  4 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2012-02-23 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

The PMU IRQ handlers in perf assume that if a counter has overflowed
then perf must be responsible. In the paranoid world of crazy hardware,
this could be false, so check that we do have a valid event before
attempting to dereference NULL in the interrupt path.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/perf_event_v6.c     |   20 ++------------------
 arch/arm/kernel/perf_event_v7.c     |    4 ++++
 arch/arm/kernel/perf_event_xscale.c |    6 ++++++
 3 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c
index 88bf152..b78af0c 100644
--- a/arch/arm/kernel/perf_event_v6.c
+++ b/arch/arm/kernel/perf_event_v6.c
@@ -467,23 +467,6 @@ armv6pmu_enable_event(struct hw_perf_event *hwc,
 	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }
 
-static int counter_is_active(unsigned long pmcr, int idx)
-{
-	unsigned long mask = 0;
-	if (idx == ARMV6_CYCLE_COUNTER)
-		mask = ARMV6_PMCR_CCOUNT_IEN;
-	else if (idx == ARMV6_COUNTER0)
-		mask = ARMV6_PMCR_COUNT0_IEN;
-	else if (idx == ARMV6_COUNTER1)
-		mask = ARMV6_PMCR_COUNT1_IEN;
-
-	if (mask)
-		return pmcr & mask;
-
-	WARN_ONCE(1, "invalid counter number (%d)\n", idx);
-	return 0;
-}
-
 static irqreturn_t
 armv6pmu_handle_irq(int irq_num,
 		    void *dev)
@@ -513,7 +496,8 @@ armv6pmu_handle_irq(int irq_num,
 		struct perf_event *event = cpuc->events[idx];
 		struct hw_perf_event *hwc;
 
-		if (!counter_is_active(pmcr, idx))
+		/* Ignore if we don't have an event. */
+		if (!event)
 			continue;
 
 		/*
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 050cc8b..4d7095a 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -960,6 +960,10 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev)
 		struct perf_event *event = cpuc->events[idx];
 		struct hw_perf_event *hwc;
 
+		/* Ignore if we don't have an event. */
+		if (!event)
+			continue;
+
 		/*
 		 * We have a single interrupt for all counters. Check that
 		 * each counter has overflowed before we process it.
diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c
index 831e019..a5bbd36 100644
--- a/arch/arm/kernel/perf_event_xscale.c
+++ b/arch/arm/kernel/perf_event_xscale.c
@@ -255,6 +255,9 @@ xscale1pmu_handle_irq(int irq_num, void *dev)
 		struct perf_event *event = cpuc->events[idx];
 		struct hw_perf_event *hwc;
 
+		if (!event)
+			continue;
+
 		if (!xscale1_pmnc_counter_has_overflowed(pmnc, idx))
 			continue;
 
@@ -592,6 +595,9 @@ xscale2pmu_handle_irq(int irq_num, void *dev)
 		struct perf_event *event = cpuc->events[idx];
 		struct hw_perf_event *hwc;
 
+		if (!event)
+			continue;
+
 		if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx))
 			continue;
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] ARM: perf: fix overflow handling for xscale2 PMUs
  2012-02-23 15:58 [PATCH 0/4] perf IRQ fixes Will Deacon
                   ` (2 preceding siblings ...)
  2012-02-23 15:58 ` [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers Will Deacon
@ 2012-02-23 15:58 ` Will Deacon
  2012-03-07  9:41 ` [PATCH 0/4] perf IRQ fixes Russell King - ARM Linux
  4 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2012-02-23 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

xscale2 PMUs indicate overflow not via the PMU control register, but by
a separate overflow FLAG register instead.

This patch fixes the xscale2 PMU code to use this register to detect
to overflow and ensures that we clear any pending overflow when
disabling a counter.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/perf_event_xscale.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c
index a5bbd36..71a21e6 100644
--- a/arch/arm/kernel/perf_event_xscale.c
+++ b/arch/arm/kernel/perf_event_xscale.c
@@ -598,7 +598,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev)
 		if (!event)
 			continue;
 
-		if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx))
+		if (!xscale2_pmnc_counter_has_overflowed(of_flags, idx))
 			continue;
 
 		hwc = &event->hw;
@@ -669,7 +669,7 @@ xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx)
 static void
 xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx)
 {
-	unsigned long flags, ien, evtsel;
+	unsigned long flags, ien, evtsel, of_flags;
 	struct pmu_hw_events *events = cpu_pmu->get_hw_events();
 
 	ien = xscale2pmu_read_int_enable();
@@ -678,26 +678,31 @@ xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx)
 	switch (idx) {
 	case XSCALE_CYCLE_COUNTER:
 		ien &= ~XSCALE2_CCOUNT_INT_EN;
+		of_flags = XSCALE2_CCOUNT_OVERFLOW;
 		break;
 	case XSCALE_COUNTER0:
 		ien &= ~XSCALE2_COUNT0_INT_EN;
 		evtsel &= ~XSCALE2_COUNT0_EVT_MASK;
 		evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT0_EVT_SHFT;
+		of_flags = XSCALE2_COUNT0_OVERFLOW;
 		break;
 	case XSCALE_COUNTER1:
 		ien &= ~XSCALE2_COUNT1_INT_EN;
 		evtsel &= ~XSCALE2_COUNT1_EVT_MASK;
 		evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT1_EVT_SHFT;
+		of_flags = XSCALE2_COUNT1_OVERFLOW;
 		break;
 	case XSCALE_COUNTER2:
 		ien &= ~XSCALE2_COUNT2_INT_EN;
 		evtsel &= ~XSCALE2_COUNT2_EVT_MASK;
 		evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT2_EVT_SHFT;
+		of_flags = XSCALE2_COUNT2_OVERFLOW;
 		break;
 	case XSCALE_COUNTER3:
 		ien &= ~XSCALE2_COUNT3_INT_EN;
 		evtsel &= ~XSCALE2_COUNT3_EVT_MASK;
 		evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT3_EVT_SHFT;
+		of_flags = XSCALE2_COUNT3_OVERFLOW;
 		break;
 	default:
 		WARN_ONCE(1, "invalid counter number (%d)\n", idx);
@@ -707,6 +712,7 @@ xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx)
 	raw_spin_lock_irqsave(&events->pmu_lock, flags);
 	xscale2pmu_write_event_select(evtsel);
 	xscale2pmu_write_int_enable(ien);
+	xscale2pmu_write_overflow_flags(of_flags);
 	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
 }
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-23 15:58 ` [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers Will Deacon
@ 2012-02-24  1:34   ` Ming Lei
  2012-02-24 10:11     ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2012-02-24  1:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Thu, Feb 23, 2012 at 11:58 PM, Will Deacon <will.deacon@arm.com> wrote:
> The PMU IRQ handlers in perf assume that if a counter has overflowed
> then perf must be responsible. In the paranoid world of crazy hardware,
> this could be false, so check that we do have a valid event before
> attempting to dereference NULL in the interrupt path.
>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
> ?arch/arm/kernel/perf_event_v6.c ? ? | ? 20 ++------------------
> ?arch/arm/kernel/perf_event_v7.c ? ? | ? ?4 ++++
> ?arch/arm/kernel/perf_event_xscale.c | ? ?6 ++++++
> ?3 files changed, 12 insertions(+), 18 deletions(-)
>
> diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c
> index 88bf152..b78af0c 100644
> --- a/arch/arm/kernel/perf_event_v6.c
> +++ b/arch/arm/kernel/perf_event_v6.c
> @@ -467,23 +467,6 @@ armv6pmu_enable_event(struct hw_perf_event *hwc,
> ? ? ? ?raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> ?}
>
> -static int counter_is_active(unsigned long pmcr, int idx)
> -{
> - ? ? ? unsigned long mask = 0;
> - ? ? ? if (idx == ARMV6_CYCLE_COUNTER)
> - ? ? ? ? ? ? ? mask = ARMV6_PMCR_CCOUNT_IEN;
> - ? ? ? else if (idx == ARMV6_COUNTER0)
> - ? ? ? ? ? ? ? mask = ARMV6_PMCR_COUNT0_IEN;
> - ? ? ? else if (idx == ARMV6_COUNTER1)
> - ? ? ? ? ? ? ? mask = ARMV6_PMCR_COUNT1_IEN;
> -
> - ? ? ? if (mask)
> - ? ? ? ? ? ? ? return pmcr & mask;
> -
> - ? ? ? WARN_ONCE(1, "invalid counter number (%d)\n", idx);
> - ? ? ? return 0;
> -}
> -
> ?static irqreturn_t
> ?armv6pmu_handle_irq(int irq_num,
> ? ? ? ? ? ? ? ? ? ?void *dev)
> @@ -513,7 +496,8 @@ armv6pmu_handle_irq(int irq_num,
> ? ? ? ? ? ? ? ?struct perf_event *event = cpuc->events[idx];
> ? ? ? ? ? ? ? ?struct hw_perf_event *hwc;
>
> - ? ? ? ? ? ? ? if (!counter_is_active(pmcr, idx))
> + ? ? ? ? ? ? ? /* Ignore if we don't have an event. */
> + ? ? ? ? ? ? ? if (!event)

I think we should check it via test_bit(idx, cpuc->used_mask) because
'hw_events->events[idx] = val' is not atomic operation and it is read here
in irq context.

> ? ? ? ? ? ? ? ? ? ? ? ?continue;
>
> ? ? ? ? ? ? ? ?/*
> diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
> index 050cc8b..4d7095a 100644
> --- a/arch/arm/kernel/perf_event_v7.c
> +++ b/arch/arm/kernel/perf_event_v7.c
> @@ -960,6 +960,10 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev)
> ? ? ? ? ? ? ? ?struct perf_event *event = cpuc->events[idx];
> ? ? ? ? ? ? ? ?struct hw_perf_event *hwc;
>
> + ? ? ? ? ? ? ? /* Ignore if we don't have an event. */
> + ? ? ? ? ? ? ? if (!event)

Same with above.

> + ? ? ? ? ? ? ? ? ? ? ? continue;
> +
> ? ? ? ? ? ? ? ?/*
> ? ? ? ? ? ? ? ? * We have a single interrupt for all counters. Check that
> ? ? ? ? ? ? ? ? * each counter has overflowed before we process it.
> diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c
> index 831e019..a5bbd36 100644
> --- a/arch/arm/kernel/perf_event_xscale.c
> +++ b/arch/arm/kernel/perf_event_xscale.c
> @@ -255,6 +255,9 @@ xscale1pmu_handle_irq(int irq_num, void *dev)
> ? ? ? ? ? ? ? ?struct perf_event *event = cpuc->events[idx];
> ? ? ? ? ? ? ? ?struct hw_perf_event *hwc;
>
> + ? ? ? ? ? ? ? if (!event)

Same with above.

> + ? ? ? ? ? ? ? ? ? ? ? continue;
> +
> ? ? ? ? ? ? ? ?if (!xscale1_pmnc_counter_has_overflowed(pmnc, idx))
> ? ? ? ? ? ? ? ? ? ? ? ?continue;
>
> @@ -592,6 +595,9 @@ xscale2pmu_handle_irq(int irq_num, void *dev)
> ? ? ? ? ? ? ? ?struct perf_event *event = cpuc->events[idx];
> ? ? ? ? ? ? ? ?struct hw_perf_event *hwc;
>
> + ? ? ? ? ? ? ? if (!event)

Same with above.

> + ? ? ? ? ? ? ? ? ? ? ? continue;
> +
> ? ? ? ? ? ? ? ?if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx))
> ? ? ? ? ? ? ? ? ? ? ? ?continue;
>
> --
> 1.7.4.1
>


thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/4] ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU
  2012-02-23 15:58 ` [PATCH 2/4] ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU Will Deacon
@ 2012-02-24  2:05   ` Ming Lei
  0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2012-02-24  2:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 23, 2012 at 11:58 PM, Will Deacon <will.deacon@arm.com> wrote:
> When disabling a counter on an ARMv7 PMU, we should also clear the
> overflow flag in case an overflow occurred whilst stopping the counter.
> This prevents a spurious overflow being picked up later and leading to
> either false accounting or a NULL dereference.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Reported-by: Ming Lei <tom.leiming@gmail.com>

> ---
> ?arch/arm/kernel/perf_event_v7.c | ? ?5 +++++
> ?1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
> index 6f48861..050cc8b 100644
> --- a/arch/arm/kernel/perf_event_v7.c
> +++ b/arch/arm/kernel/perf_event_v7.c
> @@ -809,6 +809,11 @@ static inline int armv7_pmnc_disable_intens(int idx)
>
> ? ? ? ?counter = ARMV7_IDX_TO_COUNTER(idx);
> ? ? ? ?asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(counter)));
> + ? ? ? isb();
> + ? ? ? /* Clear the overflow flag in case an interrupt is pending. */
> + ? ? ? asm volatile("mcr p15, 0, %0, c9, c12, 3" : : "r" (BIT(counter)));
> + ? ? ? isb();
> +
> ? ? ? ?return idx;
> ?}
>
> --
> 1.7.4.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-24  1:34   ` Ming Lei
@ 2012-02-24 10:11     ` Will Deacon
  2012-02-27  9:56       ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2012-02-24 10:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 24, 2012 at 01:34:32AM +0000, Ming Lei wrote:
> On Thu, Feb 23, 2012 at 11:58 PM, Will Deacon <will.deacon@arm.com> wrote:
> > @@ -513,7 +496,8 @@ armv6pmu_handle_irq(int irq_num,
> > ? ? ? ? ? ? ? ?struct perf_event *event = cpuc->events[idx];
> > ? ? ? ? ? ? ? ?struct hw_perf_event *hwc;
> >
> > - ? ? ? ? ? ? ? if (!counter_is_active(pmcr, idx))
> > + ? ? ? ? ? ? ? /* Ignore if we don't have an event. */
> > + ? ? ? ? ? ? ? if (!event)
> 
> I think we should check it via test_bit(idx, cpuc->used_mask) because
> 'hw_events->events[idx] = val' is not atomic operation and it is read here
> in irq context.

I dunno, that code is compiled to:

e5973000        ldr     r3, [r7]
e7834106        str     r4, [r3, r6, lsl #2]

so you should either see the new value or the old one - you can't see half a
pointer in there since it's a single 32-bit store.

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-24 10:11     ` Will Deacon
@ 2012-02-27  9:56       ` Ming Lei
  2012-02-27 19:36         ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2012-02-27  9:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, 24 Feb 2012 10:11:30 +0000
Will Deacon <will.deacon@arm.com> wrote:

> On Fri, Feb 24, 2012 at 01:34:32AM +0000, Ming Lei wrote:
> > On Thu, Feb 23, 2012 at 11:58 PM, Will Deacon <will.deacon@arm.com> wrote:
> > > @@ -513,7 +496,8 @@ armv6pmu_handle_irq(int irq_num,
> > > ? ? ? ? ? ? ? ?struct perf_event *event = cpuc->events[idx];
> > > ? ? ? ? ? ? ? ?struct hw_perf_event *hwc;
> > >
> > > - ? ? ? ? ? ? ? if (!counter_is_active(pmcr, idx))
> > > + ? ? ? ? ? ? ? /* Ignore if we don't have an event. */
> > > + ? ? ? ? ? ? ? if (!event)
> > 
> > I think we should check it via test_bit(idx, cpuc->used_mask) because
> > 'hw_events->events[idx] = val' is not atomic operation and it is read here
> > in irq context.
> 
> I dunno, that code is compiled to:
> 
> e5973000        ldr     r3, [r7]
> e7834106        str     r4, [r3, r6, lsl #2]
> 
> so you should either see the new value or the old one - you can't see half a
> pointer in there since it's a single 32-bit store.

Firstly the code above is only generated from one of many existing compile
options, for example, maybe storing to 32bit variable involves two instructions
in thumb mode, so are you sure it is always OK for all cases?

Secondly, I am even not sure if ARM irq handler is always triggered in instruction
boundary, maybe the irq handler is started during execution of the store instruction.
In fact, I can find the sentence below in 'B1.6.16 IRQ exception' of ARMv7 manual:

	This relaxation of the normal definition of a precise asynchronous exception
	permits interrupts to occur during the execution of instructions that change register
	or memory values, while only requiring the implementation to restore those register
	values that are needed to correctly re-execute the instruction after the preferred
	exception return. LDM and STM are examples of such instructions.

So suggest to take the correct way in theory, IMO.

thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-27  9:56       ` Ming Lei
@ 2012-02-27 19:36         ` Will Deacon
  2012-02-28  0:53           ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2012-02-27 19:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 27, 2012 at 09:56:46AM +0000, Ming Lei wrote:
> Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, Feb 24, 2012 at 01:34:32AM +0000, Ming Lei wrote:
> > > 
> > > I think we should check it via test_bit(idx, cpuc->used_mask) because
> > > 'hw_events->events[idx] = val' is not atomic operation and it is read here
> > > in irq context.
> > 
> > I dunno, that code is compiled to:
> > 
> > e5973000        ldr     r3, [r7]
> > e7834106        str     r4, [r3, r6, lsl #2]
> > 
> > so you should either see the new value or the old one - you can't see half a
> > pointer in there since it's a single 32-bit store.
> 
> Firstly the code above is only generated from one of many existing compile
> options, for example, maybe storing to 32bit variable involves two instructions
> in thumb mode, so are you sure it is always OK for all cases?

I'll check with the compiler guys if the tools ever do this. It would seem a
bit weird given that we may have things like memory type, alignment and
endianness coming into play that would make splitting up 32-bit stores a
really bad idea.

> Secondly, I am even not sure if ARM irq handler is always triggered in instruction
> boundary, maybe the irq handler is started during execution of the store instruction.
> In fact, I can find the sentence below in 'B1.6.16 IRQ exception' of ARMv7 manual:
> 
> 	This relaxation of the normal definition of a precise asynchronous exception
> 	permits interrupts to occur during the execution of instructions that change register
> 	or memory values, while only requiring the implementation to restore those register
> 	values that are needed to correctly re-execute the instruction after the preferred
> 	exception return. LDM and STM are examples of such instructions.

A single 32-bit store is not interruptible in the same way as load/store
multiple instructions are, so this isn't a problem.

> So suggest to take the correct way in theory, IMO.

I don't like checking the used_mask, because then we have to throw in some
barriers to ensure ordering between updates of the event arrays and the
mask. If we do have to change something, I think putting ACCESS_ONCE around
updates to the event array should do the trick.

Cheers,

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-27 19:36         ` Will Deacon
@ 2012-02-28  0:53           ` Ming Lei
  2012-02-28 10:55             ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2012-02-28  0:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 28, 2012 at 3:36 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Mon, Feb 27, 2012 at 09:56:46AM +0000, Ming Lei wrote:
>> Will Deacon <will.deacon@arm.com> wrote:
>> > On Fri, Feb 24, 2012 at 01:34:32AM +0000, Ming Lei wrote:
>> > >
>> > > I think we should check it via test_bit(idx, cpuc->used_mask) because
>> > > 'hw_events->events[idx] = val' is not atomic operation and it is read here
>> > > in irq context.
>> >
>> > I dunno, that code is compiled to:
>> >
>> > e5973000 ? ? ? ?ldr ? ? r3, [r7]
>> > e7834106 ? ? ? ?str ? ? r4, [r3, r6, lsl #2]
>> >
>> > so you should either see the new value or the old one - you can't see half a
>> > pointer in there since it's a single 32-bit store.
>>
>> Firstly the code above is only generated from one of many existing compile
>> options, for example, maybe storing to 32bit variable involves two instructions
>> in thumb mode, so are you sure it is always OK for all cases?
>
> I'll check with the compiler guys if the tools ever do this. It would seem a
> bit weird given that we may have things like memory type, alignment and
> endianness coming into play that would make splitting up 32-bit stores a
> really bad idea.
>
>> Secondly, I am even not sure if ARM irq handler is always triggered in instruction
>> boundary, maybe the irq handler is started during execution of the store instruction.
>> In fact, I can find the sentence below in 'B1.6.16 IRQ exception' of ARMv7 manual:
>>
>> ? ? ? This relaxation of the normal definition of a precise asynchronous exception
>> ? ? ? permits interrupts to occur during the execution of instructions that change register
>> ? ? ? or memory values, while only requiring the implementation to restore those register
>> ? ? ? values that are needed to correctly re-execute the instruction after the preferred
>> ? ? ? exception return. LDM and STM are examples of such instructions.
>
> A single 32-bit store is not interruptible in the same way as load/store
> multiple instructions are, so this isn't a problem.

In fact, I have tried to look for the similar description in ARMv7 manual, but
didn't find it unfortunately, so could you point out where it is
stated explicitly?

>
>> So suggest to take the correct way in theory, IMO.
>
> I don't like checking the used_mask, because then we have to throw in some
> barriers to ensure ordering between updates of the event arrays and the
> mask. If we do have to change something, I think putting ACCESS_ONCE around

Both used_mask and event arrays are defined as percpu variables, so are you sure
barriers are required for it?

> updates to the event array should do the trick.


thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-28  0:53           ` Ming Lei
@ 2012-02-28 10:55             ` Will Deacon
  2012-02-28 14:28               ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2012-02-28 10:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 28, 2012 at 12:53:28AM +0000, Ming Lei wrote:
> On Tue, Feb 28, 2012 at 3:36 AM, Will Deacon <will.deacon@arm.com> wrote:
> > A single 32-bit store is not interruptible in the same way as load/store
> > multiple instructions are, so this isn't a problem.
> 
> In fact, I have tried to look for the similar description in ARMv7 manual, but
> didn't find it unfortunately, so could you point out where it is
> stated explicitly?

It's the section about Single-copy atomicity (3.5.3 in revision C). All word
accesses to word-aligned locations are single-copy atomic. We rely on this
for our atomic_{read,set} implementations.

> >
> >> So suggest to take the correct way in theory, IMO.
> >
> > I don't like checking the used_mask, because then we have to throw in some
> > barriers to ensure ordering between updates of the event arrays and the
> > mask. If we do have to change something, I think putting ACCESS_ONCE around
> 
> Both used_mask and event arrays are defined as percpu variables, so are you sure
> barriers are required for it?

I expect the compiler could still re-order the statements.

I heard back from the tools guys and they say that the store will not be
split up, but may be coalesced with others in the form of a store-multiple.
Referring back to the ARM ARM, that's ok since `each 32-bit word access is
guaranteed to be single-copy atomic'.

So I'm still in favour of the original patch. It's easy to read, fairly minimal
and will work correctly.

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers
  2012-02-28 10:55             ` Will Deacon
@ 2012-02-28 14:28               ` Ming Lei
  0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2012-02-28 14:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 28, 2012 at 6:55 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Feb 28, 2012 at 12:53:28AM +0000, Ming Lei wrote:
>> On Tue, Feb 28, 2012 at 3:36 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > A single 32-bit store is not interruptible in the same way as load/store
>> > multiple instructions are, so this isn't a problem.
>>
>> In fact, I have tried to look for the similar description in ARMv7 manual, but
>> didn't find it unfortunately, so could you point out where it is
>> stated explicitly?
>
> It's the section about Single-copy atomicity (3.5.3 in revision C). All word
> accesses to word-aligned locations are single-copy atomic. We rely on this
> for our atomic_{read,set} implementations.

Got it.

>
>> >
>> >> So suggest to take the correct way in theory, IMO.
>> >
>> > I don't like checking the used_mask, because then we have to throw in some
>> > barriers to ensure ordering between updates of the event arrays and the
>> > mask. If we do have to change something, I think putting ACCESS_ONCE around
>>
>> Both used_mask and event arrays are defined as percpu variables, so are you sure
>> barriers are required for it?
>
> I expect the compiler could still re-order the statements.

Yes, looks compiler barrier is required if checking with usage_mask.

> I heard back from the tools guys and they say that the store will not be
> split up, but may be coalesced with others in the form of a store-multiple.
> Referring back to the ARM ARM, that's ok since `each 32-bit word access is
> guaranteed to be single-copy atomic'.
>
> So I'm still in favour of the original patch. It's easy to read, fairly minimal
> and will work correctly.

OK, thanks for your detailed explanation.


thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/4] perf IRQ fixes
  2012-02-23 15:58 [PATCH 0/4] perf IRQ fixes Will Deacon
                   ` (3 preceding siblings ...)
  2012-02-23 15:58 ` [PATCH 4/4] ARM: perf: fix overflow handling for xscale2 PMUs Will Deacon
@ 2012-03-07  9:41 ` Russell King - ARM Linux
  2012-03-07  9:44   ` Will Deacon
  4 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2012-03-07  9:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 23, 2012 at 03:58:12PM +0000, Will Deacon wrote:
> Hello,
> 
> This series is a set of fixes to the IRQ handling for perf-events on
> ARM. Thanks to Ming Lei for identifying a couple of the issues addressed
> here.
> 
> I tested the xscale2 changes on my Gumstix Verdex (PXA270 although the
> board isn't actually supported in mainline), the v6 changes on a PB1176
> and the v7 changes on a Cortex-A9 vexpress.
> 
> I think all of these are candidates for -stable, although it may be too
> late for the final cut of 3.3.

So do you want these (and the 5th patch) to have a stable tag on them?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/4] perf IRQ fixes
  2012-03-07  9:41 ` [PATCH 0/4] perf IRQ fixes Russell King - ARM Linux
@ 2012-03-07  9:44   ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2012-03-07  9:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Wed, Mar 07, 2012 at 09:41:29AM +0000, Russell King - ARM Linux wrote:
> On Thu, Feb 23, 2012 at 03:58:12PM +0000, Will Deacon wrote:
> > Hello,
> > 
> > This series is a set of fixes to the IRQ handling for perf-events on
> > ARM. Thanks to Ming Lei for identifying a couple of the issues addressed
> > here.
> > 
> > I tested the xscale2 changes on my Gumstix Verdex (PXA270 although the
> > board isn't actually supported in mainline), the v6 changes on a PB1176
> > and the v7 changes on a Cortex-A9 vexpress.
> > 
> > I think all of these are candidates for -stable, although it may be too
> > late for the final cut of 3.3.
> 
> So do you want these (and the 5th patch) to have a stable tag on them?

I posted them to the patch system yesterday and included a CC stable tag
on the patches then. The only patch which doesn't need that is the hotplug
notifier.

Cheers,

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-03-07  9:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-23 15:58 [PATCH 0/4] perf IRQ fixes Will Deacon
2012-02-23 15:58 ` [PATCH 1/4] ARM: perf: limit sample_period to half max_period in non-sampling mode Will Deacon
2012-02-23 15:58 ` [PATCH 2/4] ARM: perf: clear overflow flag when disabling counter on ARMv7 PMU Will Deacon
2012-02-24  2:05   ` Ming Lei
2012-02-23 15:58 ` [PATCH 3/4] ARM: perf: check that we have an event in the PMU IRQ handlers Will Deacon
2012-02-24  1:34   ` Ming Lei
2012-02-24 10:11     ` Will Deacon
2012-02-27  9:56       ` Ming Lei
2012-02-27 19:36         ` Will Deacon
2012-02-28  0:53           ` Ming Lei
2012-02-28 10:55             ` Will Deacon
2012-02-28 14:28               ` Ming Lei
2012-02-23 15:58 ` [PATCH 4/4] ARM: perf: fix overflow handling for xscale2 PMUs Will Deacon
2012-03-07  9:41 ` [PATCH 0/4] perf IRQ fixes Russell King - ARM Linux
2012-03-07  9:44   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.