linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP
@ 2020-09-07  9:53 Joakim Zhang
  2020-09-07 17:06 ` Will Deacon
  0 siblings, 1 reply; 3+ messages in thread
From: Joakim Zhang @ 2020-09-07  9:53 UTC (permalink / raw)
  To: will, mark.rutland, robin.murphy
  Cc: linux-imx, linux-arm-kernel, Joakim Zhang

DDR Perf driver only supports free-running event counters(counter1/2/3)
now, this patch adds support for stop event counters.

Legacy SoCs:
Cycle counter(counter0) is a special counter, only count cycles. When
cycle counter overflow, it will lock all counters and generate an
interrupt. In ddr_perf_irq_handler, disable cycle counter then all
counters would stop at the same time, update all counters' count, then
enable cycle counter that all counters count again. During this process,
only clear cycle counter, no need to clear event counters since they are
free-running counters. They would continue counting after overflow and
do/while loop from ddr_perf_event_update can handle event counters
overflow case.

i.MX8MP:
Almost all is the same as legacy SoCs, the only difference is that, event
counters are not free-running any more. Like cycle counter, when event
counters overflow, they would stop counting unless clear the counter,
and no interrupt generate for event counters. So we should clear event
counters that let them re-count when cycle counter overflow, which ensure
event counters will not lose data.

Take one case into consideration, from cycle counter interrupt context,
when invoke ddr_perf_counter_enable to clear event counters, but have
not set prev_count equal 0 yet. Concurrently, ddr_perf_event_update from
another thread context invokes ddr_perf_read_counter to read that event
counter value, it will return 0. Delta(new_raw_count - prev_raw_count)
calculate is incorrect. So I add a spinlock, for that clear event
counters and update event counters never happened concurrently. It is
save for cycle counter to clear then update the counter, since it is
exactly overflow.

This patch adds stop event counters support which would be compatible to
free-running event counters.

Hi Will,

I resend the patch for your review since last mail time span is too long.

I am not sure whether it is a formal solution or not. If any better solution,
please share me how to implement it? Thanks.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
---
 drivers/perf/fsl_imx8_ddr_perf.c | 52 +++++++++++++++++++++++++++-----
 1 file changed, 45 insertions(+), 7 deletions(-)

diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c
index 90884d14f95f..057e361eb391 100644
--- a/drivers/perf/fsl_imx8_ddr_perf.c
+++ b/drivers/perf/fsl_imx8_ddr_perf.c
@@ -14,6 +14,7 @@
 #include <linux/of_device.h>
 #include <linux/of_irq.h>
 #include <linux/perf_event.h>
+#include <linux/spinlock.h>
 #include <linux/slab.h>
 
 #define COUNTER_CNTL		0x0
@@ -82,6 +83,7 @@ struct ddr_pmu {
 	const struct fsl_ddr_devtype_data *devtype_data;
 	int irq;
 	int id;
+	spinlock_t lock;
 };
 
 enum ddr_perf_filter_capabilities {
@@ -368,16 +370,19 @@ static void ddr_perf_event_update(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 	u64 delta, prev_raw_count, new_raw_count;
 	int counter = hwc->idx;
+	unsigned long flags;
 
-	do {
-		prev_raw_count = local64_read(&hwc->prev_count);
-		new_raw_count = ddr_perf_read_counter(pmu, counter);
-	} while (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
-			new_raw_count) != prev_raw_count);
+	spin_lock_irqsave(&pmu->lock, flags);
+
+	prev_raw_count = local64_read(&hwc->prev_count);
+	new_raw_count = ddr_perf_read_counter(pmu, counter);
 
 	delta = (new_raw_count - prev_raw_count) & 0xFFFFFFFF;
 
 	local64_add(delta, &event->count);
+	local64_set(&hwc->prev_count, new_raw_count);
+
+	spin_unlock_irqrestore(&pmu->lock, flags);
 }
 
 static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config,
@@ -404,6 +409,15 @@ static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config,
 	}
 }
 
+static bool ddr_perf_counter_overflow(struct ddr_pmu *pmu, int counter)
+{
+	int val;
+
+	val = readl_relaxed(pmu->base + counter * 4 + COUNTER_CNTL);
+
+	return val & CNTL_OVER ? true : false;
+}
+
 static void ddr_perf_event_start(struct perf_event *event, int flags)
 {
 	struct ddr_pmu *pmu = to_ddr_pmu(event->pmu);
@@ -534,7 +548,7 @@ static int ddr_perf_init(struct ddr_pmu *pmu, void __iomem *base,
 
 static irqreturn_t ddr_perf_irq_handler(int irq, void *p)
 {
-	int i;
+	int i, ret;
 	struct ddr_pmu *pmu = (struct ddr_pmu *) p;
 	struct perf_event *event, *cycle_event = NULL;
 
@@ -546,7 +560,7 @@ static irqreturn_t ddr_perf_irq_handler(int irq, void *p)
 	/*
 	 * When the cycle counter overflows, all counters are stopped,
 	 * and an IRQ is raised. If any other counter overflows, it
-	 * continues counting, and no IRQ is raised.
+	 * continues counting (stop counting for i.MX8MP), and no IRQ is raised.
 	 *
 	 * Cycles occur at least 4 times as often as other events, so we
 	 * can update all events on a cycle counter overflow and not
@@ -566,6 +580,29 @@ static irqreturn_t ddr_perf_irq_handler(int irq, void *p)
 			cycle_event = event;
 	}
 
+	/* Clear event counters to avoid they stop counting when overflow, such as i.MX8MP */
+	spin_lock(&pmu->lock);
+	for (i = 0; i < NUM_COUNTERS; i++) {
+		if (!pmu->events[i])
+			continue;
+
+		event = pmu->events[i];
+
+		if (event->hw.idx == EVENT_CYCLES_COUNTER)
+			continue;
+
+		/* check event counters overflow */
+		ret = ddr_perf_counter_overflow(pmu, event->hw.idx);
+		if (ret)
+			dev_warn(pmu->dev, "Event Counter%d overflow happened, data incorrect!!\n", i);
+
+		/* clear event counters */
+		ddr_perf_counter_enable(pmu, event->attr.config, event->hw.idx, true);
+
+		local64_set(&event->hw.prev_count, 0);
+	}
+	spin_unlock(&pmu->lock);
+
 	ddr_perf_counter_enable(pmu,
 			      EVENT_CYCLES_ID,
 			      EVENT_CYCLES_COUNTER,
@@ -619,6 +656,7 @@ static int ddr_perf_probe(struct platform_device *pdev)
 	num = ddr_perf_init(pmu, base, &pdev->dev);
 
 	platform_set_drvdata(pdev, pmu);
+	spin_lock_init(&pmu->lock);
 
 	name = devm_kasprintf(&pdev->dev, GFP_KERNEL, DDR_PERF_DEV_NAME "%d",
 			      num);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP
  2020-09-07  9:53 [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP Joakim Zhang
@ 2020-09-07 17:06 ` Will Deacon
  2020-09-08  4:10   ` Joakim Zhang
  0 siblings, 1 reply; 3+ messages in thread
From: Will Deacon @ 2020-09-07 17:06 UTC (permalink / raw)
  To: Joakim Zhang; +Cc: mark.rutland, robin.murphy, linux-imx, linux-arm-kernel

On Mon, Sep 07, 2020 at 05:53:59PM +0800, Joakim Zhang wrote:
> DDR Perf driver only supports free-running event counters(counter1/2/3)
> now, this patch adds support for stop event counters.
> 
> Legacy SoCs:
> Cycle counter(counter0) is a special counter, only count cycles. When
> cycle counter overflow, it will lock all counters and generate an
> interrupt. In ddr_perf_irq_handler, disable cycle counter then all
> counters would stop at the same time, update all counters' count, then
> enable cycle counter that all counters count again. During this process,
> only clear cycle counter, no need to clear event counters since they are
> free-running counters. They would continue counting after overflow and
> do/while loop from ddr_perf_event_update can handle event counters
> overflow case.
> 
> i.MX8MP:
> Almost all is the same as legacy SoCs, the only difference is that, event
> counters are not free-running any more. Like cycle counter, when event
> counters overflow, they would stop counting unless clear the counter,
> and no interrupt generate for event counters. So we should clear event
> counters that let them re-count when cycle counter overflow, which ensure
> event counters will not lose data.

Was this supposed to be an improvement over the "Legacy SoCs"
implementation? It seems even worse...

Do you _have_ to write zeroes back to the event counters to get them going
again, or will any value do?

> diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c
> index 90884d14f95f..057e361eb391 100644
> --- a/drivers/perf/fsl_imx8_ddr_perf.c
> +++ b/drivers/perf/fsl_imx8_ddr_perf.c
> @@ -14,6 +14,7 @@
>  #include <linux/of_device.h>
>  #include <linux/of_irq.h>
>  #include <linux/perf_event.h>
> +#include <linux/spinlock.h>
>  #include <linux/slab.h>
>  
>  #define COUNTER_CNTL		0x0
> @@ -82,6 +83,7 @@ struct ddr_pmu {
>  	const struct fsl_ddr_devtype_data *devtype_data;
>  	int irq;
>  	int id;
> +	spinlock_t lock;
>  };
>  
>  enum ddr_perf_filter_capabilities {
> @@ -368,16 +370,19 @@ static void ddr_perf_event_update(struct perf_event *event)
>  	struct hw_perf_event *hwc = &event->hw;
>  	u64 delta, prev_raw_count, new_raw_count;
>  	int counter = hwc->idx;
> +	unsigned long flags;
>  
> -	do {
> -		prev_raw_count = local64_read(&hwc->prev_count);
> -		new_raw_count = ddr_perf_read_counter(pmu, counter);
> -	} while (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> -			new_raw_count) != prev_raw_count);
> +	spin_lock_irqsave(&pmu->lock, flags);
> +
> +	prev_raw_count = local64_read(&hwc->prev_count);
> +	new_raw_count = ddr_perf_read_counter(pmu, counter);
>  
>  	delta = (new_raw_count - prev_raw_count) & 0xFFFFFFFF;
>  
>  	local64_add(delta, &event->count);
> +	local64_set(&hwc->prev_count, new_raw_count);

Hmm, assuming that the event counters never overflow, why do we care about
the prev count at all? In other words, why don't we just add the counter
value to event->count and reset the hardware to zero every time?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP
  2020-09-07 17:06 ` Will Deacon
@ 2020-09-08  4:10   ` Joakim Zhang
  0 siblings, 0 replies; 3+ messages in thread
From: Joakim Zhang @ 2020-09-08  4:10 UTC (permalink / raw)
  To: Will Deacon; +Cc: mark.rutland, robin.murphy, dl-linux-imx, linux-arm-kernel


> -----Original Message-----
> From: Will Deacon <will@kernel.org>
> Sent: 2020年9月8日 1:07
> To: Joakim Zhang <qiangqing.zhang@nxp.com>
> Cc: mark.rutland@arm.com; robin.murphy@arm.com; dl-linux-imx
> <linux-imx@nxp.com>; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH] perf/imx_ddr: Add stop event counters support for
> i.MX8MP
> 
> On Mon, Sep 07, 2020 at 05:53:59PM +0800, Joakim Zhang wrote:
> > DDR Perf driver only supports free-running event
> > counters(counter1/2/3) now, this patch adds support for stop event counters.
> >
> > Legacy SoCs:
> > Cycle counter(counter0) is a special counter, only count cycles. When
> > cycle counter overflow, it will lock all counters and generate an
> > interrupt. In ddr_perf_irq_handler, disable cycle counter then all
> > counters would stop at the same time, update all counters' count, then
> > enable cycle counter that all counters count again. During this
> > process, only clear cycle counter, no need to clear event counters
> > since they are free-running counters. They would continue counting
> > after overflow and do/while loop from ddr_perf_event_update can handle
> > event counters overflow case.
> >
> > i.MX8MP:
> > Almost all is the same as legacy SoCs, the only difference is that,
> > event counters are not free-running any more. Like cycle counter, when
> > event counters overflow, they would stop counting unless clear the
> > counter, and no interrupt generate for event counters. So we should
> > clear event counters that let them re-count when cycle counter
> > overflow, which ensure event counters will not lose data.
> 
> Was this supposed to be an improvement over the "Legacy SoCs"
> implementation? It seems even worse...
Per IC guys' perspective, they think this is an improvement. Event counters should also stop counting when they are overflow. So they fix it as a bug.
Per software perspective, we more hope event counters are free-running. However, IC guys has not informed us when they do this change.


> Do you _have_ to write zeroes back to the event counters to get them going
> again, or will any value do?
No, event counters also have a CLEAR bit, only need clear this CLEAR bit, then event counters start counting again from zero.


> > diff --git a/drivers/perf/fsl_imx8_ddr_perf.c
> > b/drivers/perf/fsl_imx8_ddr_perf.c
> > index 90884d14f95f..057e361eb391 100644
> > --- a/drivers/perf/fsl_imx8_ddr_perf.c
> > +++ b/drivers/perf/fsl_imx8_ddr_perf.c
> > @@ -14,6 +14,7 @@
> >  #include <linux/of_device.h>
> >  #include <linux/of_irq.h>
> >  #include <linux/perf_event.h>
> > +#include <linux/spinlock.h>
> >  #include <linux/slab.h>
> >
> >  #define COUNTER_CNTL		0x0
> > @@ -82,6 +83,7 @@ struct ddr_pmu {
> >  	const struct fsl_ddr_devtype_data *devtype_data;
> >  	int irq;
> >  	int id;
> > +	spinlock_t lock;
> >  };
> >
> >  enum ddr_perf_filter_capabilities {
> > @@ -368,16 +370,19 @@ static void ddr_perf_event_update(struct
> perf_event *event)
> >  	struct hw_perf_event *hwc = &event->hw;
> >  	u64 delta, prev_raw_count, new_raw_count;
> >  	int counter = hwc->idx;
> > +	unsigned long flags;
> >
> > -	do {
> > -		prev_raw_count = local64_read(&hwc->prev_count);
> > -		new_raw_count = ddr_perf_read_counter(pmu, counter);
> > -	} while (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> > -			new_raw_count) != prev_raw_count);
> > +	spin_lock_irqsave(&pmu->lock, flags);
> > +
> > +	prev_raw_count = local64_read(&hwc->prev_count);
> > +	new_raw_count = ddr_perf_read_counter(pmu, counter);
> >
> >  	delta = (new_raw_count - prev_raw_count) & 0xFFFFFFFF;
> >
> >  	local64_add(delta, &event->count);
> > +	local64_set(&hwc->prev_count, new_raw_count);
> 
> Hmm, assuming that the event counters never overflow, why do we care about
> the prev count at all? In other words, why don't we just add the counter value
> to event->count and reset the hardware to zero every time?
Do you mean that:
for cycle counter, keep the original routine,
for event counters, add counter value to event->count, and then clear event counters to let them counting from zero again?

Sounds great! I will have a try. Thanks.

Best Regards,
Joakim Zhang
> Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-09-08  4:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-07  9:53 [PATCH] perf/imx_ddr: Add stop event counters support for i.MX8MP Joakim Zhang
2020-09-07 17:06 ` Will Deacon
2020-09-08  4:10   ` Joakim Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).