From: Eric B Munson <emunson@mgebm.net>
To: benh@kernel.crashing.org
Cc: a.p.zijlstra@chello.nl, linux-kernel@vger.kernel.org,
David.Laight@ACULAB.COM, paulus@samba.org, anton@samba.org,
acme@ghostprotocols.net, mingo@elte.hu,
linuxppc-dev@lists.ozlabs.org, stable@kernel.org,
Eric B Munson <emunson@mgebm.net>
Subject: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
Date: Fri, 15 Apr 2011 14:12:30 -0400 [thread overview]
Message-ID: <1302891150-8121-1-git-send-email-emunson@mgebm.net> (raw)
Because of speculative event roll back, it is possible for some event coutners
to decrease between reads on POWER7. This causes a problem with the way that
counters are updated. Delta calues are calculated in a 64 bit value and the
top 32 bits are masked. If the register value has decreased, this leaves us
with a very large positive value added to the kernel counters. This patch
protects against this by skipping the update if the delta would be negative.
This can lead to a lack of precision in the coutner values, but from my testing
the value is typcially fewer than 10 samples at a time.
Signed-off-by: Eric B Munson <emunson@mgebm.net>
Cc: stable@kernel.org
---
Changes from V3:
Fix delta checking so that only roll backs are discarded
Changes from V2:
Create a helper that should handle counter roll back as well as registers that
might be allowed to roll over
Changes from V1:
Updated patch leader
Added stable CC
Use an s32 to hold delta values and discard any values that are less than 0
arch/powerpc/kernel/perf_event.c | 37 ++++++++++++++++++++++++++++++-------
1 files changed, 30 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kernel/perf_event.c b/arch/powerpc/kernel/perf_event.c
index c4063b7..822f630 100644
--- a/arch/powerpc/kernel/perf_event.c
+++ b/arch/powerpc/kernel/perf_event.c
@@ -398,6 +398,25 @@ static int check_excludes(struct perf_event **ctrs, unsigned int cflags[],
return 0;
}
+static u64 check_and_compute_delta(u64 prev, u64 val)
+{
+ u64 delta = (val - prev) & 0xfffffffful;
+
+ /*
+ * POWER7 can roll back counter values, if the new value is smaller
+ * than the previous value it will cause the delta and the counter to
+ * have bogus values unless we rolled a counter over. If a coutner is
+ * rolled back, it will be smaller, but within 256, which is the maximum
+ * number of events to rollback at once. If we dectect a rollback
+ * return 0. This can lead to a small lack of precision in the
+ * counters.
+ */
+ if (prev > val && (prev - val) < 256)
+ delta = 0;
+
+ return delta;
+}
+
static void power_pmu_read(struct perf_event *event)
{
s64 val, delta, prev;
@@ -416,10 +435,11 @@ static void power_pmu_read(struct perf_event *event)
prev = local64_read(&event->hw.prev_count);
barrier();
val = read_pmc(event->hw.idx);
+ delta = check_and_compute_delta(prev, val);
+ if (!delta)
+ return;
} while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
- /* The counters are only 32 bits wide */
- delta = (val - prev) & 0xfffffffful;
local64_add(delta, &event->count);
local64_sub(delta, &event->hw.period_left);
}
@@ -449,8 +469,9 @@ static void freeze_limited_counters(struct cpu_hw_events *cpuhw,
val = (event->hw.idx == 5) ? pmc5 : pmc6;
prev = local64_read(&event->hw.prev_count);
event->hw.idx = 0;
- delta = (val - prev) & 0xfffffffful;
- local64_add(delta, &event->count);
+ delta = check_and_compute_delta(prev, val);
+ if (delta)
+ local64_add(delta, &event->count);
}
}
@@ -458,14 +479,16 @@ static void thaw_limited_counters(struct cpu_hw_events *cpuhw,
unsigned long pmc5, unsigned long pmc6)
{
struct perf_event *event;
- u64 val;
+ u64 val, prev;
int i;
for (i = 0; i < cpuhw->n_limited; ++i) {
event = cpuhw->limited_counter[i];
event->hw.idx = cpuhw->limited_hwidx[i];
val = (event->hw.idx == 5) ? pmc5 : pmc6;
- local64_set(&event->hw.prev_count, val);
+ prev = local64_read(&event->hw.prev_count);
+ if (check_and_compute_delta(prev, val))
+ local64_set(&event->hw.prev_count, val);
perf_event_update_userpage(event);
}
}
@@ -1197,7 +1220,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
/* we don't have to worry about interrupts here */
prev = local64_read(&event->hw.prev_count);
- delta = (val - prev) & 0xfffffffful;
+ delta = check_and_compute_delta(prev, val);
local64_add(delta, &event->count);
/*
--
1.7.1
next reply other threads:[~2011-04-15 18:12 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-15 18:12 Eric B Munson [this message]
2011-04-27 7:40 ` [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks David Laight
2011-04-27 12:19 ` Eric B Munson
2011-04-27 12:34 ` David Laight
2011-04-27 12:59 ` Eric B Munson
2011-04-27 13:04 ` David Laight
2011-04-27 12:42 ` David Laight
2011-04-27 13:08 ` Eric B Munson
2011-04-27 13:13 ` David Laight
2011-04-27 13:20 ` Eric B Munson
2011-04-27 13:26 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1302891150-8121-1-git-send-email-emunson@mgebm.net \
--to=emunson@mgebm.net \
--cc=David.Laight@ACULAB.COM \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@ghostprotocols.net \
--cc=anton@samba.org \
--cc=benh@kernel.crashing.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=stable@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).