* [PATCH v3 1/7] sched_clock: Expose struct clock_read_data
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-16 5:11 ` [PATCH v3 2/7] time/sched_clock: Use raw_read_seqcount_latch() Leo Yan
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
From: Peter Zijlstra <peterz@infradead.org>
In order to support perf_event_mmap_page::cap_time features, an
architecture needs, aside from a userspace readable counter register,
to expose the exact clock data so that userspace can convert the
counter register into a correct timestamp.
Provide struct clock_read_data and two (seqcount) helpers so that
architectures (arm64 in specific) can expose the numbers to userspace.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
include/linux/sched_clock.h | 28 +++++++++++++++++++++++++
kernel/time/sched_clock.c | 41 ++++++++++++-------------------------
2 files changed, 41 insertions(+), 28 deletions(-)
diff --git a/include/linux/sched_clock.h b/include/linux/sched_clock.h
index 0bb04a96a6d4..528718e4ed52 100644
--- a/include/linux/sched_clock.h
+++ b/include/linux/sched_clock.h
@@ -6,6 +6,34 @@
#define LINUX_SCHED_CLOCK
#ifdef CONFIG_GENERIC_SCHED_CLOCK
+/**
+ * struct clock_read_data - data required to read from sched_clock()
+ *
+ * @epoch_ns: sched_clock() value at last update
+ * @epoch_cyc: Clock cycle value at last update.
+ * @sched_clock_mask: Bitmask for two's complement subtraction of non 64bit
+ * clocks.
+ * @read_sched_clock: Current clock source (or dummy source when suspended).
+ * @mult: Multipler for scaled math conversion.
+ * @shift: Shift value for scaled math conversion.
+ *
+ * Care must be taken when updating this structure; it is read by
+ * some very hot code paths. It occupies <=40 bytes and, when combined
+ * with the seqcount used to synchronize access, comfortably fits into
+ * a 64 byte cache line.
+ */
+struct clock_read_data {
+ u64 epoch_ns;
+ u64 epoch_cyc;
+ u64 sched_clock_mask;
+ u64 (*read_sched_clock)(void);
+ u32 mult;
+ u32 shift;
+};
+
+extern struct clock_read_data *sched_clock_read_begin(unsigned int *seq);
+extern int sched_clock_read_retry(unsigned int seq);
+
extern void generic_sched_clock_init(void);
extern void sched_clock_register(u64 (*read)(void), int bits,
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index fa3f800d7d76..0acaadc3156c 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -19,31 +19,6 @@
#include "timekeeping.h"
-/**
- * struct clock_read_data - data required to read from sched_clock()
- *
- * @epoch_ns: sched_clock() value at last update
- * @epoch_cyc: Clock cycle value at last update.
- * @sched_clock_mask: Bitmask for two's complement subtraction of non 64bit
- * clocks.
- * @read_sched_clock: Current clock source (or dummy source when suspended).
- * @mult: Multipler for scaled math conversion.
- * @shift: Shift value for scaled math conversion.
- *
- * Care must be taken when updating this structure; it is read by
- * some very hot code paths. It occupies <=40 bytes and, when combined
- * with the seqcount used to synchronize access, comfortably fits into
- * a 64 byte cache line.
- */
-struct clock_read_data {
- u64 epoch_ns;
- u64 epoch_cyc;
- u64 sched_clock_mask;
- u64 (*read_sched_clock)(void);
- u32 mult;
- u32 shift;
-};
-
/**
* struct clock_data - all data needed for sched_clock() (including
* registration of a new clock source)
@@ -93,6 +68,17 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
return (cyc * mult) >> shift;
}
+struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
+{
+ *seq = raw_read_seqcount(&cd.seq);
+ return cd.read_data + (*seq & 1);
+}
+
+int sched_clock_read_retry(unsigned int seq)
+{
+ return read_seqcount_retry(&cd.seq, seq);
+}
+
unsigned long long notrace sched_clock(void)
{
u64 cyc, res;
@@ -100,13 +86,12 @@ unsigned long long notrace sched_clock(void)
struct clock_read_data *rd;
do {
- seq = raw_read_seqcount(&cd.seq);
- rd = cd.read_data + (seq & 1);
+ rd = sched_clock_read_begin(&seq);
cyc = (rd->read_sched_clock() - rd->epoch_cyc) &
rd->sched_clock_mask;
res = rd->epoch_ns + cyc_to_ns(cyc, rd->mult, rd->shift);
- } while (read_seqcount_retry(&cd.seq, seq));
+ } while (sched_clock_read_retry(seq));
return res;
}
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 2/7] time/sched_clock: Use raw_read_seqcount_latch()
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
2020-07-16 5:11 ` [PATCH v3 1/7] sched_clock: Expose struct clock_read_data Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-16 5:11 ` [PATCH v3 3/7] arm64: perf: Implement correct cap_user_time Leo Yan
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
From: "Ahmed S. Darwish" <a.darwish@linutronix.de>
sched_clock uses seqcount_t latching to switch between two storage
places protected by the sequence counter. This allows it to have
interruptible, NMI-safe, seqcount_t write side critical sections.
Since 7fc26327b756 ("seqlock: Introduce raw_read_seqcount_latch()"),
raw_read_seqcount_latch() became the standardized way for seqcount_t
latch read paths. Due to the dependent load, it also has one read
memory barrier less than the currently used raw_read_seqcount() API.
Use raw_read_seqcount_latch() for the seqcount_t latch read path.
Link: https://lkml.kernel.org/r/20200625085745.GD117543@hirez.programming.kicks-ass.net
Link: https://lkml.kernel.org/r/20200715092345.GA231464@debian-buster-darwi.lab.linutronix.de
References: 1809bfa44e10 ("timers, sched/clock: Avoid deadlock during read from NMI")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
kernel/time/sched_clock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 0acaadc3156c..0deaf4b79fb4 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -70,7 +70,7 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
{
- *seq = raw_read_seqcount(&cd.seq);
+ *seq = raw_read_seqcount_latch(&cd.seq);
return cd.read_data + (*seq & 1);
}
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 3/7] arm64: perf: Implement correct cap_user_time
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
2020-07-16 5:11 ` [PATCH v3 1/7] sched_clock: Expose struct clock_read_data Leo Yan
2020-07-16 5:11 ` [PATCH v3 2/7] time/sched_clock: Use raw_read_seqcount_latch() Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-16 5:11 ` [PATCH v3 4/7] arm64: perf: Only advertise cap_user_time for arch_timer Leo Yan
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
From: Peter Zijlstra <peterz@infradead.org>
As reported by Leo; the existing implementation is broken when the
clock and counter don't intersect at 0.
Use the sched_clock's struct clock_read_data information to correctly
implement cap_user_time and cap_user_time_zero.
Note that the ARM64 counter is architecturally only guaranteed to be
56bit wide (implementations are allowed to be wider) and the existing
perf ABI cannot deal with wrap-around.
This implementation should also be faster than the old; seeing how we
don't need to recompute mult and shift all the time.
[leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow]
Reported-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
arch/arm64/kernel/perf_event.c | 38 ++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 4d7879484cec..47db6c7cae6a 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
#include <linux/of.h>
#include <linux/perf/arm_pmu.h>
#include <linux/platform_device.h>
+#include <linux/sched_clock.h>
#include <linux/smp.h>
/* ARMv8 Cortex-A53 specific event types. */
@@ -1165,28 +1166,47 @@ device_initcall(armv8_pmu_driver_init)
void arch_perf_update_userpage(struct perf_event *event,
struct perf_event_mmap_page *userpg, u64 now)
{
- u32 freq;
- u32 shift;
+ struct clock_read_data *rd;
+ unsigned int seq;
+ u64 ns;
/*
* Internal timekeeping for enabled/running/stopped times
* is always computed with the sched_clock.
*/
- freq = arch_timer_get_rate();
userpg->cap_user_time = 1;
+ userpg->cap_user_time_zero = 1;
+
+ do {
+ rd = sched_clock_read_begin(&seq);
+
+ userpg->time_mult = rd->mult;
+ userpg->time_shift = rd->shift;
+ userpg->time_zero = rd->epoch_ns;
+
+ /*
+ * This isn't strictly correct, the ARM64 counter can be
+ * 'short' and then we get funnies when it wraps. The correct
+ * thing would be to extend the perf ABI with a cycle and mask
+ * value, but because wrapping on ARM64 is very rare in
+ * practise this 'works'.
+ */
+ ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
+ userpg->time_zero -= ns;
+
+ } while (sched_clock_read_retry(seq));
+
+ userpg->time_offset = userpg->time_zero - now;
- clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
- NSEC_PER_SEC, 0);
/*
* time_shift is not expected to be greater than 31 due to
* the original published conversion algorithm shifting a
* 32-bit value (now specifies a 64-bit value) - refer
* perf_event_mmap_page documentation in perf_event.h.
*/
- if (shift == 32) {
- shift = 31;
+ if (userpg->time_shift == 32) {
+ userpg->time_shift = 31;
userpg->time_mult >>= 1;
}
- userpg->time_shift = (u16)shift;
- userpg->time_offset = -now;
+
}
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 4/7] arm64: perf: Only advertise cap_user_time for arch_timer
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
` (2 preceding siblings ...)
2020-07-16 5:11 ` [PATCH v3 3/7] arm64: perf: Implement correct cap_user_time Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-16 5:11 ` [PATCH v3 5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI Leo Yan
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
From: Peter Zijlstra <peterz@infradead.org>
When sched_clock is running on anything other than arch_timer, don't
advertise cap_user_time*.
Requested-by: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
arch/arm64/kernel/perf_event.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 47db6c7cae6a..c016b116ae33 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -13,6 +13,8 @@
#include <asm/sysreg.h>
#include <asm/virt.h>
+#include <clocksource/arm_arch_timer.h>
+
#include <linux/acpi.h>
#include <linux/clocksource.h>
#include <linux/kvm_host.h>
@@ -1170,16 +1172,15 @@ void arch_perf_update_userpage(struct perf_event *event,
unsigned int seq;
u64 ns;
- /*
- * Internal timekeeping for enabled/running/stopped times
- * is always computed with the sched_clock.
- */
- userpg->cap_user_time = 1;
- userpg->cap_user_time_zero = 1;
+ userpg->cap_user_time = 0;
+ userpg->cap_user_time_zero = 0;
do {
rd = sched_clock_read_begin(&seq);
+ if (rd->read_sched_clock != arch_timer_read_counter)
+ return;
+
userpg->time_mult = rd->mult;
userpg->time_shift = rd->shift;
userpg->time_zero = rd->epoch_ns;
@@ -1209,4 +1210,10 @@ void arch_perf_update_userpage(struct perf_event *event,
userpg->time_mult >>= 1;
}
+ /*
+ * Internal timekeeping for enabled/running/stopped times
+ * is always computed with the sched_clock.
+ */
+ userpg->cap_user_time = 1;
+ userpg->cap_user_time_zero = 1;
}
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
` (3 preceding siblings ...)
2020-07-16 5:11 ` [PATCH v3 4/7] arm64: perf: Only advertise cap_user_time for arch_timer Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-16 5:11 ` [PATCH v3 6/7] arm64: perf: Add cap_user_time_short Leo Yan
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
From: Peter Zijlstra <peterz@infradead.org>
In order to support short clock counters, provide an ABI extension.
As a whole:
u64 time, delta, cyc = read_cycle_counter();
+ if (cap_user_time_short)
+ cyc = time_cycle + ((cyc - time_cycle) & time_mask);
delta = mul_u64_u32_shr(cyc, time_mult, time_shift);
if (cap_user_time_zero)
time = time_zero + delta;
delta += time_offset;
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
include/uapi/linux/perf_event.h | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..21a1edd08cbe 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -532,9 +532,10 @@ struct perf_event_mmap_page {
cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts */
- cap_user_time : 1, /* The time_* fields are used */
+ cap_user_time : 1, /* The time_{shift,mult,offset} fields are used */
cap_user_time_zero : 1, /* The time_zero field is used */
- cap_____res : 59;
+ cap_user_time_short : 1, /* the time_{cycle,mask} fields are used */
+ cap_____res : 58;
};
};
@@ -593,13 +594,29 @@ struct perf_event_mmap_page {
* ((rem * time_mult) >> time_shift);
*/
__u64 time_zero;
+
__u32 size; /* Header size up to __reserved[] fields. */
+ __u32 __reserved_1;
+
+ /*
+ * If cap_usr_time_short, the hardware clock is less than 64bit wide
+ * and we must compute the 'cyc' value, as used by cap_usr_time, as:
+ *
+ * cyc = time_cycles + ((cyc - time_cycles) & time_mask)
+ *
+ * NOTE: this form is explicitly chosen such that cap_usr_time_short
+ * is a correction on top of cap_usr_time, and code that doesn't
+ * know about cap_usr_time_short still works under the assumption
+ * the counter doesn't wrap.
+ */
+ __u64 time_cycles;
+ __u64 time_mask;
/*
* Hole for extension of the self monitor capabilities
*/
- __u8 __reserved[118*8+4]; /* align to 1k. */
+ __u8 __reserved[116*8]; /* align to 1k. */
/*
* Control data for the mmap() data buffer.
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 6/7] arm64: perf: Add cap_user_time_short
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
` (4 preceding siblings ...)
2020-07-16 5:11 ` [PATCH v3 5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-16 5:11 ` [PATCH v3 7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h Leo Yan
2020-07-20 11:56 ` [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Will Deacon
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
From: Peter Zijlstra <peterz@infradead.org>
This completes the ARM64 cap_user_time support.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
arch/arm64/kernel/perf_event.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index c016b116ae33..888bcb5d1388 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1174,6 +1174,7 @@ void arch_perf_update_userpage(struct perf_event *event,
userpg->cap_user_time = 0;
userpg->cap_user_time_zero = 0;
+ userpg->cap_user_time_short = 0;
do {
rd = sched_clock_read_begin(&seq);
@@ -1184,13 +1185,13 @@ void arch_perf_update_userpage(struct perf_event *event,
userpg->time_mult = rd->mult;
userpg->time_shift = rd->shift;
userpg->time_zero = rd->epoch_ns;
+ userpg->time_cycles = rd->epoch_cyc;
+ userpg->time_mask = rd->sched_clock_mask;
/*
- * This isn't strictly correct, the ARM64 counter can be
- * 'short' and then we get funnies when it wraps. The correct
- * thing would be to extend the perf ABI with a cycle and mask
- * value, but because wrapping on ARM64 is very rare in
- * practise this 'works'.
+ * Subtract the cycle base, such that software that
+ * doesn't know about cap_user_time_short still 'works'
+ * assuming no wraps.
*/
ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
userpg->time_zero -= ns;
@@ -1216,4 +1217,5 @@ void arch_perf_update_userpage(struct perf_event *event,
*/
userpg->cap_user_time = 1;
userpg->cap_user_time_zero = 1;
+ userpg->cap_user_time_short = 1;
}
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
` (5 preceding siblings ...)
2020-07-16 5:11 ` [PATCH v3 6/7] arm64: perf: Add cap_user_time_short Leo Yan
@ 2020-07-16 5:11 ` Leo Yan
2020-07-20 11:56 ` [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Will Deacon
7 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2020-07-16 5:11 UTC (permalink / raw)
To: Peter Zijlstra, Will Deacon, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
Ahmed S. Darwish, Ben Dooks (Codethink),
Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
linux-arm-kernel
Cc: Leo Yan
To get the changes in the commit:
"perf: Add perf_event_mmap_page::cap_user_time_short ABI"
This update is a prerequisite to add support for short clock counters
related ABI extension.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
tools/include/uapi/linux/perf_event.h | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..21a1edd08cbe 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -532,9 +532,10 @@ struct perf_event_mmap_page {
cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts */
- cap_user_time : 1, /* The time_* fields are used */
+ cap_user_time : 1, /* The time_{shift,mult,offset} fields are used */
cap_user_time_zero : 1, /* The time_zero field is used */
- cap_____res : 59;
+ cap_user_time_short : 1, /* the time_{cycle,mask} fields are used */
+ cap_____res : 58;
};
};
@@ -593,13 +594,29 @@ struct perf_event_mmap_page {
* ((rem * time_mult) >> time_shift);
*/
__u64 time_zero;
+
__u32 size; /* Header size up to __reserved[] fields. */
+ __u32 __reserved_1;
+
+ /*
+ * If cap_usr_time_short, the hardware clock is less than 64bit wide
+ * and we must compute the 'cyc' value, as used by cap_usr_time, as:
+ *
+ * cyc = time_cycles + ((cyc - time_cycles) & time_mask)
+ *
+ * NOTE: this form is explicitly chosen such that cap_usr_time_short
+ * is a correction on top of cap_usr_time, and code that doesn't
+ * know about cap_usr_time_short still works under the assumption
+ * the counter doesn't wrap.
+ */
+ __u64 time_cycles;
+ __u64 time_mask;
/*
* Hole for extension of the self monitor capabilities
*/
- __u8 __reserved[118*8+4]; /* align to 1k. */
+ __u8 __reserved[116*8]; /* align to 1k. */
/*
* Control data for the mmap() data buffer.
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support
2020-07-16 5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
` (6 preceding siblings ...)
2020-07-16 5:11 ` [PATCH v3 7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h Leo Yan
@ 2020-07-20 11:56 ` Will Deacon
7 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2020-07-20 11:56 UTC (permalink / raw)
To: Catalin Marinas, Namhyung Kim, Alexander Shishkin, linux-kernel,
Paul Cercueil, Leo Yan, Ingo Molnar, linux-arm-kernel,
Arnaldo Carvalho de Melo, Ahmed S. Darwish, Adrian Hunter,
Jiri Olsa, Ben Dooks (Codethink),
Peter Zijlstra, Kan Liang, Thomas Gleixner, Mark Rutland
Cc: Will Deacon, kernel-team
On Thu, 16 Jul 2020 13:11:23 +0800, Leo Yan wrote:
> This patch set is rebased for Peter's patch set to support
> cap_user_time/cap_user_time_short ABI for Arm64, and export Arm arch
> timer counter related parameters from kernel to Perf tool.
>
> After get feedback from Ahmed, this patch set contains Ahmed's new patch
> to refine sched clock data accessing with raw_read_seqcount_latch().
>
> [...]
Applied to will (for-next/perf), thanks!
[1/7] sched_clock: Expose struct clock_read_data
https://git.kernel.org/will/c/1b86abc1c645
[2/7] time/sched_clock: Use raw_read_seqcount_latch()
https://git.kernel.org/will/c/aadd6e5caaac
[3/7] arm64: perf: Implement correct cap_user_time
https://git.kernel.org/will/c/950b74ddefc4
[4/7] arm64: perf: Only advertise cap_user_time for arch_timer
https://git.kernel.org/will/c/279a811eb520
[5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI
https://git.kernel.org/will/c/6c0246a4588d
[6/7] arm64: perf: Add cap_user_time_short
https://git.kernel.org/will/c/c8f9eb0d6eba
[7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h
https://git.kernel.org/will/c/5271d915a99c
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread