All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support
@ 2020-07-16  5:11 ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

This patch set is rebased for Peter's patch set to support
cap_user_time/cap_user_time_short ABI for Arm64, and export Arm arch
timer counter related parameters from kernel to Perf tool.

After get feedback from Ahmed, this patch set contains Ahmed's new patch
to refine sched clock data accessing with raw_read_seqcount_latch().

This patch set has been rebased on mainline kernel with the latest
commit 994e99a96c9b ("Merge tag 'platform-drivers-x86-v5.8-2' of
git://git.infradead.org/linux-platform-drivers-x86 into master"); it
has been verified with Perf tool for Arm SPE timestamp enabling.


Changes from v2:
- Included Ahmed's patch to use raw_read_seqcount_latch() for
  sched_clock's seqcount latching;
- Changed to use mul_u64_u32_shr() for converting counter to ns
  in Arm64's arch_perf_update_userpage() (PeterZ).


Ahmed S. Darwish (1):
  time/sched_clock: Use raw_read_seqcount_latch()

Leo Yan (1):
  tools headers UAPI: Update tools's copy of linux/perf_event.h

Peter Zijlstra (5):
  sched_clock: Expose struct clock_read_data
  arm64: perf: Implement correct cap_user_time
  arm64: perf: Only advertise cap_user_time for arch_timer
  perf: Add perf_event_mmap_page::cap_user_time_short ABI
  arm64: perf: Add cap_user_time_short

 arch/arm64/kernel/perf_event.c        | 57 ++++++++++++++++++++-------
 include/linux/sched_clock.h           | 28 +++++++++++++
 include/uapi/linux/perf_event.h       | 23 +++++++++--
 kernel/time/sched_clock.c             | 41 ++++++-------------
 tools/include/uapi/linux/perf_event.h | 23 +++++++++--
 5 files changed, 124 insertions(+), 48 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support
@ 2020-07-16  5:11 ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

This patch set is rebased for Peter's patch set to support
cap_user_time/cap_user_time_short ABI for Arm64, and export Arm arch
timer counter related parameters from kernel to Perf tool.

After get feedback from Ahmed, this patch set contains Ahmed's new patch
to refine sched clock data accessing with raw_read_seqcount_latch().

This patch set has been rebased on mainline kernel with the latest
commit 994e99a96c9b ("Merge tag 'platform-drivers-x86-v5.8-2' of
git://git.infradead.org/linux-platform-drivers-x86 into master"); it
has been verified with Perf tool for Arm SPE timestamp enabling.


Changes from v2:
- Included Ahmed's patch to use raw_read_seqcount_latch() for
  sched_clock's seqcount latching;
- Changed to use mul_u64_u32_shr() for converting counter to ns
  in Arm64's arch_perf_update_userpage() (PeterZ).


Ahmed S. Darwish (1):
  time/sched_clock: Use raw_read_seqcount_latch()

Leo Yan (1):
  tools headers UAPI: Update tools's copy of linux/perf_event.h

Peter Zijlstra (5):
  sched_clock: Expose struct clock_read_data
  arm64: perf: Implement correct cap_user_time
  arm64: perf: Only advertise cap_user_time for arch_timer
  perf: Add perf_event_mmap_page::cap_user_time_short ABI
  arm64: perf: Add cap_user_time_short

 arch/arm64/kernel/perf_event.c        | 57 ++++++++++++++++++++-------
 include/linux/sched_clock.h           | 28 +++++++++++++
 include/uapi/linux/perf_event.h       | 23 +++++++++--
 kernel/time/sched_clock.c             | 41 ++++++-------------
 tools/include/uapi/linux/perf_event.h | 23 +++++++++--
 5 files changed, 124 insertions(+), 48 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/7] sched_clock: Expose struct clock_read_data
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

In order to support perf_event_mmap_page::cap_time features, an
architecture needs, aside from a userspace readable counter register,
to expose the exact clock data so that userspace can convert the
counter register into a correct timestamp.

Provide struct clock_read_data and two (seqcount) helpers so that
architectures (arm64 in specific) can expose the numbers to userspace.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 include/linux/sched_clock.h | 28 +++++++++++++++++++++++++
 kernel/time/sched_clock.c   | 41 ++++++++++++-------------------------
 2 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/include/linux/sched_clock.h b/include/linux/sched_clock.h
index 0bb04a96a6d4..528718e4ed52 100644
--- a/include/linux/sched_clock.h
+++ b/include/linux/sched_clock.h
@@ -6,6 +6,34 @@
 #define LINUX_SCHED_CLOCK
 
 #ifdef CONFIG_GENERIC_SCHED_CLOCK
+/**
+ * struct clock_read_data - data required to read from sched_clock()
+ *
+ * @epoch_ns:		sched_clock() value at last update
+ * @epoch_cyc:		Clock cycle value at last update.
+ * @sched_clock_mask:   Bitmask for two's complement subtraction of non 64bit
+ *			clocks.
+ * @read_sched_clock:	Current clock source (or dummy source when suspended).
+ * @mult:		Multipler for scaled math conversion.
+ * @shift:		Shift value for scaled math conversion.
+ *
+ * Care must be taken when updating this structure; it is read by
+ * some very hot code paths. It occupies <=40 bytes and, when combined
+ * with the seqcount used to synchronize access, comfortably fits into
+ * a 64 byte cache line.
+ */
+struct clock_read_data {
+	u64 epoch_ns;
+	u64 epoch_cyc;
+	u64 sched_clock_mask;
+	u64 (*read_sched_clock)(void);
+	u32 mult;
+	u32 shift;
+};
+
+extern struct clock_read_data *sched_clock_read_begin(unsigned int *seq);
+extern int sched_clock_read_retry(unsigned int seq);
+
 extern void generic_sched_clock_init(void);
 
 extern void sched_clock_register(u64 (*read)(void), int bits,
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index fa3f800d7d76..0acaadc3156c 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -19,31 +19,6 @@
 
 #include "timekeeping.h"
 
-/**
- * struct clock_read_data - data required to read from sched_clock()
- *
- * @epoch_ns:		sched_clock() value at last update
- * @epoch_cyc:		Clock cycle value at last update.
- * @sched_clock_mask:   Bitmask for two's complement subtraction of non 64bit
- *			clocks.
- * @read_sched_clock:	Current clock source (or dummy source when suspended).
- * @mult:		Multipler for scaled math conversion.
- * @shift:		Shift value for scaled math conversion.
- *
- * Care must be taken when updating this structure; it is read by
- * some very hot code paths. It occupies <=40 bytes and, when combined
- * with the seqcount used to synchronize access, comfortably fits into
- * a 64 byte cache line.
- */
-struct clock_read_data {
-	u64 epoch_ns;
-	u64 epoch_cyc;
-	u64 sched_clock_mask;
-	u64 (*read_sched_clock)(void);
-	u32 mult;
-	u32 shift;
-};
-
 /**
  * struct clock_data - all data needed for sched_clock() (including
  *                     registration of a new clock source)
@@ -93,6 +68,17 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 	return (cyc * mult) >> shift;
 }
 
+struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
+{
+	*seq = raw_read_seqcount(&cd.seq);
+	return cd.read_data + (*seq & 1);
+}
+
+int sched_clock_read_retry(unsigned int seq)
+{
+	return read_seqcount_retry(&cd.seq, seq);
+}
+
 unsigned long long notrace sched_clock(void)
 {
 	u64 cyc, res;
@@ -100,13 +86,12 @@ unsigned long long notrace sched_clock(void)
 	struct clock_read_data *rd;
 
 	do {
-		seq = raw_read_seqcount(&cd.seq);
-		rd = cd.read_data + (seq & 1);
+		rd = sched_clock_read_begin(&seq);
 
 		cyc = (rd->read_sched_clock() - rd->epoch_cyc) &
 		      rd->sched_clock_mask;
 		res = rd->epoch_ns + cyc_to_ns(cyc, rd->mult, rd->shift);
-	} while (read_seqcount_retry(&cd.seq, seq));
+	} while (sched_clock_read_retry(seq));
 
 	return res;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 1/7] sched_clock: Expose struct clock_read_data
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

In order to support perf_event_mmap_page::cap_time features, an
architecture needs, aside from a userspace readable counter register,
to expose the exact clock data so that userspace can convert the
counter register into a correct timestamp.

Provide struct clock_read_data and two (seqcount) helpers so that
architectures (arm64 in specific) can expose the numbers to userspace.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 include/linux/sched_clock.h | 28 +++++++++++++++++++++++++
 kernel/time/sched_clock.c   | 41 ++++++++++++-------------------------
 2 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/include/linux/sched_clock.h b/include/linux/sched_clock.h
index 0bb04a96a6d4..528718e4ed52 100644
--- a/include/linux/sched_clock.h
+++ b/include/linux/sched_clock.h
@@ -6,6 +6,34 @@
 #define LINUX_SCHED_CLOCK
 
 #ifdef CONFIG_GENERIC_SCHED_CLOCK
+/**
+ * struct clock_read_data - data required to read from sched_clock()
+ *
+ * @epoch_ns:		sched_clock() value at last update
+ * @epoch_cyc:		Clock cycle value at last update.
+ * @sched_clock_mask:   Bitmask for two's complement subtraction of non 64bit
+ *			clocks.
+ * @read_sched_clock:	Current clock source (or dummy source when suspended).
+ * @mult:		Multipler for scaled math conversion.
+ * @shift:		Shift value for scaled math conversion.
+ *
+ * Care must be taken when updating this structure; it is read by
+ * some very hot code paths. It occupies <=40 bytes and, when combined
+ * with the seqcount used to synchronize access, comfortably fits into
+ * a 64 byte cache line.
+ */
+struct clock_read_data {
+	u64 epoch_ns;
+	u64 epoch_cyc;
+	u64 sched_clock_mask;
+	u64 (*read_sched_clock)(void);
+	u32 mult;
+	u32 shift;
+};
+
+extern struct clock_read_data *sched_clock_read_begin(unsigned int *seq);
+extern int sched_clock_read_retry(unsigned int seq);
+
 extern void generic_sched_clock_init(void);
 
 extern void sched_clock_register(u64 (*read)(void), int bits,
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index fa3f800d7d76..0acaadc3156c 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -19,31 +19,6 @@
 
 #include "timekeeping.h"
 
-/**
- * struct clock_read_data - data required to read from sched_clock()
- *
- * @epoch_ns:		sched_clock() value at last update
- * @epoch_cyc:		Clock cycle value at last update.
- * @sched_clock_mask:   Bitmask for two's complement subtraction of non 64bit
- *			clocks.
- * @read_sched_clock:	Current clock source (or dummy source when suspended).
- * @mult:		Multipler for scaled math conversion.
- * @shift:		Shift value for scaled math conversion.
- *
- * Care must be taken when updating this structure; it is read by
- * some very hot code paths. It occupies <=40 bytes and, when combined
- * with the seqcount used to synchronize access, comfortably fits into
- * a 64 byte cache line.
- */
-struct clock_read_data {
-	u64 epoch_ns;
-	u64 epoch_cyc;
-	u64 sched_clock_mask;
-	u64 (*read_sched_clock)(void);
-	u32 mult;
-	u32 shift;
-};
-
 /**
  * struct clock_data - all data needed for sched_clock() (including
  *                     registration of a new clock source)
@@ -93,6 +68,17 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 	return (cyc * mult) >> shift;
 }
 
+struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
+{
+	*seq = raw_read_seqcount(&cd.seq);
+	return cd.read_data + (*seq & 1);
+}
+
+int sched_clock_read_retry(unsigned int seq)
+{
+	return read_seqcount_retry(&cd.seq, seq);
+}
+
 unsigned long long notrace sched_clock(void)
 {
 	u64 cyc, res;
@@ -100,13 +86,12 @@ unsigned long long notrace sched_clock(void)
 	struct clock_read_data *rd;
 
 	do {
-		seq = raw_read_seqcount(&cd.seq);
-		rd = cd.read_data + (seq & 1);
+		rd = sched_clock_read_begin(&seq);
 
 		cyc = (rd->read_sched_clock() - rd->epoch_cyc) &
 		      rd->sched_clock_mask;
 		res = rd->epoch_ns + cyc_to_ns(cyc, rd->mult, rd->shift);
-	} while (read_seqcount_retry(&cd.seq, seq));
+	} while (sched_clock_read_retry(seq));
 
 	return res;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/7] time/sched_clock: Use raw_read_seqcount_latch()
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: "Ahmed S. Darwish" <a.darwish@linutronix.de>

sched_clock uses seqcount_t latching to switch between two storage
places protected by the sequence counter. This allows it to have
interruptible, NMI-safe, seqcount_t write side critical sections.

Since 7fc26327b756 ("seqlock: Introduce raw_read_seqcount_latch()"),
raw_read_seqcount_latch() became the standardized way for seqcount_t
latch read paths. Due to the dependent load, it also has one read
memory barrier less than the currently used raw_read_seqcount() API.

Use raw_read_seqcount_latch() for the seqcount_t latch read path.

Link: https://lkml.kernel.org/r/20200625085745.GD117543@hirez.programming.kicks-ass.net
Link: https://lkml.kernel.org/r/20200715092345.GA231464@debian-buster-darwi.lab.linutronix.de
References: 1809bfa44e10 ("timers, sched/clock: Avoid deadlock during read from NMI")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 kernel/time/sched_clock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 0acaadc3156c..0deaf4b79fb4 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -70,7 +70,7 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 
 struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
 {
-	*seq = raw_read_seqcount(&cd.seq);
+	*seq = raw_read_seqcount_latch(&cd.seq);
 	return cd.read_data + (*seq & 1);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/7] time/sched_clock: Use raw_read_seqcount_latch()
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: "Ahmed S. Darwish" <a.darwish@linutronix.de>

sched_clock uses seqcount_t latching to switch between two storage
places protected by the sequence counter. This allows it to have
interruptible, NMI-safe, seqcount_t write side critical sections.

Since 7fc26327b756 ("seqlock: Introduce raw_read_seqcount_latch()"),
raw_read_seqcount_latch() became the standardized way for seqcount_t
latch read paths. Due to the dependent load, it also has one read
memory barrier less than the currently used raw_read_seqcount() API.

Use raw_read_seqcount_latch() for the seqcount_t latch read path.

Link: https://lkml.kernel.org/r/20200625085745.GD117543@hirez.programming.kicks-ass.net
Link: https://lkml.kernel.org/r/20200715092345.GA231464@debian-buster-darwi.lab.linutronix.de
References: 1809bfa44e10 ("timers, sched/clock: Avoid deadlock during read from NMI")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 kernel/time/sched_clock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 0acaadc3156c..0deaf4b79fb4 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -70,7 +70,7 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 
 struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
 {
-	*seq = raw_read_seqcount(&cd.seq);
+	*seq = raw_read_seqcount_latch(&cd.seq);
 	return cd.read_data + (*seq & 1);
 }
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/7] arm64: perf: Implement correct cap_user_time
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

As reported by Leo; the existing implementation is broken when the
clock and counter don't intersect at 0.

Use the sched_clock's struct clock_read_data information to correctly
implement cap_user_time and cap_user_time_zero.

Note that the ARM64 counter is architecturally only guaranteed to be
56bit wide (implementations are allowed to be wider) and the existing
perf ABI cannot deal with wrap-around.

This implementation should also be faster than the old; seeing how we
don't need to recompute mult and shift all the time.

[leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow]

Reported-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 38 ++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 4d7879484cec..47db6c7cae6a 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
+#include <linux/sched_clock.h>
 #include <linux/smp.h>
 
 /* ARMv8 Cortex-A53 specific event types. */
@@ -1165,28 +1166,47 @@ device_initcall(armv8_pmu_driver_init)
 void arch_perf_update_userpage(struct perf_event *event,
 			       struct perf_event_mmap_page *userpg, u64 now)
 {
-	u32 freq;
-	u32 shift;
+	struct clock_read_data *rd;
+	unsigned int seq;
+	u64 ns;
 
 	/*
 	 * Internal timekeeping for enabled/running/stopped times
 	 * is always computed with the sched_clock.
 	 */
-	freq = arch_timer_get_rate();
 	userpg->cap_user_time = 1;
+	userpg->cap_user_time_zero = 1;
+
+	do {
+		rd = sched_clock_read_begin(&seq);
+
+		userpg->time_mult = rd->mult;
+		userpg->time_shift = rd->shift;
+		userpg->time_zero = rd->epoch_ns;
+
+		/*
+		 * This isn't strictly correct, the ARM64 counter can be
+		 * 'short' and then we get funnies when it wraps. The correct
+		 * thing would be to extend the perf ABI with a cycle and mask
+		 * value, but because wrapping on ARM64 is very rare in
+		 * practise this 'works'.
+		 */
+		ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
+		userpg->time_zero -= ns;
+
+	} while (sched_clock_read_retry(seq));
+
+	userpg->time_offset = userpg->time_zero - now;
 
-	clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
-			NSEC_PER_SEC, 0);
 	/*
 	 * time_shift is not expected to be greater than 31 due to
 	 * the original published conversion algorithm shifting a
 	 * 32-bit value (now specifies a 64-bit value) - refer
 	 * perf_event_mmap_page documentation in perf_event.h.
 	 */
-	if (shift == 32) {
-		shift = 31;
+	if (userpg->time_shift == 32) {
+		userpg->time_shift = 31;
 		userpg->time_mult >>= 1;
 	}
-	userpg->time_shift = (u16)shift;
-	userpg->time_offset = -now;
+
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/7] arm64: perf: Implement correct cap_user_time
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

As reported by Leo; the existing implementation is broken when the
clock and counter don't intersect at 0.

Use the sched_clock's struct clock_read_data information to correctly
implement cap_user_time and cap_user_time_zero.

Note that the ARM64 counter is architecturally only guaranteed to be
56bit wide (implementations are allowed to be wider) and the existing
perf ABI cannot deal with wrap-around.

This implementation should also be faster than the old; seeing how we
don't need to recompute mult and shift all the time.

[leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow]

Reported-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 38 ++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 4d7879484cec..47db6c7cae6a 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
+#include <linux/sched_clock.h>
 #include <linux/smp.h>
 
 /* ARMv8 Cortex-A53 specific event types. */
@@ -1165,28 +1166,47 @@ device_initcall(armv8_pmu_driver_init)
 void arch_perf_update_userpage(struct perf_event *event,
 			       struct perf_event_mmap_page *userpg, u64 now)
 {
-	u32 freq;
-	u32 shift;
+	struct clock_read_data *rd;
+	unsigned int seq;
+	u64 ns;
 
 	/*
 	 * Internal timekeeping for enabled/running/stopped times
 	 * is always computed with the sched_clock.
 	 */
-	freq = arch_timer_get_rate();
 	userpg->cap_user_time = 1;
+	userpg->cap_user_time_zero = 1;
+
+	do {
+		rd = sched_clock_read_begin(&seq);
+
+		userpg->time_mult = rd->mult;
+		userpg->time_shift = rd->shift;
+		userpg->time_zero = rd->epoch_ns;
+
+		/*
+		 * This isn't strictly correct, the ARM64 counter can be
+		 * 'short' and then we get funnies when it wraps. The correct
+		 * thing would be to extend the perf ABI with a cycle and mask
+		 * value, but because wrapping on ARM64 is very rare in
+		 * practise this 'works'.
+		 */
+		ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
+		userpg->time_zero -= ns;
+
+	} while (sched_clock_read_retry(seq));
+
+	userpg->time_offset = userpg->time_zero - now;
 
-	clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
-			NSEC_PER_SEC, 0);
 	/*
 	 * time_shift is not expected to be greater than 31 due to
 	 * the original published conversion algorithm shifting a
 	 * 32-bit value (now specifies a 64-bit value) - refer
 	 * perf_event_mmap_page documentation in perf_event.h.
 	 */
-	if (shift == 32) {
-		shift = 31;
+	if (userpg->time_shift == 32) {
+		userpg->time_shift = 31;
 		userpg->time_mult >>= 1;
 	}
-	userpg->time_shift = (u16)shift;
-	userpg->time_offset = -now;
+
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/7] arm64: perf: Only advertise cap_user_time for arch_timer
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

When sched_clock is running on anything other than arch_timer, don't
advertise cap_user_time*.

Requested-by: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 47db6c7cae6a..c016b116ae33 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -13,6 +13,8 @@
 #include <asm/sysreg.h>
 #include <asm/virt.h>
 
+#include <clocksource/arm_arch_timer.h>
+
 #include <linux/acpi.h>
 #include <linux/clocksource.h>
 #include <linux/kvm_host.h>
@@ -1170,16 +1172,15 @@ void arch_perf_update_userpage(struct perf_event *event,
 	unsigned int seq;
 	u64 ns;
 
-	/*
-	 * Internal timekeeping for enabled/running/stopped times
-	 * is always computed with the sched_clock.
-	 */
-	userpg->cap_user_time = 1;
-	userpg->cap_user_time_zero = 1;
+	userpg->cap_user_time = 0;
+	userpg->cap_user_time_zero = 0;
 
 	do {
 		rd = sched_clock_read_begin(&seq);
 
+		if (rd->read_sched_clock != arch_timer_read_counter)
+			return;
+
 		userpg->time_mult = rd->mult;
 		userpg->time_shift = rd->shift;
 		userpg->time_zero = rd->epoch_ns;
@@ -1209,4 +1210,10 @@ void arch_perf_update_userpage(struct perf_event *event,
 		userpg->time_mult >>= 1;
 	}
 
+	/*
+	 * Internal timekeeping for enabled/running/stopped times
+	 * is always computed with the sched_clock.
+	 */
+	userpg->cap_user_time = 1;
+	userpg->cap_user_time_zero = 1;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/7] arm64: perf: Only advertise cap_user_time for arch_timer
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

When sched_clock is running on anything other than arch_timer, don't
advertise cap_user_time*.

Requested-by: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 47db6c7cae6a..c016b116ae33 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -13,6 +13,8 @@
 #include <asm/sysreg.h>
 #include <asm/virt.h>
 
+#include <clocksource/arm_arch_timer.h>
+
 #include <linux/acpi.h>
 #include <linux/clocksource.h>
 #include <linux/kvm_host.h>
@@ -1170,16 +1172,15 @@ void arch_perf_update_userpage(struct perf_event *event,
 	unsigned int seq;
 	u64 ns;
 
-	/*
-	 * Internal timekeeping for enabled/running/stopped times
-	 * is always computed with the sched_clock.
-	 */
-	userpg->cap_user_time = 1;
-	userpg->cap_user_time_zero = 1;
+	userpg->cap_user_time = 0;
+	userpg->cap_user_time_zero = 0;
 
 	do {
 		rd = sched_clock_read_begin(&seq);
 
+		if (rd->read_sched_clock != arch_timer_read_counter)
+			return;
+
 		userpg->time_mult = rd->mult;
 		userpg->time_shift = rd->shift;
 		userpg->time_zero = rd->epoch_ns;
@@ -1209,4 +1210,10 @@ void arch_perf_update_userpage(struct perf_event *event,
 		userpg->time_mult >>= 1;
 	}
 
+	/*
+	 * Internal timekeeping for enabled/running/stopped times
+	 * is always computed with the sched_clock.
+	 */
+	userpg->cap_user_time = 1;
+	userpg->cap_user_time_zero = 1;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

In order to support short clock counters, provide an ABI extension.

As a whole:

    u64 time, delta, cyc = read_cycle_counter();

+   if (cap_user_time_short)
+	cyc = time_cycle + ((cyc - time_cycle) & time_mask);

    delta = mul_u64_u32_shr(cyc, time_mult, time_shift);

    if (cap_user_time_zero)
	time = time_zero + delta;

    delta += time_offset;

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 include/uapi/linux/perf_event.h | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..21a1edd08cbe 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -532,9 +532,10 @@ struct perf_event_mmap_page {
 				cap_bit0_is_deprecated	: 1, /* Always 1, signals that bit 0 is zero */
 
 				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
-				cap_user_time		: 1, /* The time_* fields are used */
+				cap_user_time		: 1, /* The time_{shift,mult,offset} fields are used */
 				cap_user_time_zero	: 1, /* The time_zero field is used */
-				cap_____res		: 59;
+				cap_user_time_short	: 1, /* the time_{cycle,mask} fields are used */
+				cap_____res		: 58;
 		};
 	};
 
@@ -593,13 +594,29 @@ struct perf_event_mmap_page {
 	 *               ((rem * time_mult) >> time_shift);
 	 */
 	__u64	time_zero;
+
 	__u32	size;			/* Header size up to __reserved[] fields. */
+	__u32	__reserved_1;
+
+	/*
+	 * If cap_usr_time_short, the hardware clock is less than 64bit wide
+	 * and we must compute the 'cyc' value, as used by cap_usr_time, as:
+	 *
+	 *   cyc = time_cycles + ((cyc - time_cycles) & time_mask)
+	 *
+	 * NOTE: this form is explicitly chosen such that cap_usr_time_short
+	 *       is a correction on top of cap_usr_time, and code that doesn't
+	 *       know about cap_usr_time_short still works under the assumption
+	 *       the counter doesn't wrap.
+	 */
+	__u64	time_cycles;
+	__u64	time_mask;
 
 		/*
 		 * Hole for extension of the self monitor capabilities
 		 */
 
-	__u8	__reserved[118*8+4];	/* align to 1k. */
+	__u8	__reserved[116*8];	/* align to 1k. */
 
 	/*
 	 * Control data for the mmap() data buffer.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

In order to support short clock counters, provide an ABI extension.

As a whole:

    u64 time, delta, cyc = read_cycle_counter();

+   if (cap_user_time_short)
+	cyc = time_cycle + ((cyc - time_cycle) & time_mask);

    delta = mul_u64_u32_shr(cyc, time_mult, time_shift);

    if (cap_user_time_zero)
	time = time_zero + delta;

    delta += time_offset;

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 include/uapi/linux/perf_event.h | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..21a1edd08cbe 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -532,9 +532,10 @@ struct perf_event_mmap_page {
 				cap_bit0_is_deprecated	: 1, /* Always 1, signals that bit 0 is zero */
 
 				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
-				cap_user_time		: 1, /* The time_* fields are used */
+				cap_user_time		: 1, /* The time_{shift,mult,offset} fields are used */
 				cap_user_time_zero	: 1, /* The time_zero field is used */
-				cap_____res		: 59;
+				cap_user_time_short	: 1, /* the time_{cycle,mask} fields are used */
+				cap_____res		: 58;
 		};
 	};
 
@@ -593,13 +594,29 @@ struct perf_event_mmap_page {
 	 *               ((rem * time_mult) >> time_shift);
 	 */
 	__u64	time_zero;
+
 	__u32	size;			/* Header size up to __reserved[] fields. */
+	__u32	__reserved_1;
+
+	/*
+	 * If cap_usr_time_short, the hardware clock is less than 64bit wide
+	 * and we must compute the 'cyc' value, as used by cap_usr_time, as:
+	 *
+	 *   cyc = time_cycles + ((cyc - time_cycles) & time_mask)
+	 *
+	 * NOTE: this form is explicitly chosen such that cap_usr_time_short
+	 *       is a correction on top of cap_usr_time, and code that doesn't
+	 *       know about cap_usr_time_short still works under the assumption
+	 *       the counter doesn't wrap.
+	 */
+	__u64	time_cycles;
+	__u64	time_mask;
 
 		/*
 		 * Hole for extension of the self monitor capabilities
 		 */
 
-	__u8	__reserved[118*8+4];	/* align to 1k. */
+	__u8	__reserved[116*8];	/* align to 1k. */
 
 	/*
 	 * Control data for the mmap() data buffer.
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/7] arm64: perf: Add cap_user_time_short
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

This completes the ARM64 cap_user_time support.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index c016b116ae33..888bcb5d1388 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1174,6 +1174,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 
 	userpg->cap_user_time = 0;
 	userpg->cap_user_time_zero = 0;
+	userpg->cap_user_time_short = 0;
 
 	do {
 		rd = sched_clock_read_begin(&seq);
@@ -1184,13 +1185,13 @@ void arch_perf_update_userpage(struct perf_event *event,
 		userpg->time_mult = rd->mult;
 		userpg->time_shift = rd->shift;
 		userpg->time_zero = rd->epoch_ns;
+		userpg->time_cycles = rd->epoch_cyc;
+		userpg->time_mask = rd->sched_clock_mask;
 
 		/*
-		 * This isn't strictly correct, the ARM64 counter can be
-		 * 'short' and then we get funnies when it wraps. The correct
-		 * thing would be to extend the perf ABI with a cycle and mask
-		 * value, but because wrapping on ARM64 is very rare in
-		 * practise this 'works'.
+		 * Subtract the cycle base, such that software that
+		 * doesn't know about cap_user_time_short still 'works'
+		 * assuming no wraps.
 		 */
 		ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
 		userpg->time_zero -= ns;
@@ -1216,4 +1217,5 @@ void arch_perf_update_userpage(struct perf_event *event,
 	 */
 	userpg->cap_user_time = 1;
 	userpg->cap_user_time_zero = 1;
+	userpg->cap_user_time_short = 1;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/7] arm64: perf: Add cap_user_time_short
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

From: Peter Zijlstra <peterz@infradead.org>

This completes the ARM64 cap_user_time support.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index c016b116ae33..888bcb5d1388 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1174,6 +1174,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 
 	userpg->cap_user_time = 0;
 	userpg->cap_user_time_zero = 0;
+	userpg->cap_user_time_short = 0;
 
 	do {
 		rd = sched_clock_read_begin(&seq);
@@ -1184,13 +1185,13 @@ void arch_perf_update_userpage(struct perf_event *event,
 		userpg->time_mult = rd->mult;
 		userpg->time_shift = rd->shift;
 		userpg->time_zero = rd->epoch_ns;
+		userpg->time_cycles = rd->epoch_cyc;
+		userpg->time_mask = rd->sched_clock_mask;
 
 		/*
-		 * This isn't strictly correct, the ARM64 counter can be
-		 * 'short' and then we get funnies when it wraps. The correct
-		 * thing would be to extend the perf ABI with a cycle and mask
-		 * value, but because wrapping on ARM64 is very rare in
-		 * practise this 'works'.
+		 * Subtract the cycle base, such that software that
+		 * doesn't know about cap_user_time_short still 'works'
+		 * assuming no wraps.
 		 */
 		ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
 		userpg->time_zero -= ns;
@@ -1216,4 +1217,5 @@ void arch_perf_update_userpage(struct perf_event *event,
 	 */
 	userpg->cap_user_time = 1;
 	userpg->cap_user_time_zero = 1;
+	userpg->cap_user_time_short = 1;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-16  5:11   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

To get the changes in the commit:

  "perf: Add perf_event_mmap_page::cap_user_time_short ABI"

This update is a prerequisite to add support for short clock counters
related ABI extension.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/include/uapi/linux/perf_event.h | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..21a1edd08cbe 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -532,9 +532,10 @@ struct perf_event_mmap_page {
 				cap_bit0_is_deprecated	: 1, /* Always 1, signals that bit 0 is zero */
 
 				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
-				cap_user_time		: 1, /* The time_* fields are used */
+				cap_user_time		: 1, /* The time_{shift,mult,offset} fields are used */
 				cap_user_time_zero	: 1, /* The time_zero field is used */
-				cap_____res		: 59;
+				cap_user_time_short	: 1, /* the time_{cycle,mask} fields are used */
+				cap_____res		: 58;
 		};
 	};
 
@@ -593,13 +594,29 @@ struct perf_event_mmap_page {
 	 *               ((rem * time_mult) >> time_shift);
 	 */
 	__u64	time_zero;
+
 	__u32	size;			/* Header size up to __reserved[] fields. */
+	__u32	__reserved_1;
+
+	/*
+	 * If cap_usr_time_short, the hardware clock is less than 64bit wide
+	 * and we must compute the 'cyc' value, as used by cap_usr_time, as:
+	 *
+	 *   cyc = time_cycles + ((cyc - time_cycles) & time_mask)
+	 *
+	 * NOTE: this form is explicitly chosen such that cap_usr_time_short
+	 *       is a correction on top of cap_usr_time, and code that doesn't
+	 *       know about cap_usr_time_short still works under the assumption
+	 *       the counter doesn't wrap.
+	 */
+	__u64	time_cycles;
+	__u64	time_mask;
 
 		/*
 		 * Hole for extension of the self monitor capabilities
 		 */
 
-	__u8	__reserved[118*8+4];	/* align to 1k. */
+	__u8	__reserved[116*8];	/* align to 1k. */
 
 	/*
 	 * Control data for the mmap() data buffer.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h
@ 2020-07-16  5:11   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2020-07-16  5:11 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Catalin Marinas, Thomas Gleixner,
	Ahmed S. Darwish, Ben Dooks (Codethink),
	Paul Cercueil, Adrian Hunter, Kan Liang, linux-kernel,
	linux-arm-kernel
  Cc: Leo Yan

To get the changes in the commit:

  "perf: Add perf_event_mmap_page::cap_user_time_short ABI"

This update is a prerequisite to add support for short clock counters
related ABI extension.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/include/uapi/linux/perf_event.h | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..21a1edd08cbe 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -532,9 +532,10 @@ struct perf_event_mmap_page {
 				cap_bit0_is_deprecated	: 1, /* Always 1, signals that bit 0 is zero */
 
 				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
-				cap_user_time		: 1, /* The time_* fields are used */
+				cap_user_time		: 1, /* The time_{shift,mult,offset} fields are used */
 				cap_user_time_zero	: 1, /* The time_zero field is used */
-				cap_____res		: 59;
+				cap_user_time_short	: 1, /* the time_{cycle,mask} fields are used */
+				cap_____res		: 58;
 		};
 	};
 
@@ -593,13 +594,29 @@ struct perf_event_mmap_page {
 	 *               ((rem * time_mult) >> time_shift);
 	 */
 	__u64	time_zero;
+
 	__u32	size;			/* Header size up to __reserved[] fields. */
+	__u32	__reserved_1;
+
+	/*
+	 * If cap_usr_time_short, the hardware clock is less than 64bit wide
+	 * and we must compute the 'cyc' value, as used by cap_usr_time, as:
+	 *
+	 *   cyc = time_cycles + ((cyc - time_cycles) & time_mask)
+	 *
+	 * NOTE: this form is explicitly chosen such that cap_usr_time_short
+	 *       is a correction on top of cap_usr_time, and code that doesn't
+	 *       know about cap_usr_time_short still works under the assumption
+	 *       the counter doesn't wrap.
+	 */
+	__u64	time_cycles;
+	__u64	time_mask;
 
 		/*
 		 * Hole for extension of the self monitor capabilities
 		 */
 
-	__u8	__reserved[118*8+4];	/* align to 1k. */
+	__u8	__reserved[116*8];	/* align to 1k. */
 
 	/*
 	 * Control data for the mmap() data buffer.
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support
  2020-07-16  5:11 ` Leo Yan
@ 2020-07-20 11:56   ` Will Deacon
  -1 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-07-20 11:56 UTC (permalink / raw)
  To: Catalin Marinas, Namhyung Kim, Alexander Shishkin, linux-kernel,
	Paul Cercueil, Leo Yan, Ingo Molnar, linux-arm-kernel,
	Arnaldo Carvalho de Melo, Ahmed S. Darwish, Adrian Hunter,
	Jiri Olsa, Ben Dooks (Codethink),
	Peter Zijlstra, Kan Liang, Thomas Gleixner, Mark Rutland
  Cc: kernel-team, Will Deacon

On Thu, 16 Jul 2020 13:11:23 +0800, Leo Yan wrote:
> This patch set is rebased for Peter's patch set to support
> cap_user_time/cap_user_time_short ABI for Arm64, and export Arm arch
> timer counter related parameters from kernel to Perf tool.
> 
> After get feedback from Ahmed, this patch set contains Ahmed's new patch
> to refine sched clock data accessing with raw_read_seqcount_latch().
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/7] sched_clock: Expose struct clock_read_data
      https://git.kernel.org/will/c/1b86abc1c645
[2/7] time/sched_clock: Use raw_read_seqcount_latch()
      https://git.kernel.org/will/c/aadd6e5caaac
[3/7] arm64: perf: Implement correct cap_user_time
      https://git.kernel.org/will/c/950b74ddefc4
[4/7] arm64: perf: Only advertise cap_user_time for arch_timer
      https://git.kernel.org/will/c/279a811eb520
[5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI
      https://git.kernel.org/will/c/6c0246a4588d
[6/7] arm64: perf: Add cap_user_time_short
      https://git.kernel.org/will/c/c8f9eb0d6eba
[7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h
      https://git.kernel.org/will/c/5271d915a99c

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support
@ 2020-07-20 11:56   ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-07-20 11:56 UTC (permalink / raw)
  To: Catalin Marinas, Namhyung Kim, Alexander Shishkin, linux-kernel,
	Paul Cercueil, Leo Yan, Ingo Molnar, linux-arm-kernel,
	Arnaldo Carvalho de Melo, Ahmed S. Darwish, Adrian Hunter,
	Jiri Olsa, Ben Dooks (Codethink),
	Peter Zijlstra, Kan Liang, Thomas Gleixner, Mark Rutland
  Cc: Will Deacon, kernel-team

On Thu, 16 Jul 2020 13:11:23 +0800, Leo Yan wrote:
> This patch set is rebased for Peter's patch set to support
> cap_user_time/cap_user_time_short ABI for Arm64, and export Arm arch
> timer counter related parameters from kernel to Perf tool.
> 
> After get feedback from Ahmed, this patch set contains Ahmed's new patch
> to refine sched clock data accessing with raw_read_seqcount_latch().
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/7] sched_clock: Expose struct clock_read_data
      https://git.kernel.org/will/c/1b86abc1c645
[2/7] time/sched_clock: Use raw_read_seqcount_latch()
      https://git.kernel.org/will/c/aadd6e5caaac
[3/7] arm64: perf: Implement correct cap_user_time
      https://git.kernel.org/will/c/950b74ddefc4
[4/7] arm64: perf: Only advertise cap_user_time for arch_timer
      https://git.kernel.org/will/c/279a811eb520
[5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI
      https://git.kernel.org/will/c/6c0246a4588d
[6/7] arm64: perf: Add cap_user_time_short
      https://git.kernel.org/will/c/c8f9eb0d6eba
[7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h
      https://git.kernel.org/will/c/5271d915a99c

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-07-20 11:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-16  5:11 [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Leo Yan
2020-07-16  5:11 ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 1/7] sched_clock: Expose struct clock_read_data Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 2/7] time/sched_clock: Use raw_read_seqcount_latch() Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 3/7] arm64: perf: Implement correct cap_user_time Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 4/7] arm64: perf: Only advertise cap_user_time for arch_timer Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 5/7] perf: Add perf_event_mmap_page::cap_user_time_short ABI Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 6/7] arm64: perf: Add cap_user_time_short Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-16  5:11 ` [PATCH v3 7/7] tools headers UAPI: Update tools's copy of linux/perf_event.h Leo Yan
2020-07-16  5:11   ` Leo Yan
2020-07-20 11:56 ` [PATCH v3 0/7] arm64: perf: Proper cap_user_time* support Will Deacon
2020-07-20 11:56   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.