linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer
@ 2021-06-02 10:29 Leo Yan
  2021-06-02 10:30 ` [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on " Leo Yan
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Based on the discussion [1], this patch series is to refine the memory
barriers for AUX ring buffer.

Patches 01 ~ 04 to address the barriers usage in the kernel.  The first
patch is to make clear comment for how to use the barriers between the
data store and aux_head store, this asks the driver to make sure the
data is visible.  Patches 02 ~ 04 is to refine the drivers for barriers
after the data store.

Patches 05 ~ 07 is to fix and clean up the memory barries in perf tool
for AUX ring buffer.

Since the 64-bit value's atomicity is not promised on 32-bit perf, the
last patch is to report error and let perf to directly exit for this
case.

Have testes the patches on Arm64 Juno platform.

[1] https://lore.kernel.org/patchwork/patch/1431867/


Leo Yan (8):
  perf/ring_buffer: Add comment for barriers on AUX ring buffer
  coresight: tmc-etr: Add barrier after updating AUX ring buffer
  coresight: tmc-etf: Add comment for store ordering
  perf/x86: Add barrier after updating bts
  perf auxtrace: Change to use SMP memory barriers
  perf auxtrace: Drop legacy __sync functions
  perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  perf record: Directly bail out for compat case

 arch/x86/events/intel/bts.c                   |  3 +++
 .../hwtracing/coresight/coresight-tmc-etf.c   |  6 +++++
 .../hwtracing/coresight/coresight-tmc-etr.c   |  8 ++++++
 kernel/events/ring_buffer.c                   |  9 +++++++
 tools/perf/builtin-record.c                   | 17 ++++++++++++
 tools/perf/util/auxtrace.h                    | 27 +++----------------
 6 files changed, 47 insertions(+), 23 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on AUX ring buffer
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-07 15:27   ` Peter Zijlstra
  2021-06-02 10:30 ` [PATCH v2 2/8] coresight: tmc-etr: Add barrier after updating " Leo Yan
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

AUX ring buffer applies almost the same barriers as perf ring buffer,
but there has an exception for ordering between writing the AUX trace
data and updating user_page::aux_head.

This patch adds comment for how to use the barriers on AUX ring buffer,
and gives comment to ask the drivers to flush the trace data into AUX
ring buffer prior to updating user_page::aux_head.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 kernel/events/ring_buffer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 52868716ec35..5cf6579be05e 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)
 		perf_event_aux_event(handle->event, aux_head, size,
 				     handle->aux_flags);
 
+	/*
+	 * See perf_output_put_handle(), AUX ring buffer applies the same
+	 * barrier pairing as the perf ring buffer; except for B, since
+	 * AUX ring buffer is written by hardware trace, we cannot simply
+	 * use the generic memory barrier (like smp_wmb()) prior to update
+	 * user_page::aux_head, the hardware trace driver takes the
+	 * responsibility to ensure the trace data has been flushed into
+	 * the AUX buffer before calling perf_aux_output_end().
+	 */
 	WRITE_ONCE(rb->user_page->aux_head, rb->aux_head);
 	if (rb_need_aux_wakeup(rb))
 		wakeup = true;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/8] coresight: tmc-etr: Add barrier after updating AUX ring buffer
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
  2021-06-02 10:30 ` [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on " Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-02 10:30 ` [PATCH v2 3/8] coresight: tmc-etf: Add comment for store ordering Leo Yan
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since a memory barrier is required between AUX trace data store and
aux_head store, and the AUX trace data is filled with memcpy(), it's
sufficient to use smp_wmb() so can ensure the trace data is visible
prior to updating aux_head.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index acdb59e0e661..713205db15a1 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 	 */
 	if (etr_perf->snapshot)
 		handle->head += size;
+
+	/*
+	 * It requires the ordering between the AUX trace data and aux_head
+	 * store, below smp_wmb() ensures the AUX trace data is visible prior
+	 * to updating aux_head.
+	 */
+	smp_wmb();
+
 out:
 	/*
 	 * Don't set the TRUNCATED flag in snapshot mode because 1) the
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/8] coresight: tmc-etf: Add comment for store ordering
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
  2021-06-02 10:30 ` [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on " Leo Yan
  2021-06-02 10:30 ` [PATCH v2 2/8] coresight: tmc-etr: Add barrier after updating " Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-02 10:30 ` [PATCH v2 4/8] perf/x86: Add barrier after updating bts Leo Yan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

AUX ring buffer is required to separate the data store and aux_head
store, since the function CS_LOCK() has contained memory barrier mb(),
mb() is a more conservative barrier than smp_wmb() on Arm32/Arm64, thus
it's needless to add any explicit barrier anymore.

Add comment to make clear for the barrier usage for ETF.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 45b85edfc690..9a42ee689921 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -553,6 +553,12 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 	if (buf->snapshot)
 		handle->head += to_read;
 
+	/*
+	 * AUX ring buffer requires to use memory barrier to separate the trace
+	 * data store and aux_head store, because CS_LOCK() contains mb() which
+	 * gives more heavy barrier than smp_wmb(), it's not necessary to
+	 * explicitly invoke any barrier.
+	 */
 	CS_LOCK(drvdata->base);
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/8] perf/x86: Add barrier after updating bts
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (2 preceding siblings ...)
  2021-06-02 10:30 ` [PATCH v2 3/8] coresight: tmc-etf: Add comment for store ordering Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-07 15:29   ` Peter Zijlstra
  2021-06-02 10:30 ` [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers Leo Yan
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Add barrier wmb() to separate the AUX data store and aux_head store.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/x86/events/intel/bts.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 6320d2cfd9d3..4a015d160bc5 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -209,6 +209,9 @@ static void bts_update(struct bts_ctx *bts)
 	} else {
 		local_set(&buf->data_size, head);
 	}
+
+	/* The WMB separates data store and aux_head store matches. */
+	wmb();
 }
 
 static int
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (3 preceding siblings ...)
  2021-06-02 10:30 ` [PATCH v2 4/8] perf/x86: Add barrier after updating bts Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-07 10:02   ` Adrian Hunter
  2021-06-07 15:29   ` Peter Zijlstra
  2021-06-02 10:30 ` [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions Leo Yan
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

The kernel and the userspace tool can access the AUX ring buffer head
and tail from different CPUs, thus SMP class of barriers are required
on SMP system.

This patch changes to use SMP barriers to replace mb() and rmb()
barriers.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a4fbb33b7245..42b7ef811bde 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -444,7 +444,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 	u64 head = READ_ONCE(pc->aux_head);
 
 	/* Ensure all reads are done after we read the head */
-	rmb();
+	smp_rmb();
 	return head;
 }
 
@@ -458,7 +458,7 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 #endif
 
 	/* Ensure all reads are done after we read the head */
-	rmb();
+	smp_rmb();
 	return head;
 }
 
@@ -470,7 +470,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 #endif
 
 	/* Ensure all reads are done before we write the tail out */
-	mb();
+	smp_mb();
 #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	pc->aux_tail = tail;
 #else
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (4 preceding siblings ...)
  2021-06-02 10:30 ` [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-02 10:47   ` Adrian Hunter
  2021-06-02 10:30 ` [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
  2021-06-02 10:30 ` [PATCH v2 8/8] perf record: Directly bail out for compat case Leo Yan
  7 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

The main purpose for using __sync built-in functions is to support
compat mode for 32-bit perf with 64-bit kernel.  But using these
built-in functions might cause couple potential issues.

Firstly, __sync functions originally support Intel Itanium processoer [1]
but it cannot promise to support all 32-bit archs.  Now these
functions have become the legacy functions.

As Peter also pointed out the logic issue in the function
auxtrace_mmap__write_tail(), it does a cmpxchg with 0 values to load
old_tail, and then executes a further cmpxchg with old_tail to write
the new tail.  If consider the aux_tail might be assigned to '0' in the
middle of loops, this can introduce mess for AUX buffer if the kernel
fetches the temporary value '0'.

Considering __sync functions cannot really fix the 64-bit value
atomicity on 32-bit archs, thus this patch drops __sync functions.

Credits to Peter for detailed analysis.

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.h | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 42b7ef811bde..e625bc76cdde 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -432,12 +432,6 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
-/*
- * In snapshot mode the mmapped page is read-only which makes using
- * __sync_val_compare_and_swap() problematic.  However, snapshot mode expects
- * the buffer is not updated while the snapshot is made (e.g. Intel PT disables
- * the event) so there is not a race anyway.
- */
 static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
@@ -451,11 +445,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	u64 head = READ_ONCE(pc->aux_head);
-#else
-	u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
-#endif
 
 	/* Ensure all reads are done after we read the head */
 	smp_rmb();
@@ -465,19 +455,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
-	u64 old_tail;
-#endif
 
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
-#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	pc->aux_tail = tail;
-#else
-	do {
-		old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
-	} while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail));
-#endif
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (5 preceding siblings ...)
  2021-06-02 10:30 ` [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-07 10:03   ` Adrian Hunter
  2021-06-07 15:31   ` Peter Zijlstra
  2021-06-02 10:30 ` [PATCH v2 8/8] perf record: Directly bail out for compat case Leo Yan
  7 siblings, 2 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
behaviour.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index e625bc76cdde..abc4282f5272 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -458,7 +458,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
-	pc->aux_tail = tail;
+	WRITE_ONCE(pc->aux_tail, tail);
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (6 preceding siblings ...)
  2021-06-02 10:30 ` [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
@ 2021-06-02 10:30 ` Leo Yan
  2021-06-02 11:18   ` Adrian Hunter
  7 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-02 10:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Adrian Hunter, Andi Kleen,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since the 64-bit atomicity is not promised in 32-bit perf, directly
report the error and bail out for this case.

Now only applies on x86_64 and Arm64 platforms.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-record.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3337b5f93336..f47e298281f7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -74,6 +74,7 @@
 #include <linux/zalloc.h>
 #include <linux/bitmap.h>
 #include <sys/time.h>
+#include <sys/utsname.h>
 
 struct switch_output {
 	bool		 enabled;
@@ -848,6 +849,22 @@ static int record__mmap_evlist(struct record *rec,
 				  opts->auxtrace_sample_mode;
 	char msg[512];
 
+#ifndef __LP64__
+	struct utsname uts;
+	int ret;
+
+	ret = uname(&uts);
+	if (ret < 0)
+		return ret;
+
+	if (!strncmp(uts.machine, "x86_64", 6) || !strncmp(uts.machine, "aarch64", 7) ||
+	    !strncmp(uts.machine, "arm64", 5)) {
+		pr_err("Error, 32-bit perf cannot record from a 64-bit kernel.\n"
+		       "Please use a 64-bit version of perf instead.\n");
+		return -ENOTSUP;
+	}
+#endif
+
 	if (opts->affinity != PERF_AFFINITY_SYS)
 		cpu__setup_cpunode_map();
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions
  2021-06-02 10:30 ` [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions Leo Yan
@ 2021-06-02 10:47   ` Adrian Hunter
  2021-06-02 11:16     ` Leo Yan
  0 siblings, 1 reply; 28+ messages in thread
From: Adrian Hunter @ 2021-06-02 10:47 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 2/06/21 1:30 pm, Leo Yan wrote:
> The main purpose for using __sync built-in functions is to support
> compat mode for 32-bit perf with 64-bit kernel.  But using these
> built-in functions might cause couple potential issues.
> 
> Firstly, __sync functions originally support Intel Itanium processoer [1]
> but it cannot promise to support all 32-bit archs.  Now these
> functions have become the legacy functions.
> 
> As Peter also pointed out the logic issue in the function
> auxtrace_mmap__write_tail(), it does a cmpxchg with 0 values to load
> old_tail, and then executes a further cmpxchg with old_tail to write
> the new tail.  If consider the aux_tail might be assigned to '0' in the
> middle of loops, this can introduce mess for AUX buffer if the kernel
> fetches the temporary value '0'.
> 
> Considering __sync functions cannot really fix the 64-bit value
> atomicity on 32-bit archs, thus this patch drops __sync functions.
> 
> Credits to Peter for detailed analysis.
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/auxtrace.h | 19 -------------------
>  1 file changed, 19 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index 42b7ef811bde..e625bc76cdde 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -432,12 +432,6 @@ struct auxtrace_cache;
>  
>  #ifdef HAVE_AUXTRACE_SUPPORT
>  
> -/*
> - * In snapshot mode the mmapped page is read-only which makes using
> - * __sync_val_compare_and_swap() problematic.  However, snapshot mode expects
> - * the buffer is not updated while the snapshot is made (e.g. Intel PT disables
> - * the event) so there is not a race anyway.
> - */
>  static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> @@ -451,11 +445,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)

The test and setup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT is not needed anymore either.

>  	u64 head = READ_ONCE(pc->aux_head);
> -#else
> -	u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
> -#endif
>  
>  	/* Ensure all reads are done after we read the head */
>  	smp_rmb();
> @@ -465,19 +455,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> -#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
> -	u64 old_tail;
> -#endif
>  
>  	/* Ensure all reads are done before we write the tail out */
>  	smp_mb();
> -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
>  	pc->aux_tail = tail;
> -#else
> -	do {
> -		old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
> -	} while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail));
> -#endif
>  }
>  
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions
  2021-06-02 10:47   ` Adrian Hunter
@ 2021-06-02 11:16     ` Leo Yan
  2021-06-02 11:21       ` Adrian Hunter
  0 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-02 11:16 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

Hi Adrian,

On Wed, Jun 02, 2021 at 01:47:42PM +0300, Adrian Hunter wrote:

[...]

> > @@ -451,11 +445,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
> >  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
> >  {
> >  	struct perf_event_mmap_page *pc = mm->userpg;
> > -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
> 
> The test and setup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT is not needed anymore either.

Yes, I think there have two files should be cleaned:

  Makefile.config
  util/auxtrace.c

If still miss anything, please let me know (I remembered there have a
test case for __sync_xxx_compare_and_swap, but I cannot find it now,
so this is why I am concern if I miss anything not).

Thanks for review,
Leo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-02 10:30 ` [PATCH v2 8/8] perf record: Directly bail out for compat case Leo Yan
@ 2021-06-02 11:18   ` Adrian Hunter
  2021-06-02 12:38     ` Leo Yan
  0 siblings, 1 reply; 28+ messages in thread
From: Adrian Hunter @ 2021-06-02 11:18 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 2/06/21 1:30 pm, Leo Yan wrote:
> Since the 64-bit atomicity is not promised in 32-bit perf, directly
> report the error and bail out for this case.
> 
> Now only applies on x86_64 and Arm64 platforms.
> 
> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>

Maybe we can do better for the compat case.

We can assume the upper 32-bits change very seldom,
and always increase. So for the 'read' case:

	u64 first, second, last;
	u64 mask = (u64)((u32)-1) << 32;

	do {
		first = READ_ONCE(pc->aux_head);
		rmb();
		second = READ_ONCE(pc->aux_head);
		rmb();
		last = READ_ONCE(pc->aux_head);
	} while ((first & mask) != (last & mask));
	return second;

For the write case, we can cause a fatal error only if the new
tail has non-zero upper 32-bits.  That gives up to 4GiB of data
before aborting:

	if (tail & mask)
		return -1;
	smp_mb();
	WRITE_ONCE(pc->aux_tail, tail);

> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/builtin-record.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 3337b5f93336..f47e298281f7 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -74,6 +74,7 @@
>  #include <linux/zalloc.h>
>  #include <linux/bitmap.h>
>  #include <sys/time.h>
> +#include <sys/utsname.h>
>  
>  struct switch_output {
>  	bool		 enabled;
> @@ -848,6 +849,22 @@ static int record__mmap_evlist(struct record *rec,
>  				  opts->auxtrace_sample_mode;
>  	char msg[512];
>  
> +#ifndef __LP64__
> +	struct utsname uts;
> +	int ret;
> +
> +	ret = uname(&uts);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (!strncmp(uts.machine, "x86_64", 6) || !strncmp(uts.machine, "aarch64", 7) ||
> +	    !strncmp(uts.machine, "arm64", 5)) {
> +		pr_err("Error, 32-bit perf cannot record from a 64-bit kernel.\n"
> +		       "Please use a 64-bit version of perf instead.\n");
> +		return -ENOTSUP;
> +	}
> +#endif
> +
>  	if (opts->affinity != PERF_AFFINITY_SYS)
>  		cpu__setup_cpunode_map();
>  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions
  2021-06-02 11:16     ` Leo Yan
@ 2021-06-02 11:21       ` Adrian Hunter
  2021-06-02 13:01         ` Leo Yan
  0 siblings, 1 reply; 28+ messages in thread
From: Adrian Hunter @ 2021-06-02 11:21 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 2/06/21 2:16 pm, Leo Yan wrote:
> Hi Adrian,
> 
> On Wed, Jun 02, 2021 at 01:47:42PM +0300, Adrian Hunter wrote:
> 
> [...]
> 
>>> @@ -451,11 +445,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>>>  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>>>  {
>>>  	struct perf_event_mmap_page *pc = mm->userpg;
>>> -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
>>
>> The test and setup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT is not needed anymore either.
> 
> Yes, I think there have two files should be cleaned:
> 
>   Makefile.config
>   util/auxtrace.c
> 
> If still miss anything, please let me know (I remembered there have a
> test case for __sync_xxx_compare_and_swap, but I cannot find it now,
> so this is why I am concern if I miss anything not).

tools/build/feature/test-sync-compare-and-swap.c

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-02 11:18   ` Adrian Hunter
@ 2021-06-02 12:38     ` Leo Yan
  2021-06-07 10:23       ` Adrian Hunter
  0 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-02 12:38 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

Hi Adrain,

On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
> On 2/06/21 1:30 pm, Leo Yan wrote:
> > Since the 64-bit atomicity is not promised in 32-bit perf, directly
> > report the error and bail out for this case.
> > 
> > Now only applies on x86_64 and Arm64 platforms.
> > 
> > Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> 
> Maybe we can do better for the compat case.
> 
> We can assume the upper 32-bits change very seldom,
> and always increase. So for the 'read' case:
> 
> 	u64 first, second, last;
> 	u64 mask = (u64)((u32)-1) << 32;
> 
> 	do {
> 		first = READ_ONCE(pc->aux_head);
> 		rmb();
> 		second = READ_ONCE(pc->aux_head);
> 		rmb();
> 		last = READ_ONCE(pc->aux_head);
> 	} while ((first & mask) != (last & mask));
> 	return second;
> 
> For the write case, we can cause a fatal error only if the new
> tail has non-zero upper 32-bits.  That gives up to 4GiB of data
> before aborting:
> 
> 	if (tail & mask)
> 		return -1;
> 	smp_mb();
> 	WRITE_ONCE(pc->aux_tail, tail);

Seems to me, it's pointless to only support aux_head for 64-bit and
support aux_tail for 32-bit.  I understand this can be helpful for the
snapshot mode which only uses aux_head, but it still fails to support
the normal case for AUX ring buffer using 64-bit head/tail.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions
  2021-06-02 11:21       ` Adrian Hunter
@ 2021-06-02 13:01         ` Leo Yan
  0 siblings, 0 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-02 13:01 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Wed, Jun 02, 2021 at 02:21:05PM +0300, Adrian Hunter wrote:
> On 2/06/21 2:16 pm, Leo Yan wrote:
> > Hi Adrian,
> > 
> > On Wed, Jun 02, 2021 at 01:47:42PM +0300, Adrian Hunter wrote:
> > 
> > [...]
> > 
> >>> @@ -451,11 +445,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
> >>>  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
> >>>  {
> >>>  	struct perf_event_mmap_page *pc = mm->userpg;
> >>> -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
> >>
> >> The test and setup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT is not needed anymore either.
> > 
> > Yes, I think there have two files should be cleaned:
> > 
> >   Makefile.config
> >   util/auxtrace.c
> > 
> > If still miss anything, please let me know (I remembered there have a
> > test case for __sync_xxx_compare_and_swap, but I cannot find it now,
> > so this is why I am concern if I miss anything not).
> 
> tools/build/feature/test-sync-compare-and-swap.c

Thanks a lot!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers
  2021-06-02 10:30 ` [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers Leo Yan
@ 2021-06-07 10:02   ` Adrian Hunter
  2021-06-07 15:29   ` Peter Zijlstra
  1 sibling, 0 replies; 28+ messages in thread
From: Adrian Hunter @ 2021-06-07 10:02 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 2/06/21 1:30 pm, Leo Yan wrote:
> The kernel and the userspace tool can access the AUX ring buffer head
> and tail from different CPUs, thus SMP class of barriers are required
> on SMP system.
> 
> This patch changes to use SMP barriers to replace mb() and rmb()
> barriers.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/perf/util/auxtrace.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index a4fbb33b7245..42b7ef811bde 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -444,7 +444,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>  	u64 head = READ_ONCE(pc->aux_head);
>  
>  	/* Ensure all reads are done after we read the head */
> -	rmb();
> +	smp_rmb();
>  	return head;
>  }
>  
> @@ -458,7 +458,7 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  #endif
>  
>  	/* Ensure all reads are done after we read the head */
> -	rmb();
> +	smp_rmb();
>  	return head;
>  }
>  
> @@ -470,7 +470,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>  #endif
>  
>  	/* Ensure all reads are done before we write the tail out */
> -	mb();
> +	smp_mb();
>  #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
>  	pc->aux_tail = tail;
>  #else
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-06-02 10:30 ` [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
@ 2021-06-07 10:03   ` Adrian Hunter
  2021-06-07 15:31   ` Peter Zijlstra
  1 sibling, 0 replies; 28+ messages in thread
From: Adrian Hunter @ 2021-06-07 10:03 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 2/06/21 1:30 pm, Leo Yan wrote:
> Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
> behaviour.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/perf/util/auxtrace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index e625bc76cdde..abc4282f5272 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -458,7 +458,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>  
>  	/* Ensure all reads are done before we write the tail out */
>  	smp_mb();
> -	pc->aux_tail = tail;
> +	WRITE_ONCE(pc->aux_tail, tail);
>  }
>  
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-02 12:38     ` Leo Yan
@ 2021-06-07 10:23       ` Adrian Hunter
  2021-06-07 15:09         ` Leo Yan
  0 siblings, 1 reply; 28+ messages in thread
From: Adrian Hunter @ 2021-06-07 10:23 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 2/06/21 3:38 pm, Leo Yan wrote:
> Hi Adrain,
> 
> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
>> On 2/06/21 1:30 pm, Leo Yan wrote:
>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
>>> report the error and bail out for this case.
>>>
>>> Now only applies on x86_64 and Arm64 platforms.
>>>
>>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
>>
>> Maybe we can do better for the compat case.
>>
>> We can assume the upper 32-bits change very seldom,
>> and always increase. So for the 'read' case:
>>
>> 	u64 first, second, last;
>> 	u64 mask = (u64)((u32)-1) << 32;
>>
>> 	do {
>> 		first = READ_ONCE(pc->aux_head);
>> 		rmb();
>> 		second = READ_ONCE(pc->aux_head);
>> 		rmb();
>> 		last = READ_ONCE(pc->aux_head);
>> 	} while ((first & mask) != (last & mask));
>> 	return second;
>>
>> For the write case, we can cause a fatal error only if the new
>> tail has non-zero upper 32-bits.  That gives up to 4GiB of data
>> before aborting:
>>
>> 	if (tail & mask)
>> 		return -1;
>> 	smp_mb();
>> 	WRITE_ONCE(pc->aux_tail, tail);
> 
> Seems to me, it's pointless to only support aux_head for 64-bit and
> support aux_tail for 32-bit.  I understand this can be helpful for the
> snapshot mode which only uses aux_head, but it still fails to support
> the normal case for AUX ring buffer using 64-bit head/tail.

I am not sure why you say it is pointless.  'perf record' would still be
able to capture up to 4GiB of data. Do you mean you usually capture more
than 4GiB of data?

I was thinking we would separate out the compat case:

#if BITS_PER_LONG == 32
	if (kernel_is_64_bit)
		return compat_auxtrace_mmap__[read_head/write_tail]()
#endif

So the non-compat cases would not be affected.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-07 10:23       ` Adrian Hunter
@ 2021-06-07 15:09         ` Leo Yan
  2021-06-09  8:23           ` Adrian Hunter
  0 siblings, 1 reply; 28+ messages in thread
From: Leo Yan @ 2021-06-07 15:09 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote:
> On 2/06/21 3:38 pm, Leo Yan wrote:
> > Hi Adrain,
> > 
> > On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
> >> On 2/06/21 1:30 pm, Leo Yan wrote:
> >>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
> >>> report the error and bail out for this case.
> >>>
> >>> Now only applies on x86_64 and Arm64 platforms.
> >>>
> >>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> >>
> >> Maybe we can do better for the compat case.
> >>
> >> We can assume the upper 32-bits change very seldom,
> >> and always increase. So for the 'read' case:
> >>
> >> 	u64 first, second, last;
> >> 	u64 mask = (u64)((u32)-1) << 32;
> >>
> >> 	do {
> >> 		first = READ_ONCE(pc->aux_head);
> >> 		rmb();
> >> 		second = READ_ONCE(pc->aux_head);
> >> 		rmb();
> >> 		last = READ_ONCE(pc->aux_head);
> >> 	} while ((first & mask) != (last & mask));
> >> 	return second;
> >>
> >> For the write case, we can cause a fatal error only if the new
> >> tail has non-zero upper 32-bits.  That gives up to 4GiB of data
> >> before aborting:
> >>
> >> 	if (tail & mask)
> >> 		return -1;
> >> 	smp_mb();
> >> 	WRITE_ONCE(pc->aux_tail, tail);
> > 
> > Seems to me, it's pointless to only support aux_head for 64-bit and
> > support aux_tail for 32-bit.  I understand this can be helpful for the
> > snapshot mode which only uses aux_head, but it still fails to support
> > the normal case for AUX ring buffer using 64-bit head/tail.
> 
> I am not sure why you say it is pointless.  'perf record' would still be
> able to capture up to 4GiB of data. Do you mean you usually capture more
> than 4GiB of data?

Okay, understand.  We can support 32-bit perf for compat mode when the
trace data is less than 4GiB.

> I was thinking we would separate out the compat case:
> 
> #if BITS_PER_LONG == 32
> 	if (kernel_is_64_bit)
> 		return compat_auxtrace_mmap__[read_head/write_tail]()
> #endif
> 
> So the non-compat cases would not be affected.

Because I don't want to introduce the complexity for read/write head
and tail, and we also need to handle the same issue for the perf ring
buffer.  So how about below change?

The main idea for below change is it allows the perf to run normally
on the compat mode and exitly if detects the buffer head is close to
the low 32-bit's overflow: when detect the low 32-bit value is bigger
than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
error and exit.

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 1b4091a3b508..2a9965bfeab4 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
 	pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
 		  mm->idx, old, head, head - old);
 
+#ifdef BITS_PER_LONG == 32
+	if (kernel_is_64bit() && head >= 0xf0000000) {
+		pr_err("32-bit perf cannot read 64-bit value atomically;\n");
+		pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n");
+		return -ENOMEM;
+	}
+#endif
+
 	if (mm->mask) {
 		head_off = head & mm->mask;
 		old_off = old & mm->mask;
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 9130f6fad8d5..823b69895b85 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu)
 
 	return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1;
 }
+
+int perf_kernel_is_64bit(void)
+{
+	struct utsname uts;
+	int ret;
+
+	ret = uname(&uts);
+	if (ret < 0)
+		return 0;
+
+	if (!strncmp(uts.machine, "x86_64", 6) ||
+	    !strncmp(uts.machine, "aarch64", 7) ||
+	    !strncmp(uts.machine, "arm64", 5))
+		return 1;
+
+	return 0;
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index ca249bf5e984..c6c034fc08f6 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
 struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
 
 int perf_env__numa_node(struct perf_env *env, int cpu);
+
+int perf_kernel_is_64bit(void);
 #endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index ab7108d22428..f1d3725d599a 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to,
 	if (rc < 0)
 		return (rc == -EAGAIN) ? 1 : -1;
 
+#ifdef BITS_PER_LONG == 32
+	if (kernel_is_64bit() && head >= 0xf0000000) {
+		pr_err("32-bit perf cannot read 64-bit value atomically;\n");
+		pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n");
+		return -ENOMEM;
+	}
+#endif
+
 	size = md->core.end - md->core.start;
 
 	if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) {

Thanks,
Leo

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on AUX ring buffer
  2021-06-02 10:30 ` [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on " Leo Yan
@ 2021-06-07 15:27   ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-06-07 15:27 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	x86, H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Adrian Hunter, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Wed, Jun 02, 2021 at 06:30:00PM +0800, Leo Yan wrote:
> AUX ring buffer applies almost the same barriers as perf ring buffer,
> but there has an exception for ordering between writing the AUX trace
> data and updating user_page::aux_head.
> 
> This patch adds comment for how to use the barriers on AUX ring buffer,
> and gives comment to ask the drivers to flush the trace data into AUX
> ring buffer prior to updating user_page::aux_head.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  kernel/events/ring_buffer.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> index 52868716ec35..5cf6579be05e 100644
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)
>  		perf_event_aux_event(handle->event, aux_head, size,
>  				     handle->aux_flags);
>  
> +	/*
> +	 * See perf_output_put_handle(), AUX ring buffer applies the same
> +	 * barrier pairing as the perf ring buffer; except for B, since
> +	 * AUX ring buffer is written by hardware trace, we cannot simply
> +	 * use the generic memory barrier (like smp_wmb()) prior to update
> +	 * user_page::aux_head, the hardware trace driver takes the
> +	 * responsibility to ensure the trace data has been flushed into
> +	 * the AUX buffer before calling perf_aux_output_end().
> +	 */
>  	WRITE_ONCE(rb->user_page->aux_head, rb->aux_head);
>  	if (rb_need_aux_wakeup(rb))
>  		wakeup = true;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/8] perf/x86: Add barrier after updating bts
  2021-06-02 10:30 ` [PATCH v2 4/8] perf/x86: Add barrier after updating bts Leo Yan
@ 2021-06-07 15:29   ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2021-06-07 15:29 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	x86, H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Adrian Hunter, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Wed, Jun 02, 2021 at 06:30:03PM +0800, Leo Yan wrote:
> Add barrier wmb() to separate the AUX data store and aux_head store.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  arch/x86/events/intel/bts.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
> index 6320d2cfd9d3..4a015d160bc5 100644
> --- a/arch/x86/events/intel/bts.c
> +++ b/arch/x86/events/intel/bts.c
> @@ -209,6 +209,9 @@ static void bts_update(struct bts_ctx *bts)
>  	} else {
>  		local_set(&buf->data_size, head);
>  	}
> +
> +	/* The WMB separates data store and aux_head store matches. */
> +	wmb();

Alexander, do we indeed need an MFENCE here? or is the BTS hardware
coherent, in which case a compiler barrier would be sufficient.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers
  2021-06-02 10:30 ` [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers Leo Yan
  2021-06-07 10:02   ` Adrian Hunter
@ 2021-06-07 15:29   ` Peter Zijlstra
  2021-06-08 16:45     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-06-07 15:29 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	x86, H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Adrian Hunter, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Wed, Jun 02, 2021 at 06:30:04PM +0800, Leo Yan wrote:
> The kernel and the userspace tool can access the AUX ring buffer head
> and tail from different CPUs, thus SMP class of barriers are required
> on SMP system.
> 
> This patch changes to use SMP barriers to replace mb() and rmb()
> barriers.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-06-02 10:30 ` [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
  2021-06-07 10:03   ` Adrian Hunter
@ 2021-06-07 15:31   ` Peter Zijlstra
  2021-06-08 17:04     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2021-06-07 15:31 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	x86, H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Adrian Hunter, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Wed, Jun 02, 2021 at 06:30:06PM +0800, Leo Yan wrote:
> Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
> behaviour.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  tools/perf/util/auxtrace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index e625bc76cdde..abc4282f5272 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -458,7 +458,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>  
>  	/* Ensure all reads are done before we write the tail out */
>  	smp_mb();
> -	pc->aux_tail = tail;
> +	WRITE_ONCE(pc->aux_tail, tail);
>  }
>  
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers
  2021-06-07 15:29   ` Peter Zijlstra
@ 2021-06-08 16:45     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-08 16:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Leo Yan, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Adrian Hunter,
	Andi Kleen, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel

Em Mon, Jun 07, 2021 at 05:29:55PM +0200, Peter Zijlstra escreveu:
> On Wed, Jun 02, 2021 at 06:30:04PM +0800, Leo Yan wrote:
> > The kernel and the userspace tool can access the AUX ring buffer head
> > and tail from different CPUs, thus SMP class of barriers are required
> > on SMP system.
> > 
> > This patch changes to use SMP barriers to replace mb() and rmb()
> > barriers.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks, applied.

- Arnaldo


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-06-07 15:31   ` Peter Zijlstra
@ 2021-06-08 17:04     ` Arnaldo Carvalho de Melo
  2021-06-09  0:21       ` Leo Yan
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-08 17:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Leo Yan, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Adrian Hunter,
	Andi Kleen, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel

Em Mon, Jun 07, 2021 at 05:31:01PM +0200, Peter Zijlstra escreveu:
> On Wed, Jun 02, 2021 at 06:30:06PM +0800, Leo Yan wrote:
> > Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
> > behaviour.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Leo, this one is dependendant on the 6/8, will wait for a resubmission,
keeping 5/8 tho, as was Acked and applies cleanly, perf/core.
 
> > ---
> >  tools/perf/util/auxtrace.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> > index e625bc76cdde..abc4282f5272 100644
> > --- a/tools/perf/util/auxtrace.h
> > +++ b/tools/perf/util/auxtrace.h
> > @@ -458,7 +458,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
> >  
> >  	/* Ensure all reads are done before we write the tail out */
> >  	smp_mb();
> > -	pc->aux_tail = tail;
> > +	WRITE_ONCE(pc->aux_tail, tail);
> >  }
> >  
> >  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
> > -- 
> > 2.25.1
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-06-08 17:04     ` Arnaldo Carvalho de Melo
@ 2021-06-09  0:21       ` Leo Yan
  0 siblings, 0 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-09  0:21 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Adrian Hunter,
	Andi Kleen, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel

Hi Arnaldo,

On Tue, Jun 08, 2021 at 02:04:30PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jun 07, 2021 at 05:31:01PM +0200, Peter Zijlstra escreveu:
> > On Wed, Jun 02, 2021 at 06:30:06PM +0800, Leo Yan wrote:
> > > Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
> > > behaviour.
> > > 
> > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > 
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> Leo, this one is dependendant on the 6/8, will wait for a resubmission,
> keeping 5/8 tho, as was Acked and applies cleanly, perf/core.

Yeah, will respin for patches 6/8 and 7/8.

Thanks for merging patch 5/8.

Leo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-07 15:09         ` Leo Yan
@ 2021-06-09  8:23           ` Adrian Hunter
  2021-06-09  8:57             ` Leo Yan
  0 siblings, 1 reply; 28+ messages in thread
From: Adrian Hunter @ 2021-06-09  8:23 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 7/06/21 6:09 pm, Leo Yan wrote:
> On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote:
>> On 2/06/21 3:38 pm, Leo Yan wrote:
>>> Hi Adrain,
>>>
>>> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
>>>> On 2/06/21 1:30 pm, Leo Yan wrote:
>>>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
>>>>> report the error and bail out for this case.
>>>>>
>>>>> Now only applies on x86_64 and Arm64 platforms.
>>>>>
>>>>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
>>>>
>>>> Maybe we can do better for the compat case.
>>>>
>>>> We can assume the upper 32-bits change very seldom,
>>>> and always increase. So for the 'read' case:
>>>>
>>>> 	u64 first, second, last;
>>>> 	u64 mask = (u64)((u32)-1) << 32;
>>>>
>>>> 	do {
>>>> 		first = READ_ONCE(pc->aux_head);
>>>> 		rmb();
>>>> 		second = READ_ONCE(pc->aux_head);
>>>> 		rmb();
>>>> 		last = READ_ONCE(pc->aux_head);
>>>> 	} while ((first & mask) != (last & mask));
>>>> 	return second;
>>>>
>>>> For the write case, we can cause a fatal error only if the new
>>>> tail has non-zero upper 32-bits.  That gives up to 4GiB of data
>>>> before aborting:
>>>>
>>>> 	if (tail & mask)
>>>> 		return -1;
>>>> 	smp_mb();
>>>> 	WRITE_ONCE(pc->aux_tail, tail);
>>>
>>> Seems to me, it's pointless to only support aux_head for 64-bit and
>>> support aux_tail for 32-bit.  I understand this can be helpful for the
>>> snapshot mode which only uses aux_head, but it still fails to support
>>> the normal case for AUX ring buffer using 64-bit head/tail.
>>
>> I am not sure why you say it is pointless.  'perf record' would still be
>> able to capture up to 4GiB of data. Do you mean you usually capture more
>> than 4GiB of data?
> 
> Okay, understand.  We can support 32-bit perf for compat mode when the
> trace data is less than 4GiB.
> 
>> I was thinking we would separate out the compat case:
>>
>> #if BITS_PER_LONG == 32
>> 	if (kernel_is_64_bit)
>> 		return compat_auxtrace_mmap__[read_head/write_tail]()
>> #endif
>>
>> So the non-compat cases would not be affected.
> 
> Because I don't want to introduce the complexity for read/write head
> and tail, and we also need to handle the same issue for the perf ring
> buffer.  So how about below change?
> 
> The main idea for below change is it allows the perf to run normally
> on the compat mode and exitly if detects the buffer head is close to
> the low 32-bit's overflow: when detect the low 32-bit value is bigger
> than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
> error and exit.
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 1b4091a3b508..2a9965bfeab4 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
>  	pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
>  		  mm->idx, old, head, head - old);
>  
> +#ifdef BITS_PER_LONG == 32
> +	if (kernel_is_64bit() && head >= 0xf0000000) {

You are assuming the head never increases by more than 256MiB which
means you should limit the buffer size to 256MiB maximum.

To me this seems a bit too far from an ideal solution.

I would have thought separating out the compat case makes things
simpler to understand.

> +		pr_err("32-bit perf cannot read 64-bit value atomically;\n");
> +		pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n");
> +		return -ENOMEM;
> +	}
> +#endif
> +
>  	if (mm->mask) {
>  		head_off = head & mm->mask;
>  		old_off = old & mm->mask;
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 9130f6fad8d5..823b69895b85 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu)
>  
>  	return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1;
>  }
> +
> +int perf_kernel_is_64bit(void)
> +{
> +	struct utsname uts;
> +	int ret;
> +
> +	ret = uname(&uts);
> +	if (ret < 0)
> +		return 0;
> +
> +	if (!strncmp(uts.machine, "x86_64", 6) ||
> +	    !strncmp(uts.machine, "aarch64", 7) ||
> +	    !strncmp(uts.machine, "arm64", 5))
> +		return 1;
> +
> +	return 0;
> +}

Obviously, we don't need to keep checking uname. It could be a global variable
that is always set up early.


> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index ca249bf5e984..c6c034fc08f6 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
>  struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
>  
>  int perf_env__numa_node(struct perf_env *env, int cpu);
> +
> +int perf_kernel_is_64bit(void);
>  #endif /* __PERF_ENV_H */
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index ab7108d22428..f1d3725d599a 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to,
>  	if (rc < 0)
>  		return (rc == -EAGAIN) ? 1 : -1;
>  
> +#ifdef BITS_PER_LONG == 32
> +	if (kernel_is_64bit() && head >= 0xf0000000) {
> +		pr_err("32-bit perf cannot read 64-bit value atomically;\n");
> +		pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n");
> +		return -ENOMEM;
> +	}
> +#endif
> +
>  	size = md->core.end - md->core.start;
>  
>  	if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) {
> 
> Thanks,
> Leo
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 8/8] perf record: Directly bail out for compat case
  2021-06-09  8:23           ` Adrian Hunter
@ 2021-06-09  8:57             ` Leo Yan
  0 siblings, 0 replies; 28+ messages in thread
From: Leo Yan @ 2021-06-09  8:57 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, Andi Kleen, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

Hi Adrian,

On Wed, Jun 09, 2021 at 11:23:25AM +0300, Adrian Hunter wrote:

[...]

> >> I was thinking we would separate out the compat case:
> >>
> >> #if BITS_PER_LONG == 32
> >> 	if (kernel_is_64_bit)
> >> 		return compat_auxtrace_mmap__[read_head/write_tail]()
> >> #endif
> >>
> >> So the non-compat cases would not be affected.
> > 
> > Because I don't want to introduce the complexity for read/write head
> > and tail, and we also need to handle the same issue for the perf ring
> > buffer.  So how about below change?
> > 
> > The main idea for below change is it allows the perf to run normally
> > on the compat mode and exitly if detects the buffer head is close to
> > the low 32-bit's overflow: when detect the low 32-bit value is bigger
> > than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
> > error and exit.
> > 
> > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> > index 1b4091a3b508..2a9965bfeab4 100644
> > --- a/tools/perf/util/auxtrace.c
> > +++ b/tools/perf/util/auxtrace.c
> > @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
> >  	pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
> >  		  mm->idx, old, head, head - old);
> >  
> > +#ifdef BITS_PER_LONG == 32
> > +	if (kernel_is_64bit() && head >= 0xf0000000) {
> 
> You are assuming the head never increases by more than 256MiB which
> means you should limit the buffer size to 256MiB maximum.
> 
> To me this seems a bit too far from an ideal solution.
> 
> I would have thought separating out the compat case makes things
> simpler to understand.

Agreed.  I will follow up the suggestions to add compat variants for
accessing AUX head and tail, and will distinguish compat case with
global env variable for 64-bit kernel.

After get ready, will send out for review. Thanks a lot for suggestions!

Leo

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-06-09  8:58 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 10:29 [PATCH v2 0/8] perf: Refine barriers for AUX ring buffer Leo Yan
2021-06-02 10:30 ` [PATCH v2 1/8] perf/ring_buffer: Add comment for barriers on " Leo Yan
2021-06-07 15:27   ` Peter Zijlstra
2021-06-02 10:30 ` [PATCH v2 2/8] coresight: tmc-etr: Add barrier after updating " Leo Yan
2021-06-02 10:30 ` [PATCH v2 3/8] coresight: tmc-etf: Add comment for store ordering Leo Yan
2021-06-02 10:30 ` [PATCH v2 4/8] perf/x86: Add barrier after updating bts Leo Yan
2021-06-07 15:29   ` Peter Zijlstra
2021-06-02 10:30 ` [PATCH v2 5/8] perf auxtrace: Change to use SMP memory barriers Leo Yan
2021-06-07 10:02   ` Adrian Hunter
2021-06-07 15:29   ` Peter Zijlstra
2021-06-08 16:45     ` Arnaldo Carvalho de Melo
2021-06-02 10:30 ` [PATCH v2 6/8] perf auxtrace: Drop legacy __sync functions Leo Yan
2021-06-02 10:47   ` Adrian Hunter
2021-06-02 11:16     ` Leo Yan
2021-06-02 11:21       ` Adrian Hunter
2021-06-02 13:01         ` Leo Yan
2021-06-02 10:30 ` [PATCH v2 7/8] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
2021-06-07 10:03   ` Adrian Hunter
2021-06-07 15:31   ` Peter Zijlstra
2021-06-08 17:04     ` Arnaldo Carvalho de Melo
2021-06-09  0:21       ` Leo Yan
2021-06-02 10:30 ` [PATCH v2 8/8] perf record: Directly bail out for compat case Leo Yan
2021-06-02 11:18   ` Adrian Hunter
2021-06-02 12:38     ` Leo Yan
2021-06-07 10:23       ` Adrian Hunter
2021-06-07 15:09         ` Leo Yan
2021-06-09  8:23           ` Adrian Hunter
2021-06-09  8:57             ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).