linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer
@ 2021-07-11 10:40 Leo Yan
  2021-07-11 10:40 ` [PATCH v4 01/11] perf/ring_buffer: Add comment for barriers on " Leo Yan
                   ` (10 more replies)
  0 siblings, 11 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

This patch series is to refine the memory barriers for AUX ring buffer.

Patches 01 ~ 04 to address the barriers usage in the kernel.  The first
patch is to make clear comment for how to use the barriers between the
data store and aux_head store, this asks the driver to make sure the
data is visible.  Patches 02 ~ 04 is to refine the drivers for barriers
after the data store.

Patch 05 is to use WRITE_ONCE() for updating aux_tail.

Patches 06 ~ 09 is to drop the legacy __sync functions, and polish for
duplicate code and cleanup the build and feature test after
SYNC_COMPARE_AND_SWAP is not used.

Patch 10 introduces a new global variable to indicate the kernel runs in
64-bit mode which can be used to confirm if in compat mode; patch 11
introduces variant functions for accessing AUX head/tail, it resolves
the aotmicity for reading head pointer, and returns error for the tail
is bigger than 4GB.

Have testes the patches on Arm64 Juno platform.

Changes from v3:
- Removed the inapprocate paragraph in the commit log for patch "perf
  auxtrace: Drop legacy __sync functions" (Adrian);
- Added new patch to remove feature-sync-compare-and-swap test (Adrian);
- Th patch for "perf auxtrace: Use WRITE_ONCE() for updating aux_tail",
  is a standlone and simple change, so moved it ahead in the patch set
  for better ordering;
- Minor improvement for commit logs in the last two patches.

Changes from v2:
- Removed auxtrace_mmap__read_snapshot_head(), which has the duplicated
  code with auxtrace_mmap__read_head();
- Cleanuped the build for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT (Adrian);
- Added global variable "kernel_is_64_bit" (Adrian);
- Added compat variants compat_auxtrace_mmap__{read_head|write_tail}
  (Adrian).


Leo Yan (11):
  perf/ring_buffer: Add comment for barriers on AUX ring buffer
  coresight: tmc-etr: Add barrier after updating AUX ring buffer
  coresight: tmc-etf: Add comment for store ordering
  perf/x86: Add barrier after updating bts
  perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  perf auxtrace: Drop legacy __sync functions
  perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
  perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
  tools: Remove feature-sync-compare-and-swap feature detection
  perf env: Set flag for kernel is 64-bit mode
  perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}

 arch/x86/events/intel/bts.c                   |   3 +
 .../hwtracing/coresight/coresight-tmc-etf.c   |   6 +
 .../hwtracing/coresight/coresight-tmc-etr.c   |   8 ++
 kernel/events/ring_buffer.c                   |   9 ++
 tools/build/Makefile.feature                  |   1 -
 tools/build/feature/Makefile                  |   4 -
 tools/build/feature/test-all.c                |   4 -
 .../feature/test-sync-compare-and-swap.c      |  15 ---
 tools/perf/Makefile.config                    |   4 -
 tools/perf/util/auxtrace.c                    |  19 ++-
 tools/perf/util/auxtrace.h                    | 109 ++++++++++++++----
 tools/perf/util/env.c                         |  17 ++-
 tools/perf/util/env.h                         |   1 +
 13 files changed, 136 insertions(+), 64 deletions(-)
 delete mode 100644 tools/build/feature/test-sync-compare-and-swap.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 01/11] perf/ring_buffer: Add comment for barriers on AUX ring buffer
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
@ 2021-07-11 10:40 ` Leo Yan
  2021-07-11 10:40 ` [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating " Leo Yan
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

AUX ring buffer applies almost the same barriers as perf ring buffer,
but there has an exception for ordering between writing the AUX trace
data and updating user_page::aux_head.

This patch adds comment for how to use the barriers on AUX ring buffer,
and gives comment to ask the drivers to flush the trace data into AUX
ring buffer prior to updating user_page::aux_head.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/ring_buffer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 52868716ec35..5cf6579be05e 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)
 		perf_event_aux_event(handle->event, aux_head, size,
 				     handle->aux_flags);
 
+	/*
+	 * See perf_output_put_handle(), AUX ring buffer applies the same
+	 * barrier pairing as the perf ring buffer; except for B, since
+	 * AUX ring buffer is written by hardware trace, we cannot simply
+	 * use the generic memory barrier (like smp_wmb()) prior to update
+	 * user_page::aux_head, the hardware trace driver takes the
+	 * responsibility to ensure the trace data has been flushed into
+	 * the AUX buffer before calling perf_aux_output_end().
+	 */
 	WRITE_ONCE(rb->user_page->aux_head, rb->aux_head);
 	if (rb_need_aux_wakeup(rb))
 		wakeup = true;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating AUX ring buffer
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
  2021-07-11 10:40 ` [PATCH v4 01/11] perf/ring_buffer: Add comment for barriers on " Leo Yan
@ 2021-07-11 10:40 ` Leo Yan
  2021-07-12 10:40   ` Suzuki K Poulose
  2021-07-11 10:40 ` [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering Leo Yan
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since a memory barrier is required between AUX trace data store and
aux_head store, and the AUX trace data is filled with memcpy(), it's
sufficient to use smp_wmb() so can ensure the trace data is visible
prior to updating aux_head.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index acdb59e0e661..713205db15a1 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 	 */
 	if (etr_perf->snapshot)
 		handle->head += size;
+
+	/*
+	 * It requires the ordering between the AUX trace data and aux_head
+	 * store, below smp_wmb() ensures the AUX trace data is visible prior
+	 * to updating aux_head.
+	 */
+	smp_wmb();
+
 out:
 	/*
 	 * Don't set the TRUNCATED flag in snapshot mode because 1) the
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
  2021-07-11 10:40 ` [PATCH v4 01/11] perf/ring_buffer: Add comment for barriers on " Leo Yan
  2021-07-11 10:40 ` [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating " Leo Yan
@ 2021-07-11 10:40 ` Leo Yan
  2021-07-13 12:56   ` Peter Zijlstra
  2021-07-11 10:40 ` [PATCH v4 04/11] perf/x86: Add barrier after updating bts Leo Yan
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

AUX ring buffer is required to separate the data store and aux_head
store, since the function CS_LOCK() has contained memory barrier mb(),
mb() is a more conservative barrier than smp_wmb() on Arm32/Arm64, thus
it's needless to add any explicit barrier anymore.

Add comment to make clear for the barrier usage for ETF.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 45b85edfc690..9a42ee689921 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -553,6 +553,12 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 	if (buf->snapshot)
 		handle->head += to_read;
 
+	/*
+	 * AUX ring buffer requires to use memory barrier to separate the trace
+	 * data store and aux_head store, because CS_LOCK() contains mb() which
+	 * gives more heavy barrier than smp_wmb(), it's not necessary to
+	 * explicitly invoke any barrier.
+	 */
 	CS_LOCK(drvdata->base);
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 04/11] perf/x86: Add barrier after updating bts
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (2 preceding siblings ...)
  2021-07-11 10:40 ` [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering Leo Yan
@ 2021-07-11 10:40 ` Leo Yan
  2021-07-13 13:01   ` Peter Zijlstra
  2021-07-11 10:40 ` [PATCH v4 05/11] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Add barrier wmb() to separate the AUX data store and aux_head store.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/x86/events/intel/bts.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 6320d2cfd9d3..4a015d160bc5 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -209,6 +209,9 @@ static void bts_update(struct bts_ctx *bts)
 	} else {
 		local_set(&buf->data_size, head);
 	}
+
+	/* The WMB separates data store and aux_head store matches. */
+	wmb();
 }
 
 static int
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 05/11] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (3 preceding siblings ...)
  2021-07-11 10:40 ` [PATCH v4 04/11] perf/x86: Add barrier after updating bts Leo Yan
@ 2021-07-11 10:40 ` Leo Yan
  2021-07-11 10:41 ` [PATCH v4 06/11] perf auxtrace: Drop legacy __sync functions Leo Yan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
behaviour.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/perf/util/auxtrace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index cc1c1b9cec9c..79227b8864cd 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -480,7 +480,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
 #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
-	pc->aux_tail = tail;
+	WRITE_ONCE(pc->aux_tail, tail);
 #else
 	do {
 		old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 06/11] perf auxtrace: Drop legacy __sync functions
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (4 preceding siblings ...)
  2021-07-11 10:40 ` [PATCH v4 05/11] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
@ 2021-07-11 10:41 ` Leo Yan
  2021-07-11 10:41 ` [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

The main purpose for using __sync built-in functions is to support
compat mode for 32-bit perf with 64-bit kernel.  But using these
built-in functions might cause potential issues.

__sync functions originally support Intel Itanium processoer [1]
but it cannot promise to support all 32-bit archs.  Now these
functions have become the legacy functions.

Considering __sync functions cannot really fix the 64-bit value
atomicity on 32-bit archs, thus this patch drops __sync functions.

Credits to Peter for detailed analysis.

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.h | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 79227b8864cd..4f9176368134 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -440,12 +440,6 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
-/*
- * In snapshot mode the mmapped page is read-only which makes using
- * __sync_val_compare_and_swap() problematic.  However, snapshot mode expects
- * the buffer is not updated while the snapshot is made (e.g. Intel PT disables
- * the event) so there is not a race anyway.
- */
 static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
@@ -459,11 +453,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	u64 head = READ_ONCE(pc->aux_head);
-#else
-	u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
-#endif
 
 	/* Ensure all reads are done after we read the head */
 	smp_rmb();
@@ -473,19 +463,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
-	u64 old_tail;
-#endif
 
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
-#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	WRITE_ONCE(pc->aux_tail, tail);
-#else
-	do {
-		old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
-	} while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail));
-#endif
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (5 preceding siblings ...)
  2021-07-11 10:41 ` [PATCH v4 06/11] perf auxtrace: Drop legacy __sync functions Leo Yan
@ 2021-07-11 10:41 ` Leo Yan
  2021-07-12 14:32   ` Adrian Hunter
  2021-07-11 10:41 ` [PATCH v4 08/11] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since the function auxtrace_mmap__read_snapshot_head() is exactly same
with auxtrace_mmap__read_head(), whether the session is in snapshot mode
or not, it's unified to use function auxtrace_mmap__read_head() for
reading AUX buffer head.

And the function auxtrace_mmap__read_snapshot_head() is unused so this
patch removes it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.c |  5 ++---
 tools/perf/util/auxtrace.h | 10 ----------
 2 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index cb19669d2a5b..7958e17229ea 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1686,13 +1686,12 @@ static int __auxtrace_mmap__read(struct mmap *map,
 	union perf_event ev;
 	void *data1, *data2;
 
+	head = auxtrace_mmap__read_head(mm);
+
 	if (snapshot) {
-		head = auxtrace_mmap__read_snapshot_head(mm);
 		if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data,
 						   &head, &old))
 			return -1;
-	} else {
-		head = auxtrace_mmap__read_head(mm);
 	}
 
 	if (old == head)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 4f9176368134..d68a5e80b217 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -440,16 +440,6 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
-static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
-{
-	struct perf_event_mmap_page *pc = mm->userpg;
-	u64 head = READ_ONCE(pc->aux_head);
-
-	/* Ensure all reads are done after we read the head */
-	smp_rmb();
-	return head;
-}
-
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 08/11] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (6 preceding siblings ...)
  2021-07-11 10:41 ` [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
@ 2021-07-11 10:41 ` Leo Yan
  2021-07-11 10:41 ` [PATCH v4 09/11] tools: Remove feature-sync-compare-and-swap feature detection Leo Yan
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since the __sync functions have been dropped, This patch removes unused
build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/Makefile.config | 4 ----
 tools/perf/util/auxtrace.c | 5 -----
 2 files changed, 9 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index eb8e487ef90b..4a0d9a6defc7 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -349,10 +349,6 @@ CXXFLAGS += $(INC_FLAGS)
 
 LIBPERF_CFLAGS := $(CORE_CFLAGS) $(EXTRA_CFLAGS)
 
-ifeq ($(feature-sync-compare-and-swap), 1)
-  CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
-endif
-
 ifeq ($(feature-pthread-attr-setaffinity-np), 1)
   CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP
 endif
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 7958e17229ea..6a63be8b2430 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -130,11 +130,6 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
 		return 0;
 	}
 
-#if BITS_PER_LONG != 64 && !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
-	pr_err("Cannot use AUX area tracing mmaps\n");
-	return -1;
-#endif
-
 	pc->aux_offset = mp->offset;
 	pc->aux_size = mp->len;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 09/11] tools: Remove feature-sync-compare-and-swap feature detection
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (7 preceding siblings ...)
  2021-07-11 10:41 ` [PATCH v4 08/11] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
@ 2021-07-11 10:41 ` Leo Yan
  2021-07-11 10:41 ` [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode Leo Yan
  2021-07-11 10:41 ` [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
  10 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since the __sync functions have been removed from perf, it's needless
for perf tool to test the feature sync-compare-and-swap.

The feature test is not used by any other components, remove it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/build/Makefile.feature                     |  1 -
 tools/build/feature/Makefile                     |  4 ----
 tools/build/feature/test-all.c                   |  4 ----
 tools/build/feature/test-sync-compare-and-swap.c | 15 ---------------
 4 files changed, 24 deletions(-)
 delete mode 100644 tools/build/feature/test-sync-compare-and-swap.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 04a8e3db8a54..3dd2f68366f9 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -34,7 +34,6 @@ FEATURE_TESTS_BASIC :=                  \
         dwarf_getlocations              \
         eventfd                         \
         fortify-source                  \
-        sync-compare-and-swap           \
         get_current_dir_name            \
         gettid				\
         glibc                           \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index ec203e28407f..eff55d287db1 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -9,7 +9,6 @@ FILES=                                          \
          test-dwarf_getlocations.bin            \
          test-eventfd.bin                       \
          test-fortify-source.bin                \
-         test-sync-compare-and-swap.bin         \
          test-get_current_dir_name.bin          \
          test-glibc.bin                         \
          test-gtk2.bin                          \
@@ -260,9 +259,6 @@ $(OUTPUT)test-libdw-dwarf-unwind.bin:
 $(OUTPUT)test-libbabeltrace.bin:
 	$(BUILD) # -lbabeltrace provided by $(FEATURE_CHECK_LDFLAGS-libbabeltrace)
 
-$(OUTPUT)test-sync-compare-and-swap.bin:
-	$(BUILD)
-
 $(OUTPUT)test-compile-32.bin:
 	$(CC) -m32 -o $@ test-compile.c
 
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 464873883396..920439527291 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -106,10 +106,6 @@
 # include "test-libdw-dwarf-unwind.c"
 #undef main
 
-#define main main_test_sync_compare_and_swap
-# include "test-sync-compare-and-swap.c"
-#undef main
-
 #define main main_test_zlib
 # include "test-zlib.c"
 #undef main
diff --git a/tools/build/feature/test-sync-compare-and-swap.c b/tools/build/feature/test-sync-compare-and-swap.c
deleted file mode 100644
index 3bc6b0768a53..000000000000
--- a/tools/build/feature/test-sync-compare-and-swap.c
+++ /dev/null
@@ -1,15 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <stdint.h>
-
-volatile uint64_t x;
-
-int main(int argc, char *argv[])
-{
-	uint64_t old, new = argc;
-
-	(void)argv;
-	do {
-		old = __sync_val_compare_and_swap(&x, 0, 0);
-	} while (!__sync_bool_compare_and_swap(&x, old, new));
-	return old == new;
-}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (8 preceding siblings ...)
  2021-07-11 10:41 ` [PATCH v4 09/11] tools: Remove feature-sync-compare-and-swap feature detection Leo Yan
@ 2021-07-11 10:41 ` Leo Yan
  2021-07-12 14:37   ` Adrian Hunter
  2021-07-12 18:14   ` Arnaldo Carvalho de Melo
  2021-07-11 10:41 ` [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
  10 siblings, 2 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

It's useful to know that the kernel is running in 32-bit or 64-bit
mode.  E.g. We can decide if perf tool is running in compat mode
from this info.

This patch adds a global variable "kernel_is_64_bit", it's initialized
when a session setups environment, its value is decided by checking the
architecture string.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/env.c | 17 ++++++++++++++++-
 tools/perf/util/env.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index ebc5e9ad35db..345635a2e842 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -11,6 +11,7 @@
 #include <stdlib.h>
 #include <string.h>
 
+int kernel_is_64_bit;
 struct perf_env perf_env;
 
 #ifdef HAVE_LIBBPF_SUPPORT
@@ -172,6 +173,19 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused)
 }
 #endif // HAVE_LIBBPF_SUPPORT
 
+static void perf_env__init_kernel_mode(struct perf_env *env)
+{
+	const char *arch = perf_env__raw_arch(env);
+
+	if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
+	    !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
+	    !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
+	    !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
+		kernel_is_64_bit = 1;
+	else
+		kernel_is_64_bit = 0;
+}
+
 void perf_env__exit(struct perf_env *env)
 {
 	int i;
@@ -217,13 +231,14 @@ void perf_env__exit(struct perf_env *env)
 	zfree(&env->hybrid_cpc_nodes);
 }
 
-void perf_env__init(struct perf_env *env __maybe_unused)
+void perf_env__init(struct perf_env *env)
 {
 #ifdef HAVE_LIBBPF_SUPPORT
 	env->bpf_progs.infos = RB_ROOT;
 	env->bpf_progs.btfs = RB_ROOT;
 	init_rwsem(&env->bpf_progs.lock);
 #endif
+	perf_env__init_kernel_mode(env);
 }
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 6824a7423a2d..cc989ff49740 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -139,6 +139,7 @@ enum perf_compress_type {
 struct bpf_prog_info_node;
 struct btf_node;
 
+extern int kernel_is_64_bit;
 extern struct perf_env perf_env;
 
 void perf_env__exit(struct perf_env *env);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (9 preceding siblings ...)
  2021-07-11 10:41 ` [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode Leo Yan
@ 2021-07-11 10:41 ` Leo Yan
  2021-07-12 14:44   ` Russell King (Oracle)
  2021-07-13  7:07   ` Adrian Hunter
  10 siblings, 2 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-11 10:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

When perf runs in compat mode (kernel in 64-bit mode and the perf is in
32-bit mode), the 64-bit value atomicity in the user space cannot be
assured, E.g. on some architectures, the 64-bit value accessing is split
into two instructions, one is for the low 32-bit word accessing and
another is for the high 32-bit word.

This patch introduces two functions compat_auxtrace_mmap__read_head()
and compat_auxtrace_mmap__write_tail(), as their naming indicates, when
perf tool works in compat mode, it uses these two functions to access
the AUX head and tail.  These two functions can allow the perf tool to
work properly in certain conditions, e.g. when perf tool works in
snapshot mode with only using AUX head pointer, or perf tool uses the
AUX buffer and the incremented tail is not bigger than 4GB.

When perf tool cannot handle the case when the AUX tail is bigger than
4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and
tells the caller to bail out for the error.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.c |  9 ++--
 tools/perf/util/auxtrace.h | 94 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 98 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 6a63be8b2430..d6fc250fbf97 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1766,10 +1766,13 @@ static int __auxtrace_mmap__read(struct mmap *map,
 	mm->prev = head;
 
 	if (!snapshot) {
-		auxtrace_mmap__write_tail(mm, head);
-		if (itr->read_finish) {
-			int err;
+		int err;
 
+		err = auxtrace_mmap__write_tail(mm, head);
+		if (err < 0)
+			return err;
+
+		if (itr->read_finish) {
 			err = itr->read_finish(itr, mm->idx);
 			if (err < 0)
 				return err;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index d68a5e80b217..66de7b6e65ec 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -18,6 +18,8 @@
 #include <asm/bitsperlong.h>
 #include <asm/barrier.h>
 
+#include "env.h"
+
 union perf_event;
 struct perf_session;
 struct evlist;
@@ -440,23 +442,111 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
+/*
+ * In the compat mode kernel runs in 64-bit and perf tool runs in 32-bit mode,
+ * 32-bit perf tool cannot access 64-bit value atomically, which might lead to
+ * the issues caused by the below sequence on multiple CPUs: when perf tool
+ * accesses either the load operation or the store operation for 64-bit value,
+ * on some architectures the operation is divided into two instructions, one
+ * is for accessing the low 32-bit value and another is for the high 32-bit;
+ * thus these two user operations can give the kernel chances to access the
+ * 64-bit value, and thus leads to the unexpected load values.
+ *
+ *   kernel (64-bit)                        user (32-bit)
+ *
+ *   if (LOAD ->aux_tail) { --,             LOAD ->aux_head_lo
+ *       STORE $aux_data      |       ,--->
+ *       FLUSH $aux_data      |       |     LOAD ->aux_head_hi
+ *       STORE ->aux_head   --|-------`     smp_rmb()
+ *   }                        |             LOAD $data
+ *                            |             smp_mb()
+ *                            |             STORE ->aux_tail_lo
+ *                            `----------->
+ *                                          STORE ->aux_tail_hi
+ *
+ * For this reason, it's impossible for the perf tool to work correctly when
+ * the AUX head or tail is bigger than 4GB (more than 32 bits length); and we
+ * can not simply limit the AUX ring buffer to less than 4GB, the reason is
+ * the pointers can be increased monotonically (e.g in snapshot mode), whatever
+ * the buffer size it is, at the end the head and tail can be bigger than 4GB
+ * and carry out to the high 32-bit.
+ *
+ * To mitigate the issues and improve the user experience, we can allow the
+ * perf tool working in certain conditions and bail out with error if detect
+ * any overflow cannot be handled.
+ *
+ * For reading the AUX head, it reads out the values for three times, and
+ * compares the high 4 bytes of the values between the first time and the last
+ * time, if there has no change for high 4 bytes injected by the kernel during
+ * the user reading sequence, it's safe for use the second value.
+ *
+ * When update the AUX tail and detects any carrying in the high 32 bits, it
+ * means there have two store operations in user space and it cannot promise
+ * the atomicity for 64-bit write, so return '-1' in this case to tell the
+ * caller an overflow error has happened.
+ */
+static inline u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
+{
+	struct perf_event_mmap_page *pc = mm->userpg;
+	u64 first, second, last;
+	u64 mask = (u64)(UINT32_MAX) << 32;
+
+	do {
+		first = READ_ONCE(pc->aux_head);
+		/* Ensure all reads are done after we read the head */
+		smp_rmb();
+		second = READ_ONCE(pc->aux_head);
+		/* Ensure all reads are done after we read the head */
+		smp_rmb();
+		last = READ_ONCE(pc->aux_head);
+	} while ((first & mask) != (last & mask));
+
+	return second;
+}
+
+static inline int compat_auxtrace_mmap__write_tail(struct auxtrace_mmap *mm,
+						    u64 tail)
+{
+	struct perf_event_mmap_page *pc = mm->userpg;
+	u64 mask = (u64)(UINT32_MAX) << 32;
+
+	if (tail & mask)
+		return -1;
+
+	/* Ensure all reads are done before we write the tail out */
+	smp_mb();
+	WRITE_ONCE(pc->aux_tail, tail);
+	return 0;
+}
+
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-	u64 head = READ_ONCE(pc->aux_head);
+	u64 head;
+
+#if BITS_PER_LONG == 32
+	if (kernel_is_64_bit)
+		return compat_auxtrace_mmap__read_head(mm);
+#endif
+	head = READ_ONCE(pc->aux_head);
 
 	/* Ensure all reads are done after we read the head */
 	smp_rmb();
 	return head;
 }
 
-static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
+static inline int auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
 
+#if BITS_PER_LONG == 32
+	if (kernel_is_64_bit)
+		return compat_auxtrace_mmap__write_tail(mm, tail);
+#endif
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
 	WRITE_ONCE(pc->aux_tail, tail);
+	return 0;
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating AUX ring buffer
  2021-07-11 10:40 ` [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating " Leo Yan
@ 2021-07-12 10:40   ` Suzuki K Poulose
  2021-07-12 10:54     ` Leo Yan
  0 siblings, 1 reply; 34+ messages in thread
From: Suzuki K Poulose @ 2021-07-12 10:40 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 11/07/2021 11:40, Leo Yan wrote:
> Since a memory barrier is required between AUX trace data store and
> aux_head store, and the AUX trace data is filled with memcpy(), it's
> sufficient to use smp_wmb() so can ensure the trace data is visible
> prior to updating aux_head.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index acdb59e0e661..713205db15a1 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
>   	 */
>   	if (etr_perf->snapshot)
>   		handle->head += size;
> +
> +	/*
> +	 * It requires the ordering between the AUX trace data and aux_head
> +	 * store, below smp_wmb() ensures the AUX trace data is visible prior
> +	 * to updating aux_head.
> +	 */

Please could we reword this a bit, something like :

	/*
	 * Ensure that the AUX trace data is visible before the aux_head
	 * is updated via perf_aux_output_end(), as expected by the
	 * perf ring buffer.
	 */
> +	smp_wmb();
> +

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating AUX ring buffer
  2021-07-12 10:40   ` Suzuki K Poulose
@ 2021-07-12 10:54     ` Leo Yan
  0 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-12 10:54 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

Hi Suzuki,

On Mon, Jul 12, 2021 at 11:40:12AM +0100, Suzuki Kuruppassery Poulose wrote:
> On 11/07/2021 11:40, Leo Yan wrote:
> > Since a memory barrier is required between AUX trace data store and
> > aux_head store, and the AUX trace data is filled with memcpy(), it's
> > sufficient to use smp_wmb() so can ensure the trace data is visible
> > prior to updating aux_head.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >   drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index acdb59e0e661..713205db15a1 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
> >   	 */
> >   	if (etr_perf->snapshot)
> >   		handle->head += size;
> > +
> > +	/*
> > +	 * It requires the ordering between the AUX trace data and aux_head
> > +	 * store, below smp_wmb() ensures the AUX trace data is visible prior
> > +	 * to updating aux_head.
> > +	 */
> 
> Please could we reword this a bit, something like :
> 
> 	/*
> 	 * Ensure that the AUX trace data is visible before the aux_head
> 	 * is updated via perf_aux_output_end(), as expected by the
> 	 * perf ring buffer.
> 	 */

Will refine with this in next spin.  Thanks for review!

> > +	smp_wmb();
> > +
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
  2021-07-11 10:41 ` [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
@ 2021-07-12 14:32   ` Adrian Hunter
  2021-07-13 13:10     ` Leo Yan
  0 siblings, 1 reply; 34+ messages in thread
From: Adrian Hunter @ 2021-07-12 14:32 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 11/07/21 1:41 pm, Leo Yan wrote:
> Since the function auxtrace_mmap__read_snapshot_head() is exactly same
> with auxtrace_mmap__read_head(), whether the session is in snapshot mode
> or not, it's unified to use function auxtrace_mmap__read_head() for
> reading AUX buffer head.
> 
> And the function auxtrace_mmap__read_snapshot_head() is unused so this
> patch removes it.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/auxtrace.c |  5 ++---
>  tools/perf/util/auxtrace.h | 10 ----------
>  2 files changed, 2 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index cb19669d2a5b..7958e17229ea 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -1686,13 +1686,12 @@ static int __auxtrace_mmap__read(struct mmap *map,
>  	union perf_event ev;
>  	void *data1, *data2;
>  
> +	head = auxtrace_mmap__read_head(mm);
> +
>  	if (snapshot) {
> -		head = auxtrace_mmap__read_snapshot_head(mm);
>  		if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data,
>  						   &head, &old))

That leaves a nested 'if' which is not kernel style i.e. could be

	if (snapshot &&
	    auxtrace_record__find_snapshot(itr, mm->idx, mm, data, &head, &old))


>  			return -1;
> -	} else {
> -		head = auxtrace_mmap__read_head(mm);
>  	}
>  
>  	if (old == head)
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index 4f9176368134..d68a5e80b217 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -440,16 +440,6 @@ struct auxtrace_cache;
>  
>  #ifdef HAVE_AUXTRACE_SUPPORT
>  
> -static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
> -{
> -	struct perf_event_mmap_page *pc = mm->userpg;
> -	u64 head = READ_ONCE(pc->aux_head);
> -
> -	/* Ensure all reads are done after we read the head */
> -	smp_rmb();
> -	return head;
> -}
> -
>  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-11 10:41 ` [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode Leo Yan
@ 2021-07-12 14:37   ` Adrian Hunter
  2021-07-12 18:14   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 34+ messages in thread
From: Adrian Hunter @ 2021-07-12 14:37 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 11/07/21 1:41 pm, Leo Yan wrote:
> It's useful to know that the kernel is running in 32-bit or 64-bit
> mode.  E.g. We can decide if perf tool is running in compat mode
> from this info.
> 
> This patch adds a global variable "kernel_is_64_bit", it's initialized
> when a session setups environment, its value is decided by checking the
> architecture string.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/env.c | 17 ++++++++++++++++-
>  tools/perf/util/env.h |  1 +
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index ebc5e9ad35db..345635a2e842 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -11,6 +11,7 @@
>  #include <stdlib.h>
>  #include <string.h>
>  
> +int kernel_is_64_bit;
>  struct perf_env perf_env;
>  
>  #ifdef HAVE_LIBBPF_SUPPORT
> @@ -172,6 +173,19 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused)
>  }
>  #endif // HAVE_LIBBPF_SUPPORT
>  
> +static void perf_env__init_kernel_mode(struct perf_env *env)
> +{
> +	const char *arch = perf_env__raw_arch(env);
> +
> +	if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
> +	    !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
> +	    !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
> +	    !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
> +		kernel_is_64_bit = 1;
> +	else
> +		kernel_is_64_bit = 0;
> +}
> +
>  void perf_env__exit(struct perf_env *env)
>  {
>  	int i;
> @@ -217,13 +231,14 @@ void perf_env__exit(struct perf_env *env)
>  	zfree(&env->hybrid_cpc_nodes);
>  }
>  
> -void perf_env__init(struct perf_env *env __maybe_unused)
> +void perf_env__init(struct perf_env *env)
>  {
>  #ifdef HAVE_LIBBPF_SUPPORT
>  	env->bpf_progs.infos = RB_ROOT;
>  	env->bpf_progs.btfs = RB_ROOT;
>  	init_rwsem(&env->bpf_progs.lock);
>  #endif
> +	perf_env__init_kernel_mode(env);

perf_env__init() is also used for session->header.env which is not
necessarily the current machine.  So this initialization could be
separate from perf_env__init() to avoid confusion.

>  }
>  
>  int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index 6824a7423a2d..cc989ff49740 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -139,6 +139,7 @@ enum perf_compress_type {
>  struct bpf_prog_info_node;
>  struct btf_node;
>  
> +extern int kernel_is_64_bit;
>  extern struct perf_env perf_env;
>  
>  void perf_env__exit(struct perf_env *env);
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-11 10:41 ` [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
@ 2021-07-12 14:44   ` Russell King (Oracle)
  2021-07-13 15:46     ` Leo Yan
  2021-07-13  7:07   ` Adrian Hunter
  1 sibling, 1 reply; 34+ messages in thread
From: Russell King (Oracle) @ 2021-07-12 14:44 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel

On Sun, Jul 11, 2021 at 06:41:05PM +0800, Leo Yan wrote:
> When perf runs in compat mode (kernel in 64-bit mode and the perf is in
> 32-bit mode), the 64-bit value atomicity in the user space cannot be
> assured, E.g. on some architectures, the 64-bit value accessing is split
> into two instructions, one is for the low 32-bit word accessing and
> another is for the high 32-bit word.

Does this apply to 32-bit ARM code on aarch64? I would not have thought
it would, as the structure member is a __u64 and
compat_auxtrace_mmap__read_head() doesn't seem to be marking anything
as packed, so the compiler _should_ be able to use a LDRD instruction
to load the value.

Is this a problem noticed on non-ARM architectures?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-11 10:41 ` [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode Leo Yan
  2021-07-12 14:37   ` Adrian Hunter
@ 2021-07-12 18:14   ` Arnaldo Carvalho de Melo
  2021-07-13 15:09     ` Leo Yan
  1 sibling, 1 reply; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-12 18:14 UTC (permalink / raw)
  To: Leo Yan
  Cc: Adrian Hunter, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, linux-perf-users, linux-kernel,
	coresight, linux-arm-kernel

Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> It's useful to know that the kernel is running in 32-bit or 64-bit
> mode.  E.g. We can decide if perf tool is running in compat mode
> from this info.
> 
> This patch adds a global variable "kernel_is_64_bit", it's initialized
> when a session setups environment, its value is decided by checking the
> architecture string.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/env.c | 17 ++++++++++++++++-
>  tools/perf/util/env.h |  1 +
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index ebc5e9ad35db..345635a2e842 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -11,6 +11,7 @@
>  #include <stdlib.h>
>  #include <string.h>
>  
> +int kernel_is_64_bit;
>  struct perf_env perf_env;

Why can't this be in 'struct perf_env'?

- Arnaldo
  
>  #ifdef HAVE_LIBBPF_SUPPORT
> @@ -172,6 +173,19 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused)
>  }
>  #endif // HAVE_LIBBPF_SUPPORT
>  
> +static void perf_env__init_kernel_mode(struct perf_env *env)
> +{
> +	const char *arch = perf_env__raw_arch(env);
> +
> +	if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
> +	    !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
> +	    !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
> +	    !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
> +		kernel_is_64_bit = 1;
> +	else
> +		kernel_is_64_bit = 0;
> +}
> +
>  void perf_env__exit(struct perf_env *env)
>  {
>  	int i;
> @@ -217,13 +231,14 @@ void perf_env__exit(struct perf_env *env)
>  	zfree(&env->hybrid_cpc_nodes);
>  }
>  
> -void perf_env__init(struct perf_env *env __maybe_unused)
> +void perf_env__init(struct perf_env *env)
>  {
>  #ifdef HAVE_LIBBPF_SUPPORT
>  	env->bpf_progs.infos = RB_ROOT;
>  	env->bpf_progs.btfs = RB_ROOT;
>  	init_rwsem(&env->bpf_progs.lock);
>  #endif
> +	perf_env__init_kernel_mode(env);
>  }
>  
>  int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index 6824a7423a2d..cc989ff49740 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -139,6 +139,7 @@ enum perf_compress_type {
>  struct bpf_prog_info_node;
>  struct btf_node;
>  
> +extern int kernel_is_64_bit;
>  extern struct perf_env perf_env;
>  
>  void perf_env__exit(struct perf_env *env);
> -- 
> 2.25.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-11 10:41 ` [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
  2021-07-12 14:44   ` Russell King (Oracle)
@ 2021-07-13  7:07   ` Adrian Hunter
  2021-07-13 15:48     ` Leo Yan
  1 sibling, 1 reply; 34+ messages in thread
From: Adrian Hunter @ 2021-07-13  7:07 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 11/07/21 1:41 pm, Leo Yan wrote:
> When perf runs in compat mode (kernel in 64-bit mode and the perf is in
> 32-bit mode), the 64-bit value atomicity in the user space cannot be
> assured, E.g. on some architectures, the 64-bit value accessing is split
> into two instructions, one is for the low 32-bit word accessing and
> another is for the high 32-bit word.
> 
> This patch introduces two functions compat_auxtrace_mmap__read_head()
> and compat_auxtrace_mmap__write_tail(), as their naming indicates, when
> perf tool works in compat mode, it uses these two functions to access
> the AUX head and tail.  These two functions can allow the perf tool to
> work properly in certain conditions, e.g. when perf tool works in
> snapshot mode with only using AUX head pointer, or perf tool uses the
> AUX buffer and the incremented tail is not bigger than 4GB.
> 
> When perf tool cannot handle the case when the AUX tail is bigger than
> 4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and
> tells the caller to bail out for the error.
> 
> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/auxtrace.c |  9 ++--
>  tools/perf/util/auxtrace.h | 94 +++++++++++++++++++++++++++++++++++++-
>  2 files changed, 98 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 6a63be8b2430..d6fc250fbf97 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -1766,10 +1766,13 @@ static int __auxtrace_mmap__read(struct mmap *map,
>  	mm->prev = head;
>  
>  	if (!snapshot) {
> -		auxtrace_mmap__write_tail(mm, head);
> -		if (itr->read_finish) {
> -			int err;
> +		int err;
>  
> +		err = auxtrace_mmap__write_tail(mm, head);
> +		if (err < 0)
> +			return err;
> +
> +		if (itr->read_finish) {
>  			err = itr->read_finish(itr, mm->idx);
>  			if (err < 0)
>  				return err;
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index d68a5e80b217..66de7b6e65ec 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -18,6 +18,8 @@
>  #include <asm/bitsperlong.h>
>  #include <asm/barrier.h>
>  
> +#include "env.h"
> +
>  union perf_event;
>  struct perf_session;
>  struct evlist;
> @@ -440,23 +442,111 @@ struct auxtrace_cache;
>  
>  #ifdef HAVE_AUXTRACE_SUPPORT
>  
> +/*
> + * In the compat mode kernel runs in 64-bit and perf tool runs in 32-bit mode,
> + * 32-bit perf tool cannot access 64-bit value atomically, which might lead to
> + * the issues caused by the below sequence on multiple CPUs: when perf tool
> + * accesses either the load operation or the store operation for 64-bit value,
> + * on some architectures the operation is divided into two instructions, one
> + * is for accessing the low 32-bit value and another is for the high 32-bit;
> + * thus these two user operations can give the kernel chances to access the
> + * 64-bit value, and thus leads to the unexpected load values.
> + *
> + *   kernel (64-bit)                        user (32-bit)
> + *
> + *   if (LOAD ->aux_tail) { --,             LOAD ->aux_head_lo
> + *       STORE $aux_data      |       ,--->
> + *       FLUSH $aux_data      |       |     LOAD ->aux_head_hi
> + *       STORE ->aux_head   --|-------`     smp_rmb()
> + *   }                        |             LOAD $data
> + *                            |             smp_mb()
> + *                            |             STORE ->aux_tail_lo
> + *                            `----------->
> + *                                          STORE ->aux_tail_hi
> + *
> + * For this reason, it's impossible for the perf tool to work correctly when
> + * the AUX head or tail is bigger than 4GB (more than 32 bits length); and we
> + * can not simply limit the AUX ring buffer to less than 4GB, the reason is
> + * the pointers can be increased monotonically (e.g in snapshot mode), whatever

At least for Intel PT, in snapshot mode, the head is always an offset
into the buffer, so never more than 4GB for a 32-bit perf tool. So maybe
leave out "(e.g in snapshot mode)"

> + * the buffer size it is, at the end the head and tail can be bigger than 4GB
> + * and carry out to the high 32-bit.
> + *
> + * To mitigate the issues and improve the user experience, we can allow the
> + * perf tool working in certain conditions and bail out with error if detect
> + * any overflow cannot be handled.
> + *
> + * For reading the AUX head, it reads out the values for three times, and
> + * compares the high 4 bytes of the values between the first time and the last
> + * time, if there has no change for high 4 bytes injected by the kernel during
> + * the user reading sequence, it's safe for use the second value.
> + *
> + * When update the AUX tail and detects any carrying in the high 32 bits, it
> + * means there have two store operations in user space and it cannot promise
> + * the atomicity for 64-bit write, so return '-1' in this case to tell the
> + * caller an overflow error has happened.
> + */
> +static inline u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
> +{
> +	struct perf_event_mmap_page *pc = mm->userpg;
> +	u64 first, second, last;
> +	u64 mask = (u64)(UINT32_MAX) << 32;
> +
> +	do {
> +		first = READ_ONCE(pc->aux_head);
> +		/* Ensure all reads are done after we read the head */
> +		smp_rmb();
> +		second = READ_ONCE(pc->aux_head);
> +		/* Ensure all reads are done after we read the head */
> +		smp_rmb();
> +		last = READ_ONCE(pc->aux_head);
> +	} while ((first & mask) != (last & mask));
> +
> +	return second;
> +}
> +
> +static inline int compat_auxtrace_mmap__write_tail(struct auxtrace_mmap *mm,
> +						    u64 tail)
> +{
> +	struct perf_event_mmap_page *pc = mm->userpg;
> +	u64 mask = (u64)(UINT32_MAX) << 32;
> +
> +	if (tail & mask)
> +		return -1;
> +
> +	/* Ensure all reads are done before we write the tail out */
> +	smp_mb();
> +	WRITE_ONCE(pc->aux_tail, tail);
> +	return 0;
> +}
> +
>  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> -	u64 head = READ_ONCE(pc->aux_head);
> +	u64 head;
> +
> +#if BITS_PER_LONG == 32
> +	if (kernel_is_64_bit)
> +		return compat_auxtrace_mmap__read_head(mm);
> +#endif
> +	head = READ_ONCE(pc->aux_head);
>  
>  	/* Ensure all reads are done after we read the head */
>  	smp_rmb();
>  	return head;
>  }
>  
> -static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
> +static inline int auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
>  
> +#if BITS_PER_LONG == 32
> +	if (kernel_is_64_bit)
> +		return compat_auxtrace_mmap__write_tail(mm, tail);
> +#endif
>  	/* Ensure all reads are done before we write the tail out */
>  	smp_mb();
>  	WRITE_ONCE(pc->aux_tail, tail);
> +	return 0;
>  }
>  
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering
  2021-07-11 10:40 ` [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering Leo Yan
@ 2021-07-13 12:56   ` Peter Zijlstra
  2021-07-13 15:49     ` Leo Yan
  0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2021-07-13 12:56 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Sun, Jul 11, 2021 at 06:40:57PM +0800, Leo Yan wrote:
> AUX ring buffer is required to separate the data store and aux_head
> store, since the function CS_LOCK() has contained memory barrier mb(),
> mb() is a more conservative barrier than smp_wmb() on Arm32/Arm64, thus
> it's needless to add any explicit barrier anymore.
> 
> Add comment to make clear for the barrier usage for ETF.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> index 45b85edfc690..9a42ee689921 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> @@ -553,6 +553,12 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
>  	if (buf->snapshot)
>  		handle->head += to_read;
>  
> +	/*
> +	 * AUX ring buffer requires to use memory barrier to separate the trace
> +	 * data store and aux_head store, because CS_LOCK() contains mb() which
> +	 * gives more heavy barrier than smp_wmb(), it's not necessary to
> +	 * explicitly invoke any barrier.
> +	 */
>  	CS_LOCK(drvdata->base);

'more heavy' is not a correctness argument :-)

The argument to make here is that CS_LOCK() ensures completion /
visibility of the hardware buffer.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 04/11] perf/x86: Add barrier after updating bts
  2021-07-11 10:40 ` [PATCH v4 04/11] perf/x86: Add barrier after updating bts Leo Yan
@ 2021-07-13 13:01   ` Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: Peter Zijlstra @ 2021-07-13 13:01 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Sun, Jul 11, 2021 at 06:40:58PM +0800, Leo Yan wrote:
> Add barrier wmb() to separate the AUX data store and aux_head store.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  arch/x86/events/intel/bts.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
> index 6320d2cfd9d3..4a015d160bc5 100644
> --- a/arch/x86/events/intel/bts.c
> +++ b/arch/x86/events/intel/bts.c
> @@ -209,6 +209,9 @@ static void bts_update(struct bts_ctx *bts)
>  	} else {
>  		local_set(&buf->data_size, head);
>  	}
> +
> +	/* The WMB separates data store and aux_head store matches. */
> +	wmb();

Alexander, last time you mentioned (on IRC) that BTS is supposed to be
coherent, in which case we can probably get away with just a compiler
barrier. Can you confirm?

That said; this BTS crud is so horrifically slow, an extra MFENCE isn't
going to matter one way or another.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
  2021-07-12 14:32   ` Adrian Hunter
@ 2021-07-13 13:10     ` Leo Yan
  0 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-13 13:10 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Mon, Jul 12, 2021 at 05:32:11PM +0300, Adrian Hunter wrote:

[...]

> > --- a/tools/perf/util/auxtrace.c
> > +++ b/tools/perf/util/auxtrace.c
> > @@ -1686,13 +1686,12 @@ static int __auxtrace_mmap__read(struct mmap *map,
> >  	union perf_event ev;
> >  	void *data1, *data2;
> >  
> > +	head = auxtrace_mmap__read_head(mm);
> > +
> >  	if (snapshot) {
> > -		head = auxtrace_mmap__read_snapshot_head(mm);
> >  		if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data,
> >  						   &head, &old))
> 
> That leaves a nested 'if' which is not kernel style i.e. could be
> 
> 	if (snapshot &&
> 	    auxtrace_record__find_snapshot(itr, mm->idx, mm, data, &head, &old))

Will refine in next spin, thanks for suggestion!

> >  			return -1;
> > -	} else {
> > -		head = auxtrace_mmap__read_head(mm);
> >  	}
> >  
> >  	if (old == head)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-12 18:14   ` Arnaldo Carvalho de Melo
@ 2021-07-13 15:09     ` Leo Yan
  2021-07-13 17:31       ` Hunter, Adrian
  0 siblings, 1 reply; 34+ messages in thread
From: Leo Yan @ 2021-07-13 15:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, linux-perf-users, linux-kernel,
	coresight, linux-arm-kernel

Hi Arnaldo, Adrian,

On Mon, Jul 12, 2021 at 03:14:35PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> > It's useful to know that the kernel is running in 32-bit or 64-bit
> > mode.  E.g. We can decide if perf tool is running in compat mode
> > from this info.
> > 
> > This patch adds a global variable "kernel_is_64_bit", it's initialized
> > when a session setups environment, its value is decided by checking the
> > architecture string.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  tools/perf/util/env.c | 17 ++++++++++++++++-
> >  tools/perf/util/env.h |  1 +
> >  2 files changed, 17 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> > index ebc5e9ad35db..345635a2e842 100644
> > --- a/tools/perf/util/env.c
> > +++ b/tools/perf/util/env.c
> > @@ -11,6 +11,7 @@
> >  #include <stdlib.h>
> >  #include <string.h>
> >  
> > +int kernel_is_64_bit;
> >  struct perf_env perf_env;
> 
> Why can't this be in 'struct perf_env'?

Good question.  I considered to add it in struct perf_env but finally
I used this way; the reason is this variable "kernel_is_64_bit" is only
used during recording phase for AUX ring buffer, and don't use it for
report.  So seems to me it's over complexity to add a new field and
just wander if it's necessary to save this field as new feature in the
perf header.

Combining the comment from Adrian in another email, I think it's good
to add a new field "compat_mode" in the struct perf_env, and this field
will be initialized in build-record.c.  Currently we don't need to save
this value into the perf file, if later we need to use this value for
decoding phase, then we can add a new feature item to save "compat_mode"
into the perf file's header.

If you have any different idea, please let me know.  Thanks!

Leo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-12 14:44   ` Russell King (Oracle)
@ 2021-07-13 15:46     ` Leo Yan
  2021-07-13 16:14       ` Russell King (Oracle)
  0 siblings, 1 reply; 34+ messages in thread
From: Leo Yan @ 2021-07-13 15:46 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel

Hi Russell,

On Mon, Jul 12, 2021 at 03:44:11PM +0100, Russell King (Oracle) wrote:
> On Sun, Jul 11, 2021 at 06:41:05PM +0800, Leo Yan wrote:
> > When perf runs in compat mode (kernel in 64-bit mode and the perf is in
> > 32-bit mode), the 64-bit value atomicity in the user space cannot be
> > assured, E.g. on some architectures, the 64-bit value accessing is split
> > into two instructions, one is for the low 32-bit word accessing and
> > another is for the high 32-bit word.
> 
> Does this apply to 32-bit ARM code on aarch64? I would not have thought
> it would, as the structure member is a __u64 and
> compat_auxtrace_mmap__read_head() doesn't seem to be marking anything
> as packed, so the compiler _should_ be able to use a LDRD instruction
> to load the value.

I think essentially your question is relevant to the memory model.
For 32-bit Arm application on aarch64, in the Armv8 architecture
reference manual ARM DDI 0487F.c, chapter "E2.2.1
Requirements for single-copy atomicity" describes:

"LDM, LDC, LDRD, STM, STC, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, VSTM,
and VSTR instructions are executed as a sequence of word-aligned word
accesses. Each 32-bit word access is guaranteed to be single-copy
atomic. The architecture does not require subsequences of two or more
word accesses from the sequence to be single-copy atomic."

So I think LDRD/STRD instruction cannot promise the atomicity for
loading or storing two words in 32-bit Arm.

And another thought is the functions compat_auxtrace_mmap__read_head()
is a general function, I avoid to write it with any architecture
specific instructions.

> Is this a problem noticed on non-ARM architectures?

No, actually we just concluded the potential issue based on the analysis
for the weak memory model.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-13  7:07   ` Adrian Hunter
@ 2021-07-13 15:48     ` Leo Yan
  0 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-13 15:48 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Tue, Jul 13, 2021 at 10:07:03AM +0300, Adrian Hunter wrote:

[...]

> > +/*
> > + * In the compat mode kernel runs in 64-bit and perf tool runs in 32-bit mode,
> > + * 32-bit perf tool cannot access 64-bit value atomically, which might lead to
> > + * the issues caused by the below sequence on multiple CPUs: when perf tool
> > + * accesses either the load operation or the store operation for 64-bit value,
> > + * on some architectures the operation is divided into two instructions, one
> > + * is for accessing the low 32-bit value and another is for the high 32-bit;
> > + * thus these two user operations can give the kernel chances to access the
> > + * 64-bit value, and thus leads to the unexpected load values.
> > + *
> > + *   kernel (64-bit)                        user (32-bit)
> > + *
> > + *   if (LOAD ->aux_tail) { --,             LOAD ->aux_head_lo
> > + *       STORE $aux_data      |       ,--->
> > + *       FLUSH $aux_data      |       |     LOAD ->aux_head_hi
> > + *       STORE ->aux_head   --|-------`     smp_rmb()
> > + *   }                        |             LOAD $data
> > + *                            |             smp_mb()
> > + *                            |             STORE ->aux_tail_lo
> > + *                            `----------->
> > + *                                          STORE ->aux_tail_hi
> > + *
> > + * For this reason, it's impossible for the perf tool to work correctly when
> > + * the AUX head or tail is bigger than 4GB (more than 32 bits length); and we
> > + * can not simply limit the AUX ring buffer to less than 4GB, the reason is
> > + * the pointers can be increased monotonically (e.g in snapshot mode), whatever
> 
> At least for Intel PT, in snapshot mode, the head is always an offset
> into the buffer, so never more than 4GB for a 32-bit perf tool. So maybe
> leave out "(e.g in snapshot mode)"

Sure, will leave out "(e.g in snapshot mode)".

Thanks,
Leo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering
  2021-07-13 12:56   ` Peter Zijlstra
@ 2021-07-13 15:49     ` Leo Yan
  0 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-13 15:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Tue, Jul 13, 2021 at 02:56:06PM +0200, Peter Zijlstra wrote:
> On Sun, Jul 11, 2021 at 06:40:57PM +0800, Leo Yan wrote:
> > AUX ring buffer is required to separate the data store and aux_head
> > store, since the function CS_LOCK() has contained memory barrier mb(),
> > mb() is a more conservative barrier than smp_wmb() on Arm32/Arm64, thus
> > it's needless to add any explicit barrier anymore.
> > 
> > Add comment to make clear for the barrier usage for ETF.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> > index 45b85edfc690..9a42ee689921 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> > @@ -553,6 +553,12 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
> >  	if (buf->snapshot)
> >  		handle->head += to_read;
> >  
> > +	/*
> > +	 * AUX ring buffer requires to use memory barrier to separate the trace
> > +	 * data store and aux_head store, because CS_LOCK() contains mb() which
> > +	 * gives more heavy barrier than smp_wmb(), it's not necessary to
> > +	 * explicitly invoke any barrier.
> > +	 */
> >  	CS_LOCK(drvdata->base);
> 
> 'more heavy' is not a correctness argument :-)
> 
> The argument to make here is that CS_LOCK() ensures completion /
> visibility of the hardware buffer.

Will correct for this, thanks for reminding!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-13 15:46     ` Leo Yan
@ 2021-07-13 16:14       ` Russell King (Oracle)
  2021-07-13 18:13         ` Catalin Marinas
  0 siblings, 1 reply; 34+ messages in thread
From: Russell King (Oracle) @ 2021-07-13 16:14 UTC (permalink / raw)
  To: Leo Yan, Will Deacon, Catalin Marinas
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel

On Tue, Jul 13, 2021 at 11:46:02PM +0800, Leo Yan wrote:
> Hi Russell,
> 
> On Mon, Jul 12, 2021 at 03:44:11PM +0100, Russell King (Oracle) wrote:
> > On Sun, Jul 11, 2021 at 06:41:05PM +0800, Leo Yan wrote:
> > > When perf runs in compat mode (kernel in 64-bit mode and the perf is in
> > > 32-bit mode), the 64-bit value atomicity in the user space cannot be
> > > assured, E.g. on some architectures, the 64-bit value accessing is split
> > > into two instructions, one is for the low 32-bit word accessing and
> > > another is for the high 32-bit word.
> > 
> > Does this apply to 32-bit ARM code on aarch64? I would not have thought
> > it would, as the structure member is a __u64 and
> > compat_auxtrace_mmap__read_head() doesn't seem to be marking anything
> > as packed, so the compiler _should_ be able to use a LDRD instruction
> > to load the value.
> 
> I think essentially your question is relevant to the memory model.
> For 32-bit Arm application on aarch64, in the Armv8 architecture
> reference manual ARM DDI 0487F.c, chapter "E2.2.1
> Requirements for single-copy atomicity" describes:
> 
> "LDM, LDC, LDRD, STM, STC, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, VSTM,
> and VSTR instructions are executed as a sequence of word-aligned word
> accesses. Each 32-bit word access is guaranteed to be single-copy
> atomic. The architecture does not require subsequences of two or more
> word accesses from the sequence to be single-copy atomic."

... which is an interesting statement for ARMv7 code. DDI0406C says
similar but goes on to say:

   In an implementation that includes the Large Physical Address
   Extension, LDRD and STRD accesses to 64-bit aligned locations
   are 64-bit single-copy atomic as seen by translation table
   walks and accesses to translation tables.

then states that LPAE page tables must be stored in memory that such
page tables must be in memory that is capable of supporting 64-bit
single-copy atomic accesses.

In Linux, we assume all RAM that the kernel has access to can contain
page tables. So by implication, all RAM that the kernel has access to
and exposes to userspace must be 64-bit single-copy atomic (if not,
we have a rather serious bug.)

The remaining question is whether it would be sane for LDRD and STRD
to be single-copy atomic to translation table walkers but not to other
CPUs. Since Linux expects to be able to modify the page tables from
any CPU in the system, this requirement must hold, otherwise it's going
to be a really strangely designed system.

Therefore, I put it that for Linux to operate correctly on 32-bit Arm
CPUs with LPAE, LDRD and STRD must be 64-bit single-copy atomic
inspite of what the architecture reference documentation may allow.

Now, since we allow 32-bit ARM kernels to run under KVM on ARMv8, it
would be pretty silly if this was broken on aarch64 - it would mean
such a guest would have no way to atomically update the LPAE page
tables. We know that's not true, since we can run 32-bit kernels and
userspace just fine under aarch64.

I'd be interested to hear what Catalin and Will have to say on this,
but I suspect in practice, Arm systems that are running Linux with
LPAE (ARMv7+LPAE, ARMv8) will implement LDRD and STRD with 64-bit
single-copy atomic semantics.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-13 15:09     ` Leo Yan
@ 2021-07-13 17:31       ` Hunter, Adrian
  2021-07-14 13:59         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 34+ messages in thread
From: Hunter, Adrian @ 2021-07-13 17:31 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel



> -----Original Message-----
> From: Leo Yan <leo.yan@linaro.org>
> Sent: Tuesday, July 13, 2021 6:10 PM
> To: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Hunter, Adrian <adrian.hunter@intel.com>; Peter Zijlstra
> <peterz@infradead.org>; Ingo Molnar <mingo@redhat.com>; Mark Rutland
> <mark.rutland@arm.com>; Alexander Shishkin
> <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@redhat.com>;
> Namhyung Kim <namhyung@kernel.org>; Thomas Gleixner
> <tglx@linutronix.de>; Borislav Petkov <bp@alien8.de>; x86@kernel.org; H.
> Peter Anvin <hpa@zytor.com>; Mathieu Poirier
> <mathieu.poirier@linaro.org>; Suzuki K Poulose
> <suzuki.poulose@arm.com>; Mike Leach <mike.leach@linaro.org>; linux-
> perf-users@vger.kernel.org; linux-kernel@vger.kernel.org;
> coresight@lists.linaro.org; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
> 
> Hi Arnaldo, Adrian,
> 
> On Mon, Jul 12, 2021 at 03:14:35PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> > > It's useful to know that the kernel is running in 32-bit or 64-bit
> > > mode.  E.g. We can decide if perf tool is running in compat mode
> > > from this info.
> > >
> > > This patch adds a global variable "kernel_is_64_bit", it's
> > > initialized when a session setups environment, its value is decided
> > > by checking the architecture string.
> > >
> > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > ---
> > >  tools/perf/util/env.c | 17 ++++++++++++++++-  tools/perf/util/env.h
> > > |  1 +
> > >  2 files changed, 17 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index
> > > ebc5e9ad35db..345635a2e842 100644
> > > --- a/tools/perf/util/env.c
> > > +++ b/tools/perf/util/env.c
> > > @@ -11,6 +11,7 @@
> > >  #include <stdlib.h>
> > >  #include <string.h>
> > >
> > > +int kernel_is_64_bit;
> > >  struct perf_env perf_env;
> >
> > Why can't this be in 'struct perf_env'?
> 
> Good question.  I considered to add it in struct perf_env but finally I used this
> way; the reason is this variable "kernel_is_64_bit" is only used during
> recording phase for AUX ring buffer, and don't use it for report.  So seems to
> me it's over complexity to add a new field and just wander if it's necessary to
> save this field as new feature in the perf header.

I think we store the arch, so if the "kernel_is_64_bit" calculation depends only on arch
then I guess we don't need a new feature at the moment.

> 
> Combining the comment from Adrian in another email, I think it's good to add
> a new field "compat_mode" in the struct perf_env, and this field will be
> initialized in build-record.c.  Currently we don't need to save this value into
> the perf file, if later we need to use this value for decoding phase, then we
> can add a new feature item to save "compat_mode"
> into the perf file's header.
> 
> If you have any different idea, please let me know.  Thanks!
> 
> Leo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-13 16:14       ` Russell King (Oracle)
@ 2021-07-13 18:13         ` Catalin Marinas
  2021-07-14  8:40           ` Russell King (Oracle)
  0 siblings, 1 reply; 34+ messages in thread
From: Catalin Marinas @ 2021-07-13 18:13 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Leo Yan, Will Deacon, Arnaldo Carvalho de Melo, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel

On Tue, Jul 13, 2021 at 05:14:41PM +0100, Russell King wrote:
> On Tue, Jul 13, 2021 at 11:46:02PM +0800, Leo Yan wrote:
> > On Mon, Jul 12, 2021 at 03:44:11PM +0100, Russell King (Oracle) wrote:
> > > On Sun, Jul 11, 2021 at 06:41:05PM +0800, Leo Yan wrote:
> > > > When perf runs in compat mode (kernel in 64-bit mode and the perf is in
> > > > 32-bit mode), the 64-bit value atomicity in the user space cannot be
> > > > assured, E.g. on some architectures, the 64-bit value accessing is split
> > > > into two instructions, one is for the low 32-bit word accessing and
> > > > another is for the high 32-bit word.
> > > 
> > > Does this apply to 32-bit ARM code on aarch64? I would not have thought
> > > it would, as the structure member is a __u64 and
> > > compat_auxtrace_mmap__read_head() doesn't seem to be marking anything
> > > as packed, so the compiler _should_ be able to use a LDRD instruction
> > > to load the value.
> > 
> > I think essentially your question is relevant to the memory model.
> > For 32-bit Arm application on aarch64, in the Armv8 architecture
> > reference manual ARM DDI 0487F.c, chapter "E2.2.1
> > Requirements for single-copy atomicity" describes:
> > 
> > "LDM, LDC, LDRD, STM, STC, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, VSTM,
> > and VSTR instructions are executed as a sequence of word-aligned word
> > accesses. Each 32-bit word access is guaranteed to be single-copy
> > atomic. The architecture does not require subsequences of two or more
> > word accesses from the sequence to be single-copy atomic."
> 
> ... which is an interesting statement for ARMv7 code. DDI0406C says
> similar but goes on to say:
> 
>    In an implementation that includes the Large Physical Address
>    Extension, LDRD and STRD accesses to 64-bit aligned locations
>    are 64-bit single-copy atomic as seen by translation table
>    walks and accesses to translation tables.
> 
> then states that LPAE page tables must be stored in memory that such
> page tables must be in memory that is capable of supporting 64-bit
> single-copy atomic accesses.

A similar statement is in the ARMv8 ARM (E2.2.1 in version G.a).

> In Linux, we assume all RAM that the kernel has access to can contain
> page tables. So by implication, all RAM that the kernel has access to
> and exposes to userspace must be 64-bit single-copy atomic (if not,
> we have a rather serious bug.)

Indeed. We should assume that the SDRAM supports all the CPU features.

> The remaining question is whether it would be sane for LDRD and STRD
> to be single-copy atomic to translation table walkers but not to other
> CPUs. Since Linux expects to be able to modify the page tables from
> any CPU in the system, this requirement must hold, otherwise it's going
> to be a really strangely designed system.

The above statement does say "translation table walks and accesses to
translation tables". The accesses can be LDRD/STRD instructions from
other CPUs. Since the hardware can't tell whether the access is to a
page table, the designers just made LDRD/STRD single-copy atomic.

> I'd be interested to hear what Catalin and Will have to say on this,
> but I suspect in practice, Arm systems that are running Linux with
> LPAE (ARMv7+LPAE, ARMv8) will implement LDRD and STRD with 64-bit
> single-copy atomic semantics.

That's my understanding as well. In theory one could have a page table
access from EL0, so it should be atomic.

We could try to clarify E2.2.1 to simply state that naturally aligned
LDRD/STRD are single-copy atomic without any subsequent statement on the
translation table.

-- 
Catalin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-13 18:13         ` Catalin Marinas
@ 2021-07-14  8:40           ` Russell King (Oracle)
  2021-07-23  7:23             ` Leo Yan
  0 siblings, 1 reply; 34+ messages in thread
From: Russell King (Oracle) @ 2021-07-14  8:40 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leo Yan, Will Deacon, Arnaldo Carvalho de Melo, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel

On Tue, Jul 13, 2021 at 07:13:02PM +0100, Catalin Marinas wrote:
> We could try to clarify E2.2.1 to simply state that naturally aligned
> LDRD/STRD are single-copy atomic without any subsequent statement on the
> translation table.

I think that clarification would be most helpful. Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-13 17:31       ` Hunter, Adrian
@ 2021-07-14 13:59         ` Arnaldo Carvalho de Melo
  2021-07-14 14:00           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-14 13:59 UTC (permalink / raw)
  To: Hunter, Adrian
  Cc: Leo Yan, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, linux-perf-users, linux-kernel,
	coresight, linux-arm-kernel

Em Tue, Jul 13, 2021 at 05:31:03PM +0000, Hunter, Adrian escreveu:
> > On Mon, Jul 12, 2021 at 03:14:35PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> > > > +++ b/tools/perf/util/env.c
> > > > @@ -11,6 +11,7 @@
> > > >  #include <stdlib.h>
> > > >  #include <string.h>

> > > > +int kernel_is_64_bit;
> > > >  struct perf_env perf_env;

> > > Why can't this be in 'struct perf_env'?

> > Good question.  I considered to add it in struct perf_env but finally I used this
> > way; the reason is this variable "kernel_is_64_bit" is only used during
> > recording phase for AUX ring buffer, and don't use it for report.  So seems to
> > me it's over complexity to add a new field and just wander if it's necessary to
> > save this field as new feature in the perf header.

> I think we store the arch, so if the "kernel_is_64_bit" calculation depends only on arch
> then I guess we don't need a new feature at the moment.

So, I wasn't suggesting to add this info to the perf.data file header,
just to the in-memory 'struct perf_env'.

And also we should avoid unconditionally initializing things that we may
never need, please structure it as:

static void perf_env__init_kernel_mode(struct perf_env *env)
{
       const char *arch = perf_env__raw_arch(env);

       if (!strncmp(arch, "x86_64", 6)   || !strncmp(arch, "aarch64", 7) ||
           !strncmp(arch, "arm64", 5)    || !strncmp(arch, "mips64", 6) ||
           !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
           !strncmp(arch, "s390x", 5)    || !strncmp(arch, "sparc64", 7))
               kernel_is_64_bit = 1;
       else
               kernel_is_64_bit = 0;
}


void perf_env__init(struct perf_env *env)
{
	...
	env->kernel_is_64_bit = -1;
	...
}

bool perf_env__kernel_is_64_bit(struct perf_env *env)
{
	if (env->kernel_is_64_bit == -1)
		perf_env__init_kernel_mode(env);

	return env->kernel_is_64_bit;
}

One thing in my TODO is to crack down on the tons of initializations
perf does unconditionally, last time I looked there are lots :-\

- Arnaldo
 
> > Combining the comment from Adrian in another email, I think it's good to add
> > a new field "compat_mode" in the struct perf_env, and this field will be
> > initialized in build-record.c.  Currently we don't need to save this value into
> > the perf file, if later we need to use this value for decoding phase, then we
> > can add a new feature item to save "compat_mode"
> > into the perf file's header.

> > If you have any different idea, please let me know.  Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-14 13:59         ` Arnaldo Carvalho de Melo
@ 2021-07-14 14:00           ` Arnaldo Carvalho de Melo
  2021-07-23  7:11             ` Leo Yan
  0 siblings, 1 reply; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-14 14:00 UTC (permalink / raw)
  To: Hunter, Adrian
  Cc: Leo Yan, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, linux-perf-users, linux-kernel,
	coresight, linux-arm-kernel

Em Wed, Jul 14, 2021 at 10:59:49AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jul 13, 2021 at 05:31:03PM +0000, Hunter, Adrian escreveu:
> > > On Mon, Jul 12, 2021 at 03:14:35PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> > > > > +++ b/tools/perf/util/env.c
> > > > > @@ -11,6 +11,7 @@
> > > > >  #include <stdlib.h>
> > > > >  #include <string.h>
> 
> > > > > +int kernel_is_64_bit;
> > > > >  struct perf_env perf_env;
> 
> > > > Why can't this be in 'struct perf_env'?
> 
> > > Good question.  I considered to add it in struct perf_env but finally I used this
> > > way; the reason is this variable "kernel_is_64_bit" is only used during
> > > recording phase for AUX ring buffer, and don't use it for report.  So seems to
> > > me it's over complexity to add a new field and just wander if it's necessary to
> > > save this field as new feature in the perf header.
> 
> > I think we store the arch, so if the "kernel_is_64_bit" calculation depends only on arch
> > then I guess we don't need a new feature at the moment.
> 
> So, I wasn't suggesting to add this info to the perf.data file header,
> just to the in-memory 'struct perf_env'.
> 
> And also we should avoid unconditionally initializing things that we may
> never need, please structure it as:

Oops, forgot these:
 
> static void perf_env__init_kernel_mode(struct perf_env *env)
> {
>        const char *arch = perf_env__raw_arch(env);
> 
>        if (!strncmp(arch, "x86_64", 6)   || !strncmp(arch, "aarch64", 7) ||
>            !strncmp(arch, "arm64", 5)    || !strncmp(arch, "mips64", 6) ||
>            !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
>            !strncmp(arch, "s390x", 5)    || !strncmp(arch, "sparc64", 7))
>                kernel_is_64_bit = 1;
                 env->kernel_is_64_bit = 1;
>        else
>                kernel_is_64_bit = 0;
                 env->kernel_is_64_bit = 0;
> }
> 
> 
> void perf_env__init(struct perf_env *env)
> {
> 	...
> 	env->kernel_is_64_bit = -1;
> 	...
> }
> 
> bool perf_env__kernel_is_64_bit(struct perf_env *env)
> {
> 	if (env->kernel_is_64_bit == -1)
> 		perf_env__init_kernel_mode(env);
> 
> 	return env->kernel_is_64_bit;
> }
> 
> One thing in my TODO is to crack down on the tons of initializations
> perf does unconditionally, last time I looked there are lots :-\
> 
> - Arnaldo
>  
> > > Combining the comment from Adrian in another email, I think it's good to add
> > > a new field "compat_mode" in the struct perf_env, and this field will be
> > > initialized in build-record.c.  Currently we don't need to save this value into
> > > the perf file, if later we need to use this value for decoding phase, then we
> > > can add a new feature item to save "compat_mode"
> > > into the perf file's header.
> 
> > > If you have any different idea, please let me know.  Thanks!

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode
  2021-07-14 14:00           ` Arnaldo Carvalho de Melo
@ 2021-07-23  7:11             ` Leo Yan
  0 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-23  7:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Hunter, Adrian, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, linux-perf-users, linux-kernel,
	coresight, linux-arm-kernel

Hi Arnaldo,

On Wed, Jul 14, 2021 at 11:00:44AM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > So, I wasn't suggesting to add this info to the perf.data file header,
> > just to the in-memory 'struct perf_env'.
> > 
> > And also we should avoid unconditionally initializing things that we may
> > never need, please structure it as:
> 
> Oops, forgot these:
>  
> > static void perf_env__init_kernel_mode(struct perf_env *env)
> > {
> >        const char *arch = perf_env__raw_arch(env);
> > 
> >        if (!strncmp(arch, "x86_64", 6)   || !strncmp(arch, "aarch64", 7) ||
> >            !strncmp(arch, "arm64", 5)    || !strncmp(arch, "mips64", 6) ||
> >            !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
> >            !strncmp(arch, "s390x", 5)    || !strncmp(arch, "sparc64", 7))
> >                kernel_is_64_bit = 1;
>                  env->kernel_is_64_bit = 1;
> >        else
> >                kernel_is_64_bit = 0;
>                  env->kernel_is_64_bit = 0;
> > }
> > 
> > 
> > void perf_env__init(struct perf_env *env)
> > {
> > 	...
> > 	env->kernel_is_64_bit = -1;
> > 	...
> > }
> > 
> > bool perf_env__kernel_is_64_bit(struct perf_env *env)
> > {
> > 	if (env->kernel_is_64_bit == -1)
> > 		perf_env__init_kernel_mode(env);
> > 
> > 	return env->kernel_is_64_bit;
> > }

Thanks a lot for the suggestion; this is much clear for me, will spin
new patch set by following it.

Sorry for slow response due to my bandwidth was occupied by a task in
hand.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-14  8:40           ` Russell King (Oracle)
@ 2021-07-23  7:23             ` Leo Yan
  0 siblings, 0 replies; 34+ messages in thread
From: Leo Yan @ 2021-07-23  7:23 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Catalin Marinas, Will Deacon, Arnaldo Carvalho de Melo,
	Adrian Hunter, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin, Mathieu Poirier,
	Suzuki K Poulose, Mike Leach, linux-perf-users, linux-kernel,
	coresight, linux-arm-kernel

On Wed, Jul 14, 2021 at 09:40:15AM +0100, Russell King (Oracle) wrote:
> On Tue, Jul 13, 2021 at 07:13:02PM +0100, Catalin Marinas wrote:
> > We could try to clarify E2.2.1 to simply state that naturally aligned
> > LDRD/STRD are single-copy atomic without any subsequent statement on the
> > translation table.
> 
> I think that clarification would be most helpful. Thanks.

Thanks for the suggestion and confirmation, Russell & Catalin.

If so, I will implement the weak functions for
compat_auxtrace_mmap__{read_head|write_tail}; and write the arm/arm64
specific functions with using LDRD/STRD instructions.

For better patches organization, I will use a separate patch set for
enabling the compat functions (in particular patches 10, 11/11) in
the next spin.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-07-23  7:23 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-11 10:40 [PATCH v4 00/11] perf: Refine barriers for AUX ring buffer Leo Yan
2021-07-11 10:40 ` [PATCH v4 01/11] perf/ring_buffer: Add comment for barriers on " Leo Yan
2021-07-11 10:40 ` [PATCH v4 02/11] coresight: tmc-etr: Add barrier after updating " Leo Yan
2021-07-12 10:40   ` Suzuki K Poulose
2021-07-12 10:54     ` Leo Yan
2021-07-11 10:40 ` [PATCH v4 03/11] coresight: tmc-etf: Add comment for store ordering Leo Yan
2021-07-13 12:56   ` Peter Zijlstra
2021-07-13 15:49     ` Leo Yan
2021-07-11 10:40 ` [PATCH v4 04/11] perf/x86: Add barrier after updating bts Leo Yan
2021-07-13 13:01   ` Peter Zijlstra
2021-07-11 10:40 ` [PATCH v4 05/11] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
2021-07-11 10:41 ` [PATCH v4 06/11] perf auxtrace: Drop legacy __sync functions Leo Yan
2021-07-11 10:41 ` [PATCH v4 07/11] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
2021-07-12 14:32   ` Adrian Hunter
2021-07-13 13:10     ` Leo Yan
2021-07-11 10:41 ` [PATCH v4 08/11] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
2021-07-11 10:41 ` [PATCH v4 09/11] tools: Remove feature-sync-compare-and-swap feature detection Leo Yan
2021-07-11 10:41 ` [PATCH v4 10/11] perf env: Set flag for kernel is 64-bit mode Leo Yan
2021-07-12 14:37   ` Adrian Hunter
2021-07-12 18:14   ` Arnaldo Carvalho de Melo
2021-07-13 15:09     ` Leo Yan
2021-07-13 17:31       ` Hunter, Adrian
2021-07-14 13:59         ` Arnaldo Carvalho de Melo
2021-07-14 14:00           ` Arnaldo Carvalho de Melo
2021-07-23  7:11             ` Leo Yan
2021-07-11 10:41 ` [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
2021-07-12 14:44   ` Russell King (Oracle)
2021-07-13 15:46     ` Leo Yan
2021-07-13 16:14       ` Russell King (Oracle)
2021-07-13 18:13         ` Catalin Marinas
2021-07-14  8:40           ` Russell King (Oracle)
2021-07-23  7:23             ` Leo Yan
2021-07-13  7:07   ` Adrian Hunter
2021-07-13 15:48     ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).