linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer
@ 2021-07-04  7:16 Leo Yan
  2021-07-04  7:16 ` [PATCH v3 01/10] perf/ring_buffer: Add comment for barriers on " Leo Yan
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

This patch series is to refine the memory barriers for AUX ring buffer.

Patches 01 ~ 04 to address the barriers usage in the kernel.  The first
patch is to make clear comment for how to use the barriers between the
data store and aux_head store, this asks the driver to make sure the
data is visible.  Patches 02 ~ 04 is to refine the drivers for barriers
after the data store.

Patches 05 ~ 07 is to drop the legacy __sync functions, and polish for
duplicate code and cleanup the build after SYNC_COMPARE_AND_SWAP is not
used.

Patch 08 is to use WRITE_ONCE() for updating aux_tail.

Since the 64-bit value's atomicity is not promised on 32-bit perf, the
last two patches tries to fixup for perf tool when it runs in compat
mode.  Patch 09 introduces a new global variable to indicate the kernel
runs in 64-bit mode which can be used to confirm if in compat mode;
patch 10 introduces variant functions for accessing AUX head/tail, it
can resolve the aotmicity issue for reading head pointer, and for the
tail write overflow issue it returns error to notify the tool to exit.

Have testes the patches on Arm64 Juno platform.

Changes from v2:
- Removed auxtrace_mmap__read_snapshot_head(), which has the duplicated
  code with auxtrace_mmap__read_head();
- Cleanuped the build for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT (Adrian);
- Added global variable "kernel_is_64_bit" (Adrian);
- Added compat variants compat_auxtrace_mmap__{read_head|write_tail}
  (Adrian).


Leo Yan (10):
  perf/ring_buffer: Add comment for barriers on AUX ring buffer
  coresight: tmc-etr: Add barrier after updating AUX ring buffer
  coresight: tmc-etf: Add comment for store ordering
  perf/x86: Add barrier after updating bts
  perf auxtrace: Drop legacy __sync functions
  perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
  perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
  perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  perf env: Set kernel bit mode
  perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}

 arch/x86/events/intel/bts.c                   |   3 +
 .../hwtracing/coresight/coresight-tmc-etf.c   |   6 +
 .../hwtracing/coresight/coresight-tmc-etr.c   |   8 ++
 kernel/events/ring_buffer.c                   |   9 ++
 tools/perf/Makefile.config                    |   4 -
 tools/perf/util/auxtrace.c                    |  19 ++-
 tools/perf/util/auxtrace.h                    | 109 ++++++++++++++----
 tools/perf/util/env.c                         |  17 ++-
 tools/perf/util/env.h                         |   1 +
 9 files changed, 136 insertions(+), 40 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 01/10] perf/ring_buffer: Add comment for barriers on AUX ring buffer
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 02/10] coresight: tmc-etr: Add barrier after updating " Leo Yan
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

AUX ring buffer applies almost the same barriers as perf ring buffer,
but there has an exception for ordering between writing the AUX trace
data and updating user_page::aux_head.

This patch adds comment for how to use the barriers on AUX ring buffer,
and gives comment to ask the drivers to flush the trace data into AUX
ring buffer prior to updating user_page::aux_head.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/ring_buffer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 52868716ec35..5cf6579be05e 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size)
 		perf_event_aux_event(handle->event, aux_head, size,
 				     handle->aux_flags);
 
+	/*
+	 * See perf_output_put_handle(), AUX ring buffer applies the same
+	 * barrier pairing as the perf ring buffer; except for B, since
+	 * AUX ring buffer is written by hardware trace, we cannot simply
+	 * use the generic memory barrier (like smp_wmb()) prior to update
+	 * user_page::aux_head, the hardware trace driver takes the
+	 * responsibility to ensure the trace data has been flushed into
+	 * the AUX buffer before calling perf_aux_output_end().
+	 */
 	WRITE_ONCE(rb->user_page->aux_head, rb->aux_head);
 	if (rb_need_aux_wakeup(rb))
 		wakeup = true;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 02/10] coresight: tmc-etr: Add barrier after updating AUX ring buffer
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
  2021-07-04  7:16 ` [PATCH v3 01/10] perf/ring_buffer: Add comment for barriers on " Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 03/10] coresight: tmc-etf: Add comment for store ordering Leo Yan
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since a memory barrier is required between AUX trace data store and
aux_head store, and the AUX trace data is filled with memcpy(), it's
sufficient to use smp_wmb() so can ensure the trace data is visible
prior to updating aux_head.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index acdb59e0e661..713205db15a1 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 	 */
 	if (etr_perf->snapshot)
 		handle->head += size;
+
+	/*
+	 * It requires the ordering between the AUX trace data and aux_head
+	 * store, below smp_wmb() ensures the AUX trace data is visible prior
+	 * to updating aux_head.
+	 */
+	smp_wmb();
+
 out:
 	/*
 	 * Don't set the TRUNCATED flag in snapshot mode because 1) the
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 03/10] coresight: tmc-etf: Add comment for store ordering
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
  2021-07-04  7:16 ` [PATCH v3 01/10] perf/ring_buffer: Add comment for barriers on " Leo Yan
  2021-07-04  7:16 ` [PATCH v3 02/10] coresight: tmc-etr: Add barrier after updating " Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 04/10] perf/x86: Add barrier after updating bts Leo Yan
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

AUX ring buffer is required to separate the data store and aux_head
store, since the function CS_LOCK() has contained memory barrier mb(),
mb() is a more conservative barrier than smp_wmb() on Arm32/Arm64, thus
it's needless to add any explicit barrier anymore.

Add comment to make clear for the barrier usage for ETF.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 45b85edfc690..9a42ee689921 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -553,6 +553,12 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 	if (buf->snapshot)
 		handle->head += to_read;
 
+	/*
+	 * AUX ring buffer requires to use memory barrier to separate the trace
+	 * data store and aux_head store, because CS_LOCK() contains mb() which
+	 * gives more heavy barrier than smp_wmb(), it's not necessary to
+	 * explicitly invoke any barrier.
+	 */
 	CS_LOCK(drvdata->base);
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 04/10] perf/x86: Add barrier after updating bts
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (2 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 03/10] coresight: tmc-etf: Add comment for store ordering Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions Leo Yan
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Add barrier wmb() to separate the AUX data store and aux_head store.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/x86/events/intel/bts.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 6320d2cfd9d3..4a015d160bc5 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -209,6 +209,9 @@ static void bts_update(struct bts_ctx *bts)
 	} else {
 		local_set(&buf->data_size, head);
 	}
+
+	/* The WMB separates data store and aux_head store matches. */
+	wmb();
 }
 
 static int
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (3 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 04/10] perf/x86: Add barrier after updating bts Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-10 12:34   ` Adrian Hunter
  2021-07-04  7:16 ` [PATCH v3 06/10] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

The main purpose for using __sync built-in functions is to support
compat mode for 32-bit perf with 64-bit kernel.  But using these
built-in functions might cause couple potential issues.

Firstly, __sync functions originally support Intel Itanium processoer [1]
but it cannot promise to support all 32-bit archs.  Now these
functions have become the legacy functions.

As Peter also pointed out the logic issue in the function
auxtrace_mmap__write_tail(), it does a cmpxchg with 0 values to load
old_tail, and then executes a further cmpxchg with old_tail to write
the new tail.  If consider the aux_tail might be assigned to '0' in the
middle of loops, this can introduce mess for AUX buffer if the kernel
fetches the temporary value '0'.

Considering __sync functions cannot really fix the 64-bit value
atomicity on 32-bit archs, thus this patch drops __sync functions.

Credits to Peter for detailed analysis.

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.h | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index cc1c1b9cec9c..f489ca159997 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -440,12 +440,6 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
-/*
- * In snapshot mode the mmapped page is read-only which makes using
- * __sync_val_compare_and_swap() problematic.  However, snapshot mode expects
- * the buffer is not updated while the snapshot is made (e.g. Intel PT disables
- * the event) so there is not a race anyway.
- */
 static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
@@ -459,11 +453,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	u64 head = READ_ONCE(pc->aux_head);
-#else
-	u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
-#endif
 
 	/* Ensure all reads are done after we read the head */
 	smp_rmb();
@@ -473,19 +463,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
-	u64 old_tail;
-#endif
 
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
-#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	pc->aux_tail = tail;
-#else
-	do {
-		old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
-	} while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail));
-#endif
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 06/10] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (4 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since the function auxtrace_mmap__read_snapshot_head() is exactly same
with auxtrace_mmap__read_head(), whether the session is in snapshot mode
or not, it's unified to use function auxtrace_mmap__read_head() for
reading AUX buffer head.

And the function auxtrace_mmap__read_snapshot_head() is unused so this
patch removes it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.c |  5 ++---
 tools/perf/util/auxtrace.h | 10 ----------
 2 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 9350eeb3a3fc..44bd04dbb6d6 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1686,13 +1686,12 @@ static int __auxtrace_mmap__read(struct mmap *map,
 	union perf_event ev;
 	void *data1, *data2;
 
+	head = auxtrace_mmap__read_head(mm);
+
 	if (snapshot) {
-		head = auxtrace_mmap__read_snapshot_head(mm);
 		if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data,
 						   &head, &old))
 			return -1;
-	} else {
-		head = auxtrace_mmap__read_head(mm);
 	}
 
 	if (old == head)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index f489ca159997..1a2f42980e3f 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -440,16 +440,6 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
-static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
-{
-	struct perf_event_mmap_page *pc = mm->userpg;
-	u64 head = READ_ONCE(pc->aux_head);
-
-	/* Ensure all reads are done after we read the head */
-	smp_rmb();
-	return head;
-}
-
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (5 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 06/10] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-10 12:36   ` Adrian Hunter
  2021-07-04  7:16 ` [PATCH v3 08/10] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Since the __sync functions have been dropped, This patch removes unused
build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.

Note, there have a test for SYNC_COMPARE_AND_SWAP and the test file is
located in build/feature/test-sync-compare-and-swap.c.  Since there
still has several components using the sync functions, it's deliberately
to not be removed.

  $ cd linux/tools
  $ git grep __sync_val_compare_and_swap | awk '{ printf $1"\n" }'
  build/feature/test-sync-compare-and-swap.c:
  include/asm-generic/atomic-gcc.h:
  testing/selftests/bpf/progs/atomics.c:
  testing/selftests/bpf/progs/atomics.c:
  testing/selftests/bpf/progs/atomics.c:
  testing/selftests/bpf/progs/atomics.c:
  testing/selftests/futex/include/atomic.h:
  testing/selftests/futex/include/futextest.h:

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/Makefile.config | 4 ----
 tools/perf/util/auxtrace.c | 5 -----
 2 files changed, 9 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index eb8e487ef90b..4a0d9a6defc7 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -349,10 +349,6 @@ CXXFLAGS += $(INC_FLAGS)
 
 LIBPERF_CFLAGS := $(CORE_CFLAGS) $(EXTRA_CFLAGS)
 
-ifeq ($(feature-sync-compare-and-swap), 1)
-  CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
-endif
-
 ifeq ($(feature-pthread-attr-setaffinity-np), 1)
   CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP
 endif
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 44bd04dbb6d6..db6255b55c90 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -130,11 +130,6 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
 		return 0;
 	}
 
-#if BITS_PER_LONG != 64 && !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
-	pr_err("Cannot use AUX area tracing mmaps\n");
-	return -1;
-#endif
-
 	pc->aux_offset = mp->offset;
 	pc->aux_size = mp->len;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 08/10] perf auxtrace: Use WRITE_ONCE() for updating aux_tail
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (6 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 09/10] perf env: Set kernel bit mode Leo Yan
  2021-07-04  7:16 ` [PATCH v3 10/10] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
behaviour.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/perf/util/auxtrace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 1a2f42980e3f..d68a5e80b217 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -456,7 +456,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
-	pc->aux_tail = tail;
+	WRITE_ONCE(pc->aux_tail, tail);
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 09/10] perf env: Set kernel bit mode
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (7 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 08/10] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  2021-07-04  7:16 ` [PATCH v3 10/10] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

It's useful to know that the kernel is running in 32-bit or 64-bit
mode.  E.g. We can decide perf tool is running in compat mode when
detects kernel is running in 64-bit mode and the tool is in 32-bit
mode with the compiler macro BITS_PER_LONG is 32.

This patch adds a global variable "kernel_is_64_bit", during the
environment initialization for the session, the kernel running mode
is decided by checking the architecture string.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/env.c | 17 ++++++++++++++++-
 tools/perf/util/env.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index ebc5e9ad35db..345635a2e842 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -11,6 +11,7 @@
 #include <stdlib.h>
 #include <string.h>
 
+int kernel_is_64_bit;
 struct perf_env perf_env;
 
 #ifdef HAVE_LIBBPF_SUPPORT
@@ -172,6 +173,19 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused)
 }
 #endif // HAVE_LIBBPF_SUPPORT
 
+static void perf_env__init_kernel_mode(struct perf_env *env)
+{
+	const char *arch = perf_env__raw_arch(env);
+
+	if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
+	    !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
+	    !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
+	    !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
+		kernel_is_64_bit = 1;
+	else
+		kernel_is_64_bit = 0;
+}
+
 void perf_env__exit(struct perf_env *env)
 {
 	int i;
@@ -217,13 +231,14 @@ void perf_env__exit(struct perf_env *env)
 	zfree(&env->hybrid_cpc_nodes);
 }
 
-void perf_env__init(struct perf_env *env __maybe_unused)
+void perf_env__init(struct perf_env *env)
 {
 #ifdef HAVE_LIBBPF_SUPPORT
 	env->bpf_progs.infos = RB_ROOT;
 	env->bpf_progs.btfs = RB_ROOT;
 	init_rwsem(&env->bpf_progs.lock);
 #endif
+	perf_env__init_kernel_mode(env);
 }
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 6824a7423a2d..cc989ff49740 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -139,6 +139,7 @@ enum perf_compress_type {
 struct bpf_prog_info_node;
 struct btf_node;
 
+extern int kernel_is_64_bit;
 extern struct perf_env perf_env;
 
 void perf_env__exit(struct perf_env *env);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 10/10] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
  2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
                   ` (8 preceding siblings ...)
  2021-07-04  7:16 ` [PATCH v3 09/10] perf env: Set kernel bit mode Leo Yan
@ 2021-07-04  7:16 ` Leo Yan
  9 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-04  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Adrian Hunter,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86,
	H. Peter Anvin, Mathieu Poirier, Suzuki K Poulose, Mike Leach,
	linux-perf-users, linux-kernel, coresight, linux-arm-kernel
  Cc: Leo Yan

When perf runs in compat mode (kernel in 64-bit mode and the perf is in
32-bit mode), the 64-bit value atomicity in the user space cannot be
assured, E.g. on some architectures, the 64-bit value accessing is split
into two instructions, one is the low 32-bit accessing and another is
for the high 32-bit.

This patch introduces two functions compat_auxtrace_mmap__read_head()
and compat_auxtrace_mmap__write_tail(), as their naming indicates, when
perf tool works in compat mode, it uses these two functions to access
the AUX head and tail.  These two functions can allow the perf tool to
work properly in certain conditions, e.g. when perf tool works in
snapshot mode with only using AUX head pointer, or perf tool uses the
AUX buffer and the tail is not bigger than 4GB.

When the perf tool cannot handle the case when the AUX tail is overflow
for 4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and
the caller is changed to bail out for the overflow error.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/auxtrace.c |  9 ++--
 tools/perf/util/auxtrace.h | 94 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 98 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index db6255b55c90..a430d74b279f 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1766,10 +1766,13 @@ static int __auxtrace_mmap__read(struct mmap *map,
 	mm->prev = head;
 
 	if (!snapshot) {
-		auxtrace_mmap__write_tail(mm, head);
-		if (itr->read_finish) {
-			int err;
+		int err;
 
+		err = auxtrace_mmap__write_tail(mm, head);
+		if (err < 0)
+			return err;
+
+		if (itr->read_finish) {
 			err = itr->read_finish(itr, mm->idx);
 			if (err < 0)
 				return err;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index d68a5e80b217..66de7b6e65ec 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -18,6 +18,8 @@
 #include <asm/bitsperlong.h>
 #include <asm/barrier.h>
 
+#include "env.h"
+
 union perf_event;
 struct perf_session;
 struct evlist;
@@ -440,23 +442,111 @@ struct auxtrace_cache;
 
 #ifdef HAVE_AUXTRACE_SUPPORT
 
+/*
+ * In the compat mode kernel runs in 64-bit and perf tool runs in 32-bit mode,
+ * 32-bit perf tool cannot access 64-bit value atomically, which might lead to
+ * the issues caused by the below sequence on multiple CPUs: when perf tool
+ * accesses either the load operation or the store operation for 64-bit value,
+ * on some architectures the operation is divided into two instructions, one
+ * is for accessing the low 32-bit value and another is for the high 32-bit;
+ * thus these two user operations can give the kernel chances to access the
+ * 64-bit value, and thus leads to the unexpected load values.
+ *
+ *   kernel (64-bit)                        user (32-bit)
+ *
+ *   if (LOAD ->aux_tail) { --,             LOAD ->aux_head_lo
+ *       STORE $aux_data      |       ,--->
+ *       FLUSH $aux_data      |       |     LOAD ->aux_head_hi
+ *       STORE ->aux_head   --|-------`     smp_rmb()
+ *   }                        |             LOAD $data
+ *                            |             smp_mb()
+ *                            |             STORE ->aux_tail_lo
+ *                            `----------->
+ *                                          STORE ->aux_tail_hi
+ *
+ * For this reason, it's impossible for the perf tool to work correctly when
+ * the AUX head or tail is bigger than 4GB (more than 32 bits length); and we
+ * can not simply limit the AUX ring buffer to less than 4GB, the reason is
+ * the pointers can be increased monotonically (e.g in snapshot mode), whatever
+ * the buffer size it is, at the end the head and tail can be bigger than 4GB
+ * and carry out to the high 32-bit.
+ *
+ * To mitigate the issues and improve the user experience, we can allow the
+ * perf tool working in certain conditions and bail out with error if detect
+ * any overflow cannot be handled.
+ *
+ * For reading the AUX head, it reads out the values for three times, and
+ * compares the high 4 bytes of the values between the first time and the last
+ * time, if there has no change for high 4 bytes injected by the kernel during
+ * the user reading sequence, it's safe for use the second value.
+ *
+ * When update the AUX tail and detects any carrying in the high 32 bits, it
+ * means there have two store operations in user space and it cannot promise
+ * the atomicity for 64-bit write, so return '-1' in this case to tell the
+ * caller an overflow error has happened.
+ */
+static inline u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
+{
+	struct perf_event_mmap_page *pc = mm->userpg;
+	u64 first, second, last;
+	u64 mask = (u64)(UINT32_MAX) << 32;
+
+	do {
+		first = READ_ONCE(pc->aux_head);
+		/* Ensure all reads are done after we read the head */
+		smp_rmb();
+		second = READ_ONCE(pc->aux_head);
+		/* Ensure all reads are done after we read the head */
+		smp_rmb();
+		last = READ_ONCE(pc->aux_head);
+	} while ((first & mask) != (last & mask));
+
+	return second;
+}
+
+static inline int compat_auxtrace_mmap__write_tail(struct auxtrace_mmap *mm,
+						    u64 tail)
+{
+	struct perf_event_mmap_page *pc = mm->userpg;
+	u64 mask = (u64)(UINT32_MAX) << 32;
+
+	if (tail & mask)
+		return -1;
+
+	/* Ensure all reads are done before we write the tail out */
+	smp_mb();
+	WRITE_ONCE(pc->aux_tail, tail);
+	return 0;
+}
+
 static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
-	u64 head = READ_ONCE(pc->aux_head);
+	u64 head;
+
+#if BITS_PER_LONG == 32
+	if (kernel_is_64_bit)
+		return compat_auxtrace_mmap__read_head(mm);
+#endif
+	head = READ_ONCE(pc->aux_head);
 
 	/* Ensure all reads are done after we read the head */
 	smp_rmb();
 	return head;
 }
 
-static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
+static inline int auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 {
 	struct perf_event_mmap_page *pc = mm->userpg;
 
+#if BITS_PER_LONG == 32
+	if (kernel_is_64_bit)
+		return compat_auxtrace_mmap__write_tail(mm, tail);
+#endif
 	/* Ensure all reads are done before we write the tail out */
 	smp_mb();
 	WRITE_ONCE(pc->aux_tail, tail);
+	return 0;
 }
 
 int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions
  2021-07-04  7:16 ` [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions Leo Yan
@ 2021-07-10 12:34   ` Adrian Hunter
  2021-07-11  7:41     ` Leo Yan
  0 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2021-07-10 12:34 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 4/07/21 10:16 am, Leo Yan wrote:
> The main purpose for using __sync built-in functions is to support
> compat mode for 32-bit perf with 64-bit kernel.  But using these
> built-in functions might cause couple potential issues.
> 
> Firstly, __sync functions originally support Intel Itanium processoer [1]
> but it cannot promise to support all 32-bit archs.  Now these
> functions have become the legacy functions.
> 
> As Peter also pointed out the logic issue in the function
> auxtrace_mmap__write_tail(), it does a cmpxchg with 0 values to load
> old_tail, and then executes a further cmpxchg with old_tail to write
> the new tail.  If consider the aux_tail might be assigned to '0' in the
> middle of loops, this can introduce mess for AUX buffer if the kernel
> fetches the temporary value '0'.

That is not exactly true. The definition of __sync_*_compare_and_swap is
"if the current value of *ptr is oldval, then write newval into *pt"
so replacing zero with zero won't make any difference, but it will return
the old value in any case.  Probably better to leave out that paragraph.

> 
> Considering __sync functions cannot really fix the 64-bit value
> atomicity on 32-bit archs, thus this patch drops __sync functions.
> 
> Credits to Peter for detailed analysis.
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/auxtrace.h | 19 -------------------
>  1 file changed, 19 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index cc1c1b9cec9c..f489ca159997 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -440,12 +440,6 @@ struct auxtrace_cache;
>  
>  #ifdef HAVE_AUXTRACE_SUPPORT
>  
> -/*
> - * In snapshot mode the mmapped page is read-only which makes using
> - * __sync_val_compare_and_swap() problematic.  However, snapshot mode expects
> - * the buffer is not updated while the snapshot is made (e.g. Intel PT disables
> - * the event) so there is not a race anyway.
> - */
>  static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> @@ -459,11 +453,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>  static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
>  	u64 head = READ_ONCE(pc->aux_head);
> -#else
> -	u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
> -#endif
>  
>  	/* Ensure all reads are done after we read the head */
>  	smp_rmb();
> @@ -473,19 +463,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>  static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>  {
>  	struct perf_event_mmap_page *pc = mm->userpg;
> -#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
> -	u64 old_tail;
> -#endif
>  
>  	/* Ensure all reads are done before we write the tail out */
>  	smp_mb();
> -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
>  	pc->aux_tail = tail;
> -#else
> -	do {
> -		old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
> -	} while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail));
> -#endif
>  }
>  
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
  2021-07-04  7:16 ` [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
@ 2021-07-10 12:36   ` Adrian Hunter
  2021-07-11  8:08     ` Leo Yan
  0 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2021-07-10 12:36 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On 4/07/21 10:16 am, Leo Yan wrote:
> Since the __sync functions have been dropped, This patch removes unused
> build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.
> 
> Note, there have a test for SYNC_COMPARE_AND_SWAP and the test file is
> located in build/feature/test-sync-compare-and-swap.c.  Since there
> still has several components using the sync functions, it's deliberately
> to not be removed.

I don't quite follow that.  If they aren't using the feature test
macro, then why keep the feature test?

> 
>   $ cd linux/tools
>   $ git grep __sync_val_compare_and_swap | awk '{ printf $1"\n" }'
>   build/feature/test-sync-compare-and-swap.c:
>   include/asm-generic/atomic-gcc.h:
>   testing/selftests/bpf/progs/atomics.c:
>   testing/selftests/bpf/progs/atomics.c:
>   testing/selftests/bpf/progs/atomics.c:
>   testing/selftests/bpf/progs/atomics.c:
>   testing/selftests/futex/include/atomic.h:
>   testing/selftests/futex/include/futextest.h:
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/Makefile.config | 4 ----
>  tools/perf/util/auxtrace.c | 5 -----
>  2 files changed, 9 deletions(-)
> 
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index eb8e487ef90b..4a0d9a6defc7 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -349,10 +349,6 @@ CXXFLAGS += $(INC_FLAGS)
>  
>  LIBPERF_CFLAGS := $(CORE_CFLAGS) $(EXTRA_CFLAGS)
>  
> -ifeq ($(feature-sync-compare-and-swap), 1)
> -  CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
> -endif
> -
>  ifeq ($(feature-pthread-attr-setaffinity-np), 1)
>    CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP
>  endif
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 44bd04dbb6d6..db6255b55c90 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -130,11 +130,6 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
>  		return 0;
>  	}
>  
> -#if BITS_PER_LONG != 64 && !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
> -	pr_err("Cannot use AUX area tracing mmaps\n");
> -	return -1;
> -#endif
> -
>  	pc->aux_offset = mp->offset;
>  	pc->aux_size = mp->len;
>  
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions
  2021-07-10 12:34   ` Adrian Hunter
@ 2021-07-11  7:41     ` Leo Yan
  0 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-11  7:41 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

On Sat, Jul 10, 2021 at 03:34:24PM +0300, Adrian Hunter wrote:
> On 4/07/21 10:16 am, Leo Yan wrote:
> > The main purpose for using __sync built-in functions is to support
> > compat mode for 32-bit perf with 64-bit kernel.  But using these
> > built-in functions might cause couple potential issues.
> > 
> > Firstly, __sync functions originally support Intel Itanium processoer [1]
> > but it cannot promise to support all 32-bit archs.  Now these
> > functions have become the legacy functions.
> > 
> > As Peter also pointed out the logic issue in the function
> > auxtrace_mmap__write_tail(), it does a cmpxchg with 0 values to load
> > old_tail, and then executes a further cmpxchg with old_tail to write
> > the new tail.  If consider the aux_tail might be assigned to '0' in the
> > middle of loops, this can introduce mess for AUX buffer if the kernel
> > fetches the temporary value '0'.
> 
> That is not exactly true. The definition of __sync_*_compare_and_swap is
> "if the current value of *ptr is oldval, then write newval into *pt"
> so replacing zero with zero won't make any difference, but it will return
> the old value in any case.  Probably better to leave out that paragraph.

Okay, I admit the paragraph is not right, will drop it to avoid
confusion.  Thanks for review!

Leo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
  2021-07-10 12:36   ` Adrian Hunter
@ 2021-07-11  8:08     ` Leo Yan
  0 siblings, 0 replies; 15+ messages in thread
From: Leo Yan @ 2021-07-11  8:08 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel

Hi Adrian,

On Sat, Jul 10, 2021 at 03:36:53PM +0300, Adrian Hunter wrote:
> On 4/07/21 10:16 am, Leo Yan wrote:
> > Since the __sync functions have been dropped, This patch removes unused
> > build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.
> > 
> > Note, there have a test for SYNC_COMPARE_AND_SWAP and the test file is
> > located in build/feature/test-sync-compare-and-swap.c.  Since there
> > still has several components using the sync functions, it's deliberately
> > to not be removed.
> 
> I don't quite follow that.  If they aren't using the feature test
> macro, then why keep the feature test?

There are files are still using __sync_xxx_compare_and_swap() functions,
e.g. in the folder tools/testing/selftests/bpf.  On the other hand,
after drop __sync functions from perf, there have no any Makefile check
the feature 'feature-sync-compare-and-swap'.  So it's safe to remove the
feature test.

Sorry for confusion.  Will drop the feature test in new patch set.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-07-11  8:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-04  7:16 [PATCH v3 00/10] perf: Refine barriers for AUX ring buffer Leo Yan
2021-07-04  7:16 ` [PATCH v3 01/10] perf/ring_buffer: Add comment for barriers on " Leo Yan
2021-07-04  7:16 ` [PATCH v3 02/10] coresight: tmc-etr: Add barrier after updating " Leo Yan
2021-07-04  7:16 ` [PATCH v3 03/10] coresight: tmc-etf: Add comment for store ordering Leo Yan
2021-07-04  7:16 ` [PATCH v3 04/10] perf/x86: Add barrier after updating bts Leo Yan
2021-07-04  7:16 ` [PATCH v3 05/10] perf auxtrace: Drop legacy __sync functions Leo Yan
2021-07-10 12:34   ` Adrian Hunter
2021-07-11  7:41     ` Leo Yan
2021-07-04  7:16 ` [PATCH v3 06/10] perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() Leo Yan
2021-07-04  7:16 ` [PATCH v3 07/10] perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT Leo Yan
2021-07-10 12:36   ` Adrian Hunter
2021-07-11  8:08     ` Leo Yan
2021-07-04  7:16 ` [PATCH v3 08/10] perf auxtrace: Use WRITE_ONCE() for updating aux_tail Leo Yan
2021-07-04  7:16 ` [PATCH v3 09/10] perf env: Set kernel bit mode Leo Yan
2021-07-04  7:16 ` [PATCH v3 10/10] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail} Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).