linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] Generic IPI sending tracepoint
@ 2022-10-07 15:41 Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 1/5] trace: Add trace_ipi_send_{cpu, cpumask} Valentin Schneider
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Valentin Schneider @ 2022-10-07 15:41 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86
  Cc: Paul E. McKenney, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Guo Ren, David S. Miller

Background
==========

Detecting IPI *reception* is relatively easy, e.g. using
trace_irq_handler_{entry,exit} or even just function-trace
flush_smp_call_function_queue() for SMP calls.  

Figuring out their *origin*, is trickier as there is no generic tracepoint tied
to e.g. smp_call_function():

o AFAIA x86 has no tracepoint tied to sending IPIs, only receiving them
  (cf. trace_call_function{_single}_entry()).
o arm/arm64 do have trace_ipi_raise(), which gives us the target cpus but also a
  mostly useless string (smp_calls will all be "Function call interrupts").
o Other architectures don't seem to have any IPI-sending related tracepoint.  

I believe one reason those tracepoints used by arm/arm64 ended up as they were
is because these archs used to handle IPIs differently from regular interrupts
(the IRQ driver would directly invoke an IPI-handling routine), which meant they 
never showed up in trace_irq_handler_{entry, exit}. The trace_ipi_{entry,exit}
tracepoints gave a way to trace IPI reception but those have become redundant as
of: 

      56afcd3dbd19 ("ARM: Allow IPIs to be handled as normal interrupts")
      d3afc7f12987 ("arm64: Allow IPIs to be handled as normal interrupts")

which gave IPIs a "proper" handler function used through
generic_handle_domain_irq(), which makes them show up via
trace_irq_handler_{entry, exit}.

Changing stuff up
=================

Per the above, it would make sense to reshuffle trace_ipi_raise() and move it
into generic code. This also came up during Daniel's talk on Osnoise at the CPU
isolation MC of LPC 2022 [1]. 

Now, to be useful, such a tracepoint needs to export:
o targeted CPU(s)
o calling context

The only way to get the calling context with trace_ipi_raise() is to trigger a
stack dump, e.g. $(trace-cmd -e ipi* -T echo 42).

As for the targeted CPUs, the existing tracepoint does export them, albeit in
cpumask form, which is quite inconvenient from a tooling perspective. For
instance, as far as I'm aware, it's not possible to do event filtering on a
cpumask via trace-cmd.

Because of the above points, this is introducing a new tracepoint.

Patches
=======

This results in having trace events for:

o smp_call_function*()
o smp_send_reschedule()
o irq_work_queue*()

This is incomplete, just looking at arm64 there's more IPI types that aren't covered:

  IPI_CPU_STOP,
  IPI_CPU_CRASH_STOP,
  IPI_TIMER,
  IPI_WAKEUP,

... But it feels like a good starting point.

Another thing worth mentioning is that depending on the callsite, the _RET_IP_
fed to the tracepoint is not always useful - generic_exec_single() doesn't tell
you much about the actual callback being sent via IPI, so there might be value
in exploding the single tracepoint into at least one variant for smp_calls.

Links
=====

[1]: https://youtu.be/5gT57y4OzBM?t=14234

Valentin Schneider (5):
  trace: Add trace_ipi_send_{cpu, cpumask}
  sched, smp: Trace send_call_function_single_ipi()
  smp: Add a multi-CPU variant to send_call_function_single_ipi()
  irq_work: Trace calls to arch_irq_work_raise()
  treewide: Rename and trace arch-definitions of smp_send_reschedule()

 arch/alpha/kernel/smp.c          |  2 +-
 arch/arc/kernel/smp.c            |  2 +-
 arch/arm/kernel/smp.c            |  5 +----
 arch/arm64/kernel/smp.c          |  3 +--
 arch/csky/kernel/smp.c           |  2 +-
 arch/hexagon/kernel/smp.c        |  2 +-
 arch/ia64/kernel/smp.c           |  4 ++--
 arch/loongarch/include/asm/smp.h |  2 +-
 arch/mips/include/asm/smp.h      |  2 +-
 arch/openrisc/kernel/smp.c       |  2 +-
 arch/parisc/kernel/smp.c         |  4 ++--
 arch/powerpc/kernel/smp.c        |  4 ++--
 arch/riscv/kernel/smp.c          |  4 ++--
 arch/s390/kernel/smp.c           |  2 +-
 arch/sh/kernel/smp.c             |  2 +-
 arch/sparc/kernel/smp_32.c       |  2 +-
 arch/sparc/kernel/smp_64.c       |  2 +-
 arch/x86/include/asm/smp.h       |  2 +-
 arch/xtensa/kernel/smp.c         |  2 +-
 include/linux/smp.h              |  1 +
 include/trace/events/ipi.h       | 27 +++++++++++++++++++++++++++
 kernel/irq_work.c                | 12 +++++++++++-
 kernel/sched/core.c              |  7 +++++--
 kernel/smp.c                     | 18 +++++++++++++++++-
 24 files changed, 84 insertions(+), 31 deletions(-)

--
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 1/5] trace: Add trace_ipi_send_{cpu, cpumask}
  2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
@ 2022-10-07 15:45 ` Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 2/5] sched, smp: Trace send_call_function_single_ipi() Valentin Schneider
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Valentin Schneider @ 2022-10-07 15:45 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86
  Cc: Paul E. McKenney, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Guo Ren, David S. Miller

trace_ipi_raise is unsuitable for generically tracing IPI sources; add a
variant of it that takes a callsite and a CPU. Define a macro helper for
handling IPIs sent to multiple CPUs.

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 include/trace/events/ipi.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/include/trace/events/ipi.h b/include/trace/events/ipi.h
index 0be71dad6ec0..fd2f2aeb36fe 100644
--- a/include/trace/events/ipi.h
+++ b/include/trace/events/ipi.h
@@ -35,6 +35,33 @@ TRACE_EVENT(ipi_raise,
 	TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), __entry->reason)
 );
 
+TRACE_EVENT(ipi_send_cpu,
+
+	TP_PROTO(unsigned long callsite, unsigned int cpu),
+
+	TP_ARGS(callsite, cpu),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, callsite)
+		__field(unsigned int, cpu)
+	),
+
+	TP_fast_assign(
+		__entry->callsite = callsite;
+		__entry->cpu      = cpu;
+	),
+
+	TP_printk("callsite=%pS target_cpu=%d", (void *)__entry->callsite, __entry->cpu)
+);
+
+#define trace_ipi_send_cpumask(callsite, mask) do {		\
+	if (static_key_false(&__tracepoint_ipi_send_cpu.key)) { \
+		int cpu;					\
+		for_each_cpu(cpu, mask)				\
+			trace_ipi_send_cpu(callsite, cpu);	\
+	}							\
+} while (0)
+
 DECLARE_EVENT_CLASS(ipi_handler,
 
 	TP_PROTO(const char *reason),
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 2/5] sched, smp: Trace send_call_function_single_ipi()
  2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 1/5] trace: Add trace_ipi_send_{cpu, cpumask} Valentin Schneider
@ 2022-10-07 15:45 ` Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 3/5] smp: Add a multi-CPU variant to send_call_function_single_ipi() Valentin Schneider
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Valentin Schneider @ 2022-10-07 15:45 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86
  Cc: Paul E. McKenney, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Guo Ren, David S. Miller

send_call_function_single_ipi() is the thing that sends IPIs at the bottom
of smp_call_function*() via either generic_exec_single() or
smp_call_function_many_cond(). Give it an IPI-related tracepoint.

Note that this ends up tracing any IPI sent via __smp_call_single_queue(),
which covers __ttwu_queue_wakelist() and irq_work_queue_on() "for free".

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 arch/arm/kernel/smp.c   | 3 ---
 arch/arm64/kernel/smp.c | 1 -
 kernel/sched/core.c     | 7 +++++--
 kernel/smp.c            | 4 ++++
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 978db2d96b44..3b280d55c1c4 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -48,9 +48,6 @@
 #include <asm/mach/arch.h>
 #include <asm/mpu.h>
 
-#define CREATE_TRACE_POINTS
-#include <trace/events/ipi.h>
-
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
  * so we need some other way of telling a new secondary core
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ffc5d76cf695..937d2623e06b 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -51,7 +51,6 @@
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
-#define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
 DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 60fdc0faf1c9..14e5e137172f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -81,6 +81,7 @@
 #include <linux/sched/rseq_api.h>
 #include <trace/events/sched.h>
 #undef CREATE_TRACE_POINTS
+#include <trace/events/ipi.h>
 
 #include "sched.h"
 #include "stats.h"
@@ -3753,10 +3754,12 @@ void send_call_function_single_ipi(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 
-	if (!set_nr_if_polling(rq->idle))
+	if (!set_nr_if_polling(rq->idle)) {
+		trace_ipi_send_cpu(_RET_IP_, cpu);
 		arch_send_call_function_single_ipi(cpu);
-	else
+	} else {
 		trace_sched_wake_idle_without_ipi(cpu);
+	}
 }
 
 /*
diff --git a/kernel/smp.c b/kernel/smp.c
index e8cdc025a046..7a7a22d69972 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -26,6 +26,10 @@
 #include <linux/sched/debug.h>
 #include <linux/jump_label.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/ipi.h>
+#undef CREATE_TRACE_POINTS
+
 #include "smpboot.h"
 #include "sched/smp.h"
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 3/5] smp: Add a multi-CPU variant to send_call_function_single_ipi()
  2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 1/5] trace: Add trace_ipi_send_{cpu, cpumask} Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 2/5] sched, smp: Trace send_call_function_single_ipi() Valentin Schneider
@ 2022-10-07 15:45 ` Valentin Schneider
  2022-10-07 15:45 ` [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise() Valentin Schneider
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Valentin Schneider @ 2022-10-07 15:45 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86
  Cc: Paul E. McKenney, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Guo Ren, David S. Miller

This simply wraps around the arch function and prepends it with a
tracepoint, bringing parity with send_call_function_single_ipi().

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/smp.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 7a7a22d69972..387735180aed 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -160,6 +160,12 @@ void __init call_function_init(void)
 	smpcfd_prepare_cpu(smp_processor_id());
 }
 
+static inline void send_call_function_ipi_mask(const struct cpumask *mask)
+{
+	trace_ipi_send_cpumask(_RET_IP_, mask);
+	arch_send_call_function_ipi_mask(mask);
+}
+
 #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
 
 static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled);
@@ -970,7 +976,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 		if (nr_cpus == 1)
 			send_call_function_single_ipi(last_cpu);
 		else if (likely(nr_cpus > 1))
-			arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
+			send_call_function_ipi_mask(cfd->cpumask_ipi);
 
 		cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->pinged, this_cpu, CFD_SEQ_NOCPU, CFD_SEQ_PINGED);
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise()
  2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
                   ` (2 preceding siblings ...)
  2022-10-07 15:45 ` [RFC PATCH 3/5] smp: Add a multi-CPU variant to send_call_function_single_ipi() Valentin Schneider
@ 2022-10-07 15:45 ` Valentin Schneider
  2022-10-08 19:34   ` Steven Rostedt
  2022-10-07 15:45 ` [RFC PATCH 5/5] treewide: Rename and trace arch-definitions of smp_send_reschedule() Valentin Schneider
  2022-10-07 20:01 ` [RFC PATCH 0/5] Generic IPI sending tracepoint Marcelo Tosatti
  5 siblings, 1 reply; 16+ messages in thread
From: Valentin Schneider @ 2022-10-07 15:45 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86
  Cc: Paul E. McKenney, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Guo Ren, David S. Miller

Adding a tracepoint to send_call_function_single_ipi() covers
irq_work_queue_on() when the CPU isn't the local one - add a tracepoint to
irq_work_raise() to cover the other cases.

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/irq_work.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 7afa40fe5cc4..5a550b681878 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -22,6 +22,8 @@
 #include <asm/processor.h>
 #include <linux/kasan.h>
 
+#include <trace/events/ipi.h>
+
 static DEFINE_PER_CPU(struct llist_head, raised_list);
 static DEFINE_PER_CPU(struct llist_head, lazy_list);
 static DEFINE_PER_CPU(struct task_struct *, irq_workd);
@@ -74,6 +76,14 @@ void __weak arch_irq_work_raise(void)
 	 */
 }
 
+static inline void irq_work_raise(void)
+{
+	if (arch_irq_work_has_interrupt())
+		trace_ipi_send_cpu(_RET_IP_, smp_processor_id());
+
+	arch_irq_work_raise();
+}
+
 /* Enqueue on current CPU, work must already be claimed and preempt disabled */
 static void __irq_work_queue_local(struct irq_work *work)
 {
@@ -99,7 +109,7 @@ static void __irq_work_queue_local(struct irq_work *work)
 
 	/* If the work is "lazy", handle it from next tick if any */
 	if (!lazy_work || tick_nohz_tick_stopped())
-		arch_irq_work_raise();
+		irq_work_raise();
 }
 
 /* Enqueue the irq work @work on the current CPU */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 5/5] treewide: Rename and trace arch-definitions of smp_send_reschedule()
  2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
                   ` (3 preceding siblings ...)
  2022-10-07 15:45 ` [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise() Valentin Schneider
@ 2022-10-07 15:45 ` Valentin Schneider
  2022-10-08  0:23   ` Guo Ren
  2022-10-07 20:01 ` [RFC PATCH 0/5] Generic IPI sending tracepoint Marcelo Tosatti
  5 siblings, 1 reply; 16+ messages in thread
From: Valentin Schneider @ 2022-10-07 15:45 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86
  Cc: Paul E. McKenney, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Guo Ren, David S. Miller

To be able to trace invocations of smp_send_reschedule(), rename the
arch-specific definitions of it to arch_smp_send_reschedule() and wrap it
into an smp_send_reschedule() that contains a tracepoint.

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 arch/alpha/kernel/smp.c          | 2 +-
 arch/arc/kernel/smp.c            | 2 +-
 arch/arm/kernel/smp.c            | 2 +-
 arch/arm64/kernel/smp.c          | 2 +-
 arch/csky/kernel/smp.c           | 2 +-
 arch/hexagon/kernel/smp.c        | 2 +-
 arch/ia64/kernel/smp.c           | 4 ++--
 arch/loongarch/include/asm/smp.h | 2 +-
 arch/mips/include/asm/smp.h      | 2 +-
 arch/openrisc/kernel/smp.c       | 2 +-
 arch/parisc/kernel/smp.c         | 4 ++--
 arch/powerpc/kernel/smp.c        | 4 ++--
 arch/riscv/kernel/smp.c          | 4 ++--
 arch/s390/kernel/smp.c           | 2 +-
 arch/sh/kernel/smp.c             | 2 +-
 arch/sparc/kernel/smp_32.c       | 2 +-
 arch/sparc/kernel/smp_64.c       | 2 +-
 arch/x86/include/asm/smp.h       | 2 +-
 arch/xtensa/kernel/smp.c         | 2 +-
 include/linux/smp.h              | 1 +
 kernel/smp.c                     | 6 ++++++
 21 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index f4e20f75438f..38637eb9eebd 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -562,7 +562,7 @@ handle_ipi(struct pt_regs *regs)
 }
 
 void
-smp_send_reschedule(int cpu)
+arch_smp_send_reschedule(int cpu)
 {
 #ifdef DEBUG_IPI_MSG
 	if (cpu == hard_smp_processor_id())
diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
index ab9e75e90f72..ae2e6a312361 100644
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -292,7 +292,7 @@ static void ipi_send_msg(const struct cpumask *callmap, enum ipi_msg_type msg)
 		ipi_send_msg_one(cpu, msg);
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	ipi_send_msg_one(cpu, IPI_RESCHEDULE);
 }
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 3b280d55c1c4..f216ac890b6f 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -745,7 +745,7 @@ void __init set_smp_ipi_range(int ipi_base, int n)
 	ipi_setup(smp_processor_id());
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 937d2623e06b..8d108edc4a89 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -976,7 +976,7 @@ void __init set_smp_ipi_range(int ipi_base, int n)
 	ipi_setup(smp_processor_id());
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
 }
diff --git a/arch/csky/kernel/smp.c b/arch/csky/kernel/smp.c
index 4b605aa2e1d6..fd7f81be16dd 100644
--- a/arch/csky/kernel/smp.c
+++ b/arch/csky/kernel/smp.c
@@ -140,7 +140,7 @@ void smp_send_stop(void)
 	on_each_cpu(ipi_stop, NULL, 1);
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
 }
diff --git a/arch/hexagon/kernel/smp.c b/arch/hexagon/kernel/smp.c
index 4ba93e59370c..4e8bee25b8c6 100644
--- a/arch/hexagon/kernel/smp.c
+++ b/arch/hexagon/kernel/smp.c
@@ -217,7 +217,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	}
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	send_ipi(cpumask_of(cpu), IPI_RESCHEDULE);
 }
diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c
index e2cc59db86bc..ea4f009a232b 100644
--- a/arch/ia64/kernel/smp.c
+++ b/arch/ia64/kernel/smp.c
@@ -220,11 +220,11 @@ kdump_smp_send_init(void)
  * Called with preemption disabled.
  */
 void
-smp_send_reschedule (int cpu)
+arch_smp_send_reschedule (int cpu)
 {
 	ia64_send_ipi(cpu, IA64_IPI_RESCHEDULE, IA64_IPI_DM_INT, 0);
 }
-EXPORT_SYMBOL_GPL(smp_send_reschedule);
+EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
 
 /*
  * Called with preemption disabled.
diff --git a/arch/loongarch/include/asm/smp.h b/arch/loongarch/include/asm/smp.h
index 71189b28bfb2..3fcca134dfb1 100644
--- a/arch/loongarch/include/asm/smp.h
+++ b/arch/loongarch/include/asm/smp.h
@@ -83,7 +83,7 @@ extern void show_ipi_list(struct seq_file *p, int prec);
  * it goes straight through and wastes no time serializing
  * anything. Worst case is that we lose a reschedule ...
  */
-static inline void smp_send_reschedule(int cpu)
+static inline void arch_smp_send_reschedule(int cpu)
 {
 	loongson3_send_ipi_single(cpu, SMP_RESCHEDULE);
 }
diff --git a/arch/mips/include/asm/smp.h b/arch/mips/include/asm/smp.h
index 5d9ff61004ca..9806e79895d9 100644
--- a/arch/mips/include/asm/smp.h
+++ b/arch/mips/include/asm/smp.h
@@ -66,7 +66,7 @@ extern void calculate_cpu_foreign_map(void);
  * it goes straight through and wastes no time serializing
  * anything. Worst case is that we lose a reschedule ...
  */
-static inline void smp_send_reschedule(int cpu)
+static inline void arch_smp_send_reschedule(int cpu)
 {
 	extern const struct plat_smp_ops *mp_ops;	/* private */
 
diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
index e1419095a6f0..0a7a059e2dff 100644
--- a/arch/openrisc/kernel/smp.c
+++ b/arch/openrisc/kernel/smp.c
@@ -173,7 +173,7 @@ void handle_IPI(unsigned int ipi_msg)
 	}
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
 }
diff --git a/arch/parisc/kernel/smp.c b/arch/parisc/kernel/smp.c
index 7dbd92cafae3..b7fc859fa87d 100644
--- a/arch/parisc/kernel/smp.c
+++ b/arch/parisc/kernel/smp.c
@@ -246,8 +246,8 @@ void kgdb_roundup_cpus(void)
 inline void 
 smp_send_stop(void)	{ send_IPI_allbutself(IPI_CPU_STOP); }
 
-void 
-smp_send_reschedule(int cpu) { send_IPI_single(cpu, IPI_RESCHEDULE); }
+void
+arch_smp_send_reschedule(int cpu) { send_IPI_single(cpu, IPI_RESCHEDULE); }
 
 void
 smp_send_all_nop(void)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 169703fead57..2d7b217392f2 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -364,12 +364,12 @@ static inline void do_message_pass(int cpu, int msg)
 #endif
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	if (likely(smp_ops))
 		do_message_pass(cpu, PPC_MSG_RESCHEDULE);
 }
-EXPORT_SYMBOL_GPL(smp_send_reschedule);
+EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
 
 void arch_send_call_function_single_ipi(int cpu)
 {
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 760a64518c58..213602e89a8b 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -235,8 +235,8 @@ void smp_send_stop(void)
 			   cpumask_pr_args(cpu_online_mask));
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	send_ipi_single(cpu, IPI_RESCHEDULE);
 }
-EXPORT_SYMBOL_GPL(smp_send_reschedule);
+EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 30c91d565933..9d1c36571106 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -542,7 +542,7 @@ void arch_send_call_function_single_ipi(int cpu)
  * it goes straight through and wastes no time serializing
  * anything. Worst case is that we lose a reschedule ...
  */
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	pcpu_ec_call(pcpu_devices + cpu, ec_schedule);
 }
diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
index 65924d9ec245..5cf35a774dc7 100644
--- a/arch/sh/kernel/smp.c
+++ b/arch/sh/kernel/smp.c
@@ -256,7 +256,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
 	       (bogosum / (5000/HZ)) % 100);
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	mp_ops->send_ipi(cpu, SMP_MSG_RESCHEDULE);
 }
diff --git a/arch/sparc/kernel/smp_32.c b/arch/sparc/kernel/smp_32.c
index ad8094d955eb..87eaa7719fa2 100644
--- a/arch/sparc/kernel/smp_32.c
+++ b/arch/sparc/kernel/smp_32.c
@@ -120,7 +120,7 @@ void cpu_panic(void)
 
 struct linux_prom_registers smp_penguin_ctable = { 0 };
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	/*
 	 * CPU model dependent way of implementing IPI generation targeting
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index a55295d1b924..e5964d1d8b37 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1430,7 +1430,7 @@ static unsigned long send_cpu_poke(int cpu)
 	return hv_err;
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	if (cpu == smp_processor_id()) {
 		WARN_ON_ONCE(preemptible());
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index a73bced40e24..5ff5815149bd 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -99,7 +99,7 @@ static inline void play_dead(void)
 	smp_ops.play_dead();
 }
 
-static inline void smp_send_reschedule(int cpu)
+static inline void arch_smp_send_reschedule(int cpu)
 {
 	smp_ops.smp_send_reschedule(cpu);
 }
diff --git a/arch/xtensa/kernel/smp.c b/arch/xtensa/kernel/smp.c
index 4dc109dd6214..d95907b8e4d3 100644
--- a/arch/xtensa/kernel/smp.c
+++ b/arch/xtensa/kernel/smp.c
@@ -389,7 +389,7 @@ void arch_send_call_function_single_ipi(int cpu)
 	send_ipi_message(cpumask_of(cpu), IPI_CALL_FUNC);
 }
 
-void smp_send_reschedule(int cpu)
+void arch_smp_send_reschedule(int cpu)
 {
 	send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
 }
diff --git a/include/linux/smp.h b/include/linux/smp.h
index a80ab58ae3f1..a67e7aad17b9 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -125,6 +125,7 @@ extern void smp_send_stop(void);
 /*
  * sends a 'reschedule' event to another CPU:
  */
+extern void arch_smp_send_reschedule(int cpu);
 extern void smp_send_reschedule(int cpu);
 
 
diff --git a/kernel/smp.c b/kernel/smp.c
index 387735180aed..9dfe057424f8 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -166,6 +166,12 @@ static inline void send_call_function_ipi_mask(const struct cpumask *mask)
 	arch_send_call_function_ipi_mask(mask);
 }
 
+void smp_send_reschedule(int cpu)
+{
+	trace_ipi_send_cpu(_RET_IP_, cpu);
+	arch_smp_send_reschedule(cpu);
+}
+
 #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
 
 static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
                   ` (4 preceding siblings ...)
  2022-10-07 15:45 ` [RFC PATCH 5/5] treewide: Rename and trace arch-definitions of smp_send_reschedule() Valentin Schneider
@ 2022-10-07 20:01 ` Marcelo Tosatti
  2022-10-08 19:40   ` Steven Rostedt
  2022-10-11 16:17   ` Valentin Schneider
  5 siblings, 2 replies; 16+ messages in thread
From: Marcelo Tosatti @ 2022-10-07 20:01 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Frederic Weisbecker, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Marc Zyngier,
	Mark Rutland, Russell King, Nicholas Piggin, Guo Ren,
	David S. Miller

Hi Valentin,

On Fri, Oct 07, 2022 at 04:41:40PM +0100, Valentin Schneider wrote:
> Background
> ==========
> 
> Detecting IPI *reception* is relatively easy, e.g. using
> trace_irq_handler_{entry,exit} or even just function-trace
> flush_smp_call_function_queue() for SMP calls.  
> 
> Figuring out their *origin*, is trickier as there is no generic tracepoint tied
> to e.g. smp_call_function():
> 
> o AFAIA x86 has no tracepoint tied to sending IPIs, only receiving them
>   (cf. trace_call_function{_single}_entry()).
> o arm/arm64 do have trace_ipi_raise(), which gives us the target cpus but also a
>   mostly useless string (smp_calls will all be "Function call interrupts").
> o Other architectures don't seem to have any IPI-sending related tracepoint.  
> 
> I believe one reason those tracepoints used by arm/arm64 ended up as they were
> is because these archs used to handle IPIs differently from regular interrupts
> (the IRQ driver would directly invoke an IPI-handling routine), which meant they 
> never showed up in trace_irq_handler_{entry, exit}. The trace_ipi_{entry,exit}
> tracepoints gave a way to trace IPI reception but those have become redundant as
> of: 
> 
>       56afcd3dbd19 ("ARM: Allow IPIs to be handled as normal interrupts")
>       d3afc7f12987 ("arm64: Allow IPIs to be handled as normal interrupts")
> 
> which gave IPIs a "proper" handler function used through
> generic_handle_domain_irq(), which makes them show up via
> trace_irq_handler_{entry, exit}.
> 
> Changing stuff up
> =================
> 
> Per the above, it would make sense to reshuffle trace_ipi_raise() and move it
> into generic code. This also came up during Daniel's talk on Osnoise at the CPU
> isolation MC of LPC 2022 [1]. 
> 
> Now, to be useful, such a tracepoint needs to export:
> o targeted CPU(s)
> o calling context
> 
> The only way to get the calling context with trace_ipi_raise() is to trigger a
> stack dump, e.g. $(trace-cmd -e ipi* -T echo 42).
> 
> As for the targeted CPUs, the existing tracepoint does export them, albeit in
> cpumask form, which is quite inconvenient from a tooling perspective. For
> instance, as far as I'm aware, it's not possible to do event filtering on a
> cpumask via trace-cmd.

https://man7.org/linux/man-pages/man1/trace-cmd-set.1.html

       -f filter
           Specify a filter for the previous event. This must come after
           a -e. This will filter what events get recorded based on the
           content of the event. Filtering is passed to the kernel
           directly so what filtering is allowed may depend on what
           version of the kernel you have. Basically, it will let you
           use C notation to check if an event should be processed or
           not.

               ==, >=, <=, >, <, &, |, && and ||

           The above are usually safe to use to compare fields.

This looks overkill to me (consider large number of bits set in mask).

+#define trace_ipi_send_cpumask(callsite, mask) do {            \
+	if (static_key_false(&__tracepoint_ipi_send_cpu.key)) { \
+               int cpu;                                        \
+               for_each_cpu(cpu, mask)                         \
+                       trace_ipi_send_cpu(callsite, cpu);	\
+	}                                                       \
+} while (0)


> 
> Because of the above points, this is introducing a new tracepoint.
> 
> Patches
> =======
> 
> This results in having trace events for:
> 
> o smp_call_function*()
> o smp_send_reschedule()
> o irq_work_queue*()
> 
> This is incomplete, just looking at arm64 there's more IPI types that aren't covered:
> 
>   IPI_CPU_STOP,
>   IPI_CPU_CRASH_STOP,
>   IPI_TIMER,
>   IPI_WAKEUP,
> 
> ... But it feels like a good starting point.

Can't you have a single tracepoint (or variant with cpumask) that would
cover such cases as well?

Maybe (as parameters for tracepoint):

	* type (reschedule, smp_call_function, timer, wakeup, ...).

	* function address: valid for smp_call_function, irq_work_queue
	  types.

> Another thing worth mentioning is that depending on the callsite, the _RET_IP_
> fed to the tracepoint is not always useful - generic_exec_single() doesn't tell
> you much about the actual callback being sent via IPI, so there might be value
> in exploding the single tracepoint into at least one variant for smp_calls.

Not sure i grasp what you mean by "exploding the single tracepoint...",
but yes knowing the function or irq work function is very useful.

> 
> Links
> =====
> 
> [1]: https://youtu.be/5gT57y4OzBM?t=14234
> 
> Valentin Schneider (5):
>   trace: Add trace_ipi_send_{cpu, cpumask}
>   sched, smp: Trace send_call_function_single_ipi()
>   smp: Add a multi-CPU variant to send_call_function_single_ipi()
>   irq_work: Trace calls to arch_irq_work_raise()
>   treewide: Rename and trace arch-definitions of smp_send_reschedule()
> 
>  arch/alpha/kernel/smp.c          |  2 +-
>  arch/arc/kernel/smp.c            |  2 +-
>  arch/arm/kernel/smp.c            |  5 +----
>  arch/arm64/kernel/smp.c          |  3 +--
>  arch/csky/kernel/smp.c           |  2 +-
>  arch/hexagon/kernel/smp.c        |  2 +-
>  arch/ia64/kernel/smp.c           |  4 ++--
>  arch/loongarch/include/asm/smp.h |  2 +-
>  arch/mips/include/asm/smp.h      |  2 +-
>  arch/openrisc/kernel/smp.c       |  2 +-
>  arch/parisc/kernel/smp.c         |  4 ++--
>  arch/powerpc/kernel/smp.c        |  4 ++--
>  arch/riscv/kernel/smp.c          |  4 ++--
>  arch/s390/kernel/smp.c           |  2 +-
>  arch/sh/kernel/smp.c             |  2 +-
>  arch/sparc/kernel/smp_32.c       |  2 +-
>  arch/sparc/kernel/smp_64.c       |  2 +-
>  arch/x86/include/asm/smp.h       |  2 +-
>  arch/xtensa/kernel/smp.c         |  2 +-
>  include/linux/smp.h              |  1 +
>  include/trace/events/ipi.h       | 27 +++++++++++++++++++++++++++
>  kernel/irq_work.c                | 12 +++++++++++-
>  kernel/sched/core.c              |  7 +++++--
>  kernel/smp.c                     | 18 +++++++++++++++++-
>  24 files changed, 84 insertions(+), 31 deletions(-)
> 
> --
> 2.31.1
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/5] treewide: Rename and trace arch-definitions of smp_send_reschedule()
  2022-10-07 15:45 ` [RFC PATCH 5/5] treewide: Rename and trace arch-definitions of smp_send_reschedule() Valentin Schneider
@ 2022-10-08  0:23   ` Guo Ren
  0 siblings, 0 replies; 16+ messages in thread
From: Guo Ren @ 2022-10-08  0:23 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Marcelo Tosatti, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	David S. Miller

On Fri, Oct 7, 2022 at 11:46 PM Valentin Schneider <vschneid@redhat.com> wrote:
>
> To be able to trace invocations of smp_send_reschedule(), rename the
> arch-specific definitions of it to arch_smp_send_reschedule() and wrap it
> into an smp_send_reschedule() that contains a tracepoint.
>
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
>  arch/alpha/kernel/smp.c          | 2 +-
>  arch/arc/kernel/smp.c            | 2 +-
>  arch/arm/kernel/smp.c            | 2 +-
>  arch/arm64/kernel/smp.c          | 2 +-
>  arch/csky/kernel/smp.c           | 2 +-
>  arch/hexagon/kernel/smp.c        | 2 +-
>  arch/ia64/kernel/smp.c           | 4 ++--
>  arch/loongarch/include/asm/smp.h | 2 +-
>  arch/mips/include/asm/smp.h      | 2 +-
>  arch/openrisc/kernel/smp.c       | 2 +-
>  arch/parisc/kernel/smp.c         | 4 ++--
>  arch/powerpc/kernel/smp.c        | 4 ++--
>  arch/riscv/kernel/smp.c          | 4 ++--
>  arch/s390/kernel/smp.c           | 2 +-
>  arch/sh/kernel/smp.c             | 2 +-
>  arch/sparc/kernel/smp_32.c       | 2 +-
>  arch/sparc/kernel/smp_64.c       | 2 +-
>  arch/x86/include/asm/smp.h       | 2 +-
>  arch/xtensa/kernel/smp.c         | 2 +-
>  include/linux/smp.h              | 1 +
>  kernel/smp.c                     | 6 ++++++
>  21 files changed, 30 insertions(+), 23 deletions(-)
>
> diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
> index f4e20f75438f..38637eb9eebd 100644
> --- a/arch/alpha/kernel/smp.c
> +++ b/arch/alpha/kernel/smp.c
> @@ -562,7 +562,7 @@ handle_ipi(struct pt_regs *regs)
>  }
>
>  void
> -smp_send_reschedule(int cpu)
> +arch_smp_send_reschedule(int cpu)
>  {
>  #ifdef DEBUG_IPI_MSG
>         if (cpu == hard_smp_processor_id())
> diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
> index ab9e75e90f72..ae2e6a312361 100644
> --- a/arch/arc/kernel/smp.c
> +++ b/arch/arc/kernel/smp.c
> @@ -292,7 +292,7 @@ static void ipi_send_msg(const struct cpumask *callmap, enum ipi_msg_type msg)
>                 ipi_send_msg_one(cpu, msg);
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         ipi_send_msg_one(cpu, IPI_RESCHEDULE);
>  }
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 3b280d55c1c4..f216ac890b6f 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -745,7 +745,7 @@ void __init set_smp_ipi_range(int ipi_base, int n)
>         ipi_setup(smp_processor_id());
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 937d2623e06b..8d108edc4a89 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -976,7 +976,7 @@ void __init set_smp_ipi_range(int ipi_base, int n)
>         ipi_setup(smp_processor_id());
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/csky/kernel/smp.c b/arch/csky/kernel/smp.c
> index 4b605aa2e1d6..fd7f81be16dd 100644
> --- a/arch/csky/kernel/smp.c
> +++ b/arch/csky/kernel/smp.c
> @@ -140,7 +140,7 @@ void smp_send_stop(void)
>         on_each_cpu(ipi_stop, NULL, 1);
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
For csky part, Acked-by: Guo Ren <guoren@kernel.org>

>  {
>         send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/hexagon/kernel/smp.c b/arch/hexagon/kernel/smp.c
> index 4ba93e59370c..4e8bee25b8c6 100644
> --- a/arch/hexagon/kernel/smp.c
> +++ b/arch/hexagon/kernel/smp.c
> @@ -217,7 +217,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>         }
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         send_ipi(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c
> index e2cc59db86bc..ea4f009a232b 100644
> --- a/arch/ia64/kernel/smp.c
> +++ b/arch/ia64/kernel/smp.c
> @@ -220,11 +220,11 @@ kdump_smp_send_init(void)
>   * Called with preemption disabled.
>   */
>  void
> -smp_send_reschedule (int cpu)
> +arch_smp_send_reschedule (int cpu)
>  {
>         ia64_send_ipi(cpu, IA64_IPI_RESCHEDULE, IA64_IPI_DM_INT, 0);
>  }
> -EXPORT_SYMBOL_GPL(smp_send_reschedule);
> +EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
>
>  /*
>   * Called with preemption disabled.
> diff --git a/arch/loongarch/include/asm/smp.h b/arch/loongarch/include/asm/smp.h
> index 71189b28bfb2..3fcca134dfb1 100644
> --- a/arch/loongarch/include/asm/smp.h
> +++ b/arch/loongarch/include/asm/smp.h
> @@ -83,7 +83,7 @@ extern void show_ipi_list(struct seq_file *p, int prec);
>   * it goes straight through and wastes no time serializing
>   * anything. Worst case is that we lose a reschedule ...
>   */
> -static inline void smp_send_reschedule(int cpu)
> +static inline void arch_smp_send_reschedule(int cpu)
>  {
>         loongson3_send_ipi_single(cpu, SMP_RESCHEDULE);
>  }
> diff --git a/arch/mips/include/asm/smp.h b/arch/mips/include/asm/smp.h
> index 5d9ff61004ca..9806e79895d9 100644
> --- a/arch/mips/include/asm/smp.h
> +++ b/arch/mips/include/asm/smp.h
> @@ -66,7 +66,7 @@ extern void calculate_cpu_foreign_map(void);
>   * it goes straight through and wastes no time serializing
>   * anything. Worst case is that we lose a reschedule ...
>   */
> -static inline void smp_send_reschedule(int cpu)
> +static inline void arch_smp_send_reschedule(int cpu)
>  {
>         extern const struct plat_smp_ops *mp_ops;       /* private */
>
> diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
> index e1419095a6f0..0a7a059e2dff 100644
> --- a/arch/openrisc/kernel/smp.c
> +++ b/arch/openrisc/kernel/smp.c
> @@ -173,7 +173,7 @@ void handle_IPI(unsigned int ipi_msg)
>         }
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/parisc/kernel/smp.c b/arch/parisc/kernel/smp.c
> index 7dbd92cafae3..b7fc859fa87d 100644
> --- a/arch/parisc/kernel/smp.c
> +++ b/arch/parisc/kernel/smp.c
> @@ -246,8 +246,8 @@ void kgdb_roundup_cpus(void)
>  inline void
>  smp_send_stop(void)    { send_IPI_allbutself(IPI_CPU_STOP); }
>
> -void
> -smp_send_reschedule(int cpu) { send_IPI_single(cpu, IPI_RESCHEDULE); }
> +void
> +arch_smp_send_reschedule(int cpu) { send_IPI_single(cpu, IPI_RESCHEDULE); }
>
>  void
>  smp_send_all_nop(void)
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 169703fead57..2d7b217392f2 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -364,12 +364,12 @@ static inline void do_message_pass(int cpu, int msg)
>  #endif
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         if (likely(smp_ops))
>                 do_message_pass(cpu, PPC_MSG_RESCHEDULE);
>  }
> -EXPORT_SYMBOL_GPL(smp_send_reschedule);
> +EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
>
>  void arch_send_call_function_single_ipi(int cpu)
>  {
> diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
> index 760a64518c58..213602e89a8b 100644
> --- a/arch/riscv/kernel/smp.c
> +++ b/arch/riscv/kernel/smp.c
> @@ -235,8 +235,8 @@ void smp_send_stop(void)
>                            cpumask_pr_args(cpu_online_mask));
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         send_ipi_single(cpu, IPI_RESCHEDULE);
>  }
> -EXPORT_SYMBOL_GPL(smp_send_reschedule);
> +EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
> diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
> index 30c91d565933..9d1c36571106 100644
> --- a/arch/s390/kernel/smp.c
> +++ b/arch/s390/kernel/smp.c
> @@ -542,7 +542,7 @@ void arch_send_call_function_single_ipi(int cpu)
>   * it goes straight through and wastes no time serializing
>   * anything. Worst case is that we lose a reschedule ...
>   */
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         pcpu_ec_call(pcpu_devices + cpu, ec_schedule);
>  }
> diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
> index 65924d9ec245..5cf35a774dc7 100644
> --- a/arch/sh/kernel/smp.c
> +++ b/arch/sh/kernel/smp.c
> @@ -256,7 +256,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
>                (bogosum / (5000/HZ)) % 100);
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         mp_ops->send_ipi(cpu, SMP_MSG_RESCHEDULE);
>  }
> diff --git a/arch/sparc/kernel/smp_32.c b/arch/sparc/kernel/smp_32.c
> index ad8094d955eb..87eaa7719fa2 100644
> --- a/arch/sparc/kernel/smp_32.c
> +++ b/arch/sparc/kernel/smp_32.c
> @@ -120,7 +120,7 @@ void cpu_panic(void)
>
>  struct linux_prom_registers smp_penguin_ctable = { 0 };
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         /*
>          * CPU model dependent way of implementing IPI generation targeting
> diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> index a55295d1b924..e5964d1d8b37 100644
> --- a/arch/sparc/kernel/smp_64.c
> +++ b/arch/sparc/kernel/smp_64.c
> @@ -1430,7 +1430,7 @@ static unsigned long send_cpu_poke(int cpu)
>         return hv_err;
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         if (cpu == smp_processor_id()) {
>                 WARN_ON_ONCE(preemptible());
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index a73bced40e24..5ff5815149bd 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -99,7 +99,7 @@ static inline void play_dead(void)
>         smp_ops.play_dead();
>  }
>
> -static inline void smp_send_reschedule(int cpu)
> +static inline void arch_smp_send_reschedule(int cpu)
>  {
>         smp_ops.smp_send_reschedule(cpu);
>  }
> diff --git a/arch/xtensa/kernel/smp.c b/arch/xtensa/kernel/smp.c
> index 4dc109dd6214..d95907b8e4d3 100644
> --- a/arch/xtensa/kernel/smp.c
> +++ b/arch/xtensa/kernel/smp.c
> @@ -389,7 +389,7 @@ void arch_send_call_function_single_ipi(int cpu)
>         send_ipi_message(cpumask_of(cpu), IPI_CALL_FUNC);
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
>         send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index a80ab58ae3f1..a67e7aad17b9 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -125,6 +125,7 @@ extern void smp_send_stop(void);
>  /*
>   * sends a 'reschedule' event to another CPU:
>   */
> +extern void arch_smp_send_reschedule(int cpu);
>  extern void smp_send_reschedule(int cpu);
>
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 387735180aed..9dfe057424f8 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -166,6 +166,12 @@ static inline void send_call_function_ipi_mask(const struct cpumask *mask)
>         arch_send_call_function_ipi_mask(mask);
>  }
>
> +void smp_send_reschedule(int cpu)
> +{
> +       trace_ipi_send_cpu(_RET_IP_, cpu);
> +       arch_smp_send_reschedule(cpu);
> +}
> +
>  #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
>
>  static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled);
> --
> 2.31.1
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise()
  2022-10-07 15:45 ` [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise() Valentin Schneider
@ 2022-10-08 19:34   ` Steven Rostedt
  2022-10-11 15:02     ` Valentin Schneider
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2022-10-08 19:34 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Peter Zijlstra, Thomas Gleixner, Sebastian Andrzej Siewior,
	Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
	Frederic Weisbecker, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Marc Zyngier, Mark Rutland, Russell King,
	Nicholas Piggin, Guo Ren, David S. Miller

On Fri,  7 Oct 2022 16:45:32 +0100
Valentin Schneider <vschneid@redhat.com> wrote:
>  }
>  
> +static inline void irq_work_raise(void)
> +{
> +	if (arch_irq_work_has_interrupt())
> +		trace_ipi_send_cpu(_RET_IP_, smp_processor_id());

To save on the branch, let's make the above:

	if (trace_ipi_send_cpu_enabled() && arch_irq_work_has_interrupt())

As the "trace_*_enabled()" is a static branch that will make it a nop
when the tracepoint is not enabled.

-- Steve


> +
> +	arch_irq_work_raise();
> +}
> +
>  /* Enqueue on current CPU, work must already be claimed and preempt disabled */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-07 20:01 ` [RFC PATCH 0/5] Generic IPI sending tracepoint Marcelo Tosatti
@ 2022-10-08 19:40   ` Steven Rostedt
  2022-10-11 16:17   ` Valentin Schneider
  1 sibling, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2022-10-08 19:40 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Valentin Schneider, linux-alpha, linux-kernel, linux-snps-arc,
	linux-arm-kernel, linux-csky, linux-hexagon, linux-ia64,
	loongarch, linux-mips, openrisc, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, sparclinux, linux-xtensa, x86,
	Paul E. McKenney, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Frederic Weisbecker, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Marc Zyngier,
	Mark Rutland, Russell King, Nicholas Piggin, Guo Ren,
	David S. Miller

On Fri, 7 Oct 2022 17:01:33 -0300
Marcelo Tosatti <mtosatti@redhat.com> wrote:

> > As for the targeted CPUs, the existing tracepoint does export them, albeit in
> > cpumask form, which is quite inconvenient from a tooling perspective. For
> > instance, as far as I'm aware, it's not possible to do event filtering on a
> > cpumask via trace-cmd.  
> 
> https://man7.org/linux/man-pages/man1/trace-cmd-set.1.html
> 
>        -f filter
>            Specify a filter for the previous event. This must come after
>            a -e. This will filter what events get recorded based on the
>            content of the event. Filtering is passed to the kernel
>            directly so what filtering is allowed may depend on what
>            version of the kernel you have. Basically, it will let you
>            use C notation to check if an event should be processed or
>            not.
> 
>                ==, >=, <=, >, <, &, |, && and ||
> 
>            The above are usually safe to use to compare fields.

We could always add an "isset(x)" filter ;-)

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise()
  2022-10-08 19:34   ` Steven Rostedt
@ 2022-10-11 15:02     ` Valentin Schneider
  0 siblings, 0 replies; 16+ messages in thread
From: Valentin Schneider @ 2022-10-11 15:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Peter Zijlstra, Thomas Gleixner, Sebastian Andrzej Siewior,
	Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
	Frederic Weisbecker, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Marc Zyngier, Mark Rutland, Russell King,
	Nicholas Piggin, Guo Ren, David S. Miller

On 08/10/22 15:34, Steven Rostedt wrote:
> On Fri,  7 Oct 2022 16:45:32 +0100
> Valentin Schneider <vschneid@redhat.com> wrote:
>>  }
>>  
>> +static inline void irq_work_raise(void)
>> +{
>> +	if (arch_irq_work_has_interrupt())
>> +		trace_ipi_send_cpu(_RET_IP_, smp_processor_id());
>
> To save on the branch, let's make the above:
>
> 	if (trace_ipi_send_cpu_enabled() && arch_irq_work_has_interrupt())
>
> As the "trace_*_enabled()" is a static branch that will make it a nop
> when the tracepoint is not enabled.
>

Makes sense, thanks for the suggestion.

> -- Steve
>
>
>> +
>> +	arch_irq_work_raise();
>> +}
>> +
>>  /* Enqueue on current CPU, work must already be claimed and preempt disabled */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-07 20:01 ` [RFC PATCH 0/5] Generic IPI sending tracepoint Marcelo Tosatti
  2022-10-08 19:40   ` Steven Rostedt
@ 2022-10-11 16:17   ` Valentin Schneider
  2022-10-11 16:22     ` Daniel Bristot de Oliveira
  2022-10-11 20:34     ` Steven Rostedt
  1 sibling, 2 replies; 16+ messages in thread
From: Valentin Schneider @ 2022-10-11 16:17 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Frederic Weisbecker, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Marc Zyngier,
	Mark Rutland, Russell King, Nicholas Piggin, Guo Ren,
	David S. Miller, Douglas RAILLARD

+Cc Douglas

On 07/10/22 17:01, Marcelo Tosatti wrote:
> Hi Valentin,
>
> On Fri, Oct 07, 2022 at 04:41:40PM +0100, Valentin Schneider wrote:
>> Background
>> ==========
>> 
>> As for the targeted CPUs, the existing tracepoint does export them, albeit in
>> cpumask form, which is quite inconvenient from a tooling perspective. For
>> instance, as far as I'm aware, it's not possible to do event filtering on a
>> cpumask via trace-cmd.
>
> https://man7.org/linux/man-pages/man1/trace-cmd-set.1.html
>
>        -f filter
>            Specify a filter for the previous event. This must come after
>            a -e. This will filter what events get recorded based on the
>            content of the event. Filtering is passed to the kernel
>            directly so what filtering is allowed may depend on what
>            version of the kernel you have. Basically, it will let you
>            use C notation to check if an event should be processed or
>            not.
>
>                ==, >=, <=, >, <, &, |, && and ||
>
>            The above are usually safe to use to compare fields.
>
> This looks overkill to me (consider large number of bits set in mask).
>
> +#define trace_ipi_send_cpumask(callsite, mask) do {            \
> +	if (static_key_false(&__tracepoint_ipi_send_cpu.key)) { \
> +               int cpu;                                        \
> +               for_each_cpu(cpu, mask)                         \
> +                       trace_ipi_send_cpu(callsite, cpu);	\
> +	}                                                       \
> +} while (0)
>

Indeed, I expected pushback on this :-)

I went for this due to how much simpler an int is to process/use compared
to a cpumask. There is the trigger example I listed above, but the
consumption of the trace event itself as well.

Consider this event collected on an arm64 QEMU instance (output from trace-cmd)

    <...>-234   [001]    37.251567: ipi_raise:            target_mask=00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000004 (Function call interrupts)

That sort of formatting has been an issue downstream for things like LISA
[1] where events are aggregated into Pandas tables, and we need to play
silly games for performance reason because bitmasks aren't a native Python
type.

I had a look at libtraceevent to see how this data is exposed and if the
answer would be better tooling:

tep_get_field_val() just yields an unsigned long long of value 0x200018,
which AFAICT is just the [length, offset] thing associated with dynamic
arrays. Not really usable, and I don't see anything exported in the lib to
extract and use those values.

tep_get_field_raw() is better, it handles the dynamic array for us and
yields a pointer to the cpumask array at the tail of the record. With that
it's easy to get an output such as: cpumask[size=32]=[4,0,0,0,]. Still,
this isn't a native type for many programming languages.

In contrast, this is immediately readable and consumable by userspace tools

<...>-234   [001]    37.250882: ipi_send_cpu:         callsite=__smp_call_single_queue+0x5c target_cpu=2

Thinking out loud, it makes way more sense to record a cpumask in the
tracepoint, but perhaps we could have a postprocessing step to transform
those into N events each targeting a single CPU?

[1]: https://github.com/ARM-software/lisa/blob/37b51243a94b27ea031ff62bb4ce818a59a7f6ef/lisa/trace.py#L4756

>
>> 
>> Because of the above points, this is introducing a new tracepoint.
>> 
>> Patches
>> =======
>> 
>> This results in having trace events for:
>> 
>> o smp_call_function*()
>> o smp_send_reschedule()
>> o irq_work_queue*()
>> 
>> This is incomplete, just looking at arm64 there's more IPI types that aren't covered:
>> 
>>   IPI_CPU_STOP,
>>   IPI_CPU_CRASH_STOP,
>>   IPI_TIMER,
>>   IPI_WAKEUP,
>> 
>> ... But it feels like a good starting point.
>
> Can't you have a single tracepoint (or variant with cpumask) that would
> cover such cases as well?
>
> Maybe (as parameters for tracepoint):
>
> 	* type (reschedule, smp_call_function, timer, wakeup, ...).
>
> 	* function address: valid for smp_call_function, irq_work_queue
> 	  types.
>

That's a good point, I wasn't sure about having a parameter serving as
discriminant for another, but the function address would be either valid or
NULL which is fine. So perhaps:
o callsite (i.e. _RET_IP_), serves as type
o address of callback tied to IPI, if any
o target CPUs

>> Another thing worth mentioning is that depending on the callsite, the _RET_IP_
>> fed to the tracepoint is not always useful - generic_exec_single() doesn't tell
>> you much about the actual callback being sent via IPI, so there might be value
>> in exploding the single tracepoint into at least one variant for smp_calls.
>
> Not sure i grasp what you mean by "exploding the single tracepoint...",
> but yes knowing the function or irq work function is very useful.
>

Sorry; I meant having several "specialized" tracepoints instead of a single one.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-11 16:17   ` Valentin Schneider
@ 2022-10-11 16:22     ` Daniel Bristot de Oliveira
  2022-10-11 16:40       ` Valentin Schneider
  2022-10-11 20:34     ` Steven Rostedt
  1 sibling, 1 reply; 16+ messages in thread
From: Daniel Bristot de Oliveira @ 2022-10-11 16:22 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Marcelo Tosatti, Guo Ren, David S. Miller, Douglas RAILLARD

On 10/11/22 18:17, Valentin Schneider wrote:
> Thinking out loud, it makes way more sense to record a cpumask in the
> tracepoint, but perhaps we could have a postprocessing step to transform
> those into N events each targeting a single CPU?

My approach on the tracers/rtla is to make the simple things in kernel, and beautify
things in user-space.

You could keep the tracepoint as a mask, and then make it pretty, like cpus=3-5,8
in user-space. For example with a trace-cmd/perf loadable plugin, libtracefs helper.

For rtla I was thinking to make a new tool to parse them. and make it pretty there.

-- Daniel


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-11 16:22     ` Daniel Bristot de Oliveira
@ 2022-10-11 16:40       ` Valentin Schneider
  2022-10-11 20:41         ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Valentin Schneider @ 2022-10-11 16:40 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira
  Cc: linux-alpha, linux-kernel, linux-snps-arc, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-ia64, loongarch, linux-mips,
	openrisc, linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	linux-sh, sparclinux, linux-xtensa, x86, Paul E. McKenney,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli, Frederic Weisbecker,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Marc Zyngier, Mark Rutland, Russell King, Nicholas Piggin,
	Marcelo Tosatti, Guo Ren, David S. Miller, Douglas RAILLARD

On 11/10/22 18:22, Daniel Bristot de Oliveira wrote:
> On 10/11/22 18:17, Valentin Schneider wrote:
>> Thinking out loud, it makes way more sense to record a cpumask in the
>> tracepoint, but perhaps we could have a postprocessing step to transform
>> those into N events each targeting a single CPU?
>
> My approach on the tracers/rtla is to make the simple things in kernel, and beautify
> things in user-space.
>
> You could keep the tracepoint as a mask, and then make it pretty, like cpus=3-5,8
> in user-space. For example with a trace-cmd/perf loadable plugin, libtracefs helper.
>

That's a nice idea, the one downside I see is that means registering an
event handler for all events with cpumasks rather than directly targeting
cpumask fields, but that doesn't look too horrible. I'll dig a bit in that
direction.

> For rtla I was thinking to make a new tool to parse them. and make it pretty there.
>
> -- Daniel


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-11 16:17   ` Valentin Schneider
  2022-10-11 16:22     ` Daniel Bristot de Oliveira
@ 2022-10-11 20:34     ` Steven Rostedt
  1 sibling, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2022-10-11 20:34 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Marcelo Tosatti, linux-alpha, linux-kernel, linux-snps-arc,
	linux-arm-kernel, linux-csky, linux-hexagon, linux-ia64,
	loongarch, linux-mips, openrisc, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, sparclinux, linux-xtensa, x86,
	Paul E. McKenney, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior, Juri Lelli,
	Daniel Bristot de Oliveira, Frederic Weisbecker, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Marc Zyngier,
	Mark Rutland, Russell King, Nicholas Piggin, Guo Ren,
	David S. Miller, Douglas RAILLARD

On Tue, 11 Oct 2022 17:17:04 +0100
Valentin Schneider <vschneid@redhat.com> wrote:

> tep_get_field_val() just yields an unsigned long long of value 0x200018,
> which AFAICT is just the [length, offset] thing associated with dynamic
> arrays. Not really usable, and I don't see anything exported in the lib to
> extract and use those values.

Correct.

> 
> tep_get_field_raw() is better, it handles the dynamic array for us and
> yields a pointer to the cpumask array at the tail of the record. With that
> it's easy to get an output such as: cpumask[size=32]=[4,0,0,0,]. Still,
> this isn't a native type for many programming languages.

Yeah, this is the interface that I would have used. And it would likely
require some kind of wrapper to make it into what you prefer.

Note, I've been complaining for some time now how much I hate the
libtraceevent interface, and want to rewrite it. (I just spoke to
someone that wants to implement it in Rust). But the question comes
down to how to make it backward compatible. Perhaps we don't and just
up the major version number (libtraceevent 2.0).

What would you recommend as an API to process cpumasks better?

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] Generic IPI sending tracepoint
  2022-10-11 16:40       ` Valentin Schneider
@ 2022-10-11 20:41         ` Steven Rostedt
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2022-10-11 20:41 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Daniel Bristot de Oliveira, linux-alpha, linux-kernel,
	linux-snps-arc, linux-arm-kernel, linux-csky, linux-hexagon,
	linux-ia64, loongarch, linux-mips, openrisc, linux-parisc,
	linuxppc-dev, linux-riscv, linux-s390, linux-sh, sparclinux,
	linux-xtensa, x86, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, Sebastian Andrzej Siewior, Juri Lelli,
	Frederic Weisbecker, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Marc Zyngier, Mark Rutland, Russell King,
	Nicholas Piggin, Marcelo Tosatti, Guo Ren, David S. Miller,
	Douglas RAILLARD

On Tue, 11 Oct 2022 17:40:26 +0100
Valentin Schneider <vschneid@redhat.com> wrote:

> > You could keep the tracepoint as a mask, and then make it pretty, like cpus=3-5,8
> > in user-space. For example with a trace-cmd/perf loadable plugin, libtracefs helper.
> >  
> 
> That's a nice idea, the one downside I see is that means registering an
> event handler for all events with cpumasks rather than directly targeting
> cpumask fields, but that doesn't look too horrible. I'll dig a bit in that
> direction.

We could just make all all dynamic array's of unsigned long use that
format? I don't know of any other event that has dynamic arrays of
unsigned longs. And doing a search doesn't come up with any.

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-10-11 20:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-07 15:41 [RFC PATCH 0/5] Generic IPI sending tracepoint Valentin Schneider
2022-10-07 15:45 ` [RFC PATCH 1/5] trace: Add trace_ipi_send_{cpu, cpumask} Valentin Schneider
2022-10-07 15:45 ` [RFC PATCH 2/5] sched, smp: Trace send_call_function_single_ipi() Valentin Schneider
2022-10-07 15:45 ` [RFC PATCH 3/5] smp: Add a multi-CPU variant to send_call_function_single_ipi() Valentin Schneider
2022-10-07 15:45 ` [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise() Valentin Schneider
2022-10-08 19:34   ` Steven Rostedt
2022-10-11 15:02     ` Valentin Schneider
2022-10-07 15:45 ` [RFC PATCH 5/5] treewide: Rename and trace arch-definitions of smp_send_reschedule() Valentin Schneider
2022-10-08  0:23   ` Guo Ren
2022-10-07 20:01 ` [RFC PATCH 0/5] Generic IPI sending tracepoint Marcelo Tosatti
2022-10-08 19:40   ` Steven Rostedt
2022-10-11 16:17   ` Valentin Schneider
2022-10-11 16:22     ` Daniel Bristot de Oliveira
2022-10-11 16:40       ` Valentin Schneider
2022-10-11 20:41         ` Steven Rostedt
2022-10-11 20:34     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).