linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/3] KVM: arm64: Support tracing virtual CPU ID
@ 2022-11-05  7:23 Leo Yan
  2022-11-05  7:23 ` [PATCH v1 1/3] KVM: arm64: Dynamically register callback for tracepoints Leo Yan
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Leo Yan @ 2022-11-05  7:23 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Catalin Marinas, Will Deacon,
	Arnaldo Carvalho de Melo, John Garry, James Clark, Mike Leach,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, kvmarm, kvmarm,
	linux-kernel, linux-perf-users
  Cc: Leo Yan

Before there have some efforts and discussion for supprot tracing
virtual CPU ID in Arm64 KVM, see [1][2].

The issue was blocked with a main concern that we cannot change the
existed trace events to avoid ABI breakage.  So the question is how
we add new trace events with tracing virtual CPU ID and also need to
keep backward compatibility.

This patch set is to restart the work, it's inspired by Qais Yousef's
work for adding scheduler tracepoints [3].

The first patch changes to register tracepoint callbacks, this can allow
us to support multiple trace events with a single call site, it's a
preparation to add new trace events.

The second patch is to add two new trace events kvm_entry_v2 and
kvm_exit_v2, and these two trace events contain the field "vcpu_id" for
virtual CPU ID.

For more complete view, the third patch is the change in perf tool.
It dynamically detects trace nodes under sysfs and decide to use the
version 2's trace events or rollback to use original events.

This patch set has been tested with mainline kernel on Arm64 Ampere
Altra platform.

Note: I used checkpatch.pl to validate patches format and observed it
reports error for second patch for adding trace events; since the trace
event definition uses its own coding style, I just keep as it is.

[1] https://lore.kernel.org/lkml/1560330526-15468-2-git-send-email-yuzenghui@huawei.com/
[2] https://lore.kernel.org/lkml/20200917003645.689665-1-sergey.senozhatsky@gmail.com/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/kernel/sched/core.c?id=a056a5bed7fa67706574b00cf1122c38596b2be1


Leo Yan (3):
  KVM: arm64: Dynamically register callback for tracepoints
  KVM: arm64: Add trace events with field 'vcpu_id'
  perf arm64: Support virtual CPU ID for kvm-stat

 arch/arm64/kvm/Makefile               |  2 +-
 arch/arm64/kvm/arm.c                  |  4 +-
 arch/arm64/kvm/trace.c                | 35 +++++++++++++++++
 arch/arm64/kvm/trace_arm.h            | 53 ++++++++++++++++++++++++++
 tools/perf/arch/arm64/util/kvm-stat.c | 54 ++++++++++++++++++++++++---
 5 files changed, 140 insertions(+), 8 deletions(-)
 create mode 100644 arch/arm64/kvm/trace.c

-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 1/3] KVM: arm64: Dynamically register callback for tracepoints
  2022-11-05  7:23 [PATCH v1 0/3] KVM: arm64: Support tracing virtual CPU ID Leo Yan
@ 2022-11-05  7:23 ` Leo Yan
  2022-11-05  7:23 ` [PATCH v1 2/3] KVM: arm64: Add trace events with field 'vcpu_id' Leo Yan
  2022-11-05  7:23 ` [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat Leo Yan
  2 siblings, 0 replies; 8+ messages in thread
From: Leo Yan @ 2022-11-05  7:23 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Catalin Marinas, Will Deacon,
	Arnaldo Carvalho de Melo, John Garry, James Clark, Mike Leach,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, kvmarm, kvmarm,
	linux-kernel, linux-perf-users
  Cc: Leo Yan

This commit doesn't change any functionality but only refactoring.

It registers callbacks for tracepoints dynamically, with this way, the
existed trace events (in this case kvm_entry and kvm_exit) are kept.
This is a preparation to add new trace events in later patch.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kvm/Makefile    |  2 +-
 arch/arm64/kvm/arm.c       |  4 ++--
 arch/arm64/kvm/trace.c     | 29 +++++++++++++++++++++++++++++
 arch/arm64/kvm/trace_arm.h |  8 ++++++++
 4 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/kvm/trace.c

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 5e33c2d4645a..4e641d2df7ad 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 inject_fault.o va_layout.o handle_exit.o \
 	 guest.o debug.o reset.o sys_regs.o stacktrace.o \
 	 vgic-sys-reg-v3.o fpsimd.o pkvm.o \
-	 arch_timer.o trng.o vmid.o \
+	 arch_timer.o trng.o vmid.o trace.o \
 	 vgic/vgic.o vgic/vgic-init.o \
 	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
 	 vgic/vgic-v3.o vgic/vgic-v4.o \
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 94d33e296e10..03ed5f6c92bc 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -917,7 +917,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		/**************************************************************
 		 * Enter the guest
 		 */
-		trace_kvm_entry(*vcpu_pc(vcpu));
+		trace_kvm_entry_tp(vcpu);
 		guest_timing_enter_irqoff();
 
 		ret = kvm_arm_vcpu_enter_exit(vcpu);
@@ -974,7 +974,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 
 		local_irq_enable();
 
-		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
+		trace_kvm_exit_tp(ret, vcpu);
 
 		/* Exit types that need handling before we can be preempted */
 		handle_exit_early(vcpu, ret);
diff --git a/arch/arm64/kvm/trace.c b/arch/arm64/kvm/trace.c
new file mode 100644
index 000000000000..d25a3db994e2
--- /dev/null
+++ b/arch/arm64/kvm/trace.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/init.h>
+#include <linux/tracepoint.h>
+
+#include <asm/kvm_emulate.h>
+
+#include "trace_arm.h"
+
+static void kvm_entry_tp(void *data, struct kvm_vcpu *vcpu)
+{
+	if (trace_kvm_entry_enabled())
+		trace_kvm_entry(*vcpu_pc(vcpu));
+}
+
+static void kvm_exit_tp(void *data, int ret, struct kvm_vcpu *vcpu)
+{
+	if (trace_kvm_exit_enabled())
+		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu),
+			       *vcpu_pc(vcpu));
+}
+
+static int __init kvm_tp_init(void)
+{
+	register_trace_kvm_entry_tp(kvm_entry_tp, NULL);
+	register_trace_kvm_exit_tp(kvm_exit_tp, NULL);
+	return 0;
+}
+
+core_initcall(kvm_tp_init)
diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
index 33e4e7dd2719..ef02ae93b28b 100644
--- a/arch/arm64/kvm/trace_arm.h
+++ b/arch/arm64/kvm/trace_arm.h
@@ -11,6 +11,10 @@
 /*
  * Tracepoints for entry/exit to guest
  */
+DECLARE_TRACE(kvm_entry_tp,
+	TP_PROTO(struct kvm_vcpu *vcpu),
+	TP_ARGS(vcpu));
+
 TRACE_EVENT(kvm_entry,
 	TP_PROTO(unsigned long vcpu_pc),
 	TP_ARGS(vcpu_pc),
@@ -26,6 +30,10 @@ TRACE_EVENT(kvm_entry,
 	TP_printk("PC: 0x%016lx", __entry->vcpu_pc)
 );
 
+DECLARE_TRACE(kvm_exit_tp,
+	TP_PROTO(int ret, struct kvm_vcpu *vcpu),
+	TP_ARGS(ret, vcpu));
+
 TRACE_EVENT(kvm_exit,
 	TP_PROTO(int ret, unsigned int esr_ec, unsigned long vcpu_pc),
 	TP_ARGS(ret, esr_ec, vcpu_pc),
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 2/3] KVM: arm64: Add trace events with field 'vcpu_id'
  2022-11-05  7:23 [PATCH v1 0/3] KVM: arm64: Support tracing virtual CPU ID Leo Yan
  2022-11-05  7:23 ` [PATCH v1 1/3] KVM: arm64: Dynamically register callback for tracepoints Leo Yan
@ 2022-11-05  7:23 ` Leo Yan
  2022-11-05  7:23 ` [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat Leo Yan
  2 siblings, 0 replies; 8+ messages in thread
From: Leo Yan @ 2022-11-05  7:23 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Catalin Marinas, Will Deacon,
	Arnaldo Carvalho de Melo, John Garry, James Clark, Mike Leach,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, kvmarm, kvmarm,
	linux-kernel, linux-perf-users
  Cc: Leo Yan

The existed trace events kvm_entry and kvm_exit don't contain the info
for virtual CPU id, thus the perf tool has no chance to do statistics
based on virtual CPU wise; and the trace events are ABI and we cannot
change it to avoid ABI breakage.

For above reasons, this patch adds two trace events kvm_entry_v2 and
kvm_exit_v2 with a new field 'vcpu_id'.  To support both the old and
new events, we use the tracepoint callback to check if any event is
enabled or not, if it's enabled then the callback will invoke the
corresponding trace event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kvm/trace.c     |  6 +++++
 arch/arm64/kvm/trace_arm.h | 45 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/arch/arm64/kvm/trace.c b/arch/arm64/kvm/trace.c
index d25a3db994e2..d9b2587c77c3 100644
--- a/arch/arm64/kvm/trace.c
+++ b/arch/arm64/kvm/trace.c
@@ -10,6 +10,9 @@ static void kvm_entry_tp(void *data, struct kvm_vcpu *vcpu)
 {
 	if (trace_kvm_entry_enabled())
 		trace_kvm_entry(*vcpu_pc(vcpu));
+
+	if (trace_kvm_entry_v2_enabled())
+		trace_kvm_entry_v2(vcpu);
 }
 
 static void kvm_exit_tp(void *data, int ret, struct kvm_vcpu *vcpu)
@@ -17,6 +20,9 @@ static void kvm_exit_tp(void *data, int ret, struct kvm_vcpu *vcpu)
 	if (trace_kvm_exit_enabled())
 		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu),
 			       *vcpu_pc(vcpu));
+
+	if (trace_kvm_exit_v2_enabled())
+		trace_kvm_exit_v2(ret, vcpu);
 }
 
 static int __init kvm_tp_init(void)
diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
index ef02ae93b28b..932c9d0c36f3 100644
--- a/arch/arm64/kvm/trace_arm.h
+++ b/arch/arm64/kvm/trace_arm.h
@@ -4,6 +4,7 @@
 
 #include <kvm/arm_arch_timer.h>
 #include <linux/tracepoint.h>
+#include <asm/kvm_emulate.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM kvm
@@ -30,6 +31,23 @@ TRACE_EVENT(kvm_entry,
 	TP_printk("PC: 0x%016lx", __entry->vcpu_pc)
 );
 
+TRACE_EVENT(kvm_entry_v2,
+	TP_PROTO(struct kvm_vcpu *vcpu),
+	TP_ARGS(vcpu),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	vcpu_id		)
+		__field(	unsigned long,	vcpu_pc		)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_id		= vcpu->vcpu_id;
+		__entry->vcpu_pc		= *vcpu_pc(vcpu);
+	),
+
+	TP_printk("vcpu: %u PC: 0x%016lx", __entry->vcpu_id, __entry->vcpu_pc)
+);
+
 DECLARE_TRACE(kvm_exit_tp,
 	TP_PROTO(int ret, struct kvm_vcpu *vcpu),
 	TP_ARGS(ret, vcpu));
@@ -57,6 +75,33 @@ TRACE_EVENT(kvm_exit,
 		  __entry->vcpu_pc)
 );
 
+TRACE_EVENT(kvm_exit_v2,
+	TP_PROTO(int ret, struct kvm_vcpu *vcpu),
+	TP_ARGS(ret, vcpu),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	vcpu_id		)
+		__field(	int,		ret		)
+		__field(	unsigned int,	esr_ec		)
+		__field(	unsigned long,	vcpu_pc		)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_id		= vcpu->vcpu_id;
+		__entry->ret			= ARM_EXCEPTION_CODE(ret);
+		__entry->esr_ec			= ARM_EXCEPTION_IS_TRAP(ret) ?
+						  kvm_vcpu_trap_get_class(vcpu): 0;
+		__entry->vcpu_pc		= *vcpu_pc(vcpu);
+	),
+
+	TP_printk("%s: vcpu: %u HSR_EC: 0x%04x (%s), PC: 0x%016lx",
+		  __print_symbolic(__entry->ret, kvm_arm_exception_type),
+		  __entry->vcpu_id,
+		  __entry->esr_ec,
+		  __print_symbolic(__entry->esr_ec, kvm_arm_exception_class),
+		  __entry->vcpu_pc)
+);
+
 TRACE_EVENT(kvm_guest_fault,
 	TP_PROTO(unsigned long vcpu_pc, unsigned long hsr,
 		 unsigned long hxfar,
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat
  2022-11-05  7:23 [PATCH v1 0/3] KVM: arm64: Support tracing virtual CPU ID Leo Yan
  2022-11-05  7:23 ` [PATCH v1 1/3] KVM: arm64: Dynamically register callback for tracepoints Leo Yan
  2022-11-05  7:23 ` [PATCH v1 2/3] KVM: arm64: Add trace events with field 'vcpu_id' Leo Yan
@ 2022-11-05  7:23 ` Leo Yan
  2022-11-05 13:28   ` Marc Zyngier
  2 siblings, 1 reply; 8+ messages in thread
From: Leo Yan @ 2022-11-05  7:23 UTC (permalink / raw)
  To: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Catalin Marinas, Will Deacon,
	Arnaldo Carvalho de Melo, John Garry, James Clark, Mike Leach,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, kvmarm, kvmarm,
	linux-kernel, linux-perf-users
  Cc: Leo Yan

Since the two trace events kvm_entry_v2/kvm_exit_v2 are added, we can
use the field "vcpu_id" in the events to get to know the virtual CPU ID.
To keep backward compatibility, we still need to rely on the trace
events kvm_entry/kvm_exit for old kernels.

This patch adds Arm64's functions setup_kvm_events_tp() and
arm64__setup_kvm_tp(), by detecting the nodes under sysfs folder, it can
dynamically register trace events kvm_entry_v2/kvm_exit_v2 when the
kernel has provided them, otherwise, it rolls back to use events
kvm_entry/kvm_exit for backward compatibility.

Let cpu_isa_init() to invoke arm64__setup_kvm_tp(), this can allow the
command "perf kvm stat report" also to dynamically setup trace events.

Before:

  # perf kvm stat report --vcpu 27

  Analyze events for all VMs, VCPU 27:

               VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time

  Total Samples:0, Total events handled time:0.00us.

After:

  # perf kvm stat report --vcpu 27

  Analyze events for all VMs, VCPU 27:

               VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time

                 SYS64        808    98.54%    91.24%      0.00us    303.76us      3.46us ( +-  13.54% )
                   WFx         10     1.22%     7.79%      0.00us     69.48us     23.91us ( +-  25.91% )
                   IRQ          2     0.24%     0.97%      0.00us     22.64us     14.82us ( +-  52.77% )

  Total Samples:820, Total events handled time:3068.28us.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/arch/arm64/util/kvm-stat.c | 54 ++++++++++++++++++++++++---
 1 file changed, 49 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/arm64/util/kvm-stat.c b/tools/perf/arch/arm64/util/kvm-stat.c
index 73d18e0ed6f6..1ba54ce3d7d8 100644
--- a/tools/perf/arch/arm64/util/kvm-stat.c
+++ b/tools/perf/arch/arm64/util/kvm-stat.c
@@ -3,6 +3,7 @@
 #include <memory.h>
 #include "../../../util/evsel.h"
 #include "../../../util/kvm-stat.h"
+#include "../../../util/tracepoint.h"
 #include "arm64_exception_types.h"
 #include "debug.h"
 
@@ -10,18 +11,28 @@ define_exit_reasons_table(arm64_exit_reasons, kvm_arm_exception_type);
 define_exit_reasons_table(arm64_trap_exit_reasons, kvm_arm_exception_class);
 
 const char *kvm_trap_exit_reason = "esr_ec";
-const char *vcpu_id_str = "id";
+const char *vcpu_id_str = "vcpu_id";
 const int decode_str_len = 20;
 const char *kvm_exit_reason = "ret";
-const char *kvm_entry_trace = "kvm:kvm_entry";
-const char *kvm_exit_trace = "kvm:kvm_exit";
+const char *kvm_entry_trace;
+const char *kvm_exit_trace;
 
-const char *kvm_events_tp[] = {
+#define NR_TPS	2
+
+static const char *kvm_events_tp_v1[NR_TPS + 1] = {
 	"kvm:kvm_entry",
 	"kvm:kvm_exit",
 	NULL,
 };
 
+static const char *kvm_events_tp_v2[NR_TPS + 1] = {
+	"kvm:kvm_entry_v2",
+	"kvm:kvm_exit_v2",
+	NULL,
+};
+
+const char *kvm_events_tp[NR_TPS + 1];
+
 static void event_get_key(struct evsel *evsel,
 			  struct perf_sample *sample,
 			  struct event_key *key)
@@ -78,8 +89,41 @@ const char * const kvm_skip_events[] = {
 	NULL,
 };
 
-int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid __maybe_unused)
+static int arm64__setup_kvm_tp(struct perf_kvm_stat *kvm)
 {
+	const char **kvm_events, **events_ptr;
+	int i, nr_tp = 0;
+
+	if (is_valid_tracepoint("kvm:kvm_entry_v2")) {
+		kvm_events = kvm_events_tp_v2;
+		kvm_entry_trace = "kvm:kvm_entry_v2";
+		kvm_exit_trace = "kvm:kvm_exit_v2";
+	} else {
+		kvm_events = kvm_events_tp_v1;
+		kvm_entry_trace = "kvm:kvm_entry";
+		kvm_exit_trace = "kvm:kvm_exit";
+	}
+
+	for (events_ptr = kvm_events; *events_ptr; events_ptr++) {
+		if (!is_valid_tracepoint(*events_ptr))
+			return -1;
+		nr_tp++;
+	}
+
+	for (i = 0; i < nr_tp; i++)
+		kvm_events_tp[i] = kvm_events[i];
+	kvm_events_tp[i] = NULL;
+
 	kvm->exit_reasons_isa = "arm64";
 	return 0;
 }
+
+int setup_kvm_events_tp(struct perf_kvm_stat *kvm)
+{
+	return arm64__setup_kvm_tp(kvm);
+}
+
+int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid __maybe_unused)
+{
+	return arm64__setup_kvm_tp(kvm);
+}
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat
  2022-11-05  7:23 ` [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat Leo Yan
@ 2022-11-05 13:28   ` Marc Zyngier
  2022-11-07 14:47     ` Leo Yan
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2022-11-05 13:28 UTC (permalink / raw)
  To: Leo Yan
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Catalin Marinas, Will Deacon, Arnaldo Carvalho de Melo,
	John Garry, James Clark, Mike Leach, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-arm-kernel, kvmarm, kvmarm, linux-kernel, linux-perf-users

On Sat, 05 Nov 2022 07:23:11 +0000,
Leo Yan <leo.yan@linaro.org> wrote:
> 
> Since the two trace events kvm_entry_v2/kvm_exit_v2 are added, we can
> use the field "vcpu_id" in the events to get to know the virtual CPU ID.
> To keep backward compatibility, we still need to rely on the trace
> events kvm_entry/kvm_exit for old kernels.
> 
> This patch adds Arm64's functions setup_kvm_events_tp() and
> arm64__setup_kvm_tp(), by detecting the nodes under sysfs folder, it can
> dynamically register trace events kvm_entry_v2/kvm_exit_v2 when the
> kernel has provided them, otherwise, it rolls back to use events
> kvm_entry/kvm_exit for backward compatibility.
> 
> Let cpu_isa_init() to invoke arm64__setup_kvm_tp(), this can allow the
> command "perf kvm stat report" also to dynamically setup trace events.
> 
> Before:
> 
>   # perf kvm stat report --vcpu 27
> 
>   Analyze events for all VMs, VCPU 27:
> 
>                VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time
> 
>   Total Samples:0, Total events handled time:0.00us.
>
> After:
> 
>   # perf kvm stat report --vcpu 27
> 
>   Analyze events for all VMs, VCPU 27:
> 
>                VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time
> 
>                  SYS64        808    98.54%    91.24%      0.00us    303.76us      3.46us ( +-  13.54% )
>                    WFx         10     1.22%     7.79%      0.00us     69.48us     23.91us ( +-  25.91% )
>                    IRQ          2     0.24%     0.97%      0.00us     22.64us     14.82us ( +-  52.77% )
> 
>   Total Samples:820, Total events handled time:3068.28us.

Please educate me: how useful is it to filter on a vcpu number across
all VMs? What sense does it even make?

Conversely, what would be the purpose of filtering on a 5th thread of
any process irrespective of what the process does? To me, this is the
same level of non-sense.

AFAICT, this is just piling more arbitrary data extraction for no
particular reason other than "just because we can", and there is
absolutely no guarantee that this is fit for anyone else's purpose.

I'd rather you have a generic tracepoint taking the vcpu as a context
and a BPF program that spits out the information people actually need,
keeping things out of the kernel. Or even a tracehook (like the
scheduler does), and let people load a module to dump whatever
information they please.

But randomly adding new tracepoints to output a semi-useless field
without any consideration for future-proofing? No, thank you.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat
  2022-11-05 13:28   ` Marc Zyngier
@ 2022-11-07 14:47     ` Leo Yan
  2022-11-07 15:39       ` Marc Zyngier
  0 siblings, 1 reply; 8+ messages in thread
From: Leo Yan @ 2022-11-07 14:47 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Catalin Marinas, Will Deacon, Arnaldo Carvalho de Melo,
	John Garry, James Clark, Mike Leach, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-arm-kernel, kvmarm, kvmarm, linux-kernel, linux-perf-users

On Sat, Nov 05, 2022 at 01:28:40PM +0000, Marc Zyngier wrote:

[...]

> > Before:
> > 
> >   # perf kvm stat report --vcpu 27
> > 
> >   Analyze events for all VMs, VCPU 27:
> > 
> >                VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time
> > 
> >   Total Samples:0, Total events handled time:0.00us.
> >
> > After:
> > 
> >   # perf kvm stat report --vcpu 27
> > 
> >   Analyze events for all VMs, VCPU 27:
> > 
> >                VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time
> > 
> >                  SYS64        808    98.54%    91.24%      0.00us    303.76us      3.46us ( +-  13.54% )
> >                    WFx         10     1.22%     7.79%      0.00us     69.48us     23.91us ( +-  25.91% )
> >                    IRQ          2     0.24%     0.97%      0.00us     22.64us     14.82us ( +-  52.77% )
> > 
> >   Total Samples:820, Total events handled time:3068.28us.
> 
> Please educate me: how useful is it to filter on a vcpu number across
> all VMs? What sense does it even make?

Now "perf kvm" tool is not sophisticated since it doesn't capture VMID
and virtual CPU ID together.

I think a case is we can spin a program on a specific virtual CPU with
taskset in VM, in this way we can check if any bottleneck is caused by
VM entry/exit, but I have to say that it's inaccurate if we only filter
on VCPU ID, we should consider tracing VMID and VCPU ID together in
later's enhancement.

> Conversely, what would be the purpose of filtering on a 5th thread of
> any process irrespective of what the process does? To me, this is the
> same level of non-sense.

I agree.

> AFAICT, this is just piling more arbitrary data extraction for no
> particular reason other than "just because we can", and there is
> absolutely no guarantee that this is fit for anyone else's purpose.
> 
> I'd rather you have a generic tracepoint taking the vcpu as a context
> and a BPF program that spits out the information people actually need,
> keeping things out of the kernel. Or even a tracehook (like the
> scheduler does), and let people load a module to dump whatever
> information they please.

Actually I considered three options:

Option 1: Simply add new version's trace events for recording more info.
This is not flexible and we even have risk to add more version's trace
event if later we might find that more data should traced.

This approach is straightforward and the implementation would be
simple.  This is main reason why finally I choosed to add new trace
events.

Option 2: use Kprobe to dynamically insert tracepoints; but this means
the user must have the corresponding vmlinux file, otherwise, perf
tool might inject tracepoint at an incorrect address.  This is the
main reason I didn't use Kprobe to add dynamic tracepoints.

Option 3: As you suggested, I can bind KVM tracepoints with a eBPF
program and the eBPF program records perf events.

When I reviewed Arm64's kvm_entry / kvm_exit trace events, they don't
have vcpu context in the arguments, this means I need to add new trace
events for accessing "vcpu" context.

Option 1 and 3 both need to add trace events; option 1 is more
straightforward solution and this is why it was choosed in current patch
set.

I recognized that I made a mistake, actually we can modify the trace
event's definition for kvm_entry / kvm_exit, note we only modify the
trace event's arguments, this will change the trace function's
definition but it will not break ABI (the format is exactly same for
the user space).  Below changes demonstrate what's my proposing:

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 94d33e296e10..16f6b61abfec 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -917,7 +917,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
                /**************************************************************
                 * Enter the guest
                 */
-               trace_kvm_entry(*vcpu_pc(vcpu));
+               trace_kvm_entry(vcpu);
                guest_timing_enter_irqoff();
 
                ret = kvm_arm_vcpu_enter_exit(vcpu);
diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
index 33e4e7dd2719..9df4fd30093c 100644
--- a/arch/arm64/kvm/trace_arm.h
+++ b/arch/arm64/kvm/trace_arm.h
@@ -12,15 +12,15 @@
  * Tracepoints for entry/exit to guest
  */
 TRACE_EVENT(kvm_entry,
-       TP_PROTO(unsigned long vcpu_pc),
-       TP_ARGS(vcpu_pc),
+       TP_PROTO(struct kvm_vcpu *vcpu),
+       TP_ARGS(vcpu),
 
        TP_STRUCT__entry(
                __field(        unsigned long,  vcpu_pc         )
        ),
 
        TP_fast_assign(
-               __entry->vcpu_pc                = vcpu_pc;
+               __entry->vcpu_pc                = *vcpu_pc(vcpu);
        ),
 
        TP_printk("PC: 0x%016lx", __entry->vcpu_pc)

Please let me know your opinion, if you don't object, I can move
forward with this approach.

> But randomly adding new tracepoints to output a semi-useless field
> without any consideration for future-proofing? No, thank you.

Okay.  Thanks for review!

Leo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat
  2022-11-07 14:47     ` Leo Yan
@ 2022-11-07 15:39       ` Marc Zyngier
  2022-11-08 11:49         ` Leo Yan
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2022-11-07 15:39 UTC (permalink / raw)
  To: Leo Yan
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Catalin Marinas, Will Deacon, Arnaldo Carvalho de Melo,
	John Garry, James Clark, Mike Leach, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-arm-kernel, kvmarm, kvmarm, linux-kernel, linux-perf-users

On Mon, 07 Nov 2022 14:47:10 +0000,
Leo Yan <leo.yan@linaro.org> wrote:
> 
> On Sat, Nov 05, 2022 at 01:28:40PM +0000, Marc Zyngier wrote:
> 
> [...]
> 
> > > Before:
> > > 
> > >   # perf kvm stat report --vcpu 27
> > > 
> > >   Analyze events for all VMs, VCPU 27:
> > > 
> > >                VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time
> > > 
> > >   Total Samples:0, Total events handled time:0.00us.
> > >
> > > After:
> > > 
> > >   # perf kvm stat report --vcpu 27
> > > 
> > >   Analyze events for all VMs, VCPU 27:
> > > 
> > >                VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time
> > > 
> > >                  SYS64        808    98.54%    91.24%      0.00us    303.76us      3.46us ( +-  13.54% )
> > >                    WFx         10     1.22%     7.79%      0.00us     69.48us     23.91us ( +-  25.91% )
> > >                    IRQ          2     0.24%     0.97%      0.00us     22.64us     14.82us ( +-  52.77% )
> > > 
> > >   Total Samples:820, Total events handled time:3068.28us.
> > 
> > Please educate me: how useful is it to filter on a vcpu number across
> > all VMs? What sense does it even make?
> 
> Now "perf kvm" tool is not sophisticated since it doesn't capture VMID
> and virtual CPU ID together.

VMID is not a relevant indicator anyway, as it can change at any
point. But that's only to show that everybody has a different view on
what they need to collect. At which point, we need to provide an
infrastructure for data extraction, and not the data itself.

> I think a case is we can spin a program on a specific virtual CPU with
> taskset in VM, in this way we can check if any bottleneck is caused by
> VM entry/exit, but I have to say that it's inaccurate if we only filter
> on VCPU ID, we should consider tracing VMID and VCPU ID together in
> later's enhancement.
> 
> > Conversely, what would be the purpose of filtering on a 5th thread of
> > any process irrespective of what the process does? To me, this is the
> > same level of non-sense.
> 
> I agree.
> 
> > AFAICT, this is just piling more arbitrary data extraction for no
> > particular reason other than "just because we can", and there is
> > absolutely no guarantee that this is fit for anyone else's purpose.
> > 
> > I'd rather you have a generic tracepoint taking the vcpu as a context
> > and a BPF program that spits out the information people actually need,
> > keeping things out of the kernel. Or even a tracehook (like the
> > scheduler does), and let people load a module to dump whatever
> > information they please.
> 
> Actually I considered three options:
> 
> Option 1: Simply add new version's trace events for recording more info.
> This is not flexible and we even have risk to add more version's trace
> event if later we might find that more data should traced.
> 
> This approach is straightforward and the implementation would be
> simple.  This is main reason why finally I choosed to add new trace
> events.

But that doesn't scale at all.

> 
> Option 2: use Kprobe to dynamically insert tracepoints; but this means
> the user must have the corresponding vmlinux file, otherwise, perf
> tool might inject tracepoint at an incorrect address.  This is the
> main reason I didn't use Kprobe to add dynamic tracepoints.
>
> Option 3: As you suggested, I can bind KVM tracepoints with a eBPF
> program and the eBPF program records perf events.
> 
> When I reviewed Arm64's kvm_entry / kvm_exit trace events, they don't
> have vcpu context in the arguments, this means I need to add new trace
> events for accessing "vcpu" context.

I'm not opposed to adding new trace{point,hook}s if you demonstrate
that they are generic enough or will never need to evolve.

> 
> Option 1 and 3 both need to add trace events; option 1 is more
> straightforward solution and this is why it was choosed in current patch
> set.
> 
> I recognized that I made a mistake, actually we can modify the trace
> event's definition for kvm_entry / kvm_exit, note we only modify the
> trace event's arguments, this will change the trace function's
> definition but it will not break ABI (the format is exactly same for
> the user space).  Below changes demonstrate what's my proposing:
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 94d33e296e10..16f6b61abfec 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -917,7 +917,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>                 /**************************************************************
>                  * Enter the guest
>                  */
> -               trace_kvm_entry(*vcpu_pc(vcpu));
> +               trace_kvm_entry(vcpu);
>                 guest_timing_enter_irqoff();
>  
>                 ret = kvm_arm_vcpu_enter_exit(vcpu);
> diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
> index 33e4e7dd2719..9df4fd30093c 100644
> --- a/arch/arm64/kvm/trace_arm.h
> +++ b/arch/arm64/kvm/trace_arm.h
> @@ -12,15 +12,15 @@
>   * Tracepoints for entry/exit to guest
>   */
>  TRACE_EVENT(kvm_entry,
> -       TP_PROTO(unsigned long vcpu_pc),
> -       TP_ARGS(vcpu_pc),
> +       TP_PROTO(struct kvm_vcpu *vcpu),
> +       TP_ARGS(vcpu),
>  
>         TP_STRUCT__entry(
>                 __field(        unsigned long,  vcpu_pc         )
>         ),
>  
>         TP_fast_assign(
> -               __entry->vcpu_pc                = vcpu_pc;
> +               __entry->vcpu_pc                = *vcpu_pc(vcpu);
>         ),
>  
>         TP_printk("PC: 0x%016lx", __entry->vcpu_pc)
> 
> Please let me know your opinion, if you don't object, I can move
> forward with this approach.

I have no issue with this if this doesn't change anything else.

And if you can make use of this with a BPF program and get to the same
result as your initial patch, then please submit it for inclusion in
the kernel as an example. We can then point people to it next time
this crop up (probably before Xmas).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat
  2022-11-07 15:39       ` Marc Zyngier
@ 2022-11-08 11:49         ` Leo Yan
  0 siblings, 0 replies; 8+ messages in thread
From: Leo Yan @ 2022-11-08 11:49 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Catalin Marinas, Will Deacon, Arnaldo Carvalho de Melo,
	John Garry, James Clark, Mike Leach, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-arm-kernel, kvmarm, kvmarm, linux-kernel, linux-perf-users

On Mon, Nov 07, 2022 at 03:39:07PM +0000, Marc Zyngier wrote:

[...]

> > > Please educate me: how useful is it to filter on a vcpu number across
> > > all VMs? What sense does it even make?
> > 
> > Now "perf kvm" tool is not sophisticated since it doesn't capture VMID
> > and virtual CPU ID together.
> 
> VMID is not a relevant indicator anyway, as it can change at any
> point.

Thanks for correction.  IIUC, VMID is not fixed for virtual machine, it
can be re-allocated after overflow.

> But that's only to show that everybody has a different view on
> what they need to collect. At which point, we need to provide an
> infrastructure for data extraction, and not the data itself.

Totally agree.

[...]

> > Option 3: As you suggested, I can bind KVM tracepoints with a eBPF
> > program and the eBPF program records perf events.
> > 
> > When I reviewed Arm64's kvm_entry / kvm_exit trace events, they don't
> > have vcpu context in the arguments, this means I need to add new trace
> > events for accessing "vcpu" context.
> 
> I'm not opposed to adding new trace{point,hook}s if you demonstrate
> that they are generic enough or will never need to evolve.
> 
> > 
> > Option 1 and 3 both need to add trace events; option 1 is more
> > straightforward solution and this is why it was choosed in current patch
> > set.
> > 
> > I recognized that I made a mistake, actually we can modify the trace
> > event's definition for kvm_entry / kvm_exit, note we only modify the
> > trace event's arguments, this will change the trace function's
> > definition but it will not break ABI (the format is exactly same for
> > the user space).  Below changes demonstrate what's my proposing:
> > 
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 94d33e296e10..16f6b61abfec 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -917,7 +917,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >                 /**************************************************************
> >                  * Enter the guest
> >                  */
> > -               trace_kvm_entry(*vcpu_pc(vcpu));
> > +               trace_kvm_entry(vcpu);
> >                 guest_timing_enter_irqoff();
> >  
> >                 ret = kvm_arm_vcpu_enter_exit(vcpu);
> > diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
> > index 33e4e7dd2719..9df4fd30093c 100644
> > --- a/arch/arm64/kvm/trace_arm.h
> > +++ b/arch/arm64/kvm/trace_arm.h
> > @@ -12,15 +12,15 @@
> >   * Tracepoints for entry/exit to guest
> >   */
> >  TRACE_EVENT(kvm_entry,
> > -       TP_PROTO(unsigned long vcpu_pc),
> > -       TP_ARGS(vcpu_pc),
> > +       TP_PROTO(struct kvm_vcpu *vcpu),
> > +       TP_ARGS(vcpu),
> >  
> >         TP_STRUCT__entry(
> >                 __field(        unsigned long,  vcpu_pc         )
> >         ),
> >  
> >         TP_fast_assign(
> > -               __entry->vcpu_pc                = vcpu_pc;
> > +               __entry->vcpu_pc                = *vcpu_pc(vcpu);
> >         ),
> >  
> >         TP_printk("PC: 0x%016lx", __entry->vcpu_pc)
> > 
> > Please let me know your opinion, if you don't object, I can move
> > forward with this approach.
> 
> I have no issue with this if this doesn't change anything else.

Thanks for confirmation.

> And if you can make use of this with a BPF program and get to the same
> result as your initial patch, then please submit it for inclusion in
> the kernel as an example. We can then point people to it next time
> this crop up (probably before Xmas).

Will do.

Just head up, if everything is going well, I will place the eBPF
program in the folder tools/perf/examples/bpf/, this can be easy for
integration and release with perf tool.

Thanks,
Leo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-08 11:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-05  7:23 [PATCH v1 0/3] KVM: arm64: Support tracing virtual CPU ID Leo Yan
2022-11-05  7:23 ` [PATCH v1 1/3] KVM: arm64: Dynamically register callback for tracepoints Leo Yan
2022-11-05  7:23 ` [PATCH v1 2/3] KVM: arm64: Add trace events with field 'vcpu_id' Leo Yan
2022-11-05  7:23 ` [PATCH v1 3/3] perf arm64: Support virtual CPU ID for kvm-stat Leo Yan
2022-11-05 13:28   ` Marc Zyngier
2022-11-07 14:47     ` Leo Yan
2022-11-07 15:39       ` Marc Zyngier
2022-11-08 11:49         ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).