All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
@ 2022-03-07 11:59 Like Xu
  2022-03-07 11:59 ` [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Like Xu @ 2022-03-07 11:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, kvm, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Joerg Roedel, linux-kernel

Hi,

This is a successor to the previous patch set [1] from Jason Baron, which
converts kvm_pmu_ops to use static_call. A typical perf use case [2] for
an Intel guest shows good performance gains (results are in patch 0004).

V2 -> V3 Changelog:
- Refine commit messages for __initdata; (Sean)
- Merge the logic of _defining_ and _update_; (Sean)
- Drop EXPORT_SYMBOL_GPL(kvm_pmu_ops); (Sean)
- Drop the _NULL() variant in the kvm-x86-*-ops.h; (Thanks to Paolo and Sean)
- Drop to export kvm_pmu_is_valid_msr() for nVMX; (Thanks to Sean)
- Based on the kvm/queue;

V1 -> V2 Changelog:
- Export kvm_pmu_is_valid_msr() for nVMX [Sean]
- Land memcpy() above kvm_ops_static_call_update() [Sean]
- Move the pmu_ops to kvm_x86_init_ops and tagged as __initdata. [Sean]
- Move the kvm_ops_static_call_update() to x86.c [Sean]
- Drop kvm_pmu_ops_static_call_update() [Sean]
- Fix WARNING that macros KVM_X86_OP should not use a trailing semicolon

Please note checkpatch.pl complains a lot about KVM_X86_*_OP macros:
- WARNING: macros should not use a trailing semicolon
- ERROR: Macros with multiple statements should be enclosed in a do - while loop
which could be addressed as a one-time follow-up if needed.

Previous:
https://lore.kernel.org/kvm/20211108111032.24457-1-likexu@tencent.com/

[1] https://lore.kernel.org/lkml/cover.1610680941.git.jbaron@akamai.com/
[2] perf record -e branch-instructions -e branch-misses \
-e cache-misses -e cache-references -e cpu-cycles \
-e instructions ./workload

Thanks,

Like Xu (4):
  KVM: x86: Move kvm_ops_static_call_update() to x86.c
  KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tag as __initdata
  KVM: x86: Use static calls to reduce kvm_pmu_ops overhead

 arch/x86/include/asm/kvm-x86-pmu-ops.h | 31 +++++++++++++++++
 arch/x86/include/asm/kvm_host.h        | 17 +--------
 arch/x86/kvm/pmu.c                     | 48 +++++++++++++++-----------
 arch/x86/kvm/pmu.h                     |  9 ++++-
 arch/x86/kvm/svm/pmu.c                 |  2 +-
 arch/x86/kvm/svm/svm.c                 |  2 +-
 arch/x86/kvm/vmx/pmu_intel.c           |  2 +-
 arch/x86/kvm/vmx/vmx.c                 |  2 +-
 arch/x86/kvm/x86.c                     | 23 ++++++++++++
 9 files changed, 94 insertions(+), 42 deletions(-)
 create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h

-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c
  2022-03-07 11:59 [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
@ 2022-03-07 11:59 ` Like Xu
  2022-03-29 22:59   ` Sean Christopherson
  2022-03-07 11:59 ` [PATCH v3 2/4] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Like Xu @ 2022-03-07 11:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, kvm, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Joerg Roedel, linux-kernel

From: Like Xu <likexu@tencent.com>

The kvm_ops_static_call_update() is defined in kvm_host.h. That's
completely unnecessary, it should have exactly one caller,
kvm_arch_hardware_setup().  As a prep match, move
kvm_ops_static_call_update() to x86.c, then it can reference
the kvm_pmu_ops stuff.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm_host.h | 14 --------------
 arch/x86/kvm/x86.c              | 14 ++++++++++++++
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index da2f3a21e37b..fdb62aba73ef 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1541,20 +1541,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
 #define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 
-static inline void kvm_ops_static_call_update(void)
-{
-#define __KVM_X86_OP(func) \
-	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
-#define KVM_X86_OP(func) \
-	WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
-#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
-#define KVM_X86_OP_OPTIONAL_RET0(func) \
-	static_call_update(kvm_x86_##func, (void *)kvm_x86_ops.func ? : \
-					   (void *)__static_call_return0);
-#include <asm/kvm-x86-ops.h>
-#undef __KVM_X86_OP
-}
-
 #define __KVM_HAVE_ARCH_VM_ALLOC
 static inline struct kvm *kvm_arch_alloc_vm(void)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f79bf4552082..7b4e84d80b57 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11507,6 +11507,20 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
+static inline void kvm_ops_static_call_update(void)
+{
+#define __KVM_X86_OP(func) \
+	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
+#define KVM_X86_OP(func) \
+	WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
+#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
+#define KVM_X86_OP_OPTIONAL_RET0(func) \
+	static_call_update(kvm_x86_##func, (void *)kvm_x86_ops.func ? : \
+					   (void *)__static_call_return0);
+#include <asm/kvm-x86-ops.h>
+#undef __KVM_X86_OP
+}
+
 int kvm_arch_hardware_setup(void *opaque)
 {
 	struct kvm_x86_init_ops *ops = opaque;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/4] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  2022-03-07 11:59 [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  2022-03-07 11:59 ` [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
@ 2022-03-07 11:59 ` Like Xu
  2022-03-07 11:59 ` [PATCH v3 3/4] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tag as __initdata Like Xu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Like Xu @ 2022-03-07 11:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, kvm, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Joerg Roedel, linux-kernel

From: Like Xu <likexu@tencent.com>

Replace the kvm_pmu_ops pointer in common x86 with an instance of the
struct to save one pointer dereference when invoking functions. Copy the
struct by value to set the ops during kvm_init().

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 44 +++++++++++++++++++++++---------------------
 arch/x86/kvm/pmu.h |  4 +++-
 arch/x86/kvm/x86.c |  1 +
 3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index b1a02993782b..771edc4f4494 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -49,6 +49,8 @@
  *        * AMD:   [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
  */
 
+struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
+
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -215,7 +217,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_x86_ops.pmu_ops->pmc_perf_hw_id(pmc);
+		config = kvm_pmu_ops.pmc_perf_hw_id(pmc);
 		if (config != PERF_COUNT_HW_MAX)
 			type = PERF_TYPE_HARDWARE;
 	}
@@ -267,7 +269,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_x86_ops.pmu_ops->pmc_perf_hw_id(pmc),
+			      kvm_pmu_ops.pmc_perf_hw_id(pmc),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
@@ -276,7 +278,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
-	struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, pmc_idx);
+	struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
 
 	if (!pmc)
 		return;
@@ -298,7 +300,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 	int bit;
 
 	for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
-		struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit);
+		struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
 
 		if (unlikely(!pmc || !pmc->perf_event)) {
 			clear_bit(bit, pmu->reprogram_pmi);
@@ -320,7 +322,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 /* check if idx is a valid index to access PMU */
 bool kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
-	return kvm_x86_ops.pmu_ops->is_valid_rdpmc_ecx(vcpu, idx);
+	return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
 }
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -370,7 +372,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
-	pmc = kvm_x86_ops.pmu_ops->rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+	pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
 	if (!pmc)
 		return 1;
 
@@ -386,22 +388,22 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu)) {
-		if (kvm_x86_ops.pmu_ops->deliver_pmi)
-			kvm_x86_ops.pmu_ops->deliver_pmi(vcpu);
+		if (kvm_pmu_ops.deliver_pmi)
+			kvm_pmu_ops.deliver_pmi(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
 	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
-	return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
-		kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
+	return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
+		kvm_pmu_ops.is_valid_msr(vcpu, msr);
 }
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr);
+	struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
 
 	if (pmc)
 		__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -409,13 +411,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return kvm_x86_ops.pmu_ops->get_msr(vcpu, msr_info);
+	return kvm_pmu_ops.get_msr(vcpu, msr_info);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return kvm_x86_ops.pmu_ops->set_msr(vcpu, msr_info);
+	return kvm_pmu_ops.set_msr(vcpu, msr_info);
 }
 
 /* refresh PMU settings. This function generally is called when underlying
@@ -424,7 +426,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
-	kvm_x86_ops.pmu_ops->refresh(vcpu);
+	kvm_pmu_ops.refresh(vcpu);
 }
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -432,7 +434,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	irq_work_sync(&pmu->irq_work);
-	kvm_x86_ops.pmu_ops->reset(vcpu);
+	kvm_pmu_ops.reset(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -440,7 +442,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	memset(pmu, 0, sizeof(*pmu));
-	kvm_x86_ops.pmu_ops->init(vcpu);
+	kvm_pmu_ops.init(vcpu);
 	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
 	pmu->event_count = 0;
 	pmu->need_cleanup = false;
@@ -472,14 +474,14 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 		      pmu->pmc_in_use, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
-		pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, i);
+		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
 	}
 
-	if (kvm_x86_ops.pmu_ops->cleanup)
-		kvm_x86_ops.pmu_ops->cleanup(vcpu);
+	if (kvm_pmu_ops.cleanup)
+		kvm_pmu_ops.cleanup(vcpu);
 
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
@@ -509,7 +511,7 @@ static inline bool eventsel_match_perf_hw_id(struct kvm_pmc *pmc,
 	unsigned int config;
 
 	pmc->eventsel &= (ARCH_PERFMON_EVENTSEL_EVENT | ARCH_PERFMON_EVENTSEL_UMASK);
-	config = kvm_x86_ops.pmu_ops->pmc_perf_hw_id(pmc);
+	config = kvm_pmu_ops.pmc_perf_hw_id(pmc);
 	pmc->eventsel = old_eventsel;
 	return config == perf_hw_id;
 }
@@ -537,7 +539,7 @@ void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
 	int i;
 
 	for_each_set_bit(i, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX) {
-		pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, i);
+		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
 
 		if (!pmc || !pmc_is_enabled(pmc) || !pmc_speculative_in_use(pmc))
 			continue;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7a7b8d5b775e..7032d3ebf8f4 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -17,6 +17,8 @@
 
 #define MAX_FIXED_COUNTERS	3
 
+extern struct kvm_pmu_ops kvm_pmu_ops;
+
 struct kvm_event_hw_type_mapping {
 	u8 eventsel;
 	u8 unit_mask;
@@ -90,7 +92,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
 
 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-	return kvm_x86_ops.pmu_ops->pmc_is_enabled(pmc);
+	return kvm_pmu_ops.pmc_is_enabled(pmc);
 }
 
 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7b4e84d80b57..dcaeedeef675 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11536,6 +11536,7 @@ int kvm_arch_hardware_setup(void *opaque)
 		return r;
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
+	memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
 	kvm_ops_static_call_update();
 
 	kvm_register_perf_callbacks(ops->handle_intel_pt_intr);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/4] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tag as __initdata
  2022-03-07 11:59 [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  2022-03-07 11:59 ` [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
  2022-03-07 11:59 ` [PATCH v3 2/4] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2022-03-07 11:59 ` Like Xu
  2022-03-07 11:59 ` [PATCH v3 4/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  2022-03-21 13:55 ` [PATCH v3 0/4] " Like Xu
  4 siblings, 0 replies; 9+ messages in thread
From: Like Xu @ 2022-03-07 11:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, kvm, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Joerg Roedel, linux-kernel

From: Like Xu <likexu@tencent.com>

The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.
That'll save those precious few bytes, and more importantly make
the original ops unreachable, i.e. make it harder to sneak in post-init
modification bugs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 3 +--
 arch/x86/kvm/svm/pmu.c          | 2 +-
 arch/x86/kvm/svm/svm.c          | 2 +-
 arch/x86/kvm/vmx/pmu_intel.c    | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 arch/x86/kvm/x86.c              | 2 +-
 6 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fdb62aba73ef..5d7297d1d71b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1448,8 +1448,6 @@ struct kvm_x86_ops {
 	int cpu_dirty_log_size;
 	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
 
-	/* pmu operations of sub-arch */
-	const struct kvm_pmu_ops *pmu_ops;
 	const struct kvm_x86_nested_ops *nested_ops;
 
 	void (*vcpu_blocking)(struct kvm_vcpu *vcpu);
@@ -1520,6 +1518,7 @@ struct kvm_x86_init_ops {
 	unsigned int (*handle_intel_pt_intr)(void);
 
 	struct kvm_x86_ops *runtime_ops;
+	struct kvm_pmu_ops *pmu_ops;
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index d4de52409335..d4876e6708c5 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -320,7 +320,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 	}
 }
 
-struct kvm_pmu_ops amd_pmu_ops = {
+struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.pmc_perf_hw_id = amd_pmc_perf_hw_id,
 	.pmc_is_enabled = amd_pmc_is_enabled,
 	.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index fc5222a0f506..21d85c8929d5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4613,7 +4613,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.sched_in = svm_sched_in,
 
-	.pmu_ops = &amd_pmu_ops,
 	.nested_ops = &svm_nested_ops,
 
 	.deliver_interrupt = svm_deliver_interrupt,
@@ -4887,6 +4886,7 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
 	.check_processor_compatibility = svm_check_processor_compat,
 
 	.runtime_ops = &svm_x86_ops,
+	.pmu_ops = &amd_pmu_ops,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 4e5b1eeeb77c..2c783ad122b9 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -715,7 +715,7 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 		intel_pmu_release_guest_lbr_event(vcpu);
 }
 
-struct kvm_pmu_ops intel_pmu_ops = {
+struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.pmc_perf_hw_id = intel_pmc_perf_hw_id,
 	.pmc_is_enabled = intel_pmc_is_enabled,
 	.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e8963f5af618..06088e26adae 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7823,7 +7823,6 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.cpu_dirty_log_size = PML_ENTITY_NUM,
 	.update_cpu_dirty_logging = vmx_update_cpu_dirty_logging,
 
-	.pmu_ops = &intel_pmu_ops,
 	.nested_ops = &vmx_nested_ops,
 
 	.pi_update_irte = vmx_pi_update_irte,
@@ -8078,6 +8077,7 @@ static struct kvm_x86_init_ops vmx_init_ops __initdata = {
 	.handle_intel_pt_intr = NULL,
 
 	.runtime_ops = &vmx_x86_ops,
+	.pmu_ops = &intel_pmu_ops,
 };
 
 static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dcaeedeef675..0a76f7281e74 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11536,7 +11536,7 @@ int kvm_arch_hardware_setup(void *opaque)
 		return r;
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
-	memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
+	memcpy(&kvm_pmu_ops, ops->pmu_ops, sizeof(kvm_pmu_ops));
 	kvm_ops_static_call_update();
 
 	kvm_register_perf_callbacks(ops->handle_intel_pt_intr);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 4/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
  2022-03-07 11:59 [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
                   ` (2 preceding siblings ...)
  2022-03-07 11:59 ` [PATCH v3 3/4] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tag as __initdata Like Xu
@ 2022-03-07 11:59 ` Like Xu
  2022-03-29 23:48   ` Sean Christopherson
  2022-03-21 13:55 ` [PATCH v3 0/4] " Like Xu
  4 siblings, 1 reply; 9+ messages in thread
From: Like Xu @ 2022-03-07 11:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, kvm, Jim Mattson, Wanpeng Li,
	Vitaly Kuznetsov, Joerg Roedel, linux-kernel

From: Like Xu <likexu@tencent.com>

Use static calls to improve kvm_pmu_ops performance. Introduce the
definitions that will be used by a subsequent patch to actualize the
savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
definition of static calls. This header is also intended to be
used to simplify the defition of amd_pmu_ops and intel_pmu_ops.

Here are the worst fenced_rdtsc() cycles numbers for the kvm_pmu_ops
functions that is most often called (up to 7 digits of calls) when running
a single perf test case in a guest on an ICX 2.70GHz host (mitigations=on):

		|	legacy	|	static call
------------------------------------------------------------
.pmc_idx_to_pmc	|	1304840	|	994872 (+23%)
.pmc_is_enabled	|	978670	|	1011750 (-3%)
.msr_idx_to_pmc	|	47828	|	41690 (+12%)
.is_valid_msr	|	28786	|	30108 (-4%)

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm-x86-pmu-ops.h | 31 +++++++++++++++++
 arch/x86/kvm/pmu.c                     | 46 ++++++++++++++------------
 arch/x86/kvm/pmu.h                     |  7 +++-
 arch/x86/kvm/x86.c                     |  8 +++++
 4 files changed, 70 insertions(+), 22 deletions(-)
 create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
new file mode 100644
index 000000000000..fdfd8e06fee6
--- /dev/null
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_OPTIONAL)
+BUILD_BUG_ON(1)
+#endif
+
+/*
+ * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_OPTIONAL() are used to help generate
+ * both DECLARE/DEFINE_STATIC_CALL() invocations and
+ * "static_call_update()" calls.
+ *
+ * KVM_X86_PMU_OP_OPTIONAL() can be used for those functions that can have
+ * a NULL definition, for example if "static_call_cond()" will be used
+ * at the call sites.
+ */
+KVM_X86_PMU_OP(pmc_perf_hw_id)
+KVM_X86_PMU_OP(pmc_is_enabled)
+KVM_X86_PMU_OP(pmc_idx_to_pmc)
+KVM_X86_PMU_OP(rdpmc_ecx_to_pmc)
+KVM_X86_PMU_OP(msr_idx_to_pmc)
+KVM_X86_PMU_OP(is_valid_rdpmc_ecx)
+KVM_X86_PMU_OP(is_valid_msr)
+KVM_X86_PMU_OP(get_msr)
+KVM_X86_PMU_OP(set_msr)
+KVM_X86_PMU_OP(refresh)
+KVM_X86_PMU_OP(init)
+KVM_X86_PMU_OP(reset)
+KVM_X86_PMU_OP_OPTIONAL(deliver_pmi)
+KVM_X86_PMU_OP_OPTIONAL(cleanup)
+
+#undef KVM_X86_PMU_OP
+#undef KVM_X86_PMU_OP_OPTIONAL
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 771edc4f4494..4146be236506 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -51,6 +51,12 @@
 
 struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
 
+#define KVM_X86_PMU_OP(func)					     \
+	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,			     \
+				*(((struct kvm_pmu_ops *)0)->func));
+#define KVM_X86_PMU_OP_OPTIONAL KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -217,7 +223,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_pmu_ops.pmc_perf_hw_id(pmc);
+		config = static_call(kvm_x86_pmu_pmc_perf_hw_id)(pmc);
 		if (config != PERF_COUNT_HW_MAX)
 			type = PERF_TYPE_HARDWARE;
 	}
@@ -269,7 +275,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_pmu_ops.pmc_perf_hw_id(pmc),
+			      static_call(kvm_x86_pmu_pmc_perf_hw_id)(pmc),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
@@ -278,7 +284,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
-	struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
+	struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, pmc_idx);
 
 	if (!pmc)
 		return;
@@ -300,7 +306,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 	int bit;
 
 	for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
-		struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
+		struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit);
 
 		if (unlikely(!pmc || !pmc->perf_event)) {
 			clear_bit(bit, pmu->reprogram_pmi);
@@ -322,7 +328,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 /* check if idx is a valid index to access PMU */
 bool kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
-	return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
+	return static_call(kvm_x86_pmu_is_valid_rdpmc_ecx)(vcpu, idx);
 }
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -372,7 +378,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
-	pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+	pmc = static_call(kvm_x86_pmu_rdpmc_ecx_to_pmc)(vcpu, idx, &mask);
 	if (!pmc)
 		return 1;
 
@@ -388,22 +394,21 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu)) {
-		if (kvm_pmu_ops.deliver_pmi)
-			kvm_pmu_ops.deliver_pmi(vcpu);
+		static_call_cond(kvm_x86_pmu_deliver_pmi)(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
 	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
-	return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
-		kvm_pmu_ops.is_valid_msr(vcpu, msr);
+	return static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr) ||
+		static_call(kvm_x86_pmu_is_valid_msr)(vcpu, msr);
 }
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
+	struct kvm_pmc *pmc = static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr);
 
 	if (pmc)
 		__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -411,13 +416,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return kvm_pmu_ops.get_msr(vcpu, msr_info);
+	return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return kvm_pmu_ops.set_msr(vcpu, msr_info);
+	return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
 }
 
 /* refresh PMU settings. This function generally is called when underlying
@@ -426,7 +431,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
-	kvm_pmu_ops.refresh(vcpu);
+	static_call(kvm_x86_pmu_refresh)(vcpu);
 }
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -434,7 +439,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	irq_work_sync(&pmu->irq_work);
-	kvm_pmu_ops.reset(vcpu);
+	static_call(kvm_x86_pmu_reset)(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -442,7 +447,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	memset(pmu, 0, sizeof(*pmu));
-	kvm_pmu_ops.init(vcpu);
+	static_call(kvm_x86_pmu_init)(vcpu);
 	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
 	pmu->event_count = 0;
 	pmu->need_cleanup = false;
@@ -474,14 +479,13 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 		      pmu->pmc_in_use, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
-		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
+		pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
 	}
 
-	if (kvm_pmu_ops.cleanup)
-		kvm_pmu_ops.cleanup(vcpu);
+	static_call_cond(kvm_x86_pmu_cleanup)(vcpu);
 
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
@@ -511,7 +515,7 @@ static inline bool eventsel_match_perf_hw_id(struct kvm_pmc *pmc,
 	unsigned int config;
 
 	pmc->eventsel &= (ARCH_PERFMON_EVENTSEL_EVENT | ARCH_PERFMON_EVENTSEL_UMASK);
-	config = kvm_pmu_ops.pmc_perf_hw_id(pmc);
+	config = static_call(kvm_x86_pmu_pmc_perf_hw_id)(pmc);
 	pmc->eventsel = old_eventsel;
 	return config == perf_hw_id;
 }
@@ -539,7 +543,7 @@ void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
 	int i;
 
 	for_each_set_bit(i, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX) {
-		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
+		pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);
 
 		if (!pmc || !pmc_is_enabled(pmc) || !pmc_speculative_in_use(pmc))
 			continue;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7032d3ebf8f4..9ccdfb4e838e 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -43,6 +43,11 @@ struct kvm_pmu_ops {
 	void (*cleanup)(struct kvm_vcpu *vcpu);
 };
 
+#define KVM_X86_PMU_OP(func) \
+	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func));
+#define KVM_X86_PMU_OP_OPTIONAL KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
@@ -92,7 +97,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
 
 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-	return kvm_pmu_ops.pmc_is_enabled(pmc);
+	return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
 }
 
 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0a76f7281e74..ffa0b219c13e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11519,6 +11519,14 @@ static inline void kvm_ops_static_call_update(void)
 					   (void *)__static_call_return0);
 #include <asm/kvm-x86-ops.h>
 #undef __KVM_X86_OP
+
+#define __KVM_X86_PMU_OP(func) \
+	static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func);
+#define KVM_X86_PMU_OP(func) \
+	WARN_ON(!kvm_pmu_ops.func); __KVM_X86_PMU_OP(func)
+#define KVM_X86_PMU_OP_OPTIONAL __KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+#undef __KVM_X86_PMU_OP
 }
 
 int kvm_arch_hardware_setup(void *opaque)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
  2022-03-07 11:59 [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
                   ` (3 preceding siblings ...)
  2022-03-07 11:59 ` [PATCH v3 4/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
@ 2022-03-21 13:55 ` Like Xu
  4 siblings, 0 replies; 9+ messages in thread
From: Like Xu @ 2022-03-21 13:55 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Jim Mattson, Wanpeng Li, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

Knock knock, do we have any more comments on this patch set ?

On 7/3/2022 7:59 pm, Like Xu wrote:
> Hi,
> 
> This is a successor to the previous patch set [1] from Jason Baron, which
> converts kvm_pmu_ops to use static_call. A typical perf use case [2] for
> an Intel guest shows good performance gains (results are in patch 0004).
> 
> V2 -> V3 Changelog:
> - Refine commit messages for __initdata; (Sean)
> - Merge the logic of _defining_ and _update_; (Sean)
> - Drop EXPORT_SYMBOL_GPL(kvm_pmu_ops); (Sean)
> - Drop the _NULL() variant in the kvm-x86-*-ops.h; (Thanks to Paolo and Sean)
> - Drop to export kvm_pmu_is_valid_msr() for nVMX; (Thanks to Sean)
> - Based on the kvm/queue;
> 
> V1 -> V2 Changelog:
> - Export kvm_pmu_is_valid_msr() for nVMX [Sean]
> - Land memcpy() above kvm_ops_static_call_update() [Sean]
> - Move the pmu_ops to kvm_x86_init_ops and tagged as __initdata. [Sean]
> - Move the kvm_ops_static_call_update() to x86.c [Sean]
> - Drop kvm_pmu_ops_static_call_update() [Sean]
> - Fix WARNING that macros KVM_X86_OP should not use a trailing semicolon
> 
> Please note checkpatch.pl complains a lot about KVM_X86_*_OP macros:
> - WARNING: macros should not use a trailing semicolon
> - ERROR: Macros with multiple statements should be enclosed in a do - while loop
> which could be addressed as a one-time follow-up if needed.
> 
> Previous:
> https://lore.kernel.org/kvm/20211108111032.24457-1-likexu@tencent.com/
> 
> [1] https://lore.kernel.org/lkml/cover.1610680941.git.jbaron@akamai.com/
> [2] perf record -e branch-instructions -e branch-misses \
> -e cache-misses -e cache-references -e cpu-cycles \
> -e instructions ./workload
> 
> Thanks,
> 
> Like Xu (4):
>    KVM: x86: Move kvm_ops_static_call_update() to x86.c
>    KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
>    KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tag as __initdata
>    KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
> 
>   arch/x86/include/asm/kvm-x86-pmu-ops.h | 31 +++++++++++++++++
>   arch/x86/include/asm/kvm_host.h        | 17 +--------
>   arch/x86/kvm/pmu.c                     | 48 +++++++++++++++-----------
>   arch/x86/kvm/pmu.h                     |  9 ++++-
>   arch/x86/kvm/svm/pmu.c                 |  2 +-
>   arch/x86/kvm/svm/svm.c                 |  2 +-
>   arch/x86/kvm/vmx/pmu_intel.c           |  2 +-
>   arch/x86/kvm/vmx/vmx.c                 |  2 +-
>   arch/x86/kvm/x86.c                     | 23 ++++++++++++
>   9 files changed, 94 insertions(+), 42 deletions(-)
>   create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c
  2022-03-07 11:59 ` [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
@ 2022-03-29 22:59   ` Sean Christopherson
  2022-03-29 23:26     ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2022-03-29 22:59 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Jim Mattson, Wanpeng Li, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Mon, Mar 07, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The kvm_ops_static_call_update() is defined in kvm_host.h. That's
> completely unnecessary, it should have exactly one caller,
> kvm_arch_hardware_setup().  As a prep match, move
> kvm_ops_static_call_update() to x86.c, then it can reference
> the kvm_pmu_ops stuff.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c
  2022-03-29 22:59   ` Sean Christopherson
@ 2022-03-29 23:26     ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2022-03-29 23:26 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Jim Mattson, Wanpeng Li, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Tue, Mar 29, 2022, Sean Christopherson wrote:
> On Mon, Mar 07, 2022, Like Xu wrote:
> > From: Like Xu <likexu@tencent.com>
> > 
> > The kvm_ops_static_call_update() is defined in kvm_host.h. That's
> > completely unnecessary, it should have exactly one caller,
> > kvm_arch_hardware_setup().  As a prep match, move
> > kvm_ops_static_call_update() to x86.c, then it can reference
> > the kvm_pmu_ops stuff.
> > 
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Like Xu <likexu@tencent.com>
> > ---
> 
> Reviewed-by: Sean Christopherson <seanjc@google.com>

Actually, I take that back.  If we pass in @ops, and do some other minor tweaks
along the way, then we can make kvm_pmu_ops static.

I'll post a very compile-tested-only v3.1, trying to generate diffs against your
series is going to be painful due to conflicts.  The changes aren't big, just
annoying.

Below is the diff for this patch.  Then in patch 2, kvm_ops_update() adds a call
to kvm_pmu_ops_update().  Patch 3 just tweaks the call to use ops->pmu_ops instead
of ops->runtime_ops->pmu_ops.  Patch 4 becomes purely code shuffling (I think).

---
 arch/x86/kvm/x86.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9cb8672aab92..99aa2d16845a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11595,8 +11595,10 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }

-static inline void kvm_ops_static_call_update(void)
+static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
 {
+	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
+
 #define __KVM_X86_OP(func) \
 	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
 #define KVM_X86_OP(func) \
@@ -11623,8 +11625,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	if (r != 0)
 		return r;

-	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
-	kvm_ops_static_call_update();
+	kvm_ops_update(ops);

 	kvm_register_perf_callbacks(ops->handle_intel_pt_intr);


base-commit: bd6b09f0754bea388a189d544ce11d83206579a2
--


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 4/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
  2022-03-07 11:59 ` [PATCH v3 4/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
@ 2022-03-29 23:48   ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2022-03-29 23:48 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Jim Mattson, Wanpeng Li, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Mon, Mar 07, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Use static calls to improve kvm_pmu_ops performance. Introduce the
> definitions that will be used by a subsequent patch to actualize the
> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
> definition of static calls. This header is also intended to be
> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.

This is stale, there are no subsequent patches and I think we decided to not fill
vendor ops with the ops.h headers.  I'll tweak it when sending v3.1.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-03-29 23:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 11:59 [PATCH v3 0/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
2022-03-07 11:59 ` [PATCH v3 1/4] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
2022-03-29 22:59   ` Sean Christopherson
2022-03-29 23:26     ` Sean Christopherson
2022-03-07 11:59 ` [PATCH v3 2/4] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
2022-03-07 11:59 ` [PATCH v3 3/4] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tag as __initdata Like Xu
2022-03-07 11:59 ` [PATCH v3 4/4] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
2022-03-29 23:48   ` Sean Christopherson
2022-03-21 13:55 ` [PATCH v3 0/4] " Like Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.