linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Use static_call for kvm_pmu_ops
@ 2021-11-08 11:10 Like Xu
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

Hi,

This is a successor to a previous patch set from Jason Baron. Let's convert
kvm_pmu_ops to use static_call. Shows good performance gains for
a typical perf use case [2] in the guest (results in patch 3/3).

V1 -> V2 Changelog:
- Export kvm_pmu_is_valid_msr() for nVMX [Sean]
- Land memcpy() above kvm_ops_static_call_update() [Sean]
- Move the pmu_ops to kvm_x86_init_ops and tagged as __initdata. [Sean]
- Move the kvm_ops_static_call_update() to x86.c [Sean]
- Drop kvm_pmu_ops_static_call_update() [Sean]
- Fix WARNING that macros KVM_X86_OP should not use a trailing semicolon

Previous:
https://lore.kernel.org/kvm/20211103070310.43380-1-likexu@tencent.com/

[1] https://lore.kernel.org/lkml/cover.1610680941.git.jbaron@akamai.com/
[2] perf record -e branch-instructions -e branch-misses \
-e cache-misses -e cache-references -e cpu-cycles \
-e instructions ./workload

Thanks,

Like Xu (7):
  KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
  KVM: x86: Fix WARNING that macros should not use a trailing semicolon
  KVM: x86: Move kvm_ops_static_call_update() to x86.c
  KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
  KVM: x86: Introduce definitions to support static calls for
    kvm_pmu_ops
  KVM: x86: Use static calls to reduce kvm_pmu_ops overhead

 arch/x86/include/asm/kvm-x86-ops.h     | 218 ++++++++++++-------------
 arch/x86/include/asm/kvm-x86-pmu-ops.h |  32 ++++
 arch/x86/include/asm/kvm_host.h        |  14 +-
 arch/x86/kvm/pmu.c                     |  46 +++---
 arch/x86/kvm/pmu.h                     |   9 +-
 arch/x86/kvm/svm/pmu.c                 |   2 +-
 arch/x86/kvm/svm/svm.c                 |   2 +-
 arch/x86/kvm/vmx/nested.c              |   2 +-
 arch/x86/kvm/vmx/pmu_intel.c           |   2 +-
 arch/x86/kvm/vmx/vmx.c                 |   2 +-
 arch/x86/kvm/x86.c                     |  16 +-
 11 files changed, 199 insertions(+), 146 deletions(-)
 create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h

-- 
2.33.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 17:55   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Let's export kvm_pmu_is_valid_msr() for nVMX,  instead of
exporting all kvm_pmu_ops for this one case. The reduced access
scope will help to optimize the kvm_x86_ops.pmu_ops stuff later.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c        | 1 +
 arch/x86/kvm/vmx/nested.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0772bad9165c..aa6ac9c7aff2 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -398,6 +398,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
 		kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
 }
+EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b4ee5e9f9e20..6c6bc8b2072a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4796,7 +4796,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 		return;
 
 	vmx = to_vmx(vcpu);
-	if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+	if (kvm_pmu_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
 		vmx->nested.msrs.entry_ctls_high |=
 				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 		vmx->nested.msrs.exit_ctls_high |=
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

The scripts/checkpatch.pl complains about the macro KVM_X86_OP in this way:

WARNING: macros should not use a trailing semicolon
+#define KVM_X86_OP(func) \
+	DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm-x86-ops.h | 218 ++++++++++++++---------------
 arch/x86/include/asm/kvm_host.h    |   4 +-
 arch/x86/kvm/x86.c                 |   2 +-
 3 files changed, 112 insertions(+), 112 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index cefe1d81e2e8..8f1035cc2c06 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -12,115 +12,115 @@ BUILD_BUG_ON(1)
  * case where there is no definition or a function name that
  * doesn't match the typical naming convention is supplied.
  */
-KVM_X86_OP_NULL(hardware_enable)
-KVM_X86_OP_NULL(hardware_disable)
-KVM_X86_OP_NULL(hardware_unsetup)
-KVM_X86_OP_NULL(cpu_has_accelerated_tpr)
-KVM_X86_OP(has_emulated_msr)
-KVM_X86_OP(vcpu_after_set_cpuid)
-KVM_X86_OP(vm_init)
-KVM_X86_OP_NULL(vm_destroy)
-KVM_X86_OP(vcpu_create)
-KVM_X86_OP(vcpu_free)
-KVM_X86_OP(vcpu_reset)
-KVM_X86_OP(prepare_guest_switch)
-KVM_X86_OP(vcpu_load)
-KVM_X86_OP(vcpu_put)
-KVM_X86_OP(update_exception_bitmap)
-KVM_X86_OP(get_msr)
-KVM_X86_OP(set_msr)
-KVM_X86_OP(get_segment_base)
-KVM_X86_OP(get_segment)
-KVM_X86_OP(get_cpl)
-KVM_X86_OP(set_segment)
-KVM_X86_OP_NULL(get_cs_db_l_bits)
-KVM_X86_OP(set_cr0)
-KVM_X86_OP(is_valid_cr4)
-KVM_X86_OP(set_cr4)
-KVM_X86_OP(set_efer)
-KVM_X86_OP(get_idt)
-KVM_X86_OP(set_idt)
-KVM_X86_OP(get_gdt)
-KVM_X86_OP(set_gdt)
-KVM_X86_OP(sync_dirty_debug_regs)
-KVM_X86_OP(set_dr7)
-KVM_X86_OP(cache_reg)
-KVM_X86_OP(get_rflags)
-KVM_X86_OP(set_rflags)
-KVM_X86_OP(tlb_flush_all)
-KVM_X86_OP(tlb_flush_current)
-KVM_X86_OP_NULL(tlb_remote_flush)
-KVM_X86_OP_NULL(tlb_remote_flush_with_range)
-KVM_X86_OP(tlb_flush_gva)
-KVM_X86_OP(tlb_flush_guest)
-KVM_X86_OP(run)
-KVM_X86_OP_NULL(handle_exit)
-KVM_X86_OP_NULL(skip_emulated_instruction)
-KVM_X86_OP_NULL(update_emulated_instruction)
-KVM_X86_OP(set_interrupt_shadow)
-KVM_X86_OP(get_interrupt_shadow)
-KVM_X86_OP(patch_hypercall)
-KVM_X86_OP(set_irq)
-KVM_X86_OP(set_nmi)
-KVM_X86_OP(queue_exception)
-KVM_X86_OP(cancel_injection)
-KVM_X86_OP(interrupt_allowed)
-KVM_X86_OP(nmi_allowed)
-KVM_X86_OP(get_nmi_mask)
-KVM_X86_OP(set_nmi_mask)
-KVM_X86_OP(enable_nmi_window)
-KVM_X86_OP(enable_irq_window)
-KVM_X86_OP(update_cr8_intercept)
-KVM_X86_OP(check_apicv_inhibit_reasons)
-KVM_X86_OP(refresh_apicv_exec_ctrl)
-KVM_X86_OP(hwapic_irr_update)
-KVM_X86_OP(hwapic_isr_update)
-KVM_X86_OP_NULL(guest_apic_has_interrupt)
-KVM_X86_OP(load_eoi_exitmap)
-KVM_X86_OP(set_virtual_apic_mode)
-KVM_X86_OP_NULL(set_apic_access_page_addr)
-KVM_X86_OP(deliver_posted_interrupt)
-KVM_X86_OP_NULL(sync_pir_to_irr)
-KVM_X86_OP(set_tss_addr)
-KVM_X86_OP(set_identity_map_addr)
-KVM_X86_OP(get_mt_mask)
-KVM_X86_OP(load_mmu_pgd)
-KVM_X86_OP_NULL(has_wbinvd_exit)
-KVM_X86_OP(get_l2_tsc_offset)
-KVM_X86_OP(get_l2_tsc_multiplier)
-KVM_X86_OP(write_tsc_offset)
-KVM_X86_OP(write_tsc_multiplier)
-KVM_X86_OP(get_exit_info)
-KVM_X86_OP(check_intercept)
-KVM_X86_OP(handle_exit_irqoff)
-KVM_X86_OP_NULL(request_immediate_exit)
-KVM_X86_OP(sched_in)
-KVM_X86_OP_NULL(update_cpu_dirty_logging)
-KVM_X86_OP_NULL(pre_block)
-KVM_X86_OP_NULL(post_block)
-KVM_X86_OP_NULL(vcpu_blocking)
-KVM_X86_OP_NULL(vcpu_unblocking)
-KVM_X86_OP_NULL(update_pi_irte)
-KVM_X86_OP_NULL(start_assignment)
-KVM_X86_OP_NULL(apicv_post_state_restore)
-KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt)
-KVM_X86_OP_NULL(set_hv_timer)
-KVM_X86_OP_NULL(cancel_hv_timer)
-KVM_X86_OP(setup_mce)
-KVM_X86_OP(smi_allowed)
-KVM_X86_OP(enter_smm)
-KVM_X86_OP(leave_smm)
-KVM_X86_OP(enable_smi_window)
-KVM_X86_OP_NULL(mem_enc_op)
-KVM_X86_OP_NULL(mem_enc_reg_region)
-KVM_X86_OP_NULL(mem_enc_unreg_region)
-KVM_X86_OP(get_msr_feature)
-KVM_X86_OP(can_emulate_instruction)
-KVM_X86_OP(apic_init_signal_blocked)
-KVM_X86_OP_NULL(enable_direct_tlbflush)
-KVM_X86_OP_NULL(migrate_timers)
-KVM_X86_OP(msr_filter_changed)
-KVM_X86_OP_NULL(complete_emulated_msr)
+KVM_X86_OP_NULL(hardware_enable);
+KVM_X86_OP_NULL(hardware_disable);
+KVM_X86_OP_NULL(hardware_unsetup);
+KVM_X86_OP_NULL(cpu_has_accelerated_tpr);
+KVM_X86_OP(has_emulated_msr);
+KVM_X86_OP(vcpu_after_set_cpuid);
+KVM_X86_OP(vm_init);
+KVM_X86_OP_NULL(vm_destroy);
+KVM_X86_OP(vcpu_create);
+KVM_X86_OP(vcpu_free);
+KVM_X86_OP(vcpu_reset);
+KVM_X86_OP(prepare_guest_switch);
+KVM_X86_OP(vcpu_load);
+KVM_X86_OP(vcpu_put);
+KVM_X86_OP(update_exception_bitmap);
+KVM_X86_OP(get_msr);
+KVM_X86_OP(set_msr);
+KVM_X86_OP(get_segment_base);
+KVM_X86_OP(get_segment);
+KVM_X86_OP(get_cpl);
+KVM_X86_OP(set_segment);
+KVM_X86_OP_NULL(get_cs_db_l_bits);
+KVM_X86_OP(set_cr0);
+KVM_X86_OP(is_valid_cr4);
+KVM_X86_OP(set_cr4);
+KVM_X86_OP(set_efer);
+KVM_X86_OP(get_idt);
+KVM_X86_OP(set_idt);
+KVM_X86_OP(get_gdt);
+KVM_X86_OP(set_gdt);
+KVM_X86_OP(sync_dirty_debug_regs);
+KVM_X86_OP(set_dr7);
+KVM_X86_OP(cache_reg);
+KVM_X86_OP(get_rflags);
+KVM_X86_OP(set_rflags);
+KVM_X86_OP(tlb_flush_all);
+KVM_X86_OP(tlb_flush_current);
+KVM_X86_OP_NULL(tlb_remote_flush);
+KVM_X86_OP_NULL(tlb_remote_flush_with_range);
+KVM_X86_OP(tlb_flush_gva);
+KVM_X86_OP(tlb_flush_guest);
+KVM_X86_OP(run);
+KVM_X86_OP_NULL(handle_exit);
+KVM_X86_OP_NULL(skip_emulated_instruction);
+KVM_X86_OP_NULL(update_emulated_instruction);
+KVM_X86_OP(set_interrupt_shadow);
+KVM_X86_OP(get_interrupt_shadow);
+KVM_X86_OP(patch_hypercall);
+KVM_X86_OP(set_irq);
+KVM_X86_OP(set_nmi);
+KVM_X86_OP(queue_exception);
+KVM_X86_OP(cancel_injection);
+KVM_X86_OP(interrupt_allowed);
+KVM_X86_OP(nmi_allowed);
+KVM_X86_OP(get_nmi_mask);
+KVM_X86_OP(set_nmi_mask);
+KVM_X86_OP(enable_nmi_window);
+KVM_X86_OP(enable_irq_window);
+KVM_X86_OP(update_cr8_intercept);
+KVM_X86_OP(check_apicv_inhibit_reasons);
+KVM_X86_OP(refresh_apicv_exec_ctrl);
+KVM_X86_OP(hwapic_irr_update);
+KVM_X86_OP(hwapic_isr_update);
+KVM_X86_OP_NULL(guest_apic_has_interrupt);
+KVM_X86_OP(load_eoi_exitmap);
+KVM_X86_OP(set_virtual_apic_mode);
+KVM_X86_OP_NULL(set_apic_access_page_addr);
+KVM_X86_OP(deliver_posted_interrupt);
+KVM_X86_OP_NULL(sync_pir_to_irr);
+KVM_X86_OP(set_tss_addr);
+KVM_X86_OP(set_identity_map_addr);
+KVM_X86_OP(get_mt_mask);
+KVM_X86_OP(load_mmu_pgd);
+KVM_X86_OP_NULL(has_wbinvd_exit);
+KVM_X86_OP(get_l2_tsc_offset);
+KVM_X86_OP(get_l2_tsc_multiplier);
+KVM_X86_OP(write_tsc_offset);
+KVM_X86_OP(write_tsc_multiplier);
+KVM_X86_OP(get_exit_info);
+KVM_X86_OP(check_intercept);
+KVM_X86_OP(handle_exit_irqoff);
+KVM_X86_OP_NULL(request_immediate_exit);
+KVM_X86_OP(sched_in);
+KVM_X86_OP_NULL(update_cpu_dirty_logging);
+KVM_X86_OP_NULL(pre_block);
+KVM_X86_OP_NULL(post_block);
+KVM_X86_OP_NULL(vcpu_blocking);
+KVM_X86_OP_NULL(vcpu_unblocking);
+KVM_X86_OP_NULL(update_pi_irte);
+KVM_X86_OP_NULL(start_assignment);
+KVM_X86_OP_NULL(apicv_post_state_restore);
+KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt);
+KVM_X86_OP_NULL(set_hv_timer);
+KVM_X86_OP_NULL(cancel_hv_timer);
+KVM_X86_OP(setup_mce);
+KVM_X86_OP(smi_allowed);
+KVM_X86_OP(enter_smm);
+KVM_X86_OP(leave_smm);
+KVM_X86_OP(enable_smi_window);
+KVM_X86_OP_NULL(mem_enc_op);
+KVM_X86_OP_NULL(mem_enc_reg_region);
+KVM_X86_OP_NULL(mem_enc_unreg_region);
+KVM_X86_OP(get_msr_feature);
+KVM_X86_OP(can_emulate_instruction);
+KVM_X86_OP(apic_init_signal_blocked);
+KVM_X86_OP_NULL(enable_direct_tlbflush);
+KVM_X86_OP_NULL(migrate_timers);
+KVM_X86_OP(msr_filter_changed);
+KVM_X86_OP_NULL(complete_emulated_msr);
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_NULL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88fce6ab4bbd..c2a4a362f3e2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1532,14 +1532,14 @@ extern bool __read_mostly enable_apicv;
 extern struct kvm_x86_ops kvm_x86_ops;
 
 #define KVM_X86_OP(func) \
-	DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
+	DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func))
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 
 static inline void kvm_ops_static_call_update(void)
 {
 #define KVM_X86_OP(func) \
-	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
+	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac83d873d65b..775051070627 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -125,7 +125,7 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
 
 #define KVM_X86_OP(func)					     \
 	DEFINE_STATIC_CALL_NULL(kvm_x86_##func,			     \
-				*(((struct kvm_x86_ops *)0)->func));
+				*(((struct kvm_x86_ops *)0)->func))
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
  2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

The kvm_ops_static_call_update() is defined in kvm_host.h.
That's completely unnecessary, it should have exactly one caller,
kvm_arch_hardware_setup().  As a prep match, move
kvm_ops_static_call_update() to x86.c, then it can reference
the kvm_pmu_ops stuff.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm_host.h | 8 --------
 arch/x86/kvm/x86.c              | 8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c2a4a362f3e2..c2d4ee2973c5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1536,14 +1536,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 
-static inline void kvm_ops_static_call_update(void)
-{
-#define KVM_X86_OP(func) \
-	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
-#define KVM_X86_OP_NULL KVM_X86_OP
-#include <asm/kvm-x86-ops.h>
-}
-
 #define __KVM_HAVE_ARCH_VM_ALLOC
 static inline struct kvm *kvm_arch_alloc_vm(void)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 775051070627..0aee0a078d6f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11300,6 +11300,14 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
+static inline void kvm_ops_static_call_update(void)
+{
+#define KVM_X86_OP(func) \
+	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
+#define KVM_X86_OP_NULL KVM_X86_OP
+#include <asm/kvm-x86-ops.h>
+}
+
 int kvm_arch_hardware_setup(void *opaque)
 {
 	struct kvm_x86_init_ops *ops = opaque;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (2 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 18:10   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Replace the kvm_pmu_ops pointer in common x86 with an instance of the
struct to save one pointer dereference when invoking functions. Copy the
struct by value to set the ops during kvm_init().

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 41 ++++++++++++++++++++++-------------------
 arch/x86/kvm/pmu.h |  4 +++-
 arch/x86/kvm/x86.c |  1 +
 3 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index aa6ac9c7aff2..353989bf0102 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -47,6 +47,9 @@
  *        * AMD:   [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
  */
 
+struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
+EXPORT_SYMBOL_GPL(kvm_pmu_ops);
+
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -214,7 +217,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_x86_ops.pmu_ops->find_arch_event(pmc_to_pmu(pmc),
+		config = kvm_pmu_ops.find_arch_event(pmc_to_pmu(pmc),
 						      event_select,
 						      unit_mask);
 		if (config != PERF_COUNT_HW_MAX)
@@ -268,7 +271,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_x86_ops.pmu_ops->find_fixed_event(idx),
+			      kvm_pmu_ops.find_fixed_event(idx),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
@@ -277,7 +280,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
-	struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, pmc_idx);
+	struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
 
 	if (!pmc)
 		return;
@@ -299,7 +302,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 	int bit;
 
 	for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
-		struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit);
+		struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
 
 		if (unlikely(!pmc || !pmc->perf_event)) {
 			clear_bit(bit, pmu->reprogram_pmi);
@@ -321,7 +324,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 /* check if idx is a valid index to access PMU */
 int kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
-	return kvm_x86_ops.pmu_ops->is_valid_rdpmc_ecx(vcpu, idx);
+	return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
 }
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -371,7 +374,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
-	pmc = kvm_x86_ops.pmu_ops->rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+	pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
 	if (!pmc)
 		return 1;
 
@@ -387,23 +390,23 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu)) {
-		if (kvm_x86_ops.pmu_ops->deliver_pmi)
-			kvm_x86_ops.pmu_ops->deliver_pmi(vcpu);
+		if (kvm_pmu_ops.deliver_pmi)
+			kvm_pmu_ops.deliver_pmi(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
 	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
-	return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
-		kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
+	return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
+		kvm_pmu_ops.is_valid_msr(vcpu, msr);
 }
 EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr);
+	struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
 
 	if (pmc)
 		__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -411,13 +414,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return kvm_x86_ops.pmu_ops->get_msr(vcpu, msr_info);
+	return kvm_pmu_ops.get_msr(vcpu, msr_info);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return kvm_x86_ops.pmu_ops->set_msr(vcpu, msr_info);
+	return kvm_pmu_ops.set_msr(vcpu, msr_info);
 }
 
 /* refresh PMU settings. This function generally is called when underlying
@@ -426,7 +429,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
-	kvm_x86_ops.pmu_ops->refresh(vcpu);
+	kvm_pmu_ops.refresh(vcpu);
 }
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -434,7 +437,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	irq_work_sync(&pmu->irq_work);
-	kvm_x86_ops.pmu_ops->reset(vcpu);
+	kvm_pmu_ops.reset(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -442,7 +445,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	memset(pmu, 0, sizeof(*pmu));
-	kvm_x86_ops.pmu_ops->init(vcpu);
+	kvm_pmu_ops.init(vcpu);
 	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
 	pmu->event_count = 0;
 	pmu->need_cleanup = false;
@@ -474,14 +477,14 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 		      pmu->pmc_in_use, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
-		pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, i);
+		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
 	}
 
-	if (kvm_x86_ops.pmu_ops->cleanup)
-		kvm_x86_ops.pmu_ops->cleanup(vcpu);
+	if (kvm_pmu_ops.cleanup)
+		kvm_pmu_ops.cleanup(vcpu);
 
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0e4f2b1fa9fb..b2fe135d395a 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -17,6 +17,8 @@
 
 #define MAX_FIXED_COUNTERS	3
 
+extern struct kvm_pmu_ops kvm_pmu_ops;
+
 struct kvm_event_hw_type_mapping {
 	u8 eventsel;
 	u8 unit_mask;
@@ -92,7 +94,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
 
 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-	return kvm_x86_ops.pmu_ops->pmc_is_enabled(pmc);
+	return kvm_pmu_ops.pmc_is_enabled(pmc);
 }
 
 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0aee0a078d6f..ca9a76abb6ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11323,6 +11323,7 @@ int kvm_arch_hardware_setup(void *opaque)
 		return r;
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
+	memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
 	kvm_ops_static_call_update();
 
 	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (3 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 18:17   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
  2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.
That'll save those precious few bytes, and more importantly make
the original ops unreachable, i.e. make it harder to sneak in post-init
modification bugs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm_host.h | 4 ++--
 arch/x86/kvm/svm/pmu.c          | 2 +-
 arch/x86/kvm/svm/svm.c          | 2 +-
 arch/x86/kvm/vmx/pmu_intel.c    | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 arch/x86/kvm/x86.c              | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c2d4ee2973c5..00760a3ac88c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1436,8 +1436,7 @@ struct kvm_x86_ops {
 	int cpu_dirty_log_size;
 	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
 
-	/* pmu operations of sub-arch */
-	const struct kvm_pmu_ops *pmu_ops;
+	/* nested operations of sub-arch */
 	const struct kvm_x86_nested_ops *nested_ops;
 
 	/*
@@ -1516,6 +1515,7 @@ struct kvm_x86_init_ops {
 	int (*hardware_setup)(void);
 
 	struct kvm_x86_ops *runtime_ops;
+	struct kvm_pmu_ops *pmu_ops;
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index fdf587f19c5f..4554cbc3820c 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -319,7 +319,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 	}
 }
 
-struct kvm_pmu_ops amd_pmu_ops = {
+struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.find_arch_event = amd_find_arch_event,
 	.find_fixed_event = amd_find_fixed_event,
 	.pmc_is_enabled = amd_pmc_is_enabled,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 21bb81710e0f..8834d7d2b991 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4681,7 +4681,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.sched_in = svm_sched_in,
 
-	.pmu_ops = &amd_pmu_ops,
 	.nested_ops = &svm_nested_ops,
 
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
@@ -4717,6 +4716,7 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
 	.check_processor_compatibility = svm_check_processor_compat,
 
 	.runtime_ops = &svm_x86_ops,
+	.pmu_ops = &amd_pmu_ops,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b8e0d21b7c8a..c0b905d032c8 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -703,7 +703,7 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 		intel_pmu_release_guest_lbr_event(vcpu);
 }
 
-struct kvm_pmu_ops intel_pmu_ops = {
+struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.find_arch_event = intel_find_arch_event,
 	.find_fixed_event = intel_find_fixed_event,
 	.pmc_is_enabled = intel_pmc_is_enabled,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 71f54d85f104..ce787d2e8e08 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7680,7 +7680,6 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.pre_block = vmx_pre_block,
 	.post_block = vmx_post_block,
 
-	.pmu_ops = &intel_pmu_ops,
 	.nested_ops = &vmx_nested_ops,
 
 	.update_pi_irte = pi_update_irte,
@@ -7922,6 +7921,7 @@ static struct kvm_x86_init_ops vmx_init_ops __initdata = {
 	.hardware_setup = hardware_setup,
 
 	.runtime_ops = &vmx_x86_ops,
+	.pmu_ops = &intel_pmu_ops,
 };
 
 static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ca9a76abb6ba..70dc8f41329c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11323,7 +11323,7 @@ int kvm_arch_hardware_setup(void *opaque)
 		return r;
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
-	memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
+	memcpy(&kvm_pmu_ops, ops->pmu_ops, sizeof(kvm_pmu_ops));
 	kvm_ops_static_call_update();
 
 	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (4 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 18:35   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Use static calls to improve kvm_pmu_ops performance. Introduce the
definitions that will be used by a subsequent patch to actualize the
savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
definition of static calls. This header is also intended to be
used to simplify the defition of amd_pmu_ops and intel_pmu_ops.

Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
static calls in a simlilar manner for insignificant but not
negligible performance impact, especially on older models.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm-x86-pmu-ops.h | 32 ++++++++++++++++++++++++++
 arch/x86/kvm/pmu.c                     |  6 +++++
 arch/x86/kvm/pmu.h                     |  5 ++++
 3 files changed, 43 insertions(+)
 create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
new file mode 100644
index 000000000000..b7713b16d21d
--- /dev/null
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
+BUILD_BUG_ON(1)
+#endif
+
+/*
+ * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
+ * help generate "static_call()"s. They are also intended for use when defining
+ * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
+ * for those functions that follow the [amd|intel]_func_name convention.
+ * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
+ * case where there is no definition or a function name that
+ * doesn't match the typical naming convention is supplied.
+ */
+KVM_X86_PMU_OP(find_arch_event);
+KVM_X86_PMU_OP(find_fixed_event);
+KVM_X86_PMU_OP(pmc_is_enabled);
+KVM_X86_PMU_OP(pmc_idx_to_pmc);
+KVM_X86_PMU_OP(rdpmc_ecx_to_pmc);
+KVM_X86_PMU_OP(msr_idx_to_pmc);
+KVM_X86_PMU_OP(is_valid_rdpmc_ecx);
+KVM_X86_PMU_OP(is_valid_msr);
+KVM_X86_PMU_OP(get_msr);
+KVM_X86_PMU_OP(set_msr);
+KVM_X86_PMU_OP(refresh);
+KVM_X86_PMU_OP(init);
+KVM_X86_PMU_OP(reset);
+KVM_X86_PMU_OP_NULL(deliver_pmi);
+KVM_X86_PMU_OP_NULL(cleanup);
+
+#undef KVM_X86_PMU_OP
+#undef KVM_X86_PMU_OP_NULL
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 353989bf0102..bfdd9f2bc0fa 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -50,6 +50,12 @@
 struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_pmu_ops);
 
+#define	KVM_X86_PMU_OP(func)	\
+	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,	\
+				*(((struct kvm_pmu_ops *)0)->func))
+#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index b2fe135d395a..40e0b523637b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -45,6 +45,11 @@ struct kvm_pmu_ops {
 	void (*cleanup)(struct kvm_vcpu *vcpu);
 };
 
+#define	KVM_X86_PMU_OP(func)	\
+	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
+#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (5 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
@ 2021-11-08 11:10 ` Like Xu
  6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Convert kvm_pmu_ops to use static calls.

Here are the worst sched_clock() nanosecond numbers for the kvm_pmu_ops
functions that is most often called (up to 7 digits of calls) when running
a single perf test case in a guest on an ICX 2.70GHz host (mitigations=on):

		|	legacy	|	static call
------------------------------------------------------------
.pmc_idx_to_pmc	|	10946	|	10047 (8%)
.pmc_is_enabled	|	11291	|	11175 (1%)
.msr_idx_to_pmc	|	13526	|	12346 (8%)
.is_valid_msr	|	10895	|	10484 (3%)

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 36 +++++++++++++++++-------------------
 arch/x86/kvm/pmu.h |  2 +-
 arch/x86/kvm/x86.c |  5 +++++
 3 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index bfdd9f2bc0fa..c86ff3057e2c 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -223,7 +223,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_pmu_ops.find_arch_event(pmc_to_pmu(pmc),
+		config = static_call(kvm_x86_pmu_find_arch_event)(pmc_to_pmu(pmc),
 						      event_select,
 						      unit_mask);
 		if (config != PERF_COUNT_HW_MAX)
@@ -277,7 +277,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_pmu_ops.find_fixed_event(idx),
+			      static_call(kvm_x86_pmu_find_fixed_event)(idx),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
@@ -286,7 +286,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
-	struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
+	struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, pmc_idx);
 
 	if (!pmc)
 		return;
@@ -308,7 +308,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 	int bit;
 
 	for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
-		struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
+		struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit);
 
 		if (unlikely(!pmc || !pmc->perf_event)) {
 			clear_bit(bit, pmu->reprogram_pmi);
@@ -330,7 +330,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 /* check if idx is a valid index to access PMU */
 int kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
-	return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
+	return static_call(kvm_x86_pmu_is_valid_rdpmc_ecx)(vcpu, idx);
 }
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -380,7 +380,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
-	pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+	pmc = static_call(kvm_x86_pmu_rdpmc_ecx_to_pmc)(vcpu, idx, &mask);
 	if (!pmc)
 		return 1;
 
@@ -396,23 +396,22 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu)) {
-		if (kvm_pmu_ops.deliver_pmi)
-			kvm_pmu_ops.deliver_pmi(vcpu);
+		static_call_cond(kvm_x86_pmu_deliver_pmi)(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
 	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
-	return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
-		kvm_pmu_ops.is_valid_msr(vcpu, msr);
+	return static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr) ||
+		static_call(kvm_x86_pmu_is_valid_msr)(vcpu, msr);
 }
 EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
+	struct kvm_pmc *pmc = static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr);
 
 	if (pmc)
 		__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -420,13 +419,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return kvm_pmu_ops.get_msr(vcpu, msr_info);
+	return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return kvm_pmu_ops.set_msr(vcpu, msr_info);
+	return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
 }
 
 /* refresh PMU settings. This function generally is called when underlying
@@ -435,7 +434,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
-	kvm_pmu_ops.refresh(vcpu);
+	static_call(kvm_x86_pmu_refresh)(vcpu);
 }
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -443,7 +442,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	irq_work_sync(&pmu->irq_work);
-	kvm_pmu_ops.reset(vcpu);
+	static_call(kvm_x86_pmu_reset)(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -451,7 +450,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	memset(pmu, 0, sizeof(*pmu));
-	kvm_pmu_ops.init(vcpu);
+	static_call(kvm_x86_pmu_init)(vcpu);
 	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
 	pmu->event_count = 0;
 	pmu->need_cleanup = false;
@@ -483,14 +482,13 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 		      pmu->pmc_in_use, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
-		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
+		pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
 	}
 
-	if (kvm_pmu_ops.cleanup)
-		kvm_pmu_ops.cleanup(vcpu);
+	static_call_cond(kvm_x86_pmu_cleanup)(vcpu);
 
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 40e0b523637b..a4bfd4200d67 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
 
 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-	return kvm_pmu_ops.pmc_is_enabled(pmc);
+	return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
 }
 
 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 70dc8f41329c..c5db444d5f4a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11306,6 +11306,11 @@ static inline void kvm_ops_static_call_update(void)
 	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
+
+#define	KVM_X86_PMU_OP(func)	\
+	static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func)
+#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
 }
 
 int kvm_arch_hardware_setup(void *opaque)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
@ 2021-12-08 17:55   ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 17:55 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

This is doing more than exporting a function, the export isn't even the focal
point of the patch.

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Let's export kvm_pmu_is_valid_msr() for nVMX,  instead of

Please wrap at ~75 chars.

> exporting all kvm_pmu_ops for this one case.

kvm_pmu_ops doesn't exist as of this patch, it comes later in the series.

> The reduced access scope will help to optimize the kvm_x86_ops.pmu_ops stuff
> later.

The changelog needs to explain why it's ok to add the msr_idx_to_pmc() check.

Something like:

  KVM: nVMX: Use kvm_pmu_is_valid_msr() to check for PERF_GLOBAL_CTRL support

  Use the generic kvm_pmu_is_valid_msr() helper when determining whether or not
  PERF_GLOBAL_CTRL is exposed to the guest and thus can be loaded on nested
  VM-Enter/VM-Exit.  The extra (indirect!) call to msr_idx_to_pmc() that comes
  with the helper is unnecessary, but harmless, as it's guaranteed to return
  false for MSR_CORE_PERF_GLOBAL_CTRL and this is a already a very slow path.

  Using the helper will allow future code to use static_call() for the PMU ops
  without having to export any static_call definitions.

  Export kvm_pmu_is_valid_msr() as necessary.

All that said, looking at this whole thing again, I think I'd prefer:


From d217914c9897d9a2cfd01073284a933ace4709b7 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 8 Dec 2021 09:48:20 -0800
Subject: [PATCH] KVM: nVMX: Refactor PMU refresh to avoid referencing
 kvm_x86_ops.pmu_ops

Refactor the nested VMX PMU refresh helper to pass it a flag stating
whether or not the vCPU has PERF_GLOBAL_CTRL instead of having the nVMX
helper query the information by bouncing through kvm_x86_ops.pmu_ops.
This will allow a future patch to use static_call() for the PMU ops
without having to export any static call definitions from common x86.

Alternatively, nVMX could call kvm_pmu_is_valid_msr() to indirectly use
kvm_x86_ops.pmu_ops, but that would exporting kvm_pmu_is_valid_msr() and
incurs an extra layer of indirection.

Opportunistically rename the helper to keep line lengths somewhat
reasonable, and to better capture its high-level role.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c    | 5 +++--
 arch/x86/kvm/vmx/nested.h    | 3 ++-
 arch/x86/kvm/vmx/pmu_intel.c | 3 ++-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 08e785871985..c87a81071288 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4773,7 +4773,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 	return 0;
 }
 
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
+void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
+			    bool vcpu_has_perf_global_ctrl)
 {
 	struct vcpu_vmx *vmx;
 
@@ -4781,7 +4782,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 		return;
 
 	vmx = to_vmx(vcpu);
-	if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+	if (vcpu_has_perf_global_ctrl) {
 		vmx->nested.msrs.entry_ctls_high |=
 				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 		vmx->nested.msrs.exit_ctls_high |=
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index b69a80f43b37..c92cea0b8ccc 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -32,7 +32,8 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
 int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
 int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 			u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
+void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
+			    bool vcpu_has_perf_global_ctrl);
 void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
 bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
 				 int size);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1b7456b2177b..7ca870d0249d 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -529,7 +529,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	bitmap_set(pmu->all_valid_pmc_idx,
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
-	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
+	nested_vmx_pmu_refresh(vcpu,
+			       intel_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL));
 
 	if (intel_pmu_lbr_is_compatible(vcpu))
 		x86_perf_get_lbr(&lbr_desc->records);
-- 
2.34.1.400.ga245620fadb-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2021-12-08 18:10   ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:10 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Replace the kvm_pmu_ops pointer in common x86 with an instance of the
> struct to save one pointer dereference when invoking functions. Copy the
> struct by value to set the ops during kvm_init().
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/kvm/pmu.c | 41 ++++++++++++++++++++++-------------------
>  arch/x86/kvm/pmu.h |  4 +++-
>  arch/x86/kvm/x86.c |  1 +
>  3 files changed, 26 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index aa6ac9c7aff2..353989bf0102 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -47,6 +47,9 @@
>   *        * AMD:   [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
>   */
>  
> +struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> +EXPORT_SYMBOL_GPL(kvm_pmu_ops);

This export isn't necessary.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
  2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
@ 2021-12-08 18:17   ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:17 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

s/tagged/tag

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.

State what the patch does, not what "should" be done.

  Now that pmu_ops is copied by value during kvm_arch_hardware_setup(), move
  the pointer to kvm_x86_init_ops and tag implementations as __initdata to make
  the implementations unreachable once KVM is loaded, e.g. to make it harder to
  sneak in post-init modification bugs.

> That'll save those precious few bytes, and more importantly make
> the original ops unreachable, i.e. make it harder to sneak in post-init
> modification bugs.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 4 ++--
>  arch/x86/kvm/svm/pmu.c          | 2 +-
>  arch/x86/kvm/svm/svm.c          | 2 +-
>  arch/x86/kvm/vmx/pmu_intel.c    | 2 +-
>  arch/x86/kvm/vmx/vmx.c          | 2 +-
>  arch/x86/kvm/x86.c              | 2 +-
>  6 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c2d4ee2973c5..00760a3ac88c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1436,8 +1436,7 @@ struct kvm_x86_ops {
>  	int cpu_dirty_log_size;
>  	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
>  
> -	/* pmu operations of sub-arch */
> -	const struct kvm_pmu_ops *pmu_ops;
> +	/* nested operations of sub-arch */

No need for the new comment.

>  	const struct kvm_x86_nested_ops *nested_ops;
>  
>  	/*

Nits aside,

Reviewed-by: Sean Christopherson <seanjc@google.com>

I'd also be a-ok squashing this with the copy-by-value patch.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
  2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
@ 2021-12-08 18:35   ` Sean Christopherson
  2021-12-09 12:50     ` Like Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:35 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Use static calls to improve kvm_pmu_ops performance. Introduce the
> definitions that will be used by a subsequent patch to actualize the
> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
> definition of static calls. This header is also intended to be
> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
> 
> Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
> static calls in a simlilar manner for insignificant but not
> negligible performance impact, especially on older models.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

This absolutely shouldn't be separated from patch 7/7.  By _defining_ the static
calls but not providing the logic to actually _update_ the calls, it's entirely
possible to add static_call() invocations that will compile cleanly without any
chance of doing the right thing at runtime.  

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0236c1a953d0..804f98b5552e 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)

 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-       return kvm_pmu_ops.pmc_is_enabled(pmc);
+       return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
 }

 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,

> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
> +BUILD_BUG_ON(1)
> +#endif
> +
> +/*
> + * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to

Please use all 80 chars.

> + * help generate "static_call()"s. They are also intended for use when defining
> + * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used

AMD/Intel since this is referring to the vendor and not to function names (like
the below reference).

> + * for those functions that follow the [amd|intel]_func_name convention.
> + * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the

As below, please drop the _NULL() variant.

> + * case where there is no definition or a function name that
> + * doesn't match the typical naming convention is supplied.
> + */

...

> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 353989bf0102..bfdd9f2bc0fa 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -50,6 +50,12 @@
>  struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
>  EXPORT_SYMBOL_GPL(kvm_pmu_ops);
>  
> +#define	KVM_X86_PMU_OP(func)	\
> +	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,	\
> +				*(((struct kvm_pmu_ops *)0)->func))
> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
> +#include <asm/kvm-x86-pmu-ops.h>
> +
>  static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
>  {
>  	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index b2fe135d395a..40e0b523637b 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -45,6 +45,11 @@ struct kvm_pmu_ops {
>  	void (*cleanup)(struct kvm_vcpu *vcpu);
>  };
>  
> +#define	KVM_X86_PMU_OP(func)	\
> +	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP

I don't want to proliferate the pointless and bitrot-prone KVM_X86_OP_NULL macro,
just omit this.  I'll send a patch to drop KVM_X86_OP_NULL.

> +#include <asm/kvm-x86-pmu-ops.h>
> +
>  static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
>  {
>  	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> -- 
> 2.33.0
> 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
  2021-12-08 18:35   ` Sean Christopherson
@ 2021-12-09 12:50     ` Like Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-12-09 12:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

Hi Sean,

On 9/12/2021 2:35 am, Sean Christopherson wrote:
> On Mon, Nov 08, 2021, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> Use static calls to improve kvm_pmu_ops performance. Introduce the
>> definitions that will be used by a subsequent patch to actualize the
>> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
>> definition of static calls. This header is also intended to be
>> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
>>
>> Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
>> static calls in a simlilar manner for insignificant but not
>> negligible performance impact, especially on older models.
>>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
> 
> This absolutely shouldn't be separated from patch 7/7.  By _defining_ the static
> calls but not providing the logic to actually _update_ the calls, it's entirely
> possible to add static_call() invocations that will compile cleanly without any
> chance of doing the right thing at runtime.
> 
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 0236c1a953d0..804f98b5552e 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
> 
>   static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
>   {
> -       return kvm_pmu_ops.pmc_is_enabled(pmc);
> +       return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
>   }
> 
>   static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
> 
>> @@ -0,0 +1,32 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
>> +BUILD_BUG_ON(1)
>> +#endif
>> +
>> +/*
>> + * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
> 
> Please use all 80 chars.
> 
>> + * help generate "static_call()"s. They are also intended for use when defining
>> + * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
> 
> AMD/Intel since this is referring to the vendor and not to function names (like
> the below reference).
> 
>> + * for those functions that follow the [amd|intel]_func_name convention.
>> + * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
> 
> As below, please drop the _NULL() variant.
> 
>> + * case where there is no definition or a function name that
>> + * doesn't match the typical naming convention is supplied.
>> + */
> 
> ...
> 
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 353989bf0102..bfdd9f2bc0fa 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -50,6 +50,12 @@
>>   struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
>>   EXPORT_SYMBOL_GPL(kvm_pmu_ops);
>>   
>> +#define	KVM_X86_PMU_OP(func)	\
>> +	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,	\
>> +				*(((struct kvm_pmu_ops *)0)->func))
>> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
>> +#include <asm/kvm-x86-pmu-ops.h>
>> +
>>   static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
>>   {
>>   	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
>> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
>> index b2fe135d395a..40e0b523637b 100644
>> --- a/arch/x86/kvm/pmu.h
>> +++ b/arch/x86/kvm/pmu.h
>> @@ -45,6 +45,11 @@ struct kvm_pmu_ops {
>>   	void (*cleanup)(struct kvm_vcpu *vcpu);
>>   };
>>   
>> +#define	KVM_X86_PMU_OP(func)	\
>> +	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
>> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
> 
> I don't want to proliferate the pointless and bitrot-prone KVM_X86_OP_NULL macro,
> just omit this.  I'll send a patch to drop KVM_X86_OP_NULL.

Thanks for your clear comments on this patch set.

I will send out V3 once KVM_X86_OP_NULL is dropped as well as
the comment in arch/x86/include/asm/kvm-x86-ops.h is updated.

> 
>> +#include <asm/kvm-x86-pmu-ops.h>
>> +
>>   static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
>>   {
>>   	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>> -- 
>> 2.33.0
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-12-09 12:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
2021-12-08 17:55   ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
2021-12-08 18:10   ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
2021-12-08 18:17   ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
2021-12-08 18:35   ` Sean Christopherson
2021-12-09 12:50     ` Like Xu
2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).