All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Use static_call for kvm_pmu_ops
@ 2021-11-08 11:10 Like Xu
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

Hi,

This is a successor to a previous patch set from Jason Baron. Let's convert
kvm_pmu_ops to use static_call. Shows good performance gains for
a typical perf use case [2] in the guest (results in patch 3/3).

V1 -> V2 Changelog:
- Export kvm_pmu_is_valid_msr() for nVMX [Sean]
- Land memcpy() above kvm_ops_static_call_update() [Sean]
- Move the pmu_ops to kvm_x86_init_ops and tagged as __initdata. [Sean]
- Move the kvm_ops_static_call_update() to x86.c [Sean]
- Drop kvm_pmu_ops_static_call_update() [Sean]
- Fix WARNING that macros KVM_X86_OP should not use a trailing semicolon

Previous:
https://lore.kernel.org/kvm/20211103070310.43380-1-likexu@tencent.com/

[1] https://lore.kernel.org/lkml/cover.1610680941.git.jbaron@akamai.com/
[2] perf record -e branch-instructions -e branch-misses \
-e cache-misses -e cache-references -e cpu-cycles \
-e instructions ./workload

Thanks,

Like Xu (7):
  KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
  KVM: x86: Fix WARNING that macros should not use a trailing semicolon
  KVM: x86: Move kvm_ops_static_call_update() to x86.c
  KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
  KVM: x86: Introduce definitions to support static calls for
    kvm_pmu_ops
  KVM: x86: Use static calls to reduce kvm_pmu_ops overhead

 arch/x86/include/asm/kvm-x86-ops.h     | 218 ++++++++++++-------------
 arch/x86/include/asm/kvm-x86-pmu-ops.h |  32 ++++
 arch/x86/include/asm/kvm_host.h        |  14 +-
 arch/x86/kvm/pmu.c                     |  46 +++---
 arch/x86/kvm/pmu.h                     |   9 +-
 arch/x86/kvm/svm/pmu.c                 |   2 +-
 arch/x86/kvm/svm/svm.c                 |   2 +-
 arch/x86/kvm/vmx/nested.c              |   2 +-
 arch/x86/kvm/vmx/pmu_intel.c           |   2 +-
 arch/x86/kvm/vmx/vmx.c                 |   2 +-
 arch/x86/kvm/x86.c                     |  16 +-
 11 files changed, 199 insertions(+), 146 deletions(-)
 create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h

-- 
2.33.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 17:55   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Let's export kvm_pmu_is_valid_msr() for nVMX,  instead of
exporting all kvm_pmu_ops for this one case. The reduced access
scope will help to optimize the kvm_x86_ops.pmu_ops stuff later.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c        | 1 +
 arch/x86/kvm/vmx/nested.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0772bad9165c..aa6ac9c7aff2 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -398,6 +398,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
 		kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
 }
+EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b4ee5e9f9e20..6c6bc8b2072a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4796,7 +4796,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 		return;
 
 	vmx = to_vmx(vcpu);
-	if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+	if (kvm_pmu_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
 		vmx->nested.msrs.entry_ctls_high |=
 				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 		vmx->nested.msrs.exit_ctls_high |=
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

The scripts/checkpatch.pl complains about the macro KVM_X86_OP in this way:

WARNING: macros should not use a trailing semicolon
+#define KVM_X86_OP(func) \
+	DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm-x86-ops.h | 218 ++++++++++++++---------------
 arch/x86/include/asm/kvm_host.h    |   4 +-
 arch/x86/kvm/x86.c                 |   2 +-
 3 files changed, 112 insertions(+), 112 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index cefe1d81e2e8..8f1035cc2c06 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -12,115 +12,115 @@ BUILD_BUG_ON(1)
  * case where there is no definition or a function name that
  * doesn't match the typical naming convention is supplied.
  */
-KVM_X86_OP_NULL(hardware_enable)
-KVM_X86_OP_NULL(hardware_disable)
-KVM_X86_OP_NULL(hardware_unsetup)
-KVM_X86_OP_NULL(cpu_has_accelerated_tpr)
-KVM_X86_OP(has_emulated_msr)
-KVM_X86_OP(vcpu_after_set_cpuid)
-KVM_X86_OP(vm_init)
-KVM_X86_OP_NULL(vm_destroy)
-KVM_X86_OP(vcpu_create)
-KVM_X86_OP(vcpu_free)
-KVM_X86_OP(vcpu_reset)
-KVM_X86_OP(prepare_guest_switch)
-KVM_X86_OP(vcpu_load)
-KVM_X86_OP(vcpu_put)
-KVM_X86_OP(update_exception_bitmap)
-KVM_X86_OP(get_msr)
-KVM_X86_OP(set_msr)
-KVM_X86_OP(get_segment_base)
-KVM_X86_OP(get_segment)
-KVM_X86_OP(get_cpl)
-KVM_X86_OP(set_segment)
-KVM_X86_OP_NULL(get_cs_db_l_bits)
-KVM_X86_OP(set_cr0)
-KVM_X86_OP(is_valid_cr4)
-KVM_X86_OP(set_cr4)
-KVM_X86_OP(set_efer)
-KVM_X86_OP(get_idt)
-KVM_X86_OP(set_idt)
-KVM_X86_OP(get_gdt)
-KVM_X86_OP(set_gdt)
-KVM_X86_OP(sync_dirty_debug_regs)
-KVM_X86_OP(set_dr7)
-KVM_X86_OP(cache_reg)
-KVM_X86_OP(get_rflags)
-KVM_X86_OP(set_rflags)
-KVM_X86_OP(tlb_flush_all)
-KVM_X86_OP(tlb_flush_current)
-KVM_X86_OP_NULL(tlb_remote_flush)
-KVM_X86_OP_NULL(tlb_remote_flush_with_range)
-KVM_X86_OP(tlb_flush_gva)
-KVM_X86_OP(tlb_flush_guest)
-KVM_X86_OP(run)
-KVM_X86_OP_NULL(handle_exit)
-KVM_X86_OP_NULL(skip_emulated_instruction)
-KVM_X86_OP_NULL(update_emulated_instruction)
-KVM_X86_OP(set_interrupt_shadow)
-KVM_X86_OP(get_interrupt_shadow)
-KVM_X86_OP(patch_hypercall)
-KVM_X86_OP(set_irq)
-KVM_X86_OP(set_nmi)
-KVM_X86_OP(queue_exception)
-KVM_X86_OP(cancel_injection)
-KVM_X86_OP(interrupt_allowed)
-KVM_X86_OP(nmi_allowed)
-KVM_X86_OP(get_nmi_mask)
-KVM_X86_OP(set_nmi_mask)
-KVM_X86_OP(enable_nmi_window)
-KVM_X86_OP(enable_irq_window)
-KVM_X86_OP(update_cr8_intercept)
-KVM_X86_OP(check_apicv_inhibit_reasons)
-KVM_X86_OP(refresh_apicv_exec_ctrl)
-KVM_X86_OP(hwapic_irr_update)
-KVM_X86_OP(hwapic_isr_update)
-KVM_X86_OP_NULL(guest_apic_has_interrupt)
-KVM_X86_OP(load_eoi_exitmap)
-KVM_X86_OP(set_virtual_apic_mode)
-KVM_X86_OP_NULL(set_apic_access_page_addr)
-KVM_X86_OP(deliver_posted_interrupt)
-KVM_X86_OP_NULL(sync_pir_to_irr)
-KVM_X86_OP(set_tss_addr)
-KVM_X86_OP(set_identity_map_addr)
-KVM_X86_OP(get_mt_mask)
-KVM_X86_OP(load_mmu_pgd)
-KVM_X86_OP_NULL(has_wbinvd_exit)
-KVM_X86_OP(get_l2_tsc_offset)
-KVM_X86_OP(get_l2_tsc_multiplier)
-KVM_X86_OP(write_tsc_offset)
-KVM_X86_OP(write_tsc_multiplier)
-KVM_X86_OP(get_exit_info)
-KVM_X86_OP(check_intercept)
-KVM_X86_OP(handle_exit_irqoff)
-KVM_X86_OP_NULL(request_immediate_exit)
-KVM_X86_OP(sched_in)
-KVM_X86_OP_NULL(update_cpu_dirty_logging)
-KVM_X86_OP_NULL(pre_block)
-KVM_X86_OP_NULL(post_block)
-KVM_X86_OP_NULL(vcpu_blocking)
-KVM_X86_OP_NULL(vcpu_unblocking)
-KVM_X86_OP_NULL(update_pi_irte)
-KVM_X86_OP_NULL(start_assignment)
-KVM_X86_OP_NULL(apicv_post_state_restore)
-KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt)
-KVM_X86_OP_NULL(set_hv_timer)
-KVM_X86_OP_NULL(cancel_hv_timer)
-KVM_X86_OP(setup_mce)
-KVM_X86_OP(smi_allowed)
-KVM_X86_OP(enter_smm)
-KVM_X86_OP(leave_smm)
-KVM_X86_OP(enable_smi_window)
-KVM_X86_OP_NULL(mem_enc_op)
-KVM_X86_OP_NULL(mem_enc_reg_region)
-KVM_X86_OP_NULL(mem_enc_unreg_region)
-KVM_X86_OP(get_msr_feature)
-KVM_X86_OP(can_emulate_instruction)
-KVM_X86_OP(apic_init_signal_blocked)
-KVM_X86_OP_NULL(enable_direct_tlbflush)
-KVM_X86_OP_NULL(migrate_timers)
-KVM_X86_OP(msr_filter_changed)
-KVM_X86_OP_NULL(complete_emulated_msr)
+KVM_X86_OP_NULL(hardware_enable);
+KVM_X86_OP_NULL(hardware_disable);
+KVM_X86_OP_NULL(hardware_unsetup);
+KVM_X86_OP_NULL(cpu_has_accelerated_tpr);
+KVM_X86_OP(has_emulated_msr);
+KVM_X86_OP(vcpu_after_set_cpuid);
+KVM_X86_OP(vm_init);
+KVM_X86_OP_NULL(vm_destroy);
+KVM_X86_OP(vcpu_create);
+KVM_X86_OP(vcpu_free);
+KVM_X86_OP(vcpu_reset);
+KVM_X86_OP(prepare_guest_switch);
+KVM_X86_OP(vcpu_load);
+KVM_X86_OP(vcpu_put);
+KVM_X86_OP(update_exception_bitmap);
+KVM_X86_OP(get_msr);
+KVM_X86_OP(set_msr);
+KVM_X86_OP(get_segment_base);
+KVM_X86_OP(get_segment);
+KVM_X86_OP(get_cpl);
+KVM_X86_OP(set_segment);
+KVM_X86_OP_NULL(get_cs_db_l_bits);
+KVM_X86_OP(set_cr0);
+KVM_X86_OP(is_valid_cr4);
+KVM_X86_OP(set_cr4);
+KVM_X86_OP(set_efer);
+KVM_X86_OP(get_idt);
+KVM_X86_OP(set_idt);
+KVM_X86_OP(get_gdt);
+KVM_X86_OP(set_gdt);
+KVM_X86_OP(sync_dirty_debug_regs);
+KVM_X86_OP(set_dr7);
+KVM_X86_OP(cache_reg);
+KVM_X86_OP(get_rflags);
+KVM_X86_OP(set_rflags);
+KVM_X86_OP(tlb_flush_all);
+KVM_X86_OP(tlb_flush_current);
+KVM_X86_OP_NULL(tlb_remote_flush);
+KVM_X86_OP_NULL(tlb_remote_flush_with_range);
+KVM_X86_OP(tlb_flush_gva);
+KVM_X86_OP(tlb_flush_guest);
+KVM_X86_OP(run);
+KVM_X86_OP_NULL(handle_exit);
+KVM_X86_OP_NULL(skip_emulated_instruction);
+KVM_X86_OP_NULL(update_emulated_instruction);
+KVM_X86_OP(set_interrupt_shadow);
+KVM_X86_OP(get_interrupt_shadow);
+KVM_X86_OP(patch_hypercall);
+KVM_X86_OP(set_irq);
+KVM_X86_OP(set_nmi);
+KVM_X86_OP(queue_exception);
+KVM_X86_OP(cancel_injection);
+KVM_X86_OP(interrupt_allowed);
+KVM_X86_OP(nmi_allowed);
+KVM_X86_OP(get_nmi_mask);
+KVM_X86_OP(set_nmi_mask);
+KVM_X86_OP(enable_nmi_window);
+KVM_X86_OP(enable_irq_window);
+KVM_X86_OP(update_cr8_intercept);
+KVM_X86_OP(check_apicv_inhibit_reasons);
+KVM_X86_OP(refresh_apicv_exec_ctrl);
+KVM_X86_OP(hwapic_irr_update);
+KVM_X86_OP(hwapic_isr_update);
+KVM_X86_OP_NULL(guest_apic_has_interrupt);
+KVM_X86_OP(load_eoi_exitmap);
+KVM_X86_OP(set_virtual_apic_mode);
+KVM_X86_OP_NULL(set_apic_access_page_addr);
+KVM_X86_OP(deliver_posted_interrupt);
+KVM_X86_OP_NULL(sync_pir_to_irr);
+KVM_X86_OP(set_tss_addr);
+KVM_X86_OP(set_identity_map_addr);
+KVM_X86_OP(get_mt_mask);
+KVM_X86_OP(load_mmu_pgd);
+KVM_X86_OP_NULL(has_wbinvd_exit);
+KVM_X86_OP(get_l2_tsc_offset);
+KVM_X86_OP(get_l2_tsc_multiplier);
+KVM_X86_OP(write_tsc_offset);
+KVM_X86_OP(write_tsc_multiplier);
+KVM_X86_OP(get_exit_info);
+KVM_X86_OP(check_intercept);
+KVM_X86_OP(handle_exit_irqoff);
+KVM_X86_OP_NULL(request_immediate_exit);
+KVM_X86_OP(sched_in);
+KVM_X86_OP_NULL(update_cpu_dirty_logging);
+KVM_X86_OP_NULL(pre_block);
+KVM_X86_OP_NULL(post_block);
+KVM_X86_OP_NULL(vcpu_blocking);
+KVM_X86_OP_NULL(vcpu_unblocking);
+KVM_X86_OP_NULL(update_pi_irte);
+KVM_X86_OP_NULL(start_assignment);
+KVM_X86_OP_NULL(apicv_post_state_restore);
+KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt);
+KVM_X86_OP_NULL(set_hv_timer);
+KVM_X86_OP_NULL(cancel_hv_timer);
+KVM_X86_OP(setup_mce);
+KVM_X86_OP(smi_allowed);
+KVM_X86_OP(enter_smm);
+KVM_X86_OP(leave_smm);
+KVM_X86_OP(enable_smi_window);
+KVM_X86_OP_NULL(mem_enc_op);
+KVM_X86_OP_NULL(mem_enc_reg_region);
+KVM_X86_OP_NULL(mem_enc_unreg_region);
+KVM_X86_OP(get_msr_feature);
+KVM_X86_OP(can_emulate_instruction);
+KVM_X86_OP(apic_init_signal_blocked);
+KVM_X86_OP_NULL(enable_direct_tlbflush);
+KVM_X86_OP_NULL(migrate_timers);
+KVM_X86_OP(msr_filter_changed);
+KVM_X86_OP_NULL(complete_emulated_msr);
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_NULL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88fce6ab4bbd..c2a4a362f3e2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1532,14 +1532,14 @@ extern bool __read_mostly enable_apicv;
 extern struct kvm_x86_ops kvm_x86_ops;
 
 #define KVM_X86_OP(func) \
-	DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
+	DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func))
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 
 static inline void kvm_ops_static_call_update(void)
 {
 #define KVM_X86_OP(func) \
-	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
+	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac83d873d65b..775051070627 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -125,7 +125,7 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
 
 #define KVM_X86_OP(func)					     \
 	DEFINE_STATIC_CALL_NULL(kvm_x86_##func,			     \
-				*(((struct kvm_x86_ops *)0)->func));
+				*(((struct kvm_x86_ops *)0)->func))
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
  2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

The kvm_ops_static_call_update() is defined in kvm_host.h.
That's completely unnecessary, it should have exactly one caller,
kvm_arch_hardware_setup().  As a prep match, move
kvm_ops_static_call_update() to x86.c, then it can reference
the kvm_pmu_ops stuff.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm_host.h | 8 --------
 arch/x86/kvm/x86.c              | 8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c2a4a362f3e2..c2d4ee2973c5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1536,14 +1536,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 
-static inline void kvm_ops_static_call_update(void)
-{
-#define KVM_X86_OP(func) \
-	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
-#define KVM_X86_OP_NULL KVM_X86_OP
-#include <asm/kvm-x86-ops.h>
-}
-
 #define __KVM_HAVE_ARCH_VM_ALLOC
 static inline struct kvm *kvm_arch_alloc_vm(void)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 775051070627..0aee0a078d6f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11300,6 +11300,14 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
+static inline void kvm_ops_static_call_update(void)
+{
+#define KVM_X86_OP(func) \
+	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
+#define KVM_X86_OP_NULL KVM_X86_OP
+#include <asm/kvm-x86-ops.h>
+}
+
 int kvm_arch_hardware_setup(void *opaque)
 {
 	struct kvm_x86_init_ops *ops = opaque;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (2 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 18:10   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Replace the kvm_pmu_ops pointer in common x86 with an instance of the
struct to save one pointer dereference when invoking functions. Copy the
struct by value to set the ops during kvm_init().

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 41 ++++++++++++++++++++++-------------------
 arch/x86/kvm/pmu.h |  4 +++-
 arch/x86/kvm/x86.c |  1 +
 3 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index aa6ac9c7aff2..353989bf0102 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -47,6 +47,9 @@
  *        * AMD:   [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
  */
 
+struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
+EXPORT_SYMBOL_GPL(kvm_pmu_ops);
+
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -214,7 +217,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_x86_ops.pmu_ops->find_arch_event(pmc_to_pmu(pmc),
+		config = kvm_pmu_ops.find_arch_event(pmc_to_pmu(pmc),
 						      event_select,
 						      unit_mask);
 		if (config != PERF_COUNT_HW_MAX)
@@ -268,7 +271,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_x86_ops.pmu_ops->find_fixed_event(idx),
+			      kvm_pmu_ops.find_fixed_event(idx),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
@@ -277,7 +280,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
-	struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, pmc_idx);
+	struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
 
 	if (!pmc)
 		return;
@@ -299,7 +302,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 	int bit;
 
 	for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
-		struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit);
+		struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
 
 		if (unlikely(!pmc || !pmc->perf_event)) {
 			clear_bit(bit, pmu->reprogram_pmi);
@@ -321,7 +324,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 /* check if idx is a valid index to access PMU */
 int kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
-	return kvm_x86_ops.pmu_ops->is_valid_rdpmc_ecx(vcpu, idx);
+	return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
 }
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -371,7 +374,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
-	pmc = kvm_x86_ops.pmu_ops->rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+	pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
 	if (!pmc)
 		return 1;
 
@@ -387,23 +390,23 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu)) {
-		if (kvm_x86_ops.pmu_ops->deliver_pmi)
-			kvm_x86_ops.pmu_ops->deliver_pmi(vcpu);
+		if (kvm_pmu_ops.deliver_pmi)
+			kvm_pmu_ops.deliver_pmi(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
 	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
-	return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
-		kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
+	return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
+		kvm_pmu_ops.is_valid_msr(vcpu, msr);
 }
 EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr);
+	struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
 
 	if (pmc)
 		__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -411,13 +414,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return kvm_x86_ops.pmu_ops->get_msr(vcpu, msr_info);
+	return kvm_pmu_ops.get_msr(vcpu, msr_info);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return kvm_x86_ops.pmu_ops->set_msr(vcpu, msr_info);
+	return kvm_pmu_ops.set_msr(vcpu, msr_info);
 }
 
 /* refresh PMU settings. This function generally is called when underlying
@@ -426,7 +429,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
-	kvm_x86_ops.pmu_ops->refresh(vcpu);
+	kvm_pmu_ops.refresh(vcpu);
 }
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -434,7 +437,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	irq_work_sync(&pmu->irq_work);
-	kvm_x86_ops.pmu_ops->reset(vcpu);
+	kvm_pmu_ops.reset(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -442,7 +445,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	memset(pmu, 0, sizeof(*pmu));
-	kvm_x86_ops.pmu_ops->init(vcpu);
+	kvm_pmu_ops.init(vcpu);
 	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
 	pmu->event_count = 0;
 	pmu->need_cleanup = false;
@@ -474,14 +477,14 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 		      pmu->pmc_in_use, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
-		pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, i);
+		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
 	}
 
-	if (kvm_x86_ops.pmu_ops->cleanup)
-		kvm_x86_ops.pmu_ops->cleanup(vcpu);
+	if (kvm_pmu_ops.cleanup)
+		kvm_pmu_ops.cleanup(vcpu);
 
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0e4f2b1fa9fb..b2fe135d395a 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -17,6 +17,8 @@
 
 #define MAX_FIXED_COUNTERS	3
 
+extern struct kvm_pmu_ops kvm_pmu_ops;
+
 struct kvm_event_hw_type_mapping {
 	u8 eventsel;
 	u8 unit_mask;
@@ -92,7 +94,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
 
 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-	return kvm_x86_ops.pmu_ops->pmc_is_enabled(pmc);
+	return kvm_pmu_ops.pmc_is_enabled(pmc);
 }
 
 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0aee0a078d6f..ca9a76abb6ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11323,6 +11323,7 @@ int kvm_arch_hardware_setup(void *opaque)
 		return r;
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
+	memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
 	kvm_ops_static_call_update();
 
 	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (3 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 18:17   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
  2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.
That'll save those precious few bytes, and more importantly make
the original ops unreachable, i.e. make it harder to sneak in post-init
modification bugs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm_host.h | 4 ++--
 arch/x86/kvm/svm/pmu.c          | 2 +-
 arch/x86/kvm/svm/svm.c          | 2 +-
 arch/x86/kvm/vmx/pmu_intel.c    | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 arch/x86/kvm/x86.c              | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c2d4ee2973c5..00760a3ac88c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1436,8 +1436,7 @@ struct kvm_x86_ops {
 	int cpu_dirty_log_size;
 	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
 
-	/* pmu operations of sub-arch */
-	const struct kvm_pmu_ops *pmu_ops;
+	/* nested operations of sub-arch */
 	const struct kvm_x86_nested_ops *nested_ops;
 
 	/*
@@ -1516,6 +1515,7 @@ struct kvm_x86_init_ops {
 	int (*hardware_setup)(void);
 
 	struct kvm_x86_ops *runtime_ops;
+	struct kvm_pmu_ops *pmu_ops;
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index fdf587f19c5f..4554cbc3820c 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -319,7 +319,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 	}
 }
 
-struct kvm_pmu_ops amd_pmu_ops = {
+struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.find_arch_event = amd_find_arch_event,
 	.find_fixed_event = amd_find_fixed_event,
 	.pmc_is_enabled = amd_pmc_is_enabled,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 21bb81710e0f..8834d7d2b991 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4681,7 +4681,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.sched_in = svm_sched_in,
 
-	.pmu_ops = &amd_pmu_ops,
 	.nested_ops = &svm_nested_ops,
 
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
@@ -4717,6 +4716,7 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
 	.check_processor_compatibility = svm_check_processor_compat,
 
 	.runtime_ops = &svm_x86_ops,
+	.pmu_ops = &amd_pmu_ops,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b8e0d21b7c8a..c0b905d032c8 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -703,7 +703,7 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 		intel_pmu_release_guest_lbr_event(vcpu);
 }
 
-struct kvm_pmu_ops intel_pmu_ops = {
+struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.find_arch_event = intel_find_arch_event,
 	.find_fixed_event = intel_find_fixed_event,
 	.pmc_is_enabled = intel_pmc_is_enabled,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 71f54d85f104..ce787d2e8e08 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7680,7 +7680,6 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.pre_block = vmx_pre_block,
 	.post_block = vmx_post_block,
 
-	.pmu_ops = &intel_pmu_ops,
 	.nested_ops = &vmx_nested_ops,
 
 	.update_pi_irte = pi_update_irte,
@@ -7922,6 +7921,7 @@ static struct kvm_x86_init_ops vmx_init_ops __initdata = {
 	.hardware_setup = hardware_setup,
 
 	.runtime_ops = &vmx_x86_ops,
+	.pmu_ops = &intel_pmu_ops,
 };
 
 static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ca9a76abb6ba..70dc8f41329c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11323,7 +11323,7 @@ int kvm_arch_hardware_setup(void *opaque)
 		return r;
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
-	memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
+	memcpy(&kvm_pmu_ops, ops->pmu_ops, sizeof(kvm_pmu_ops));
 	kvm_ops_static_call_update();
 
 	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (4 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
@ 2021-11-08 11:10 ` Like Xu
  2021-12-08 18:35   ` Sean Christopherson
  2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
  6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Use static calls to improve kvm_pmu_ops performance. Introduce the
definitions that will be used by a subsequent patch to actualize the
savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
definition of static calls. This header is also intended to be
used to simplify the defition of amd_pmu_ops and intel_pmu_ops.

Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
static calls in a simlilar manner for insignificant but not
negligible performance impact, especially on older models.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/include/asm/kvm-x86-pmu-ops.h | 32 ++++++++++++++++++++++++++
 arch/x86/kvm/pmu.c                     |  6 +++++
 arch/x86/kvm/pmu.h                     |  5 ++++
 3 files changed, 43 insertions(+)
 create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
new file mode 100644
index 000000000000..b7713b16d21d
--- /dev/null
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
+BUILD_BUG_ON(1)
+#endif
+
+/*
+ * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
+ * help generate "static_call()"s. They are also intended for use when defining
+ * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
+ * for those functions that follow the [amd|intel]_func_name convention.
+ * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
+ * case where there is no definition or a function name that
+ * doesn't match the typical naming convention is supplied.
+ */
+KVM_X86_PMU_OP(find_arch_event);
+KVM_X86_PMU_OP(find_fixed_event);
+KVM_X86_PMU_OP(pmc_is_enabled);
+KVM_X86_PMU_OP(pmc_idx_to_pmc);
+KVM_X86_PMU_OP(rdpmc_ecx_to_pmc);
+KVM_X86_PMU_OP(msr_idx_to_pmc);
+KVM_X86_PMU_OP(is_valid_rdpmc_ecx);
+KVM_X86_PMU_OP(is_valid_msr);
+KVM_X86_PMU_OP(get_msr);
+KVM_X86_PMU_OP(set_msr);
+KVM_X86_PMU_OP(refresh);
+KVM_X86_PMU_OP(init);
+KVM_X86_PMU_OP(reset);
+KVM_X86_PMU_OP_NULL(deliver_pmi);
+KVM_X86_PMU_OP_NULL(cleanup);
+
+#undef KVM_X86_PMU_OP
+#undef KVM_X86_PMU_OP_NULL
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 353989bf0102..bfdd9f2bc0fa 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -50,6 +50,12 @@
 struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_pmu_ops);
 
+#define	KVM_X86_PMU_OP(func)	\
+	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,	\
+				*(((struct kvm_pmu_ops *)0)->func))
+#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
 static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 {
 	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index b2fe135d395a..40e0b523637b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -45,6 +45,11 @@ struct kvm_pmu_ops {
 	void (*cleanup)(struct kvm_vcpu *vcpu);
 };
 
+#define	KVM_X86_PMU_OP(func)	\
+	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
+#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
  2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
                   ` (5 preceding siblings ...)
  2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
@ 2021-11-08 11:10 ` Like Xu
  6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

From: Like Xu <likexu@tencent.com>

Convert kvm_pmu_ops to use static calls.

Here are the worst sched_clock() nanosecond numbers for the kvm_pmu_ops
functions that is most often called (up to 7 digits of calls) when running
a single perf test case in a guest on an ICX 2.70GHz host (mitigations=on):

		|	legacy	|	static call
------------------------------------------------------------
.pmc_idx_to_pmc	|	10946	|	10047 (8%)
.pmc_is_enabled	|	11291	|	11175 (1%)
.msr_idx_to_pmc	|	13526	|	12346 (8%)
.is_valid_msr	|	10895	|	10484 (3%)

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 36 +++++++++++++++++-------------------
 arch/x86/kvm/pmu.h |  2 +-
 arch/x86/kvm/x86.c |  5 +++++
 3 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index bfdd9f2bc0fa..c86ff3057e2c 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -223,7 +223,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_pmu_ops.find_arch_event(pmc_to_pmu(pmc),
+		config = static_call(kvm_x86_pmu_find_arch_event)(pmc_to_pmu(pmc),
 						      event_select,
 						      unit_mask);
 		if (config != PERF_COUNT_HW_MAX)
@@ -277,7 +277,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_pmu_ops.find_fixed_event(idx),
+			      static_call(kvm_x86_pmu_find_fixed_event)(idx),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
@@ -286,7 +286,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
-	struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
+	struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, pmc_idx);
 
 	if (!pmc)
 		return;
@@ -308,7 +308,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 	int bit;
 
 	for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
-		struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
+		struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit);
 
 		if (unlikely(!pmc || !pmc->perf_event)) {
 			clear_bit(bit, pmu->reprogram_pmi);
@@ -330,7 +330,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 /* check if idx is a valid index to access PMU */
 int kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
 {
-	return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
+	return static_call(kvm_x86_pmu_is_valid_rdpmc_ecx)(vcpu, idx);
 }
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -380,7 +380,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
-	pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+	pmc = static_call(kvm_x86_pmu_rdpmc_ecx_to_pmc)(vcpu, idx, &mask);
 	if (!pmc)
 		return 1;
 
@@ -396,23 +396,22 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu)) {
-		if (kvm_pmu_ops.deliver_pmi)
-			kvm_pmu_ops.deliver_pmi(vcpu);
+		static_call_cond(kvm_x86_pmu_deliver_pmi)(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
 	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
-	return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
-		kvm_pmu_ops.is_valid_msr(vcpu, msr);
+	return static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr) ||
+		static_call(kvm_x86_pmu_is_valid_msr)(vcpu, msr);
 }
 EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
 
 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
+	struct kvm_pmc *pmc = static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr);
 
 	if (pmc)
 		__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -420,13 +419,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
-	return kvm_pmu_ops.get_msr(vcpu, msr_info);
+	return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
-	return kvm_pmu_ops.set_msr(vcpu, msr_info);
+	return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
 }
 
 /* refresh PMU settings. This function generally is called when underlying
@@ -435,7 +434,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
-	kvm_pmu_ops.refresh(vcpu);
+	static_call(kvm_x86_pmu_refresh)(vcpu);
 }
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -443,7 +442,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	irq_work_sync(&pmu->irq_work);
-	kvm_pmu_ops.reset(vcpu);
+	static_call(kvm_x86_pmu_reset)(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -451,7 +450,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 
 	memset(pmu, 0, sizeof(*pmu));
-	kvm_pmu_ops.init(vcpu);
+	static_call(kvm_x86_pmu_init)(vcpu);
 	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
 	pmu->event_count = 0;
 	pmu->need_cleanup = false;
@@ -483,14 +482,13 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 		      pmu->pmc_in_use, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
-		pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
+		pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
 	}
 
-	if (kvm_pmu_ops.cleanup)
-		kvm_pmu_ops.cleanup(vcpu);
+	static_call_cond(kvm_x86_pmu_cleanup)(vcpu);
 
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 40e0b523637b..a4bfd4200d67 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
 
 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-	return kvm_pmu_ops.pmc_is_enabled(pmc);
+	return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
 }
 
 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 70dc8f41329c..c5db444d5f4a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11306,6 +11306,11 @@ static inline void kvm_ops_static_call_update(void)
 	static_call_update(kvm_x86_##func, kvm_x86_ops.func)
 #define KVM_X86_OP_NULL KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
+
+#define	KVM_X86_PMU_OP(func)	\
+	static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func)
+#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
 }
 
 int kvm_arch_hardware_setup(void *opaque)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
  2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
@ 2021-12-08 17:55   ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 17:55 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

This is doing more than exporting a function, the export isn't even the focal
point of the patch.

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Let's export kvm_pmu_is_valid_msr() for nVMX,  instead of

Please wrap at ~75 chars.

> exporting all kvm_pmu_ops for this one case.

kvm_pmu_ops doesn't exist as of this patch, it comes later in the series.

> The reduced access scope will help to optimize the kvm_x86_ops.pmu_ops stuff
> later.

The changelog needs to explain why it's ok to add the msr_idx_to_pmc() check.

Something like:

  KVM: nVMX: Use kvm_pmu_is_valid_msr() to check for PERF_GLOBAL_CTRL support

  Use the generic kvm_pmu_is_valid_msr() helper when determining whether or not
  PERF_GLOBAL_CTRL is exposed to the guest and thus can be loaded on nested
  VM-Enter/VM-Exit.  The extra (indirect!) call to msr_idx_to_pmc() that comes
  with the helper is unnecessary, but harmless, as it's guaranteed to return
  false for MSR_CORE_PERF_GLOBAL_CTRL and this is a already a very slow path.

  Using the helper will allow future code to use static_call() for the PMU ops
  without having to export any static_call definitions.

  Export kvm_pmu_is_valid_msr() as necessary.

All that said, looking at this whole thing again, I think I'd prefer:


From d217914c9897d9a2cfd01073284a933ace4709b7 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 8 Dec 2021 09:48:20 -0800
Subject: [PATCH] KVM: nVMX: Refactor PMU refresh to avoid referencing
 kvm_x86_ops.pmu_ops

Refactor the nested VMX PMU refresh helper to pass it a flag stating
whether or not the vCPU has PERF_GLOBAL_CTRL instead of having the nVMX
helper query the information by bouncing through kvm_x86_ops.pmu_ops.
This will allow a future patch to use static_call() for the PMU ops
without having to export any static call definitions from common x86.

Alternatively, nVMX could call kvm_pmu_is_valid_msr() to indirectly use
kvm_x86_ops.pmu_ops, but that would exporting kvm_pmu_is_valid_msr() and
incurs an extra layer of indirection.

Opportunistically rename the helper to keep line lengths somewhat
reasonable, and to better capture its high-level role.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c    | 5 +++--
 arch/x86/kvm/vmx/nested.h    | 3 ++-
 arch/x86/kvm/vmx/pmu_intel.c | 3 ++-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 08e785871985..c87a81071288 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4773,7 +4773,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 	return 0;
 }
 
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
+void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
+			    bool vcpu_has_perf_global_ctrl)
 {
 	struct vcpu_vmx *vmx;
 
@@ -4781,7 +4782,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 		return;
 
 	vmx = to_vmx(vcpu);
-	if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+	if (vcpu_has_perf_global_ctrl) {
 		vmx->nested.msrs.entry_ctls_high |=
 				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 		vmx->nested.msrs.exit_ctls_high |=
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index b69a80f43b37..c92cea0b8ccc 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -32,7 +32,8 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
 int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
 int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 			u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
+void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
+			    bool vcpu_has_perf_global_ctrl);
 void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
 bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
 				 int size);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1b7456b2177b..7ca870d0249d 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -529,7 +529,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	bitmap_set(pmu->all_valid_pmc_idx,
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
-	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
+	nested_vmx_pmu_refresh(vcpu,
+			       intel_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL));
 
 	if (intel_pmu_lbr_is_compatible(vcpu))
 		x86_perf_get_lbr(&lbr_desc->records);
-- 
2.34.1.400.ga245620fadb-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
  2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2021-12-08 18:10   ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:10 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Replace the kvm_pmu_ops pointer in common x86 with an instance of the
> struct to save one pointer dereference when invoking functions. Copy the
> struct by value to set the ops during kvm_init().
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/kvm/pmu.c | 41 ++++++++++++++++++++++-------------------
>  arch/x86/kvm/pmu.h |  4 +++-
>  arch/x86/kvm/x86.c |  1 +
>  3 files changed, 26 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index aa6ac9c7aff2..353989bf0102 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -47,6 +47,9 @@
>   *        * AMD:   [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
>   */
>  
> +struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> +EXPORT_SYMBOL_GPL(kvm_pmu_ops);

This export isn't necessary.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
  2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
@ 2021-12-08 18:17   ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:17 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

s/tagged/tag

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.

State what the patch does, not what "should" be done.

  Now that pmu_ops is copied by value during kvm_arch_hardware_setup(), move
  the pointer to kvm_x86_init_ops and tag implementations as __initdata to make
  the implementations unreachable once KVM is loaded, e.g. to make it harder to
  sneak in post-init modification bugs.

> That'll save those precious few bytes, and more importantly make
> the original ops unreachable, i.e. make it harder to sneak in post-init
> modification bugs.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 4 ++--
>  arch/x86/kvm/svm/pmu.c          | 2 +-
>  arch/x86/kvm/svm/svm.c          | 2 +-
>  arch/x86/kvm/vmx/pmu_intel.c    | 2 +-
>  arch/x86/kvm/vmx/vmx.c          | 2 +-
>  arch/x86/kvm/x86.c              | 2 +-
>  6 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c2d4ee2973c5..00760a3ac88c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1436,8 +1436,7 @@ struct kvm_x86_ops {
>  	int cpu_dirty_log_size;
>  	void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
>  
> -	/* pmu operations of sub-arch */
> -	const struct kvm_pmu_ops *pmu_ops;
> +	/* nested operations of sub-arch */

No need for the new comment.

>  	const struct kvm_x86_nested_ops *nested_ops;
>  
>  	/*

Nits aside,

Reviewed-by: Sean Christopherson <seanjc@google.com>

I'd also be a-ok squashing this with the copy-by-value patch.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
  2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
@ 2021-12-08 18:35   ` Sean Christopherson
  2021-12-09 12:50     ` Like Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:35 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Use static calls to improve kvm_pmu_ops performance. Introduce the
> definitions that will be used by a subsequent patch to actualize the
> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
> definition of static calls. This header is also intended to be
> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
> 
> Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
> static calls in a simlilar manner for insignificant but not
> negligible performance impact, especially on older models.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

This absolutely shouldn't be separated from patch 7/7.  By _defining_ the static
calls but not providing the logic to actually _update_ the calls, it's entirely
possible to add static_call() invocations that will compile cleanly without any
chance of doing the right thing at runtime.  

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0236c1a953d0..804f98b5552e 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)

 static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
 {
-       return kvm_pmu_ops.pmc_is_enabled(pmc);
+       return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
 }

 static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,

> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
> +BUILD_BUG_ON(1)
> +#endif
> +
> +/*
> + * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to

Please use all 80 chars.

> + * help generate "static_call()"s. They are also intended for use when defining
> + * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used

AMD/Intel since this is referring to the vendor and not to function names (like
the below reference).

> + * for those functions that follow the [amd|intel]_func_name convention.
> + * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the

As below, please drop the _NULL() variant.

> + * case where there is no definition or a function name that
> + * doesn't match the typical naming convention is supplied.
> + */

...

> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 353989bf0102..bfdd9f2bc0fa 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -50,6 +50,12 @@
>  struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
>  EXPORT_SYMBOL_GPL(kvm_pmu_ops);
>  
> +#define	KVM_X86_PMU_OP(func)	\
> +	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,	\
> +				*(((struct kvm_pmu_ops *)0)->func))
> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
> +#include <asm/kvm-x86-pmu-ops.h>
> +
>  static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
>  {
>  	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index b2fe135d395a..40e0b523637b 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -45,6 +45,11 @@ struct kvm_pmu_ops {
>  	void (*cleanup)(struct kvm_vcpu *vcpu);
>  };
>  
> +#define	KVM_X86_PMU_OP(func)	\
> +	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP

I don't want to proliferate the pointless and bitrot-prone KVM_X86_OP_NULL macro,
just omit this.  I'll send a patch to drop KVM_X86_OP_NULL.

> +#include <asm/kvm-x86-pmu-ops.h>
> +
>  static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
>  {
>  	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> -- 
> 2.33.0
> 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
  2021-12-08 18:35   ` Sean Christopherson
@ 2021-12-09 12:50     ` Like Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-12-09 12:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
	Joerg Roedel, linux-kernel

Hi Sean,

On 9/12/2021 2:35 am, Sean Christopherson wrote:
> On Mon, Nov 08, 2021, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> Use static calls to improve kvm_pmu_ops performance. Introduce the
>> definitions that will be used by a subsequent patch to actualize the
>> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
>> definition of static calls. This header is also intended to be
>> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
>>
>> Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
>> static calls in a simlilar manner for insignificant but not
>> negligible performance impact, especially on older models.
>>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
> 
> This absolutely shouldn't be separated from patch 7/7.  By _defining_ the static
> calls but not providing the logic to actually _update_ the calls, it's entirely
> possible to add static_call() invocations that will compile cleanly without any
> chance of doing the right thing at runtime.
> 
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 0236c1a953d0..804f98b5552e 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
> 
>   static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
>   {
> -       return kvm_pmu_ops.pmc_is_enabled(pmc);
> +       return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
>   }
> 
>   static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
> 
>> @@ -0,0 +1,32 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
>> +BUILD_BUG_ON(1)
>> +#endif
>> +
>> +/*
>> + * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
> 
> Please use all 80 chars.
> 
>> + * help generate "static_call()"s. They are also intended for use when defining
>> + * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
> 
> AMD/Intel since this is referring to the vendor and not to function names (like
> the below reference).
> 
>> + * for those functions that follow the [amd|intel]_func_name convention.
>> + * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
> 
> As below, please drop the _NULL() variant.
> 
>> + * case where there is no definition or a function name that
>> + * doesn't match the typical naming convention is supplied.
>> + */
> 
> ...
> 
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 353989bf0102..bfdd9f2bc0fa 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -50,6 +50,12 @@
>>   struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
>>   EXPORT_SYMBOL_GPL(kvm_pmu_ops);
>>   
>> +#define	KVM_X86_PMU_OP(func)	\
>> +	DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func,	\
>> +				*(((struct kvm_pmu_ops *)0)->func))
>> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
>> +#include <asm/kvm-x86-pmu-ops.h>
>> +
>>   static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
>>   {
>>   	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
>> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
>> index b2fe135d395a..40e0b523637b 100644
>> --- a/arch/x86/kvm/pmu.h
>> +++ b/arch/x86/kvm/pmu.h
>> @@ -45,6 +45,11 @@ struct kvm_pmu_ops {
>>   	void (*cleanup)(struct kvm_vcpu *vcpu);
>>   };
>>   
>> +#define	KVM_X86_PMU_OP(func)	\
>> +	DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
>> +#define	KVM_X86_PMU_OP_NULL	KVM_X86_PMU_OP
> 
> I don't want to proliferate the pointless and bitrot-prone KVM_X86_OP_NULL macro,
> just omit this.  I'll send a patch to drop KVM_X86_OP_NULL.

Thanks for your clear comments on this patch set.

I will send out V3 once KVM_X86_OP_NULL is dropped as well as
the comment in arch/x86/include/asm/kvm-x86-ops.h is updated.

> 
>> +#include <asm/kvm-x86-pmu-ops.h>
>> +
>>   static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
>>   {
>>   	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>> -- 
>> 2.33.0
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-12-09 12:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
2021-12-08 17:55   ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
2021-12-08 18:10   ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
2021-12-08 18:17   ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
2021-12-08 18:35   ` Sean Christopherson
2021-12-09 12:50     ` Like Xu
2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.