kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] KVM monolithic v2
@ 2019-09-28 17:23 Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko Andrea Arcangeli
                   ` (12 more replies)
  0 siblings, 13 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Hello,

as usual the last 4 patches could be splitted off but I did more
measurements in that area and I altered the commit headers. They're
fairly small commits compared to the previous part so I kept it
considering they're needed to benchmark the previous part.

The KVM monolithic enhancement is easy to identify checking the word
"monolithic" in the subject so there's no confusion about where that
work stops.

This renames all functions in place mixed up in whatever location they
existed in svm.c or vmx.c. If they require an inline call they're
defined now as extern before the kvm_x86_ops structure.

Converting those small kvm_x86 functions to inlines requires more
Makefile work and header restructuring so it's left for later.

The removal of kvm_x86_ops is also left for later because that
requires lots of logic changes in the code scattered all over the
place in KVM code. This patchset tries to do the conversion with as
few logic changes as possible, so the code works the same and this
only improves the implementation and the performance.

Doing the conversion plus the logic changes at the same time would
pose the risk of not being able to identify through bisection if any
regression is caused by a bug in the conversion or in one small commit
to alter the logic and to remove the need of one more pointer to
function.

It's best if each single removal of any pointer to functional change
is done through a separate small commit. After all those small commits
done incrementally with this patchset, the kvm_x86_ops structure can
be deleted.

https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/log/?h=kvm-mono2

Thanks,
Andrea

Andrea Arcangeli (14):
  KVM: monolithic: x86: remove kvm.ko
  KVM: monolithic: x86: disable linking vmx and svm at the same time
    into the kernel
  KVM: monolithic: x86: convert the kvm_x86_ops and kvm_pmu_ops methods
    to external functions
  KVM: monolithic: x86: handle the request_immediate_exit variation
  KVM: monolithic: add more section prefixes in the KVM common code
  KVM: monolithic: x86: remove __exit section prefix from
    machine_unsetup
  KVM: monolithic: x86: remove __init section prefix from
    kvm_x86_cpu_has_kvm_support
  KVM: monolithic: x86: remove exports
  KVM: monolithic: remove exports from KVM common code
  KVM: monolithic: x86: drop the kvm_pmu_ops structure
  KVM: x86: optimize more exit handlers in vmx.c
  KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  KVM: retpolines: x86: eliminate retpoline from svm.c exit handlers
  x86: retpolines: eliminate retpoline from msr event handlers

 arch/x86/events/intel/core.c    |  11 +
 arch/x86/include/asm/kvm_host.h | 205 +++++++-
 arch/x86/kvm/Kconfig            |  24 +-
 arch/x86/kvm/Makefile           |   5 +-
 arch/x86/kvm/cpuid.c            |  27 +-
 arch/x86/kvm/hyperv.c           |   8 +-
 arch/x86/kvm/irq.c              |   4 -
 arch/x86/kvm/irq_comm.c         |   2 -
 arch/x86/kvm/kvm_cache_regs.h   |  10 +-
 arch/x86/kvm/lapic.c            |  46 +-
 arch/x86/kvm/mmu.c              |  50 +-
 arch/x86/kvm/mmu.h              |   4 +-
 arch/x86/kvm/mtrr.c             |   2 -
 arch/x86/kvm/pmu.c              |  27 +-
 arch/x86/kvm/pmu.h              |  37 +-
 arch/x86/kvm/pmu_amd.c          |  43 +-
 arch/x86/kvm/svm.c              | 682 ++++++++++++++++-----------
 arch/x86/kvm/trace.h            |   4 +-
 arch/x86/kvm/vmx/nested.c       |  84 ++--
 arch/x86/kvm/vmx/pmu_intel.c    |  46 +-
 arch/x86/kvm/vmx/vmx.c          | 795 ++++++++++++++++++--------------
 arch/x86/kvm/vmx/vmx.h          |  39 +-
 arch/x86/kvm/x86.c              | 418 +++++++----------
 arch/x86/kvm/x86.h              |   2 +-
 include/linux/kvm_host.h        |   4 +-
 virt/kvm/eventfd.c              |   1 -
 virt/kvm/kvm_main.c             |  71 +--
 27 files changed, 1413 insertions(+), 1238 deletions(-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-10-15  1:31   ` Sean Christopherson
  2019-09-28 17:23 ` [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel Andrea Arcangeli
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

This is the first commit of a patch series that aims to replace the
modular kvm.ko kernel module with a monolithic kvm-intel/kvm-amd
model. This change has the only possible cons of wasting some disk
space in /lib/modules/. The pros are that it saves CPUS and some minor
RAM which are more scarse resources than disk space.

The pointer to function virtual template model cannot provide any
runtime benefit because kvm-intel and kvm-amd can't be loaded at the
same time.

This removes kvm.ko and it links and duplicates all kvm.ko objects to
both kvm-amd and kvm-intel.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/Makefile | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 31ecf7a76d5a..68b81f381369 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,9 +12,8 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
 			   i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
 			   hyperv.o page_track.o debugfs.o
 
-kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o
-kvm-amd-y		+= svm.o pmu_amd.o
+kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o $(kvm-y)
+kvm-amd-y		+= svm.o pmu_amd.o $(kvm-y)
 
-obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-10-15  3:16   ` Sean Christopherson
  2019-09-28 17:23 ` [PATCH 04/14] KVM: monolithic: x86: handle the request_immediate_exit variation Andrea Arcangeli
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Linking both vmx and svm into the kernel at the same time isn't
possible anymore or the kvm_x86/kvm_x86_pmu external function names
would collide.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/Kconfig | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 840e12583b85..e1601c54355e 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -59,9 +59,29 @@ config KVM
 
 	  If unsure, say N.
 
+if KVM=y
+
+choice
+	prompt "To link KVM statically into the kernel you need to choose"
+	help
+	  In order to build a kernel with support for both AMD and Intel
+	  CPUs, you need to set CONFIG_KVM=m.
+
+config KVM_AMD_STATIC
+	select KVM_AMD
+	bool "Link KVM AMD statically into the kernel"
+
+config KVM_INTEL_STATIC
+	select KVM_INTEL
+	bool "Link KVM Intel statically into the kernel"
+
+endchoice
+
+endif
+
 config KVM_INTEL
 	tristate "KVM for Intel processors support"
-	depends on KVM
+	depends on (KVM && !KVM_AMD_STATIC) || KVM_INTEL_STATIC
 	# for perf_guest_get_msrs():
 	depends on CPU_SUP_INTEL
 	---help---
@@ -73,7 +93,7 @@ config KVM_INTEL
 
 config KVM_AMD
 	tristate "KVM for AMD processors support"
-	depends on KVM
+	depends on (KVM && !KVM_INTEL_STATIC) || KVM_AMD_STATIC
 	---help---
 	  Provides support for KVM on AMD processors equipped with the AMD-V
 	  (SVM) extensions.

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/14] KVM: monolithic: x86: handle the request_immediate_exit variation
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 05/14] KVM: monolithic: add more section prefixes in the KVM common code Andrea Arcangeli
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

request_immediate_exit is one of those few cases where the pointer to
function of the method isn't fixed at build time and it requires
special handling because hardware_setup() may override it at runtime.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bb122ab4b96c..aad44e62b20a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7022,7 +7022,10 @@ void kvm_x86_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
 
 void kvm_x86_request_immediate_exit(struct kvm_vcpu *vcpu)
 {
-	to_vmx(vcpu)->req_immediate_exit = true;
+	if (likely(enable_preemption_timer))
+		to_vmx(vcpu)->req_immediate_exit = true;
+	else
+		__kvm_request_immediate_exit(vcpu);
 }
 
 int kvm_x86_check_intercept(struct kvm_vcpu *vcpu,

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/14] KVM: monolithic: add more section prefixes in the KVM common code
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (2 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 04/14] KVM: monolithic: x86: handle the request_immediate_exit variation Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 06/14] KVM: monolithic: x86: remove __exit section prefix from machine_unsetup Andrea Arcangeli
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Add more section prefixes of some KVM common code function because
with the monolithic KVM model the section checker can now do a more
accurate static analysis at build time and this allows to build
without CONFIG_SECTION_MISMATCH_WARN_ONLY=n.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/x86.c       | 4 ++--
 include/linux/kvm_host.h | 4 ++--
 virt/kvm/kvm_main.c      | 6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8e593d28ff95..601190de4f87 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9206,7 +9206,7 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
-int kvm_arch_hardware_setup(void)
+__init int kvm_arch_hardware_setup(void)
 {
 	int r;
 
@@ -9237,7 +9237,7 @@ void kvm_arch_hardware_unsetup(void)
 	kvm_x86_hardware_unsetup();
 }
 
-int kvm_arch_check_processor_compat(void)
+__init int kvm_arch_check_processor_compat(void)
 {
 	return kvm_x86_check_processor_compatibility();
 }
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index fcb46b3374c6..8621916998e1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -867,9 +867,9 @@ void kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu);
 
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
-int kvm_arch_hardware_setup(void);
+__init int kvm_arch_hardware_setup(void);
 void kvm_arch_hardware_unsetup(void);
-int kvm_arch_check_processor_compat(void);
+__init int kvm_arch_check_processor_compat(void);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e6de3159e682..9aa448ea688f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4235,13 +4235,13 @@ static void kvm_sched_out(struct preempt_notifier *pn,
 	kvm_arch_vcpu_put(vcpu);
 }
 
-static void check_processor_compat(void *rtn)
+static __init void check_processor_compat(void *rtn)
 {
 	*(int *)rtn = kvm_arch_check_processor_compat();
 }
 
-int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
-		  struct module *module)
+__init int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
+		    struct module *module)
 {
 	int r;
 	int cpu;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/14] KVM: monolithic: x86: remove __exit section prefix from machine_unsetup
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (3 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 05/14] KVM: monolithic: add more section prefixes in the KVM common code Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 07/14] KVM: monolithic: x86: remove __init section prefix from kvm_x86_cpu_has_kvm_support Andrea Arcangeli
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Adjusts the section prefixes of some KVM x86 code function because
with the monolithic KVM model the section checker can now do a more
accurate static analysis at build time and it found a potentially
kernel crashing bug. This also allows to build without
CONFIG_SECTION_MISMATCH_WARN_ONLY=n.

The __exit removed from machine_unsetup is because
kvm_arch_hardware_unsetup() is called by kvm_init() which is in the
__init section. It's not allowed to call a function located in the
__exit section and dropped during the kernel link from the __init
section or the kernel will crash if that call is made.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 4 ++--
 arch/x86/kvm/svm.c              | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 47eeb92d4b4a..0ae65148e5ed 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1008,7 +1008,7 @@ extern int kvm_x86_hardware_enable(void);
 extern void kvm_x86_hardware_disable(void);
 extern __init int kvm_x86_check_processor_compatibility(void);
 extern __init int kvm_x86_hardware_setup(void);
-extern __exit void kvm_x86_hardware_unsetup(void);
+extern void kvm_x86_hardware_unsetup(void);
 extern bool kvm_x86_cpu_has_accelerated_tpr(void);
 extern bool kvm_x86_has_emulated_msr(int index);
 extern void kvm_x86_cpuid_update(struct kvm_vcpu *vcpu);
@@ -1199,7 +1199,7 @@ struct kvm_x86_ops {
 	void (*hardware_disable)(void);
 	int (*check_processor_compatibility)(void);/* __init */
 	int (*hardware_setup)(void);               /* __init */
-	void (*hardware_unsetup)(void);            /* __exit */
+	void (*hardware_unsetup)(void);
 	bool (*cpu_has_accelerated_tpr)(void);
 	bool (*has_emulated_msr)(int index);
 	void (*cpuid_update)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index aa8c0efdc441..057ba1f8d7b3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1405,7 +1405,7 @@ __init int kvm_x86_hardware_setup(void)
 	return r;
 }
 
-__exit void kvm_x86_hardware_unsetup(void)
+void kvm_x86_hardware_unsetup(void)
 {
 	int cpu;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index aad44e62b20a..2ae162eb082e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7645,7 +7645,7 @@ __init int kvm_x86_hardware_setup(void)
 	return r;
 }
 
-__exit void kvm_x86_hardware_unsetup(void)
+void kvm_x86_hardware_unsetup(void)
 {
 	if (nested)
 		nested_vmx_hardware_unsetup();

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/14] KVM: monolithic: x86: remove __init section prefix from kvm_x86_cpu_has_kvm_support
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (4 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 06/14] KVM: monolithic: x86: remove __exit section prefix from machine_unsetup Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 08/14] KVM: monolithic: x86: remove exports Andrea Arcangeli
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Adjusts the section prefixes of some KVM x86 code function because
with the monolithic KVM model the section checker can now do a more
accurate static analysis at build time. This also allows to build
without CONFIG_SECTION_MISMATCH_WARN_ONLY=n.

The __init needs to be removed on vmx despite it's only svm calling it
from kvm_x86_hardware_enable which is eventually called by
hardware_enable_nolock() or there's a (potentially false positive)
warning (false positive because this function isn't called in the vmx
case). If this isn't needed the right cleanup isn't to put it in the
__init section, but to drop it. As long as it's defined in vmx as a
kvm_x86 operation, it's expectable that might eventually be called at
runtime while hot plugging new CPUs.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0ae65148e5ed..75affbf7861b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1002,7 +1002,7 @@ struct kvm_lapic_irq {
 	bool msi_redir_hint;
 };
 
-extern __init int kvm_x86_cpu_has_kvm_support(void);
+extern int kvm_x86_cpu_has_kvm_support(void);
 extern __init int kvm_x86_disabled_by_bios(void);
 extern int kvm_x86_hardware_enable(void);
 extern void kvm_x86_hardware_disable(void);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2ae162eb082e..faccffc4709e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2094,7 +2094,7 @@ void kvm_x86_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 	}
 }
 
-__init int kvm_x86_cpu_has_kvm_support(void)
+int kvm_x86_cpu_has_kvm_support(void)
 {
 	return cpu_has_vmx();
 }

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/14] KVM: monolithic: x86: remove exports
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (5 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 07/14] KVM: monolithic: x86: remove __init section prefix from kvm_x86_cpu_has_kvm_support Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 09/14] KVM: monolithic: remove exports from KVM common code Andrea Arcangeli
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

The exports would be duplicated across kvm-amd and kvm-intel if
they're kept, this cleanup various harmless warnings about it.

The following warning remains for now to be able to load the kvmgt
driver. These remaining warnings can be handled later.

WARNING: arch/x86/kvm/kvm-amd: 'kvm_slot_page_track_add_page' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_slot_page_track_remove_page' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_page_track_register_notifier' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_page_track_unregister_notifier' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/cpuid.c    |   5 --
 arch/x86/kvm/hyperv.c   |   2 -
 arch/x86/kvm/irq.c      |   4 --
 arch/x86/kvm/irq_comm.c |   2 -
 arch/x86/kvm/lapic.c    |  16 ------
 arch/x86/kvm/mmu.c      |  24 ---------
 arch/x86/kvm/mtrr.c     |   2 -
 arch/x86/kvm/pmu.c      |   3 --
 arch/x86/kvm/x86.c      | 106 ----------------------------------------
 9 files changed, 164 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 2eacf9cea254..3c83921eb8b6 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -50,7 +50,6 @@ bool kvm_mpx_supported(void)
 	return ((host_xcr0 & (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR))
 		 && kvm_x86_mpx_supported());
 }
-EXPORT_SYMBOL_GPL(kvm_mpx_supported);
 
 u64 kvm_supported_xcr0(void)
 {
@@ -192,7 +191,6 @@ int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu)
 not_found:
 	return 36;
 }
-EXPORT_SYMBOL_GPL(cpuid_query_maxphyaddr);
 
 /* when an old userspace process fills a new kernel module */
 int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
@@ -960,7 +958,6 @@ struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
 	}
 	return best;
 }
-EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
 
 /*
  * If no match is found, check whether we exceed the vCPU's limit
@@ -1011,7 +1008,6 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, entry_found);
 	return entry_found;
 }
-EXPORT_SYMBOL_GPL(kvm_cpuid);
 
 int kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
 {
@@ -1029,4 +1025,3 @@ int kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
 	kvm_rdx_write(vcpu, edx);
 	return kvm_skip_emulated_instruction(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 1621bd0ce00c..07b2404cd469 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -711,7 +711,6 @@ bool kvm_hv_assist_page_enabled(struct kvm_vcpu *vcpu)
 		return false;
 	return vcpu->arch.pv_eoi.msr_val & KVM_MSR_ENABLED;
 }
-EXPORT_SYMBOL_GPL(kvm_hv_assist_page_enabled);
 
 bool kvm_hv_get_assist_page(struct kvm_vcpu *vcpu,
 			    struct hv_vp_assist_page *assist_page)
@@ -721,7 +720,6 @@ bool kvm_hv_get_assist_page(struct kvm_vcpu *vcpu,
 	return !kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.pv_eoi.data,
 				      assist_page, sizeof(*assist_page));
 }
-EXPORT_SYMBOL_GPL(kvm_hv_get_assist_page);
 
 static void stimer_prepare_msg(struct kvm_vcpu_hv_stimer *stimer)
 {
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index e330e7d125f7..ba4300f36a32 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -26,7 +26,6 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 
 	return 0;
 }
-EXPORT_SYMBOL(kvm_cpu_has_pending_timer);
 
 /*
  * check if there is a pending userspace external interrupt
@@ -109,7 +108,6 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
 
 	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
 }
-EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
 
 /*
  * Read pending interrupt(from non-APIC source)
@@ -146,14 +144,12 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
 
 	return kvm_get_apic_interrupt(v);	/* APIC */
 }
-EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
 
 void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
 {
 	if (lapic_in_kernel(vcpu))
 		kvm_inject_apic_timer_irqs(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_inject_pending_timer_irqs);
 
 void __kvm_migrate_timers(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 8ecd48d31800..64a13d5fcc9f 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -122,7 +122,6 @@ void kvm_set_msi_irq(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
 	irq->level = 1;
 	irq->shorthand = 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_msi_irq);
 
 static inline bool kvm_msi_route_invalid(struct kvm *kvm,
 		struct kvm_kernel_irq_routing_entry *e)
@@ -346,7 +345,6 @@ bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 
 	return r == 1;
 }
-EXPORT_SYMBOL_GPL(kvm_intr_is_single_vcpu);
 
 #define IOAPIC_ROUTING_ENTRY(irq) \
 	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index a588907f07c6..7ddea52674a4 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -122,7 +122,6 @@ bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
 {
 	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_can_post_timer_interrupt);
 
 static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
 {
@@ -412,7 +411,6 @@ bool __kvm_apic_update_irr(u32 *pir, void *regs, int *max_irr)
 	return ((max_updated_irr != -1) &&
 		(max_updated_irr == *max_irr));
 }
-EXPORT_SYMBOL_GPL(__kvm_apic_update_irr);
 
 bool kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir, int *max_irr)
 {
@@ -420,7 +418,6 @@ bool kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir, int *max_irr)
 
 	return __kvm_apic_update_irr(pir, apic->regs, max_irr);
 }
-EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
 
 static inline int apic_search_irr(struct kvm_lapic *apic)
 {
@@ -544,7 +541,6 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
 	 */
 	return apic_find_highest_irr(vcpu->arch.apic);
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_find_highest_irr);
 
 static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 			     int vector, int level, int trig_mode,
@@ -712,7 +708,6 @@ void kvm_apic_update_ppr(struct kvm_vcpu *vcpu)
 {
 	apic_update_ppr(vcpu->arch.apic);
 }
-EXPORT_SYMBOL_GPL(kvm_apic_update_ppr);
 
 static void apic_set_tpr(struct kvm_lapic *apic, u32 tpr)
 {
@@ -823,7 +818,6 @@ bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
 		return false;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_apic_match_dest);
 
 int kvm_vector_to_index(u32 vector, u32 dest_vcpus,
 		       const unsigned long *bitmap, u32 bitmap_size)
@@ -1196,7 +1190,6 @@ void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector)
 	kvm_ioapic_send_eoi(apic, vector);
 	kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_apic_set_eoi_accelerated);
 
 static void apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
 {
@@ -1355,7 +1348,6 @@ int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_reg_read);
 
 static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr)
 {
@@ -1533,7 +1525,6 @@ void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
 	if (lapic_timer_int_injected(vcpu))
 		__kvm_wait_lapic_expire(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_wait_lapic_expire);
 
 static void kvm_apic_inject_pending_timer_irqs(struct kvm_lapic *apic)
 {
@@ -1698,7 +1689,6 @@ bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
 
 	return vcpu->arch.apic->lapic_timer.hv_timer_in_use;
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);
 
 static void cancel_hv_timer(struct kvm_lapic *apic)
 {
@@ -1799,13 +1789,11 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
 out:
 	preempt_enable();
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_expired_hv_timer);
 
 void kvm_lapic_switch_to_hv_timer(struct kvm_vcpu *vcpu)
 {
 	restart_apic_timer(vcpu->arch.apic);
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_hv_timer);
 
 void kvm_lapic_switch_to_sw_timer(struct kvm_vcpu *vcpu)
 {
@@ -1817,7 +1805,6 @@ void kvm_lapic_switch_to_sw_timer(struct kvm_vcpu *vcpu)
 		start_sw_timer(apic);
 	preempt_enable();
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_switch_to_sw_timer);
 
 void kvm_lapic_restart_hv_timer(struct kvm_vcpu *vcpu)
 {
@@ -1987,7 +1974,6 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_reg_write);
 
 static int apic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
 			    gpa_t address, int len, const void *data)
@@ -2026,7 +2012,6 @@ void kvm_lapic_set_eoi(struct kvm_vcpu *vcpu)
 {
 	kvm_lapic_reg_write(vcpu->arch.apic, APIC_EOI, 0);
 }
-EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi);
 
 /* emulate APIC access in a trap manner */
 void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
@@ -2041,7 +2026,6 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
 	/* TODO: optimize to just emulate side effect w/o one more write */
 	kvm_lapic_reg_write(vcpu->arch.apic, offset, val);
 }
-EXPORT_SYMBOL_GPL(kvm_apic_write_nodecode);
 
 void kvm_free_lapic(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 80124d00c504..d2b56dba7c77 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -308,7 +308,6 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask, u64 mmio_value, u64 access_mask)
 	shadow_mmio_mask = mmio_mask | SPTE_SPECIAL_MASK;
 	shadow_mmio_access_mask = access_mask;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);
 
 static bool is_mmio_spte(u64 spte)
 {
@@ -474,7 +473,6 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 	shadow_acc_track_mask = acc_track_mask;
 	shadow_me_mask = me_mask;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_set_mask_ptes);
 
 static u8 kvm_get_shadow_phys_bits(void)
 {
@@ -1702,7 +1700,6 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
 		mask &= mask - 1;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_clear_dirty_pt_masked);
 
 /**
  * kvm_arch_mmu_enable_log_dirty_pt_masked - enable dirty logging for selected
@@ -2853,7 +2850,6 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page);
 
 static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 {
@@ -3621,7 +3617,6 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 	kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_free_roots);
 
 static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn)
 {
@@ -3846,7 +3841,6 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
 	kvm_mmu_audit(vcpu, AUDIT_POST_SYNC);
 	spin_unlock(&vcpu->kvm->mmu_lock);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_sync_roots);
 
 static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
 				  u32 access, struct x86_exception *exception)
@@ -4114,7 +4108,6 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
 	}
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_handle_page_fault);
 
 static bool
 check_hugepage_cache_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, int level)
@@ -4294,7 +4287,6 @@ void kvm_mmu_new_cr3(struct kvm_vcpu *vcpu, gpa_t new_cr3, bool skip_tlb_flush)
 	__kvm_mmu_new_cr3(vcpu, new_cr3, kvm_mmu_calc_root_page_role(vcpu),
 			  skip_tlb_flush);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_new_cr3);
 
 static unsigned long get_cr3(struct kvm_vcpu *vcpu)
 {
@@ -4534,7 +4526,6 @@ reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
 	}
 
 }
-EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask);
 
 static inline bool boot_cpu_is_amd(void)
 {
@@ -4954,7 +4945,6 @@ void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu)
 	context->mmu_role.as_u64 = new_role.as_u64;
 	reset_shadow_zero_bits_mask(vcpu, context);
 }
-EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu);
 
 static union kvm_mmu_role
 kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
@@ -5018,7 +5008,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 	reset_rsvds_bits_mask_ept(vcpu, context, execonly);
 	reset_ept_shadow_zero_bits_mask(vcpu, context, execonly);
 }
-EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
 
 static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
 {
@@ -5098,7 +5087,6 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu, bool reset_roots)
 	else
 		init_kvm_softmmu(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_init_mmu);
 
 static union kvm_mmu_page_role
 kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu)
@@ -5118,7 +5106,6 @@ void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
 	kvm_mmu_unload(vcpu);
 	kvm_init_mmu(vcpu, true);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_reset_context);
 
 int kvm_mmu_load(struct kvm_vcpu *vcpu)
 {
@@ -5136,7 +5123,6 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 out:
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_load);
 
 void kvm_mmu_unload(struct kvm_vcpu *vcpu)
 {
@@ -5145,7 +5131,6 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu)
 	kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
 	WARN_ON(VALID_PAGE(vcpu->arch.guest_mmu.root_hpa));
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_unload);
 
 static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu_page *sp, u64 *spte,
@@ -5357,7 +5342,6 @@ int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
 
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page_virt);
 
 static int make_mmu_pages_available(struct kvm_vcpu *vcpu)
 {
@@ -5464,7 +5448,6 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u64 error_code,
 		BUG();
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_page_fault);
 
 void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva)
 {
@@ -5495,7 +5478,6 @@ void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva)
 	kvm_x86_tlb_flush_gva(vcpu, gva);
 	++vcpu->stat.invlpg;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_invlpg);
 
 void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid)
 {
@@ -5527,19 +5509,16 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid)
 	 * for them.
 	 */
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_invpcid_gva);
 
 void kvm_enable_tdp(void)
 {
 	tdp_enabled = true;
 }
-EXPORT_SYMBOL_GPL(kvm_enable_tdp);
 
 void kvm_disable_tdp(void)
 {
 	tdp_enabled = false;
 }
-EXPORT_SYMBOL_GPL(kvm_disable_tdp);
 
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
@@ -5920,7 +5899,6 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 		kvm_flush_remote_tlbs_with_address(kvm, memslot->base_gfn,
 				memslot->npages);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_slot_leaf_clear_dirty);
 
 void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
 					struct kvm_memory_slot *memslot)
@@ -5939,7 +5917,6 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
 		kvm_flush_remote_tlbs_with_address(kvm, memslot->base_gfn,
 				memslot->npages);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_slot_largepage_remove_write_access);
 
 void kvm_mmu_slot_set_dirty(struct kvm *kvm,
 			    struct kvm_memory_slot *memslot)
@@ -5957,7 +5934,6 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm,
 		kvm_flush_remote_tlbs_with_address(kvm, memslot->base_gfn,
 				memslot->npages);
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_slot_set_dirty);
 
 static void __kvm_mmu_zap_all(struct kvm *kvm, bool mmio_only)
 {
diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
index 25ce3edd1872..477f7141f793 100644
--- a/arch/x86/kvm/mtrr.c
+++ b/arch/x86/kvm/mtrr.c
@@ -91,7 +91,6 @@ bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data)
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(kvm_mtrr_valid);
 
 static bool mtrr_is_enabled(struct kvm_mtrr *mtrr_state)
 {
@@ -686,7 +685,6 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn)
 
 	return type;
 }
-EXPORT_SYMBOL_GPL(kvm_mtrr_get_guest_memory_type);
 
 bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn,
 					  int page_num)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 144e5d0c25ff..0ac70bad4b31 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -200,7 +200,6 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 			      (eventsel & HSW_IN_TX),
 			      (eventsel & HSW_IN_TX_CHECKPOINTED));
 }
-EXPORT_SYMBOL_GPL(reprogram_gp_counter);
 
 void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 {
@@ -230,7 +229,6 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
 }
-EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
 
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 {
@@ -248,7 +246,6 @@ void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
 		reprogram_fixed_counter(pmc, ctrl, idx);
 	}
 }
-EXPORT_SYMBOL_GPL(reprogram_counter);
 
 void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 601190de4f87..b2b7a22d8503 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -76,7 +76,6 @@
 #define MAX_IO_MSRS 256
 #define KVM_MAX_MCE_BANKS 32
 u64 __read_mostly kvm_mce_cap_supported = MCG_CTL_P | MCG_SER_P;
-EXPORT_SYMBOL_GPL(kvm_mce_cap_supported);
 
 #define emul_to_vcpu(ctxt) \
 	container_of(ctxt, struct kvm_vcpu, arch.emulate_ctxt)
@@ -106,7 +105,6 @@ static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
 
 struct kvm_x86_ops *kvm_x86_ops __read_mostly;
-EXPORT_SYMBOL_GPL(kvm_x86_ops);
 
 static bool __read_mostly ignore_msrs = 0;
 module_param(ignore_msrs, bool, S_IRUGO | S_IWUSR);
@@ -121,15 +119,10 @@ static bool __read_mostly kvmclock_periodic_sync = true;
 module_param(kvmclock_periodic_sync, bool, S_IRUGO);
 
 bool __read_mostly kvm_has_tsc_control;
-EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
 u32  __read_mostly kvm_max_guest_tsc_khz;
-EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
 u8   __read_mostly kvm_tsc_scaling_ratio_frac_bits;
-EXPORT_SYMBOL_GPL(kvm_tsc_scaling_ratio_frac_bits);
 u64  __read_mostly kvm_max_tsc_scaling_ratio;
-EXPORT_SYMBOL_GPL(kvm_max_tsc_scaling_ratio);
 u64 __read_mostly kvm_default_tsc_scaling_ratio;
-EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio);
 
 /* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */
 static u32 __read_mostly tsc_tolerance_ppm = 250;
@@ -149,7 +142,6 @@ module_param(vector_hashing, bool, S_IRUGO);
 
 bool __read_mostly enable_vmware_backdoor = false;
 module_param(enable_vmware_backdoor, bool, S_IRUGO);
-EXPORT_SYMBOL_GPL(enable_vmware_backdoor);
 
 static bool __read_mostly force_emulation_prefix = false;
 module_param(force_emulation_prefix, bool, S_IRUGO);
@@ -221,7 +213,6 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 u64 __read_mostly host_xcr0;
 
 struct kmem_cache *x86_fpu_cache;
-EXPORT_SYMBOL_GPL(x86_fpu_cache);
 
 static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt);
 
@@ -283,7 +274,6 @@ void kvm_define_shared_msr(unsigned slot, u32 msr)
 	if (slot >= shared_msrs_global.nr)
 		shared_msrs_global.nr = slot + 1;
 }
-EXPORT_SYMBOL_GPL(kvm_define_shared_msr);
 
 static void kvm_shared_msr_cpu_online(void)
 {
@@ -313,7 +303,6 @@ int kvm_set_shared_msr(unsigned slot, u64 value, u64 mask)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_shared_msr);
 
 static void drop_user_return_notifiers(void)
 {
@@ -328,13 +317,11 @@ u64 kvm_get_apic_base(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.apic_base;
 }
-EXPORT_SYMBOL_GPL(kvm_get_apic_base);
 
 enum lapic_mode kvm_get_apic_mode(struct kvm_vcpu *vcpu)
 {
 	return kvm_apic_mode(kvm_get_apic_base(vcpu));
 }
-EXPORT_SYMBOL_GPL(kvm_get_apic_mode);
 
 int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
@@ -355,14 +342,12 @@ int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	kvm_lapic_set_base(vcpu, msr_info->data);
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_apic_base);
 
 asmlinkage __visible void kvm_spurious_fault(void)
 {
 	/* Fault while not rebooting.  We want the trace. */
 	BUG();
 }
-EXPORT_SYMBOL_GPL(kvm_spurious_fault);
 
 #define EXCPT_BENIGN		0
 #define EXCPT_CONTRIBUTORY	1
@@ -450,7 +435,6 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu)
 	vcpu->arch.exception.has_payload = false;
 	vcpu->arch.exception.payload = 0;
 }
-EXPORT_SYMBOL_GPL(kvm_deliver_exception_payload);
 
 static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 		unsigned nr, bool has_error, u32 error_code,
@@ -544,13 +528,11 @@ void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr)
 {
 	kvm_multiple_exception(vcpu, nr, false, 0, false, 0, false);
 }
-EXPORT_SYMBOL_GPL(kvm_queue_exception);
 
 void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr)
 {
 	kvm_multiple_exception(vcpu, nr, false, 0, false, 0, true);
 }
-EXPORT_SYMBOL_GPL(kvm_requeue_exception);
 
 static void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
 				  unsigned long payload)
@@ -574,7 +556,6 @@ int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err)
 
 	return 1;
 }
-EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
 
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
 {
@@ -589,7 +570,6 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
 					fault->address);
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_inject_page_fault);
 
 static bool kvm_propagate_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
 {
@@ -606,19 +586,16 @@ void kvm_inject_nmi(struct kvm_vcpu *vcpu)
 	atomic_inc(&vcpu->arch.nmi_queued);
 	kvm_make_request(KVM_REQ_NMI, vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_inject_nmi);
 
 void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
 {
 	kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, false);
 }
-EXPORT_SYMBOL_GPL(kvm_queue_exception_e);
 
 void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
 {
 	kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, true);
 }
-EXPORT_SYMBOL_GPL(kvm_requeue_exception_e);
 
 /*
  * Checks if cpl <= required_cpl; if true, return true.  Otherwise queue
@@ -631,7 +608,6 @@ bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl)
 	kvm_queue_exception_e(vcpu, GP_VECTOR, 0);
 	return false;
 }
-EXPORT_SYMBOL_GPL(kvm_require_cpl);
 
 bool kvm_require_dr(struct kvm_vcpu *vcpu, int dr)
 {
@@ -641,7 +617,6 @@ bool kvm_require_dr(struct kvm_vcpu *vcpu, int dr)
 	kvm_queue_exception(vcpu, UD_VECTOR);
 	return false;
 }
-EXPORT_SYMBOL_GPL(kvm_require_dr);
 
 /*
  * This function will be used to read from the physical memory of the currently
@@ -665,7 +640,6 @@ int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 
 	return kvm_vcpu_read_guest_page(vcpu, real_gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_read_guest_page_mmu);
 
 static int kvm_read_nested_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn,
 			       void *data, int offset, int len, u32 access)
@@ -716,7 +690,6 @@ int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3)
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(load_pdptrs);
 
 bool pdptrs_changed(struct kvm_vcpu *vcpu)
 {
@@ -744,7 +717,6 @@ bool pdptrs_changed(struct kvm_vcpu *vcpu)
 
 	return changed;
 }
-EXPORT_SYMBOL_GPL(pdptrs_changed);
 
 int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 {
@@ -803,13 +775,11 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_cr0);
 
 void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw)
 {
 	(void)kvm_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~0x0eul) | (msw & 0x0f));
 }
-EXPORT_SYMBOL_GPL(kvm_lmsw);
 
 void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu)
 {
@@ -821,7 +791,6 @@ void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu)
 		vcpu->guest_xcr0_loaded = 1;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_load_guest_xcr0);
 
 void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
 {
@@ -831,7 +800,6 @@ void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
 		vcpu->guest_xcr0_loaded = 0;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_put_guest_xcr0);
 
 static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
 {
@@ -882,7 +850,6 @@ int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_xcr);
 
 int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
@@ -944,7 +911,6 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_cr4);
 
 int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
@@ -979,7 +945,6 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_cr3);
 
 int kvm_set_cr8(struct kvm_vcpu *vcpu, unsigned long cr8)
 {
@@ -991,7 +956,6 @@ int kvm_set_cr8(struct kvm_vcpu *vcpu, unsigned long cr8)
 		vcpu->arch.cr8 = cr8;
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_cr8);
 
 unsigned long kvm_get_cr8(struct kvm_vcpu *vcpu)
 {
@@ -1000,7 +964,6 @@ unsigned long kvm_get_cr8(struct kvm_vcpu *vcpu)
 	else
 		return vcpu->arch.cr8;
 }
-EXPORT_SYMBOL_GPL(kvm_get_cr8);
 
 static void kvm_update_dr0123(struct kvm_vcpu *vcpu)
 {
@@ -1079,7 +1042,6 @@ int kvm_set_dr(struct kvm_vcpu *vcpu, int dr, unsigned long val)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_dr);
 
 int kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val)
 {
@@ -1103,7 +1065,6 @@ int kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_get_dr);
 
 bool kvm_rdpmc(struct kvm_vcpu *vcpu)
 {
@@ -1118,7 +1079,6 @@ bool kvm_rdpmc(struct kvm_vcpu *vcpu)
 	kvm_rdx_write(vcpu, data >> 32);
 	return err;
 }
-EXPORT_SYMBOL_GPL(kvm_rdpmc);
 
 /*
  * List of msr numbers which we expose to userspace through KVM_GET_MSRS
@@ -1325,7 +1285,6 @@ bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
 
 	return __kvm_valid_efer(vcpu, efer);
 }
-EXPORT_SYMBOL_GPL(kvm_valid_efer);
 
 static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
@@ -1360,7 +1319,6 @@ void kvm_enable_efer_bits(u64 mask)
 {
        efer_reserved_bits &= ~mask;
 }
-EXPORT_SYMBOL_GPL(kvm_enable_efer_bits);
 
 /*
  * Write @data into the MSR specified by @index.  Select MSR specific fault
@@ -1431,13 +1389,11 @@ int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data)
 {
 	return __kvm_get_msr(vcpu, index, data, false);
 }
-EXPORT_SYMBOL_GPL(kvm_get_msr);
 
 int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data)
 {
 	return __kvm_set_msr(vcpu, index, data, false);
 }
-EXPORT_SYMBOL_GPL(kvm_set_msr);
 
 int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
 {
@@ -1456,7 +1412,6 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
 	kvm_rdx_write(vcpu, (data >> 32) & -1u);
 	return kvm_skip_emulated_instruction(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_rdmsr);
 
 int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
 {
@@ -1472,7 +1427,6 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
 	trace_kvm_msr_write(ecx, data);
 	return kvm_skip_emulated_instruction(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
 
 /*
  * Adapt set_msr() to msr_io()'s calling convention
@@ -1771,7 +1725,6 @@ u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 
 	return _tsc;
 }
-EXPORT_SYMBOL_GPL(kvm_scale_tsc);
 
 static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
 {
@@ -1788,7 +1741,6 @@ u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
 
 	return tsc_offset + kvm_scale_tsc(vcpu, host_tsc);
 }
-EXPORT_SYMBOL_GPL(kvm_read_l1_tsc);
 
 static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
 {
@@ -1911,7 +1863,6 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	spin_unlock(&kvm->arch.pvclock_gtod_sync_lock);
 }
 
-EXPORT_SYMBOL_GPL(kvm_write_tsc);
 
 static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
 					   s64 adjustment)
@@ -2821,7 +2772,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_set_msr_common);
 
 static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
 {
@@ -3060,7 +3010,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_get_msr_common);
 
 /*
  * Read or write a bunch of msrs. All parameters are kernel addresses.
@@ -5299,7 +5248,6 @@ int kvm_read_guest_virt(struct kvm_vcpu *vcpu,
 	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, access,
 					  exception);
 }
-EXPORT_SYMBOL_GPL(kvm_read_guest_virt);
 
 static int emulator_read_std(struct x86_emulate_ctxt *ctxt,
 			     gva_t addr, void *val, unsigned int bytes,
@@ -5384,7 +5332,6 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
 	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
 					   PFERR_WRITE_MASK, exception);
 }
-EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
 int handle_ud(struct kvm_vcpu *vcpu)
 {
@@ -5408,7 +5355,6 @@ int handle_ud(struct kvm_vcpu *vcpu)
 		kvm_queue_exception(vcpu, UD_VECTOR);
 	return 1;
 }
-EXPORT_SYMBOL_GPL(handle_ud);
 
 static int vcpu_is_mmio_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 			    gpa_t gpa, bool write)
@@ -5848,7 +5794,6 @@ int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
 	kvm_emulate_wbinvd_noskip(vcpu);
 	return kvm_skip_emulated_instruction(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_wbinvd);
 
 
 
@@ -6249,7 +6194,6 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
 
 	return EMULATE_DONE;
 }
-EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
 
 static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
 {
@@ -6461,7 +6405,6 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
 		kvm_vcpu_do_singlestep(vcpu, &r);
 	return r == EMULATE_DONE;
 }
-EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
 
 static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
 {
@@ -6691,14 +6634,12 @@ int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type)
 {
 	return x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_instruction);
 
 int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu,
 					void *insn, int insn_len)
 {
 	return x86_emulate_instruction(vcpu, 0, 0, insn, insn_len);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_instruction_from_buffer);
 
 static int complete_fast_pio_out_port_0x7e(struct kvm_vcpu *vcpu)
 {
@@ -6799,7 +6740,6 @@ int kvm_fast_pio(struct kvm_vcpu *vcpu, int size, unsigned short port, int in)
 		ret = kvm_fast_pio_out(vcpu, size, port);
 	return ret && kvm_skip_emulated_instruction(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_fast_pio);
 
 static int kvmclock_cpu_down_prep(unsigned int cpu)
 {
@@ -6986,7 +6926,6 @@ static void kvm_timer_init(void)
 }
 
 DEFINE_PER_CPU(struct kvm_vcpu *, current_vcpu);
-EXPORT_PER_CPU_SYMBOL_GPL(current_vcpu);
 
 int kvm_is_in_guest(void)
 {
@@ -7190,7 +7129,6 @@ int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
 
 int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 {
@@ -7201,7 +7139,6 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 	 */
 	return kvm_vcpu_halt(vcpu) && ret;
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
 #ifdef CONFIG_X86_64
 static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
@@ -7345,7 +7282,6 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 	++vcpu->stat.hypercalls;
 	return kvm_skip_emulated_instruction(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
 
 static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
 {
@@ -7851,13 +7787,11 @@ void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
 	 */
 	put_page(page);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_reload_apic_access_page);
 
 void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu)
 {
 	smp_send_reschedule(vcpu->cpu);
 }
-EXPORT_SYMBOL_GPL(__kvm_request_immediate_exit);
 
 /*
  * Returns 1 to let vcpu_run() continue the guest execution loop without
@@ -8533,7 +8467,6 @@ void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
 	*db = cs.db;
 	*l = cs.l;
 }
-EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
 
 static void __get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 {
@@ -8645,7 +8578,6 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index,
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
 	return EMULATE_DONE;
 }
-EXPORT_SYMBOL_GPL(kvm_task_switch);
 
 static int kvm_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 {
@@ -9246,7 +9178,6 @@ bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
 {
 	return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_is_reset_bsp);
 
 bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu)
 {
@@ -9254,7 +9185,6 @@ bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu)
 }
 
 struct static_key kvm_no_apic_vcpu __read_mostly;
-EXPORT_SYMBOL_GPL(kvm_no_apic_vcpu);
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
@@ -9476,7 +9406,6 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(__x86_set_memory_region);
 
 int x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size)
 {
@@ -9488,7 +9417,6 @@ int x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size)
 
 	return r;
 }
-EXPORT_SYMBOL_GPL(x86_set_memory_region);
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
@@ -9805,13 +9733,11 @@ unsigned long kvm_get_linear_rip(struct kvm_vcpu *vcpu)
 	return (u32)(get_segment_base(vcpu, VCPU_SREG_CS) +
 		     kvm_rip_read(vcpu));
 }
-EXPORT_SYMBOL_GPL(kvm_get_linear_rip);
 
 bool kvm_is_linear_rip(struct kvm_vcpu *vcpu, unsigned long linear_rip)
 {
 	return kvm_get_linear_rip(vcpu) == linear_rip;
 }
-EXPORT_SYMBOL_GPL(kvm_is_linear_rip);
 
 unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu)
 {
@@ -9822,7 +9748,6 @@ unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu)
 		rflags &= ~X86_EFLAGS_TF;
 	return rflags;
 }
-EXPORT_SYMBOL_GPL(kvm_get_rflags);
 
 static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
 {
@@ -9837,7 +9762,6 @@ void kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
 	__kvm_set_rflags(vcpu, rflags);
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_set_rflags);
 
 void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
 {
@@ -10044,37 +9968,31 @@ void kvm_arch_start_assignment(struct kvm *kvm)
 {
 	atomic_inc(&kvm->arch.assigned_device_count);
 }
-EXPORT_SYMBOL_GPL(kvm_arch_start_assignment);
 
 void kvm_arch_end_assignment(struct kvm *kvm)
 {
 	atomic_dec(&kvm->arch.assigned_device_count);
 }
-EXPORT_SYMBOL_GPL(kvm_arch_end_assignment);
 
 bool kvm_arch_has_assigned_device(struct kvm *kvm)
 {
 	return atomic_read(&kvm->arch.assigned_device_count);
 }
-EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device);
 
 void kvm_arch_register_noncoherent_dma(struct kvm *kvm)
 {
 	atomic_inc(&kvm->arch.noncoherent_dma_count);
 }
-EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma);
 
 void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm)
 {
 	atomic_dec(&kvm->arch.noncoherent_dma_count);
 }
-EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma);
 
 bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
 {
 	return atomic_read(&kvm->arch.noncoherent_dma_count);
 }
-EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
 
 bool kvm_arch_has_irq_bypass(void)
 {
@@ -10125,32 +10043,8 @@ bool kvm_vector_hashing_enabled(void)
 {
 	return vector_hashing;
 }
-EXPORT_SYMBOL_GPL(kvm_vector_hashing_enabled);
 
 bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
 {
 	return (vcpu->arch.msr_kvm_poll_control & 1) == 0;
 }
-EXPORT_SYMBOL_GPL(kvm_arch_no_poll);
-
-
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_cr);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit_inject);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intr_vmexit);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmenter_failed);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_invlpga);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_skinit);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_intercepts);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_write_tsc_offset);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_ple_window_update);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_pml_full);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_pi_irte_update);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_unaccelerated_access);
-EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_incomplete_ipi);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/14] KVM: monolithic: remove exports from KVM common code
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (6 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 08/14] KVM: monolithic: x86: remove exports Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 10/14] KVM: monolithic: x86: drop the kvm_pmu_ops structure Andrea Arcangeli
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

The exports would be duplicated across kvm-amd and kvm-intel if
they're kept and that causes various harmless warnings.

The warnings aren't particularly concerning because the two modules
can't load at the same time, but it's cleaner to remove the warnings
by removing the exports.

This commit might break non-x86 archs, but it should be simple to make
them monolithic too (if they're not already).

In the unlikely case there's a legit reason not to go monolithic in
any arch and to keep kvm.ko around, we'll need a way to retain the
exports. In which case this commit would need to be reversed and the
exports in the kvm common code should then be done only conditionally
to a new opt-in per-arch CONFIG option.

The following warning remains for now to be able to load the kvmgt
driver. These remaining warnings can be handled later.

WARNING: arch/x86/kvm/kvm-amd: 'kvm_get_kvm' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_put_kvm' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'gfn_to_memslot' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_is_visible_gfn' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'gfn_to_pfn' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_read_guest' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko
WARNING: arch/x86/kvm/kvm-amd: 'kvm_write_guest' exported twice. Previous export was in arch/x86/kvm/kvm-intel.ko

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 virt/kvm/eventfd.c  |  1 -
 virt/kvm/kvm_main.c | 65 ---------------------------------------------
 2 files changed, 66 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 67b6fc153e9c..4c1a8abd1458 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -462,7 +462,6 @@ bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
 
 	return false;
 }
-EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
 
 void kvm_notify_acked_gsi(struct kvm *kvm, int gsi)
 {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9aa448ea688f..1afbb387001a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -72,22 +72,18 @@ MODULE_LICENSE("GPL");
 /* Architectures should define their poll value according to the halt latency */
 unsigned int halt_poll_ns = KVM_HALT_POLL_NS_DEFAULT;
 module_param(halt_poll_ns, uint, 0644);
-EXPORT_SYMBOL_GPL(halt_poll_ns);
 
 /* Default doubles per-vcpu halt_poll_ns. */
 unsigned int halt_poll_ns_grow = 2;
 module_param(halt_poll_ns_grow, uint, 0644);
-EXPORT_SYMBOL_GPL(halt_poll_ns_grow);
 
 /* The start value to grow halt_poll_ns from */
 unsigned int halt_poll_ns_grow_start = 10000; /* 10us */
 module_param(halt_poll_ns_grow_start, uint, 0644);
-EXPORT_SYMBOL_GPL(halt_poll_ns_grow_start);
 
 /* Default resets per-vcpu halt_poll_ns . */
 unsigned int halt_poll_ns_shrink;
 module_param(halt_poll_ns_shrink, uint, 0644);
-EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
 
 /*
  * Ordering of locks:
@@ -104,12 +100,10 @@ static int kvm_usage_count;
 static atomic_t hardware_enable_failed;
 
 struct kmem_cache *kvm_vcpu_cache;
-EXPORT_SYMBOL_GPL(kvm_vcpu_cache);
 
 static __read_mostly struct preempt_ops kvm_preempt_ops;
 
 struct dentry *kvm_debugfs_dir;
-EXPORT_SYMBOL_GPL(kvm_debugfs_dir);
 
 static int kvm_debugfs_num_entries;
 static const struct file_operations *stat_fops_per_vm[];
@@ -133,7 +127,6 @@ static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 static void mark_page_dirty_in_slot(struct kvm_memory_slot *memslot, gfn_t gfn);
 
 __visible bool kvm_rebooting;
-EXPORT_SYMBOL_GPL(kvm_rebooting);
 
 static bool largepages_enabled = true;
 
@@ -167,7 +160,6 @@ void vcpu_load(struct kvm_vcpu *vcpu)
 	kvm_arch_vcpu_load(vcpu, cpu);
 	put_cpu();
 }
-EXPORT_SYMBOL_GPL(vcpu_load);
 
 void vcpu_put(struct kvm_vcpu *vcpu)
 {
@@ -176,7 +168,6 @@ void vcpu_put(struct kvm_vcpu *vcpu)
 	preempt_notifier_unregister(&vcpu->preempt_notifier);
 	preempt_enable();
 }
-EXPORT_SYMBOL_GPL(vcpu_put);
 
 /* TODO: merge with kvm_arch_vcpu_should_kick */
 static bool kvm_request_needs_ipi(struct kvm_vcpu *vcpu, unsigned req)
@@ -280,7 +271,6 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 		++kvm->stat.remote_tlb_flush;
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
-EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
 #endif
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
@@ -326,7 +316,6 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 fail:
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_init);
 
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu)
 {
@@ -339,7 +328,6 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu)
 	kvm_arch_vcpu_uninit(vcpu);
 	free_page((unsigned long)vcpu->run);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_uninit);
 
 #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
 static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)
@@ -1076,7 +1064,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
 out:
 	return r;
 }
-EXPORT_SYMBOL_GPL(__kvm_set_memory_region);
 
 int kvm_set_memory_region(struct kvm *kvm,
 			  const struct kvm_userspace_memory_region *mem)
@@ -1088,7 +1075,6 @@ int kvm_set_memory_region(struct kvm *kvm,
 	mutex_unlock(&kvm->slots_lock);
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_set_memory_region);
 
 static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 					  struct kvm_userspace_memory_region *mem)
@@ -1130,7 +1116,6 @@ int kvm_get_dirty_log(struct kvm *kvm,
 		*is_dirty = 1;
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
 #ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
 /**
@@ -1216,7 +1201,6 @@ int kvm_get_dirty_log_protect(struct kvm *kvm,
 		return -EFAULT;
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
 
 /**
  * kvm_clear_dirty_log_protect - clear dirty bits in the bitmap
@@ -1290,7 +1274,6 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm,
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_clear_dirty_log_protect);
 #endif
 
 bool kvm_largepages_enabled(void)
@@ -1302,7 +1285,6 @@ void kvm_disable_largepages(void)
 {
 	largepages_enabled = false;
 }
-EXPORT_SYMBOL_GPL(kvm_disable_largepages);
 
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 {
@@ -1382,19 +1364,16 @@ unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot,
 {
 	return gfn_to_hva_many(slot, gfn, NULL);
 }
-EXPORT_SYMBOL_GPL(gfn_to_hva_memslot);
 
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 {
 	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
 }
-EXPORT_SYMBOL_GPL(gfn_to_hva);
 
 unsigned long kvm_vcpu_gfn_to_hva(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
 	return gfn_to_hva_many(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn, NULL);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_hva);
 
 /*
  * Return the hva of a @gfn and the R/W attribute if possible.
@@ -1656,7 +1635,6 @@ kvm_pfn_t __gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn,
 	return hva_to_pfn(addr, atomic, async, write_fault,
 			  writable);
 }
-EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot);
 
 kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault,
 		      bool *writable)
@@ -1664,31 +1642,26 @@ kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault,
 	return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, NULL,
 				    write_fault, writable);
 }
-EXPORT_SYMBOL_GPL(gfn_to_pfn_prot);
 
 kvm_pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn)
 {
 	return __gfn_to_pfn_memslot(slot, gfn, false, NULL, true, NULL);
 }
-EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot);
 
 kvm_pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn)
 {
 	return __gfn_to_pfn_memslot(slot, gfn, true, NULL, true, NULL);
 }
-EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic);
 
 kvm_pfn_t gfn_to_pfn_atomic(struct kvm *kvm, gfn_t gfn)
 {
 	return gfn_to_pfn_memslot_atomic(gfn_to_memslot(kvm, gfn), gfn);
 }
-EXPORT_SYMBOL_GPL(gfn_to_pfn_atomic);
 
 kvm_pfn_t kvm_vcpu_gfn_to_pfn_atomic(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
 	return gfn_to_pfn_memslot_atomic(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_pfn_atomic);
 
 kvm_pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn)
 {
@@ -1700,7 +1673,6 @@ kvm_pfn_t kvm_vcpu_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
 	return gfn_to_pfn_memslot(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_pfn);
 
 int gfn_to_page_many_atomic(struct kvm_memory_slot *slot, gfn_t gfn,
 			    struct page **pages, int nr_pages)
@@ -1717,7 +1689,6 @@ int gfn_to_page_many_atomic(struct kvm_memory_slot *slot, gfn_t gfn,
 
 	return __get_user_pages_fast(addr, nr_pages, 1, pages);
 }
-EXPORT_SYMBOL_GPL(gfn_to_page_many_atomic);
 
 static struct page *kvm_pfn_to_page(kvm_pfn_t pfn)
 {
@@ -1740,7 +1711,6 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
 	return kvm_pfn_to_page(pfn);
 }
-EXPORT_SYMBOL_GPL(gfn_to_page);
 
 static int __kvm_map_gfn(struct kvm_memory_slot *slot, gfn_t gfn,
 			 struct kvm_host_map *map)
@@ -1780,7 +1750,6 @@ int kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
 {
 	return __kvm_map_gfn(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn, map);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_map);
 
 void kvm_vcpu_unmap(struct kvm_vcpu *vcpu, struct kvm_host_map *map,
 		    bool dirty)
@@ -1808,7 +1777,6 @@ void kvm_vcpu_unmap(struct kvm_vcpu *vcpu, struct kvm_host_map *map,
 	map->hva = NULL;
 	map->page = NULL;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_unmap);
 
 struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
@@ -1818,7 +1786,6 @@ struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn)
 
 	return kvm_pfn_to_page(pfn);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_page);
 
 void kvm_release_page_clean(struct page *page)
 {
@@ -1826,14 +1793,12 @@ void kvm_release_page_clean(struct page *page)
 
 	kvm_release_pfn_clean(page_to_pfn(page));
 }
-EXPORT_SYMBOL_GPL(kvm_release_page_clean);
 
 void kvm_release_pfn_clean(kvm_pfn_t pfn)
 {
 	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn))
 		put_page(pfn_to_page(pfn));
 }
-EXPORT_SYMBOL_GPL(kvm_release_pfn_clean);
 
 void kvm_release_page_dirty(struct page *page)
 {
@@ -1841,14 +1806,12 @@ void kvm_release_page_dirty(struct page *page)
 
 	kvm_release_pfn_dirty(page_to_pfn(page));
 }
-EXPORT_SYMBOL_GPL(kvm_release_page_dirty);
 
 void kvm_release_pfn_dirty(kvm_pfn_t pfn)
 {
 	kvm_set_pfn_dirty(pfn);
 	kvm_release_pfn_clean(pfn);
 }
-EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty);
 
 void kvm_set_pfn_dirty(kvm_pfn_t pfn)
 {
@@ -1858,21 +1821,18 @@ void kvm_set_pfn_dirty(kvm_pfn_t pfn)
 		SetPageDirty(page);
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
 
 void kvm_set_pfn_accessed(kvm_pfn_t pfn)
 {
 	if (!kvm_is_reserved_pfn(pfn))
 		mark_page_accessed(pfn_to_page(pfn));
 }
-EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
 
 void kvm_get_pfn(kvm_pfn_t pfn)
 {
 	if (!kvm_is_reserved_pfn(pfn))
 		get_page(pfn_to_page(pfn));
 }
-EXPORT_SYMBOL_GPL(kvm_get_pfn);
 
 static int next_segment(unsigned long len, int offset)
 {
@@ -1904,7 +1864,6 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 
 	return __kvm_read_guest_page(slot, gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_read_guest_page);
 
 int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data,
 			     int offset, int len)
@@ -1913,7 +1872,6 @@ int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data,
 
 	return __kvm_read_guest_page(slot, gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_page);
 
 int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len)
 {
@@ -1953,7 +1911,6 @@ int kvm_vcpu_read_guest(struct kvm_vcpu *vcpu, gpa_t gpa, void *data, unsigned l
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest);
 
 static int __kvm_read_guest_atomic(struct kvm_memory_slot *slot, gfn_t gfn,
 			           void *data, int offset, unsigned long len)
@@ -1981,7 +1938,6 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data,
 
 	return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_read_guest_atomic);
 
 int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa,
 			       void *data, unsigned long len)
@@ -1992,7 +1948,6 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa,
 
 	return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);
 
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
 			          const void *data, int offset, int len)
@@ -2017,7 +1972,6 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn,
 
 	return __kvm_write_guest_page(slot, gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_write_guest_page);
 
 int kvm_vcpu_write_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn,
 			      const void *data, int offset, int len)
@@ -2026,7 +1980,6 @@ int kvm_vcpu_write_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn,
 
 	return __kvm_write_guest_page(slot, gfn, data, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest_page);
 
 int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
 		    unsigned long len)
@@ -2068,7 +2021,6 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest);
 
 static int __kvm_gfn_to_hva_cache_init(struct kvm_memslots *slots,
 				       struct gfn_to_hva_cache *ghc,
@@ -2114,7 +2066,6 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	return __kvm_gfn_to_hva_cache_init(slots, ghc, gpa, len);
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_hva_cache_init);
 
 int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 				  void *data, unsigned int offset,
@@ -2142,14 +2093,12 @@ int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_write_guest_offset_cached);
 
 int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			   void *data, unsigned long len)
 {
 	return kvm_write_guest_offset_cached(kvm, ghc, data, 0, len);
 }
-EXPORT_SYMBOL_GPL(kvm_write_guest_cached);
 
 int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			   void *data, unsigned long len)
@@ -2174,7 +2123,6 @@ int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_read_guest_cached);
 
 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 {
@@ -2182,7 +2130,6 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 
 	return kvm_write_guest_page(kvm, gfn, zero_page, offset, len);
 }
-EXPORT_SYMBOL_GPL(kvm_clear_guest_page);
 
 int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len)
 {
@@ -2201,7 +2148,6 @@ int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(kvm_clear_guest);
 
 static void mark_page_dirty_in_slot(struct kvm_memory_slot *memslot,
 				    gfn_t gfn)
@@ -2220,7 +2166,6 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 	memslot = gfn_to_memslot(kvm, gfn);
 	mark_page_dirty_in_slot(memslot, gfn);
 }
-EXPORT_SYMBOL_GPL(mark_page_dirty);
 
 void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
@@ -2229,7 +2174,6 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn)
 	memslot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 	mark_page_dirty_in_slot(memslot, gfn);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);
 
 void kvm_sigset_activate(struct kvm_vcpu *vcpu)
 {
@@ -2377,7 +2321,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
 	kvm_arch_vcpu_block_finish(vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_block);
 
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
 {
@@ -2393,7 +2336,6 @@ bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
 
 	return false;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
 
 #ifndef CONFIG_S390
 /*
@@ -2413,7 +2355,6 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 			smp_send_reschedule(cpu);
 	put_cpu();
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
 #endif /* !CONFIG_S390 */
 
 int kvm_vcpu_yield_to(struct kvm_vcpu *target)
@@ -2434,7 +2375,6 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target)
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_yield_to);
 
 /*
  * Helper that checks whether a VCPU is eligible for directed yield.
@@ -2551,7 +2491,6 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 	/* Ensure vcpu is not eligible during next spinloop */
 	kvm_vcpu_set_dy_eligible(me, false);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin);
 
 static vm_fault_t kvm_vcpu_fault(struct vm_fault *vmf)
 {
@@ -3735,7 +3674,6 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
 	r = __kvm_io_bus_write(vcpu, bus, &range, val);
 	return r < 0 ? r : 0;
 }
-EXPORT_SYMBOL_GPL(kvm_io_bus_write);
 
 /* kvm_io_bus_write_cookie - called under kvm->slots_lock */
 int kvm_io_bus_write_cookie(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
@@ -3912,7 +3850,6 @@ struct kvm_io_device *kvm_io_bus_get_dev(struct kvm *kvm, enum kvm_bus bus_idx,
 
 	return iodev;
 }
-EXPORT_SYMBOL_GPL(kvm_io_bus_get_dev);
 
 static int kvm_debugfs_open(struct inode *inode, struct file *file,
 			   int (*get)(void *, u64 *), int (*set)(void *, u64),
@@ -4341,7 +4278,6 @@ __init int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 out_fail:
 	return r;
 }
-EXPORT_SYMBOL_GPL(kvm_init);
 
 void kvm_exit(void)
 {
@@ -4359,4 +4295,3 @@ void kvm_exit(void)
 	free_cpumask_var(cpus_hardware_enabled);
 	kvm_vfio_ops_exit();
 }
-EXPORT_SYMBOL_GPL(kvm_exit);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/14] KVM: monolithic: x86: drop the kvm_pmu_ops structure
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (7 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 09/14] KVM: monolithic: remove exports from KVM common code Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 11/14] KVM: x86: optimize more exit handlers in vmx.c Andrea Arcangeli
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Cleanup after the structure was finally left completely unused.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ---
 arch/x86/kvm/pmu.h              | 20 --------------------
 arch/x86/kvm/pmu_amd.c          | 15 ---------------
 arch/x86/kvm/svm.c              |  1 -
 arch/x86/kvm/vmx/pmu_intel.c    | 15 ---------------
 arch/x86/kvm/vmx/vmx.c          |  2 --
 6 files changed, 56 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 75affbf7861b..0481648852c0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1348,9 +1348,6 @@ struct kvm_x86_ops {
 					   gfn_t offset, unsigned long mask);
 	int (*write_log_dirty)(struct kvm_vcpu *vcpu);
 
-	/* pmu operations of sub-arch */
-	const struct kvm_pmu_ops *pmu_ops;
-
 	/*
 	 * Architecture specific hooks for vCPU blocking due to
 	 * HLT instruction.
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 82f07e3492df..c74d4ab30f66 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -36,23 +36,6 @@ extern void kvm_x86_pmu_refresh(struct kvm_vcpu *vcpu);
 extern void kvm_x86_pmu_init(struct kvm_vcpu *vcpu);
 extern void kvm_x86_pmu_reset(struct kvm_vcpu *vcpu);
 
-struct kvm_pmu_ops {
-	unsigned (*find_arch_event)(struct kvm_pmu *pmu, u8 event_select,
-				    u8 unit_mask);
-	unsigned (*find_fixed_event)(int idx);
-	bool (*pmc_is_enabled)(struct kvm_pmc *pmc);
-	struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
-	struct kvm_pmc *(*msr_idx_to_pmc)(struct kvm_vcpu *vcpu, unsigned idx,
-					  u64 *mask);
-	int (*is_valid_msr_idx)(struct kvm_vcpu *vcpu, unsigned idx);
-	bool (*is_valid_msr)(struct kvm_vcpu *vcpu, u32 msr);
-	int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
-	int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
-	void (*refresh)(struct kvm_vcpu *vcpu);
-	void (*init)(struct kvm_vcpu *vcpu);
-	void (*reset)(struct kvm_vcpu *vcpu);
-};
-
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
@@ -138,7 +121,4 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
 int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp);
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx);
-
-extern struct kvm_pmu_ops intel_pmu_ops;
-extern struct kvm_pmu_ops amd_pmu_ops;
 #endif /* __KVM_X86_PMU_H */
diff --git a/arch/x86/kvm/pmu_amd.c b/arch/x86/kvm/pmu_amd.c
index 7ea588023949..1b09ae337516 100644
--- a/arch/x86/kvm/pmu_amd.c
+++ b/arch/x86/kvm/pmu_amd.c
@@ -300,18 +300,3 @@ void kvm_x86_pmu_reset(struct kvm_vcpu *vcpu)
 		pmc->counter = pmc->eventsel = 0;
 	}
 }
-
-struct kvm_pmu_ops amd_pmu_ops = {
-	.find_arch_event = kvm_x86_pmu_find_arch_event,
-	.find_fixed_event = kvm_x86_pmu_find_fixed_event,
-	.pmc_is_enabled = kvm_x86_pmu_pmc_is_enabled,
-	.pmc_idx_to_pmc = kvm_x86_pmu_pmc_idx_to_pmc,
-	.msr_idx_to_pmc = kvm_x86_pmu_msr_idx_to_pmc,
-	.is_valid_msr_idx = kvm_x86_pmu_is_valid_msr_idx,
-	.is_valid_msr = kvm_x86_pmu_is_valid_msr,
-	.get_msr = kvm_x86_pmu_get_msr,
-	.set_msr = kvm_x86_pmu_set_msr,
-	.refresh = kvm_x86_pmu_refresh,
-	.init = kvm_x86_pmu_init,
-	.reset = kvm_x86_pmu_reset,
-};
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 057ba1f8d7b3..50c57112c0ce 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7305,7 +7305,6 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 
 	.sched_in = kvm_x86_sched_in,
 
-	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = kvm_x86_deliver_posted_interrupt,
 	.dy_apicv_has_pending_interrupt = kvm_x86_dy_apicv_has_pending_interrupt,
 	.update_pi_irte = kvm_x86_update_pi_irte,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 758d6dbdbed2..530ca9942ecd 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -356,18 +356,3 @@ void kvm_x86_pmu_reset(struct kvm_vcpu *vcpu)
 	pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status =
 		pmu->global_ovf_ctrl = 0;
 }
-
-struct kvm_pmu_ops intel_pmu_ops = {
-	.find_arch_event = kvm_x86_pmu_find_arch_event,
-	.find_fixed_event = kvm_x86_pmu_find_fixed_event,
-	.pmc_is_enabled = kvm_x86_pmu_pmc_is_enabled,
-	.pmc_idx_to_pmc = kvm_x86_pmu_pmc_idx_to_pmc,
-	.msr_idx_to_pmc = kvm_x86_pmu_msr_idx_to_pmc,
-	.is_valid_msr_idx = kvm_x86_pmu_is_valid_msr_idx,
-	.is_valid_msr = kvm_x86_pmu_is_valid_msr,
-	.get_msr = kvm_x86_pmu_get_msr,
-	.set_msr = kvm_x86_pmu_set_msr,
-	.refresh = kvm_x86_pmu_refresh,
-	.init = kvm_x86_pmu_init,
-	.reset = kvm_x86_pmu_reset,
-};
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index faccffc4709e..6e995e37a8c8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7805,8 +7805,6 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.pre_block = kvm_x86_pre_block,
 	.post_block = kvm_x86_post_block,
 
-	.pmu_ops = &intel_pmu_ops,
-
 	.update_pi_irte = kvm_x86_update_pi_irte,
 
 #ifdef CONFIG_X86_64

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/14] KVM: x86: optimize more exit handlers in vmx.c
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (8 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 10/14] KVM: monolithic: x86: drop the kvm_pmu_ops structure Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers Andrea Arcangeli
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

Eliminate wasteful call/ret non RETPOLINE case and unnecessary fentry
dynamic tracing hooking points.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 30 +++++-------------------------
 1 file changed, 5 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6e995e37a8c8..de3ae2246205 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4589,7 +4589,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static int handle_external_interrupt(struct kvm_vcpu *vcpu)
+static __always_inline int handle_external_interrupt(struct kvm_vcpu *vcpu)
 {
 	++vcpu->stat.irq_exits;
 	return 1;
@@ -4860,21 +4860,6 @@ void kvm_x86_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
 	vmcs_writel(GUEST_DR7, val);
 }
 
-static int handle_cpuid(struct kvm_vcpu *vcpu)
-{
-	return kvm_emulate_cpuid(vcpu);
-}
-
-static int handle_rdmsr(struct kvm_vcpu *vcpu)
-{
-	return kvm_emulate_rdmsr(vcpu);
-}
-
-static int handle_wrmsr(struct kvm_vcpu *vcpu)
-{
-	return kvm_emulate_wrmsr(vcpu);
-}
-
 static int handle_tpr_below_threshold(struct kvm_vcpu *vcpu)
 {
 	kvm_apic_update_ppr(vcpu);
@@ -4891,11 +4876,6 @@ static int handle_interrupt_window(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
-static int handle_halt(struct kvm_vcpu *vcpu)
-{
-	return kvm_emulate_halt(vcpu);
-}
-
 static int handle_vmcall(struct kvm_vcpu *vcpu)
 {
 	return kvm_emulate_hypercall(vcpu);
@@ -5487,11 +5467,11 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[EXIT_REASON_IO_INSTRUCTION]          = handle_io,
 	[EXIT_REASON_CR_ACCESS]               = handle_cr,
 	[EXIT_REASON_DR_ACCESS]               = handle_dr,
-	[EXIT_REASON_CPUID]                   = handle_cpuid,
-	[EXIT_REASON_MSR_READ]                = handle_rdmsr,
-	[EXIT_REASON_MSR_WRITE]               = handle_wrmsr,
+	[EXIT_REASON_CPUID]                   = kvm_emulate_cpuid,
+	[EXIT_REASON_MSR_READ]                = kvm_emulate_rdmsr,
+	[EXIT_REASON_MSR_WRITE]               = kvm_emulate_wrmsr,
 	[EXIT_REASON_PENDING_INTERRUPT]       = handle_interrupt_window,
-	[EXIT_REASON_HLT]                     = handle_halt,
+	[EXIT_REASON_HLT]                     = kvm_emulate_halt,
 	[EXIT_REASON_INVD]		      = handle_invd,
 	[EXIT_REASON_INVLPG]		      = handle_invlpg,
 	[EXIT_REASON_RDPMC]                   = handle_rdpmc,

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (9 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 11/14] KVM: x86: optimize more exit handlers in vmx.c Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-10-15  8:28   ` Paolo Bonzini
  2019-09-28 17:23 ` [PATCH 13/14] KVM: retpolines: x86: eliminate retpoline from svm.c " Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 14/14] x86: retpolines: eliminate retpoline from msr event handlers Andrea Arcangeli
  12 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

It's enough to check the exit value and issue a direct call to avoid
the retpoline for all the common vmexit reasons.

Reducing this list to only EXIT_REASON_MSR_WRITE,
EXIT_REASON_PREEMPTION_TIMER, EXIT_REASON_EPT_MISCONFIG,
EXIT_REASON_IO_INSTRUCTION increases the computation time of the
hrtimer guest testcase on Haswell i5-4670T CPU @ 2.30GHz by 7% with
the default spectre v2 mitigation enabled in the host and guest. On
skylake as opposed there's no measurable difference with the short
list. To put things in prospective on Haswell the same hrtimer
workload (note: it never calls cpuid and it never attempts to trigger
more vmexit on purpose) in guest takes 16.3% longer to compute on
upstream KVM running in the host than with the KVM mono v1 patchset
applied to the host kernel, while on skylake the same takes only 5.4%
more time (both with the default mitigations enabled in guest and
host).

It's also unclear why EXIT_REASON_IO_INSTRUCTION should be included.

Of course CONFIG_RETPOLINE already forbids gcc not to do indirect
jumps while compiling all switch() statements, however switch() would
still allow the compiler to bisect the value, however it seems to run
slower if something and the reason is that it's better to prioritize
and do the minimal possible number of checks for the most common vmexit.

The halt and pause loop exiting may be slow paths from the point of
the guest, but not necessarily so from the point of the host. There
can be a flood of halt exit reasons (in fact that's why the cpuidle
guest haltpoll support was recently merged and we can't rely on it
here because there are older kernels and other OS that must also
perform optimally). All it takes is a pipe ping pong with a different
host CPU and the host CPUs running at full capacity.

The same consideration applies to the pause loop exiting exit reason,
if there's heavy host overcommit that collides heavily in a spinlock
the same may happen.

In the common case of a fully idle host, the halt and pause loop
exiting can't help, but adding them doesn't hurt the common case and
the expectation here is that if they would ever become measurable, it
would be because they are increasing (and not decreasing) performance.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index de3ae2246205..2bd57a7d2be1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5846,9 +5846,29 @@ int kvm_x86_handle_exit(struct kvm_vcpu *vcpu)
 	}
 
 	if (exit_reason < kvm_vmx_max_exit_handlers
-	    && kvm_vmx_exit_handlers[exit_reason])
+	    && kvm_vmx_exit_handlers[exit_reason]) {
+#ifdef CONFIG_RETPOLINE
+		if (exit_reason == EXIT_REASON_MSR_WRITE)
+			return kvm_emulate_wrmsr(vcpu);
+		else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
+			return handle_preemption_timer(vcpu);
+		else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT)
+			return handle_interrupt_window(vcpu);
+		else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+			return handle_external_interrupt(vcpu);
+		else if (exit_reason == EXIT_REASON_HLT)
+			return kvm_emulate_halt(vcpu);
+		else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION)
+			return handle_pause(vcpu);
+		else if (exit_reason == EXIT_REASON_MSR_READ)
+			return kvm_emulate_rdmsr(vcpu);
+		else if (exit_reason == EXIT_REASON_CPUID)
+			return kvm_emulate_cpuid(vcpu);
+		else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
+			return handle_ept_misconfig(vcpu);
+#endif
 		return kvm_vmx_exit_handlers[exit_reason](vcpu);
-	else {
+	} else {
 		vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
 				exit_reason);
 		dump_vmcs();

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 13/14] KVM: retpolines: x86: eliminate retpoline from svm.c exit handlers
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (10 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  2019-09-28 17:23 ` [PATCH 14/14] x86: retpolines: eliminate retpoline from msr event handlers Andrea Arcangeli
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

It's enough to check the exit value and issue a direct call to avoid
the retpoline for all the common vmexit reasons.

After this commit is applied, here the most common retpolines executed
under a high resolution timer workload in the guest on a SVM host:

[..]
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    ktime_get_update_offsets_now+70
    hrtimer_interrupt+131
    smp_apic_timer_interrupt+106
    apic_timer_interrupt+15
    start_sw_timer+359
    restart_apic_timer+85
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 1940
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_r12+33
    force_qs_rnp+217
    rcu_gp_kthread+1270
    kthread+268
    ret_from_fork+34
]: 4644
@[]: 25095
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    lapic_next_event+28
    clockevents_program_event+148
    hrtimer_start_range_ns+528
    start_sw_timer+356
    restart_apic_timer+85
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 41474
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    clockevents_program_event+148
    hrtimer_start_range_ns+528
    start_sw_timer+356
    restart_apic_timer+85
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 41474
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    ktime_get+58
    clockevents_program_event+84
    hrtimer_start_range_ns+528
    start_sw_timer+356
    restart_apic_timer+85
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 41887
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    lapic_next_event+28
    clockevents_program_event+148
    hrtimer_try_to_cancel+168
    hrtimer_cancel+21
    kvm_set_lapic_tscdeadline_msr+43
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 42723
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    clockevents_program_event+148
    hrtimer_try_to_cancel+168
    hrtimer_cancel+21
    kvm_set_lapic_tscdeadline_msr+43
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 42766
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    ktime_get+58
    clockevents_program_event+84
    hrtimer_try_to_cancel+168
    hrtimer_cancel+21
    kvm_set_lapic_tscdeadline_msr+43
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 42848
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    ktime_get+58
    start_sw_timer+279
    restart_apic_timer+85
    kvm_set_msr_common+1497
    msr_interception+142
    vcpu_enter_guest+684
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 499845

@total: 1780243

SVM has no TSC based programmable preemption timer so it is invoking
ktime_get() frequently.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/kvm/svm.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 50c57112c0ce..4d8370fcd212 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4989,6 +4989,20 @@ int kvm_x86_handle_exit(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
+#ifdef CONFIG_RETPOLINE
+	if (exit_code == SVM_EXIT_MSR)
+		return msr_interception(svm);
+	else if (exit_code == SVM_EXIT_VINTR)
+		return interrupt_window_interception(svm);
+	else if (exit_code == SVM_EXIT_INTR)
+		return intr_interception(svm);
+	else if (exit_code == SVM_EXIT_HLT)
+		return halt_interception(svm);
+	else if (exit_code == SVM_EXIT_NPF)
+		return npf_interception(svm);
+	else if (exit_code == SVM_EXIT_CPUID)
+		return cpuid_interception(svm);
+#endif
 	return svm_exit_handlers[exit_code](svm);
 }
 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 14/14] x86: retpolines: eliminate retpoline from msr event handlers
  2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
                   ` (11 preceding siblings ...)
  2019-09-28 17:23 ` [PATCH 13/14] KVM: retpolines: x86: eliminate retpoline from svm.c " Andrea Arcangeli
@ 2019-09-28 17:23 ` Andrea Arcangeli
  12 siblings, 0 replies; 29+ messages in thread
From: Andrea Arcangeli @ 2019-09-28 17:23 UTC (permalink / raw)
  To: kvm, linux-kernel; +Cc: Paolo Bonzini, Vitaly Kuznetsov, Sean Christopherson

It's enough to check the value and issue the direct call.

After this commit is applied, here the most common retpolines executed
under a high resolution timer workload in the guest on a VMX host:

[..]
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 267
@[]: 2256
@[
    trace_retpoline+1
    __trace_retpoline+30
    __x86_indirect_thunk_rax+33
    __kvm_wait_lapic_expire+284
    vmx_vcpu_run.part.97+1091
    vcpu_enter_guest+377
    kvm_arch_vcpu_ioctl_run+261
    kvm_vcpu_ioctl+559
    do_vfs_ioctl+164
    ksys_ioctl+96
    __x64_sys_ioctl+22
    do_syscall_64+89
    entry_SYSCALL_64_after_hwframe+68
]: 2390
@[]: 33410

@total: 315707

Note the highest hit above is __delay so probably not worth optimizing
even if it would be more frequent than 2k hits per sec.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/events/intel/core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 27ee47a7be66..65b383d5e062 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3323,8 +3323,19 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	return 0;
 }
 
+#ifdef CONFIG_RETPOLINE
+static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr);
+static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr);
+#endif
+
 struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 {
+#ifdef CONFIG_RETPOLINE
+	if (x86_pmu.guest_get_msrs == intel_guest_get_msrs)
+		return intel_guest_get_msrs(nr);
+	else if (x86_pmu.guest_get_msrs == core_guest_get_msrs)
+		return core_guest_get_msrs(nr);
+#endif
 	if (x86_pmu.guest_get_msrs)
 		return x86_pmu.guest_get_msrs(nr);
 	*nr = 0;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko
  2019-09-28 17:23 ` [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko Andrea Arcangeli
@ 2019-10-15  1:31   ` Sean Christopherson
  2019-10-15  3:18     ` Sean Christopherson
  0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2019-10-15  1:31 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Paolo Bonzini, Vitaly Kuznetsov

On Sat, Sep 28, 2019 at 01:23:10PM -0400, Andrea Arcangeli wrote:
> This is the first commit of a patch series that aims to replace the
> modular kvm.ko kernel module with a monolithic kvm-intel/kvm-amd
> model. This change has the only possible cons of wasting some disk
> space in /lib/modules/. The pros are that it saves CPUS and some minor
> RAM which are more scarse resources than disk space.
> 
> The pointer to function virtual template model cannot provide any
> runtime benefit because kvm-intel and kvm-amd can't be loaded at the
> same time.
> 
> This removes kvm.ko and it links and duplicates all kvm.ko objects to
> both kvm-amd and kvm-intel.

The KVM config option should be changed to a bool and its help text
updated.  Maybe something similar to the help for VIRTUALIZATION to make
it clear that enabling KVM on its own does nothing.

> 
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  arch/x86/kvm/Makefile | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index 31ecf7a76d5a..68b81f381369 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -12,9 +12,8 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
>  			   i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
>  			   hyperv.o page_track.o debugfs.o
>  
> -kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o
> -kvm-amd-y		+= svm.o pmu_amd.o
> +kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o $(kvm-y)
> +kvm-amd-y		+= svm.o pmu_amd.o $(kvm-y)
>  
> -obj-$(CONFIG_KVM)	+= kvm.o
>  obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
>  obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel
  2019-09-28 17:23 ` [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel Andrea Arcangeli
@ 2019-10-15  3:16   ` Sean Christopherson
  2019-10-15  8:21     ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2019-10-15  3:16 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Paolo Bonzini, Vitaly Kuznetsov

On Sat, Sep 28, 2019 at 01:23:11PM -0400, Andrea Arcangeli wrote:
> Linking both vmx and svm into the kernel at the same time isn't
> possible anymore or the kvm_x86/kvm_x86_pmu external function names
> would collide.
> 
> Reported-by: kbuild test robot <lkp@intel.com>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  arch/x86/kvm/Kconfig | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 840e12583b85..e1601c54355e 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -59,9 +59,29 @@ config KVM
>  
>  	  If unsure, say N.
>  
> +if KVM=y

Hmm, I see why the previous patch left KVM as a tristate.  I tried a
variety of hacks to let KVM be a bool but nothing worked.

> +
> +choice
> +	prompt "To link KVM statically into the kernel you need to choose"
> +	help
> +	  In order to build a kernel with support for both AMD and Intel
> +	  CPUs, you need to set CONFIG_KVM=m.
> +
> +config KVM_AMD_STATIC
> +	select KVM_AMD
> +	bool "Link KVM AMD statically into the kernel"
> +
> +config KVM_INTEL_STATIC
> +	select KVM_INTEL
> +	bool "Link KVM Intel statically into the kernel"

The prompt and choice text is way too long, e.g. in my usual window it
cuts off at:

  To link KVM statically into the kernel you need to choose (Link KVM Intel statically into

Without the full text (the -> at the end), it's not obvious it's an option
menu (AMD was selected by default for me and it took me a second to figure
out what to hit enter on).

I think short and sweet is enough for the prompt, with the details of how
build both buried in the help text.

choice
	prompt "KVM built-in support"
	help
	  Here be a long and detailed help text.

config KVM_AMD_STATIC
	select KVM_AMD
	bool "KVM AMD"

config KVM_INTEL_STATIC
	select KVM_INTEL
	bool "KVM Intel"

endchoice


The ends up looking like:

   <*>   Kernel-based Virtual Machine (KVM) support
           KVM built-in support (KVM Intel)  --->
   -*-   KVM for Intel processors support

> +
> +endchoice
> +
> +endif
> +
>  config KVM_INTEL
>  	tristate "KVM for Intel processors support"
> -	depends on KVM
> +	depends on (KVM && !KVM_AMD_STATIC) || KVM_INTEL_STATIC
>  	# for perf_guest_get_msrs():
>  	depends on CPU_SUP_INTEL
>  	---help---
> @@ -73,7 +93,7 @@ config KVM_INTEL
>  
>  config KVM_AMD
>  	tristate "KVM for AMD processors support"
> -	depends on KVM
> +	depends on (KVM && !KVM_INTEL_STATIC) || KVM_AMD_STATIC
>  	---help---
>  	  Provides support for KVM on AMD processors equipped with the AMD-V
>  	  (SVM) extensions.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko
  2019-10-15  1:31   ` Sean Christopherson
@ 2019-10-15  3:18     ` Sean Christopherson
  2019-10-15  8:32       ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2019-10-15  3:18 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Paolo Bonzini, Vitaly Kuznetsov

On Mon, Oct 14, 2019 at 06:31:44PM -0700, Sean Christopherson wrote:
> On Sat, Sep 28, 2019 at 01:23:10PM -0400, Andrea Arcangeli wrote:
> > This is the first commit of a patch series that aims to replace the
> > modular kvm.ko kernel module with a monolithic kvm-intel/kvm-amd
> > model. This change has the only possible cons of wasting some disk
> > space in /lib/modules/. The pros are that it saves CPUS and some minor
> > RAM which are more scarse resources than disk space.
> > 
> > The pointer to function virtual template model cannot provide any
> > runtime benefit because kvm-intel and kvm-amd can't be loaded at the
> > same time.
> > 
> > This removes kvm.ko and it links and duplicates all kvm.ko objects to
> > both kvm-amd and kvm-intel.
> 
> The KVM config option should be changed to a bool and its help text
> updated.  Maybe something similar to the help for VIRTUALIZATION to make
> it clear that enabling KVM on its own does nothing.

Making KVM a bool doesn't work well, keeping it a tristate and keying off
KVM=y to force Intel or AMD (as done in the next patch) looks like the
cleanest implementation.

The help text should still be updated though.

> > 
> > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> > ---
> >  arch/x86/kvm/Makefile | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> > index 31ecf7a76d5a..68b81f381369 100644
> > --- a/arch/x86/kvm/Makefile
> > +++ b/arch/x86/kvm/Makefile
> > @@ -12,9 +12,8 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
> >  			   i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
> >  			   hyperv.o page_track.o debugfs.o
> >  
> > -kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o
> > -kvm-amd-y		+= svm.o pmu_amd.o
> > +kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o $(kvm-y)
> > +kvm-amd-y		+= svm.o pmu_amd.o $(kvm-y)
> >  
> > -obj-$(CONFIG_KVM)	+= kvm.o
> >  obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
> >  obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel
  2019-10-15  3:16   ` Sean Christopherson
@ 2019-10-15  8:21     ` Paolo Bonzini
  2019-10-15 15:23       ` Sean Christopherson
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-15  8:21 UTC (permalink / raw)
  To: Sean Christopherson, Andrea Arcangeli; +Cc: kvm, linux-kernel, Vitaly Kuznetsov

On 15/10/19 05:16, Sean Christopherson wrote:
> I think short and sweet is enough for the prompt, with the details of how
> build both buried in the help text.
> 
> choice
> 	prompt "KVM built-in support"
> 	help
> 	  Here be a long and detailed help text.
> 
> config KVM_AMD_STATIC
> 	select KVM_AMD
> 	bool "KVM AMD"
> 
> config KVM_INTEL_STATIC
> 	select KVM_INTEL
> 	bool "KVM Intel"

Or even just

	bool "AMD"
	...
	bool "Intel"

> endchoice
> 
> The ends up looking like:
> 
>    <*>   Kernel-based Virtual Machine (KVM) support
>            KVM built-in support (KVM Intel)  --->
>    -*-   KVM for Intel processors support

On top of this, it's also nice to hide the KVM_INTEL/KVM_AMD prompts if
linking statically.  You can achieve that with

config KVM_INTEL
    tristate
    prompt "KVM for Intel processors support" if KVM=m
    depends on (KVM=m && m) || KVM_INTEL_STATIC

config KVM_AMD
    tristate
    prompt "KVM for AMD processors support" if KVM=m
    depends on (KVM=m && m) || KVM_AMD_STATIC

The left side of the "||" ensures that, if KVM=m, you can only choose
module build for both KVM_INTEL and KVM_AMD.  Having just "depends on
KVM" would allow a pre-existing .config to choose the now-invalid
combination

	CONFIG_KVM=y
	CONFIG_KVM_INTEL=y
	CONFIG_KVM_AMD=y

The right side of the "||" part is just for documentation, to avoid that
a selected symbol does not satisfy its dependencies.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-09-28 17:23 ` [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers Andrea Arcangeli
@ 2019-10-15  8:28   ` Paolo Bonzini
  2019-10-15 16:49     ` Andrea Arcangeli
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-15  8:28 UTC (permalink / raw)
  To: Andrea Arcangeli, kvm, linux-kernel; +Cc: Vitaly Kuznetsov, Sean Christopherson

On 28/09/19 19:23, Andrea Arcangeli wrote:
> Reducing this list to only EXIT_REASON_MSR_WRITE,
> EXIT_REASON_PREEMPTION_TIMER, EXIT_REASON_EPT_MISCONFIG,
> EXIT_REASON_IO_INSTRUCTION increases the computation time of the
> hrtimer guest testcase on Haswell i5-4670T CPU @ 2.30GHz by 7% with
> the default spectre v2 mitigation enabled in the host and guest. On
> skylake as opposed there's no measurable difference with the short
> list. To put things in prospective on Haswell the same hrtimer
> workload (note: it never calls cpuid and it never attempts to trigger
> more vmexit on purpose) in guest takes 16.3% longer to compute on
> upstream KVM running in the host than with the KVM mono v1 patchset
> applied to the host kernel, while on skylake the same takes only 5.4%
> more time (both with the default mitigations enabled in guest and
> host).
> 
> It's also unclear why EXIT_REASON_IO_INSTRUCTION should be included.

If you're including EXIT_REASON_EPT_MISCONFIG (MMIO access) then you
should include EXIT_REASON_IO_INSTRUCTION too.  Depending on the devices
that are in the guest, the doorbell register might be MMIO or PIO.

> +		if (exit_reason == EXIT_REASON_MSR_WRITE)
> +			return kvm_emulate_wrmsr(vcpu);
> +		else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
> +			return handle_preemption_timer(vcpu);
> +		else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT)
> +			return handle_interrupt_window(vcpu);
> +		else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
> +			return handle_external_interrupt(vcpu);
> +		else if (exit_reason == EXIT_REASON_HLT)
> +			return kvm_emulate_halt(vcpu);
> +		else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION)
> +			return handle_pause(vcpu);
> +		else if (exit_reason == EXIT_REASON_MSR_READ)
> +			return kvm_emulate_rdmsr(vcpu);
> +		else if (exit_reason == EXIT_REASON_CPUID)
> +			return kvm_emulate_cpuid(vcpu);
> +		else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
> +			return handle_ept_misconfig(vcpu);

So, the difference between my suggested list (which I admit is just
based on conjecture, not benchmarking) is that you add
EXIT_REASON_PAUSE_INSTRUCTION, EXIT_REASON_PENDING_INTERRUPT,
EXIT_REASON_EXTERNAL_INTERRUPT, EXIT_REASON_HLT, EXIT_REASON_MSR_READ,
EXIT_REASON_CPUID.

Which of these make a difference for the hrtimer testcase?  It's of
course totally fine to use benchmarks to prove that my intuition was
bad---but you must also use them to show why your intuition is right. :)

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko
  2019-10-15  3:18     ` Sean Christopherson
@ 2019-10-15  8:32       ` Paolo Bonzini
  0 siblings, 0 replies; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-15  8:32 UTC (permalink / raw)
  To: Sean Christopherson, Andrea Arcangeli; +Cc: kvm, linux-kernel, Vitaly Kuznetsov

On 15/10/19 05:18, Sean Christopherson wrote:
>> The KVM config option should be changed to a bool and its help text
>> updated.  Maybe something similar to the help for VIRTUALIZATION to make
>> it clear that enabling KVM on its own does nothing.
> Making KVM a bool doesn't work well, keeping it a tristate and keying off
> KVM=y to force Intel or AMD (as done in the next patch) looks like the
> cleanest implementation.

Indeed, keeping the KVM option as tristate helps showing the right
suboptions, similar to what Andrea did in patch 2.  However, this patch
already breaks the CONFIG_KVM_INTEL=y && CONFIG_KVM_AMD=y case I think,
so it should be squashed with "KVM: monolithic: x86: disable linking vmx
and svm at the same time into the kernel".

> The help text should still be updated though.

The patch doesn't change the fact that enabling KVM on its own does
nothing, so the help text can be updated independently (patch welcome :)).

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel
  2019-10-15  8:21     ` Paolo Bonzini
@ 2019-10-15 15:23       ` Sean Christopherson
  0 siblings, 0 replies; 29+ messages in thread
From: Sean Christopherson @ 2019-10-15 15:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Andrea Arcangeli, kvm, linux-kernel, Vitaly Kuznetsov

On Tue, Oct 15, 2019 at 10:21:59AM +0200, Paolo Bonzini wrote:
> On 15/10/19 05:16, Sean Christopherson wrote:
> > I think short and sweet is enough for the prompt, with the details of how
> > build both buried in the help text.
> > 
> > choice
> > 	prompt "KVM built-in support"
> > 	help
> > 	  Here be a long and detailed help text.
> > 
> > config KVM_AMD_STATIC
> > 	select KVM_AMD
> > 	bool "KVM AMD"
> > 
> > config KVM_INTEL_STATIC
> > 	select KVM_INTEL
> > 	bool "KVM Intel"
> 
> Or even just
> 
> 	bool "AMD"
> 	...
> 	bool "Intel"

Ya.

> > endchoice
> > 
> > The ends up looking like:
> > 
> >    <*>   Kernel-based Virtual Machine (KVM) support
> >            KVM built-in support (KVM Intel)  --->
> >    -*-   KVM for Intel processors support
> 
> On top of this, it's also nice to hide the KVM_INTEL/KVM_AMD prompts if
> linking statically.  You can achieve that with
> 
> config KVM_INTEL
>     tristate
>     prompt "KVM for Intel processors support" if KVM=m

That's painfully obvious now that I see it.  I always forget about putting
conditionals at the end...

>     depends on (KVM=m && m) || KVM_INTEL_STATIC
> 
> config KVM_AMD
>     tristate
>     prompt "KVM for AMD processors support" if KVM=m
>     depends on (KVM=m && m) || KVM_AMD_STATIC
> 
> The left side of the "||" ensures that, if KVM=m, you can only choose
> module build for both KVM_INTEL and KVM_AMD.  Having just "depends on
> KVM" would allow a pre-existing .config to choose the now-invalid
> combination
> 
> 	CONFIG_KVM=y
> 	CONFIG_KVM_INTEL=y
> 	CONFIG_KVM_AMD=y
> 
> The right side of the "||" part is just for documentation, to avoid that
> a selected symbol does not satisfy its dependencies.
> 
> Thanks,
> 
> Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-15  8:28   ` Paolo Bonzini
@ 2019-10-15 16:49     ` Andrea Arcangeli
  2019-10-15 19:46       ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-10-15 16:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On Tue, Oct 15, 2019 at 10:28:39AM +0200, Paolo Bonzini wrote:
> If you're including EXIT_REASON_EPT_MISCONFIG (MMIO access) then you
> should include EXIT_REASON_IO_INSTRUCTION too.  Depending on the devices
> that are in the guest, the doorbell register might be MMIO or PIO.

The fact outb/inb devices exists isn't the question here. The question
you should clarify is: which of the PIO devices is performance
critical as much as MMIO with virtio/vhost? I mean even on real
hardware those devices aren't performance critical. I didn't run into
PIO drivers with properly configured guests.

> So, the difference between my suggested list (which I admit is just
> based on conjecture, not benchmarking) is that you add
> EXIT_REASON_PAUSE_INSTRUCTION, EXIT_REASON_PENDING_INTERRUPT,
> EXIT_REASON_EXTERNAL_INTERRUPT, EXIT_REASON_HLT, EXIT_REASON_MSR_READ,
> EXIT_REASON_CPUID.
> 
> Which of these make a difference for the hrtimer testcase?  It's of
> course totally fine to use benchmarks to prove that my intuition was
> bad---but you must also use them to show why your intuition is right. :)

The hrtimer flood hits on this:

           MSR_WRITE     338793    56.54%     5.51%      0.33us     34.44us      0.44us ( +-   0.20% )
   PENDING_INTERRUPT     168431    28.11%     2.52%      0.36us     32.06us      0.40us ( +-   0.28% )
    PREEMPTION_TIMER      91723    15.31%     1.32%      0.34us     30.51us      0.39us ( +-   0.41% )
  EXTERNAL_INTERRUPT        234     0.04%     0.00%      0.25us      5.53us      0.43us ( +-   5.67% )
                 HLT         65     0.01%    90.64%      0.49us 319933.79us  37562.71us ( +-  21.68% )
            MSR_READ          6     0.00%     0.00%      0.67us      1.96us      1.06us ( +-  17.97% )
       EPT_MISCONFIG          6     0.00%     0.01%      3.09us    105.50us     26.76us ( +-  62.10% )

PENDING_INTERRUPT is the big missing thing in your list. It probably
accounts for the bulk of slowdown from your list.  However I could
imagine other loads with higher external interrupt/hlt/rdmsr than the
hrtimer one so I didn't drop those. Other loads are hitting on a flood
of HLT and from host standpoint it's no a slow path. Not all OS have
the cpuidle haltpoll governor to mitigate the HLT frequency.

I'm pretty sure HLT/EXTERNAL_INTERRUPT/PENDING_INTERRUPT should be
included.

The least useful are PAUSE, CPUID and MSR_READ, we could considering
dropping some of those (in the short term cpuid helps for benchmarking
to more accurately measure the performance improvement of not hitting
the retpoline there). I simply could imagine some load hitting
frequently on those too so I didn't drop them.

I also wonder if VMCALL should be added, certain loads hit on fairly
frequent VMCALL, but none of the one I benchmarked.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-15 16:49     ` Andrea Arcangeli
@ 2019-10-15 19:46       ` Paolo Bonzini
  2019-10-15 20:35         ` Andrea Arcangeli
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-15 19:46 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On 15/10/19 18:49, Andrea Arcangeli wrote:
> On Tue, Oct 15, 2019 at 10:28:39AM +0200, Paolo Bonzini wrote:
>> If you're including EXIT_REASON_EPT_MISCONFIG (MMIO access) then you
>> should include EXIT_REASON_IO_INSTRUCTION too.  Depending on the devices
>> that are in the guest, the doorbell register might be MMIO or PIO.
> 
> The fact outb/inb devices exists isn't the question here. The question
> you should clarify is: which of the PIO devices is performance
> critical as much as MMIO with virtio/vhost?

virtio 0.9 uses PIO.

> I mean even on real hardware those devices aren't performance critical.

On virtual machines they're actually faster than MMIO because they don't
need to go through page table walks.

>> So, the difference between my suggested list (which I admit is just
>> based on conjecture, not benchmarking) is that you add
>> EXIT_REASON_PAUSE_INSTRUCTION, EXIT_REASON_PENDING_INTERRUPT,
>> EXIT_REASON_EXTERNAL_INTERRUPT, EXIT_REASON_HLT, EXIT_REASON_MSR_READ,
>> EXIT_REASON_CPUID.
>>
>> Which of these make a difference for the hrtimer testcase?  It's of
>> course totally fine to use benchmarks to prove that my intuition was
>> bad---but you must also use them to show why your intuition is right. :)
> 
> The hrtimer flood hits on this:
> 
>            MSR_WRITE     338793    56.54%     5.51%      0.33us     34.44us      0.44us ( +-   0.20% )
>    PENDING_INTERRUPT     168431    28.11%     2.52%      0.36us     32.06us      0.40us ( +-   0.28% )
>     PREEMPTION_TIMER      91723    15.31%     1.32%      0.34us     30.51us      0.39us ( +-   0.41% )
>   EXTERNAL_INTERRUPT        234     0.04%     0.00%      0.25us      5.53us      0.43us ( +-   5.67% )
>                  HLT         65     0.01%    90.64%      0.49us 319933.79us  37562.71us ( +-  21.68% )
>             MSR_READ          6     0.00%     0.00%      0.67us      1.96us      1.06us ( +-  17.97% )
>        EPT_MISCONFIG          6     0.00%     0.01%      3.09us    105.50us     26.76us ( +-  62.10% )
> 
> PENDING_INTERRUPT is the big missing thing in your list. It probably
> accounts for the bulk of slowdown from your list.

Makes sense.

> However I could imagine other loads with higher external
> interrupt/hlt/rdmsr than the hrtimer one so I didn't drop those.
External interrupts should only tick at 1 Hz on nohz_full kernels,
and even at 1000 Hz (if physical CPUs are not isolated) it should not
really matter.  We can include it since it has such a short handler so
the cost of retpolines is in % much more than other exits.

HLT is certainly a slow path, the guest only invokes if things such as
NAPI interrupt mitigation have failed.  As long as the guest stays in
halted state for a microsecond or so, the cost of retpoline will all but
disappear.

RDMSR again shouldn't be there, guests sometimes read the PMTimer (which
is an I/O port) or TSC but for example do not really ever read the APIC
TMCCT.

> I'm pretty sure HLT/EXTERNAL_INTERRUPT/PENDING_INTERRUPT should be
> included.
> I also wonder if VMCALL should be added, certain loads hit on fairly
> frequent VMCALL, but none of the one I benchmarked.

I agree for external interrupt and pending interrupt, and VMCALL is fine
too.  In addition I'd add I/O instructions which are useful for some
guests and also for benchmarking (e.g. vmexit.flat has both IN and OUT
tests).

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-15 19:46       ` Paolo Bonzini
@ 2019-10-15 20:35         ` Andrea Arcangeli
  2019-10-15 22:22           ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-10-15 20:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On Tue, Oct 15, 2019 at 09:46:58PM +0200, Paolo Bonzini wrote:
> On 15/10/19 18:49, Andrea Arcangeli wrote:
> > On Tue, Oct 15, 2019 at 10:28:39AM +0200, Paolo Bonzini wrote:
> >> If you're including EXIT_REASON_EPT_MISCONFIG (MMIO access) then you
> >> should include EXIT_REASON_IO_INSTRUCTION too.  Depending on the devices
> >> that are in the guest, the doorbell register might be MMIO or PIO.
> > 
> > The fact outb/inb devices exists isn't the question here. The question
> > you should clarify is: which of the PIO devices is performance
> > critical as much as MMIO with virtio/vhost?
> 
> virtio 0.9 uses PIO.

0.9 is a 12 years old protocol replaced several years ago. Anybody who
needs high performance won't be running it, the others can't perform
well to begin with, so I'm not sure exactly how it's relevant in this
microoptimization context. We're not optimizing for emulated devices
or other old stuff either.

> On virtual machines they're actually faster than MMIO because they don't
> need to go through page table walks.

And how does it help that they're faster if current virtio stopped
using them and nothing else recent uses PIO?

> HLT is certainly a slow path, the guest only invokes if things such as

Your idea that HLT is a certainly is a slow path is only correct if
you assume the host is IDLE, but the host is never idle if you use
virt for consolidation.

From the point of view of the host, HLT is like every other vmexit.

> NAPI interrupt mitigation have failed.  As long as the guest stays in
> halted state for a microsecond or so, the cost of retpoline will all but
> disappear.

The only thing that matters is the number of HLT vmexit per second and
you just need to measure the number of HLT vmexits to tell it's
needed.

I've several workloads including eBPF tracing, not related to
interrupts (that in turn cannot be mitigated by NAPI) that schedule
frequently and hit 100k+ of HLT vmexits per second and the host is all
but idle. There's no need of hardware interrupt to wake up tasks and
schedule in the guest, scheduler IPIs and timers are more than enough.

The only thing that can mitigate that is the cpuidle haltpoll driver,
but it hit upstream a few months ago, all most recent enterprise
guest OS won't have it yet.

All it matters is how many vmexits per second there are, everything
else including "why" they happen and what those vmexists means for the
guest, is irrelevant, or it would be relevant only if the host was
guaranteed to be idle but there's no such guarantee.

If the host is using all idle CPUs to compute in the background
(i.e. building the kernel) with SCHED_IDLE the HLT retpoline cost will
not be any different than any other vmexit retpoline cost and easy
+100k HLT exit per second certainly puts it in the measurable
territory.

Here's a random example:

             VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time 

                 HLT     101128    75.33%    99.66%      0.43us 901000.66us    310.88us ( +-   8.46% )
              VMCALL      14089    10.50%     0.10%      1.32us     84.99us      2.14us ( +-   0.90% )
           MSR_WRITE       8246     6.14%     0.03%      0.33us     32.79us      1.05us ( +-   1.51% )
       EPT_VIOLATION       6312     4.70%     0.18%      0.50us  26426.07us      8.90us ( +-  48.58% )
    PREEMPTION_TIMER       1730     1.29%     0.01%      0.55us     26.81us      1.60us ( +-   3.48% )
  EXTERNAL_INTERRUPT       1329     0.99%     0.03%      0.27us    944.88us      6.04us ( +-  20.52% )
       EPT_MISCONFIG        982     0.73%     0.01%      0.42us    137.68us      2.05us ( +-   9.88% )
   PENDING_INTERRUPT        308     0.23%     0.00%      0.44us      4.32us      0.73us ( +-   2.57% )
   PAUSE_INSTRUCTION         58     0.04%     0.00%      0.32us     18.55us      1.48us ( +-  23.12% )
            MSR_READ         35     0.03%     0.00%      0.78us      5.55us      2.07us ( +-  10.74% )
               CPUID         24     0.02%     0.00%      0.27us      2.20us      0.59us ( +-  13.43% )

# careful despite the verifier promise that eBPF shouldn't be kernel
# crashing this may be kernel crashing because there's no verifier at
# all that verifies that the eBPF function calls available depending
# on the hooking point can actually be invoked from the kernel hooking
# points they're invoked from. this is why I tested it in a VM
bpftrace -e 'kprobe:*interrupt* { @ = count() }'

Other example with a pipe loop that just bounces a byte across a pipe
with two processes:

             VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time 

           MSR_WRITE     498945    80.49%     4.10%      0.33us     42.73us      0.44us ( +-   0.12% )
                 HLT     118474    19.11%    95.88%      0.33us 707693.05us     43.56us ( +-  24.23% )
    PREEMPTION_TIMER       1004     0.16%     0.01%      0.38us     25.47us      0.67us ( +-   5.69% )
   PENDING_INTERRUPT        894     0.14%     0.01%      0.37us     20.98us      0.49us ( +-   4.94% )
  EXTERNAL_INTERRUPT        518     0.08%     0.00%      0.26us     20.59us      0.51us ( +-   8.09% )
            MSR_READ          8     0.00%     0.00%      0.66us      1.37us      0.92us ( +-   9.19% )
       EPT_MISCONFIG          6     0.00%     0.00%      3.18us     32.71us     12.60us ( +-  43.58% )
   PAUSE_INSTRUCTION          3     0.00%     0.00%      0.59us      1.69us      1.07us ( +-  30.38% )

We wouldn't need to apply the cpuidle-haltpoll driver if HLT wasn't
such a frequent vmexit that deserves to have its retpoline cost, not
multiplied by 100000 times per second.

Over time if everything will turn out to use the cpuidle-haltpoll
driver by default (that however can increase the host CPU usage on
laptops) we can consider removing the HLT optimization, we're not
remotely there yet.

> RDMSR again shouldn't be there, guests sometimes read the PMTimer (which
> is an I/O port) or TSC but for example do not really ever read the APIC
> TMCCT.

We can try to drop RDMSR, and see if it's measurable. I already tried
to re-add some of those retpolines but it was slower and this was the
fastest combination that I got, I don't recall if I tried with RDMSR
and PAUSE alone but I can try again.

> > I'm pretty sure HLT/EXTERNAL_INTERRUPT/PENDING_INTERRUPT should be
> > included.
> > I also wonder if VMCALL should be added, certain loads hit on fairly
> > frequent VMCALL, but none of the one I benchmarked.
> 
> I agree for external interrupt and pending interrupt, and VMCALL is fine
> too.  In addition I'd add I/O instructions which are useful for some
> guests and also for benchmarking (e.g. vmexit.flat has both IN and OUT
> tests).

Isn't it faster to use cpuid for benchmarking? I mean we don't want to
pay for more than one branch for benchmarking (even cpuid is
questionable in the long term, but for now it's handy to have), and
unlike inb/outb, cpuid runs occasionally in all real life workloads
(including in guest userland) so between inb/outb, I'd rather prefer
to use cpuid as the benchmark vector because at least it has a chance
to help real workloads a bit too.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-15 20:35         ` Andrea Arcangeli
@ 2019-10-15 22:22           ` Paolo Bonzini
  2019-10-15 23:42             ` Andrea Arcangeli
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-15 22:22 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On 15/10/19 22:35, Andrea Arcangeli wrote:
> On Tue, Oct 15, 2019 at 09:46:58PM +0200, Paolo Bonzini wrote:
>> On 15/10/19 18:49, Andrea Arcangeli wrote:
>>> On Tue, Oct 15, 2019 at 10:28:39AM +0200, Paolo Bonzini wrote:
>>>> If you're including EXIT_REASON_EPT_MISCONFIG (MMIO access) then you
>>>> should include EXIT_REASON_IO_INSTRUCTION too.  Depending on the devices
>>>> that are in the guest, the doorbell register might be MMIO or PIO.
>>>
>>> The fact outb/inb devices exists isn't the question here. The question
>>> you should clarify is: which of the PIO devices is performance
>>> critical as much as MMIO with virtio/vhost?
>>
>> virtio 0.9 uses PIO.
> 
> 0.9 is a 12 years old protocol replaced several years ago.

Oh come on.  0.9 is not 12-years old.  virtio 1.0 is 3.5 years old
(March 2016).  Anything older than 2017 is going to use 0.9.

> Your idea that HLT is a certainly is a slow path is only correct if
> you assume the host is IDLE, but the host is never idle if you use
> virt for consolidation.
>
> I've several workloads including eBPF tracing, not related to
> interrupts (that in turn cannot be mitigated by NAPI) that schedule
> frequently and hit 100k+ of HLT vmexits per second and the host is all
> but idle. There's no need of hardware interrupt to wake up tasks and
> schedule in the guest, scheduler IPIs and timers are more than enough.
>
> All it matters is how many vmexits per second there are, everything
> else including "why" they happen and what those vmexists means for the
> guest, is irrelevant, or it would be relevant only if the host was
> guaranteed to be idle but there's no such guarantee.

Your tables give:

	Samples	  Samples%  Time%     Min Time  Max time       Avg time
HLT     101128    75.33%    99.66%    0.43us    901000.66us    310.88us
HLT     118474    19.11%    95.88%    0.33us    707693.05us    43.56us

If "avg time" means the average time to serve an HLT vmexit, I don't
understand how you can have an average time of 0.3ms (1/3000th of a
second) and 100000 samples per second.  Can you explain that to me?

Anyway, if the average time is indeed 310us and 43us, it is orders of
magnitude more than the time spent executing a retpoline.  That time
will be spent in an indirect branch miss (retpoline) instead of doing
while(!kvm_vcpu_check_block()), but it doesn't change anything.

>>> I'm pretty sure HLT/EXTERNAL_INTERRUPT/PENDING_INTERRUPT should be
>>> included.
>>> I also wonder if VMCALL should be added, certain loads hit on fairly
>>> frequent VMCALL, but none of the one I benchmarked.
>>
>> I agree for external interrupt and pending interrupt, and VMCALL is fine
>> too.  In addition I'd add I/O instructions which are useful for some
>> guests and also for benchmarking (e.g. vmexit.flat has both IN and OUT
>> tests).
> 
> Isn't it faster to use cpuid for benchmarking? I mean we don't want to
> pay for more than one branch for benchmarking (even cpuid is
> questionable in the long term, but for now it's handy to have),

outl is more or less the same as cpuid and vmcall.  You can measure it
with vmexit.flat.  inl is slower.

> and unlike inb/outb, cpuid runs occasionally in all real life workloads
> (including in guest userland) so between inb/outb, I'd rather prefer
> to use cpuid as the benchmark vector because at least it has a chance
> to help real workloads a bit too.

Again: what is the real workload that does thousands of CPUIDs per second?

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-15 22:22           ` Paolo Bonzini
@ 2019-10-15 23:42             ` Andrea Arcangeli
  2019-10-16  7:07               ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-10-15 23:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On Wed, Oct 16, 2019 at 12:22:31AM +0200, Paolo Bonzini wrote:
> Oh come on.  0.9 is not 12-years old.  virtio 1.0 is 3.5 years old
> (March 2016).  Anything older than 2017 is going to use 0.9.

Sorry if I got the date wrong, but still I don't see the point in
optimizing for legacy virtio. I can't justify forcing everyone to
execute that additional branch for inb/outb, in the attempt to make
legacy virtio faster that nobody should use in combination with
bleeding edge KVM in the host.

> Your tables give:
> 
> 	Samples	  Samples%  Time%     Min Time  Max time       Avg time
> HLT     101128    75.33%    99.66%    0.43us    901000.66us    310.88us
> HLT     118474    19.11%    95.88%    0.33us    707693.05us    43.56us
> 
> If "avg time" means the average time to serve an HLT vmexit, I don't
> understand how you can have an average time of 0.3ms (1/3000th of a
> second) and 100000 samples per second.  Can you explain that to me?

I described it wrong, the bpftrace record was a sleep 5, not a sleep
1. The pipe loop was sure a sleep 1.

I just wanted to show how even on things where you wouldn't even
expected to get HLT like the bpftrace that is pure guest CPU load, you
still get 100k of them (over 5 sec).

The issue is that in production you get a flood more of those with
hundred of CPUs, so the exact number doesn't move the needle.

> Anyway, if the average time is indeed 310us and 43us, it is orders of
> magnitude more than the time spent executing a retpoline.  That time
> will be spent in an indirect branch miss (retpoline) instead of doing
> while(!kvm_vcpu_check_block()), but it doesn't change anything.

Doesn't cpuidle haltpoll disable that loop? Ideally there should be
HLT vmexits then but I don't know how much fewer. This just needs to
be frequent enough that the branch cost pay itself off, but the sure
thing is that HLT vmexit will not go away unless you execute mwait in
guest mode by isolating the CPU in the host.

> Again: what is the real workload that does thousands of CPUIDs per second?

None, but there are always background CPUID vmexits while there are
never inb/outb vmexits.

So the cpuid retpoline removal has a slight chance to pay for the cost
of the branch, the inb/outb retpoline removal cannot pay off the cost
of the branch.

This is why I prefer cpuid as benchmark gadget for the short term
unless inb/outb offers other benchmark related benefits.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-15 23:42             ` Andrea Arcangeli
@ 2019-10-16  7:07               ` Paolo Bonzini
  2019-10-16 16:50                 ` Andrea Arcangeli
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-16  7:07 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On 16/10/19 01:42, Andrea Arcangeli wrote:
> On Wed, Oct 16, 2019 at 12:22:31AM +0200, Paolo Bonzini wrote:
>> Oh come on.  0.9 is not 12-years old.  virtio 1.0 is 3.5 years old
>> (March 2016).  Anything older than 2017 is going to use 0.9.
> 
> Sorry if I got the date wrong, but still I don't see the point in
> optimizing for legacy virtio. I can't justify forcing everyone to
> execute that additional branch for inb/outb, in the attempt to make
> legacy virtio faster that nobody should use in combination with
> bleeding edge KVM in the host.

Yet you would add CPUID to the list even though it is not even there in
your benchmarks, and is *never* invoked in a hot path by *any* sane
program? Some OSes have never gotten virtio 1.0 drivers.  OpenBSD only
got it earlier this year.

>> Your tables give:
>>
>> 	Samples	  Samples%  Time%     Min Time  Max time       Avg time
>> HLT     101128    75.33%    99.66%    0.43us    901000.66us    310.88us
>> HLT     118474    19.11%    95.88%    0.33us    707693.05us    43.56us
>>
>> If "avg time" means the average time to serve an HLT vmexit, I don't
>> understand how you can have an average time of 0.3ms (1/3000th of a
>> second) and 100000 samples per second.  Can you explain that to me?
> 
> I described it wrong, the bpftrace record was a sleep 5, not a sleep
> 1. The pipe loop was sure a sleep 1.

It still doesn't add up.  0.3ms / 5 is 1/15000th of a second; 43us is
1/25000th of a second.  Do you have multiple vCPU perhaps?

> The issue is that in production you get a flood more of those with
> hundred of CPUs, so the exact number doesn't move the needle.
> This just needs to be frequent enough that the branch cost pay itself off,
> but the sure thing is that HLT vmexit will not go away unless you execute
> mwait in guest mode by isolating the CPU in the host.

The number of vmexits doesn't count (for HLT).  What counts is how long
they take to be serviced, and as long as it's 1us or more the
optimization is pointless.

Consider these pictures

         w/o optimization                   with optimization
         ----------------------             -------------------------
0us      vmexit                             vmexit
500ns    retpoline                          call vmexit handler directly
600ns    retpoline                          kvm_vcpu_check_block()
700ns    retpoline                          kvm_vcpu_check_block()
800ns    kvm_vcpu_check_block()             kvm_vcpu_check_block()
900ns    kvm_vcpu_check_block()             kvm_vcpu_check_block()
...
39900ns  kvm_vcpu_check_block()             kvm_vcpu_check_block()

                            <interrupt arrives>

40000ns  kvm_vcpu_check_block()             kvm_vcpu_check_block()


Unless the interrupt arrives exactly in the few nanoseconds that it
takes to execute the retpoline, a direct handling of HLT vmexits makes
*absolutely no difference*.

>> Again: what is the real workload that does thousands of CPUIDs per second?
> 
> None, but there are always background CPUID vmexits while there are
> never inb/outb vmexits.
> 
> So the cpuid retpoline removal has a slight chance to pay for the cost
> of the branch, the inb/outb retpoline removal cannot pay off the cost
> of the branch.

Please stop considering only the exact configuration of your benchmarks.
 There are known, valid configurations where outb is a very hot vmexit.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-16  7:07               ` Paolo Bonzini
@ 2019-10-16 16:50                 ` Andrea Arcangeli
  2019-10-16 17:01                   ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2019-10-16 16:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On Wed, Oct 16, 2019 at 09:07:39AM +0200, Paolo Bonzini wrote:
> Yet you would add CPUID to the list even though it is not even there in
> your benchmarks, and is *never* invoked in a hot path by *any* sane

I justified CPUID as a "short term" benchmark gadget, it's one of
those it shouldn't be a problem at all to remove, I couldn't possibly
be against removing it. I only pointed out the fact cpuid on any
modern linux guest is going to run more frequently than any inb/outb
so if I had to pick a benchmark gadget, that remains my favorite one.

> program? Some OSes have never gotten virtio 1.0 drivers.  OpenBSD only
> got it earlier this year.

If the target is an optimization to a cranky OS that can't upgrade
virtio to obtain the full performance benefit from the retpoline
removal too (I don't know the specifics by just reading the above)
then it's a better argument. At least it sounds fair enough not to
unfair penalize the cranky OS forced to run obsolete protocols that
nobody can update or has the time to update.

I mean, until you said there's some OS that cannot upgrade to virtio
1.0, I thought it was perfectly fine to say "if you want to run a
guest with the full benefit of virtio 1.0 on KVM, you should upgrade
to virtio 1.0 and not stick to whatever 3 year old protocol, then also
the inb/outb retpoline will go away if you upgrade the host because
the inb/outb will go away in the first place".

> It still doesn't add up.  0.3ms / 5 is 1/15000th of a second; 43us is
> 1/25000th of a second.  Do you have multiple vCPU perhaps?

Why would I run any test on UP guests? Rather then spending time doing
the math on my results, it's probably quicker that you run it yourself:

https://lkml.kernel.org/r/20190109034941.28759-1-aarcange@redhat.com/

Marcelo should have better reproducers for frequent HLT that is a real
workload we have to pass, I reported the first two random things I had
around that reported fairly frequent HLT. The pipe loop load is
similar to local network I/O.

> The number of vmexits doesn't count (for HLT).  What counts is how long
> they take to be serviced, and as long as it's 1us or more the
> optimization is pointless.
> 
> Consider these pictures
> 
>          w/o optimization                   with optimization
>          ----------------------             -------------------------
> 0us      vmexit                             vmexit
> 500ns    retpoline                          call vmexit handler directly
> 600ns    retpoline                          kvm_vcpu_check_block()
> 700ns    retpoline                          kvm_vcpu_check_block()
> 800ns    kvm_vcpu_check_block()             kvm_vcpu_check_block()
> 900ns    kvm_vcpu_check_block()             kvm_vcpu_check_block()
> ...
> 39900ns  kvm_vcpu_check_block()             kvm_vcpu_check_block()
> 
>                             <interrupt arrives>
> 
> 40000ns  kvm_vcpu_check_block()             kvm_vcpu_check_block()
> 
> 
> Unless the interrupt arrives exactly in the few nanoseconds that it
> takes to execute the retpoline, a direct handling of HLT vmexits makes
> *absolutely no difference*.
> 

You keep focusing on what happens if the host is completely idle (in
which case guest HLT is a slow path) and you keep ignoring the case
that the host isn't completely idle (in which case guest HLT is not a
slow path).

Please note the single_task_running() check which immediately breaks
the kvm_vcpu_check_block() loop if there's even a single other task
that can be scheduled in the runqueue of the host CPU.

What happen when the host is not idle is quoted below:

         w/o optimization                   with optimization
         ----------------------             -------------------------
0us      vmexit                             vmexit
500ns    retpoline                          call vmexit handler directly
600ns    retpoline                          kvm_vcpu_check_block()
700ns    retpoline                          schedule()
800ns    kvm_vcpu_check_block()
900ns    schedule()
...

Disclaimer: the numbers on the left are arbitrary and I just cut and
pasted them from yours, no idea how far off they are.

To be clear, I would find it very reasonable to be requested to proof
the benefit of the HLT optimization with benchmarks specifics for that
single one liner, but until then, the idea that we can drop the
retpoline optimization from the HLT vmexit by just thinking about it,
still doesn't make sense to me, because by thinking about it I come to
the opposite conclusion.

The lack of single_task_running() in the guest driver is also why the
guest cpuidle haltpoll risks to waste some CPU with host overcommit or
with the host loaded at full capacity and why we may not assume it to
be universally enabled.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers
  2019-10-16 16:50                 ` Andrea Arcangeli
@ 2019-10-16 17:01                   ` Paolo Bonzini
  0 siblings, 0 replies; 29+ messages in thread
From: Paolo Bonzini @ 2019-10-16 17:01 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: kvm, linux-kernel, Vitaly Kuznetsov, Sean Christopherson

On 16/10/19 18:50, Andrea Arcangeli wrote:
>> It still doesn't add up.  0.3ms / 5 is 1/15000th of a second; 43us is
>> 1/25000th of a second.  Do you have multiple vCPU perhaps?
> 
> Why would I run any test on UP guests? Rather then spending time doing
> the math on my results, it's probably quicker that you run it yourself:

I don't know, but if you don't say how many vCPUs you have, I cannot do
the math and review the patch.

>> The number of vmexits doesn't count (for HLT).  What counts is how long
>> they take to be serviced, and as long as it's 1us or more the
>> optimization is pointless.
>
> Please note the single_task_running() check which immediately breaks
> the kvm_vcpu_check_block() loop if there's even a single other task
> that can be scheduled in the runqueue of the host CPU.
> 
> What happen when the host is not idle is quoted below:
> 
>          w/o optimization                   with optimization
>          ----------------------             -------------------------
> 0us      vmexit                             vmexit
> 500ns    retpoline                          call vmexit handler directly
> 600ns    retpoline                          kvm_vcpu_check_block()
> 700ns    retpoline                          schedule()
> 800ns    kvm_vcpu_check_block()
> 900ns    schedule()
> ...
> 
> Disclaimer: the numbers on the left are arbitrary and I just cut and
> pasted them from yours, no idea how far off they are.

Yes, of course.  But the idea is the same: yes, because of the retpoline
you run the guest for perhaps 300ns more before schedule()ing, but does
that really matter?  300ns * 20000 times/second is a 0.6% performance
impact, and 300ns is already very generous.  I am not sure it would be
measurable at all.

Paolo

> To be clear, I would find it very reasonable to be requested to proof
> the benefit of the HLT optimization with benchmarks specifics for that
> single one liner, but until then, the idea that we can drop the
> retpoline optimization from the HLT vmexit by just thinking about it,
> still doesn't make sense to me, because by thinking about it I come to
> the opposite conclusion.
> 
> The lack of single_task_running() in the guest driver is also why the
> guest cpuidle haltpoll risks to waste some CPU with host overcommit or
> with the host loaded at full capacity and why we may not assume it to
> be universally enabled.
> 
> Thanks,
> Andrea
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2019-10-16 17:01 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-28 17:23 [PATCH 00/14] KVM monolithic v2 Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 01/14] KVM: monolithic: x86: remove kvm.ko Andrea Arcangeli
2019-10-15  1:31   ` Sean Christopherson
2019-10-15  3:18     ` Sean Christopherson
2019-10-15  8:32       ` Paolo Bonzini
2019-09-28 17:23 ` [PATCH 02/14] KVM: monolithic: x86: disable linking vmx and svm at the same time into the kernel Andrea Arcangeli
2019-10-15  3:16   ` Sean Christopherson
2019-10-15  8:21     ` Paolo Bonzini
2019-10-15 15:23       ` Sean Christopherson
2019-09-28 17:23 ` [PATCH 04/14] KVM: monolithic: x86: handle the request_immediate_exit variation Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 05/14] KVM: monolithic: add more section prefixes in the KVM common code Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 06/14] KVM: monolithic: x86: remove __exit section prefix from machine_unsetup Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 07/14] KVM: monolithic: x86: remove __init section prefix from kvm_x86_cpu_has_kvm_support Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 08/14] KVM: monolithic: x86: remove exports Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 09/14] KVM: monolithic: remove exports from KVM common code Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 10/14] KVM: monolithic: x86: drop the kvm_pmu_ops structure Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 11/14] KVM: x86: optimize more exit handlers in vmx.c Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 12/14] KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers Andrea Arcangeli
2019-10-15  8:28   ` Paolo Bonzini
2019-10-15 16:49     ` Andrea Arcangeli
2019-10-15 19:46       ` Paolo Bonzini
2019-10-15 20:35         ` Andrea Arcangeli
2019-10-15 22:22           ` Paolo Bonzini
2019-10-15 23:42             ` Andrea Arcangeli
2019-10-16  7:07               ` Paolo Bonzini
2019-10-16 16:50                 ` Andrea Arcangeli
2019-10-16 17:01                   ` Paolo Bonzini
2019-09-28 17:23 ` [PATCH 13/14] KVM: retpolines: x86: eliminate retpoline from svm.c " Andrea Arcangeli
2019-09-28 17:23 ` [PATCH 14/14] x86: retpolines: eliminate retpoline from msr event handlers Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).