All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>,
	kvm list <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bpetkov@suse.de>
Subject: Re: RFC: Getting rid of LTR in VMX
Date: Mon, 20 Feb 2017 12:05:13 +0100	[thread overview]
Message-ID: <38d613dc-5b46-1f4f-04d1-53c01932e6d6@redhat.com> (raw)
In-Reply-To: <CALCETrVnHc1_M0_zEL=EBvMNGBj-ZqzvhrGBw9jKJ6HmL-qKDg@mail.gmail.com>



On 18/02/2017 04:29, Andy Lutomirski wrote:
> There's no code here because the patch is trivial, but I want to run
> the idea by you all first to see if there are any issues.
> 
> VMX is silly and forces the TSS limit to the minimum on VM exits.  KVM
> wastes lots of cycles bumping it back up to accomodate the io bitmap.

Actually looked at the code now...

reload_tss is only invoked for userspace exits, so it is a nice-to-have
but it wouldn't show on most workloads.  Still it does save 150-200
clock cycles to remove it (I just commented out reload_tss() from
__vmx_load_host_state to test).

Another 100-150 could be saved if we could just use rdgsbase/wrgsbase, 
instead of rdmsr/wrmsr, to read and write the kernel GS.  Super hacky
patch after sig.

vmx_save_host_state is really slow...  It would be nice if Intel
defined an XSAVES state to do it (just like AMD's vmload/vmsave).

Paolo

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9856b73a21ad..e76bfec463bc 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2138,9 +2138,12 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 #endif
 
 #ifdef CONFIG_X86_64
-	rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base);
+	local_irq_disable();
+	asm volatile("swapgs; rdgsbase %0" : "=r" (vmx->msr_host_kernel_gs_base));
 	if (is_long_mode(&vmx->vcpu))
-		wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
+		asm volatile("wrgsbase %0" : : "r" (vmx->msr_guest_kernel_gs_base));
+	asm volatile("swapgs");
+	local_irq_enable();
 #endif
 	if (boot_cpu_has(X86_FEATURE_MPX))
 		rdmsrl(MSR_IA32_BNDCFGS, vmx->host_state.msr_host_bndcfgs);
@@ -2152,14 +2155,20 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 
 static void __vmx_load_host_state(struct vcpu_vmx *vmx)
 {
+	unsigned long flags;
+
 	if (!vmx->host_state.loaded)
 		return;
 
 	++vmx->vcpu.stat.host_state_reload;
 	vmx->host_state.loaded = 0;
 #ifdef CONFIG_X86_64
+	local_irq_save(flags);
+	asm("swapgs");
 	if (is_long_mode(&vmx->vcpu))
-		rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
+		asm volatile("rdgsbase %0" : "=r" (vmx->msr_guest_kernel_gs_base));
+	asm volatile("wrgsbase %0; swapgs" : : "r" (vmx->msr_host_kernel_gs_base));
+	local_irq_restore(flags);
 #endif
 	if (vmx->host_state.gs_ldt_reload_needed) {
 		kvm_load_ldt(vmx->host_state.ldt_sel);
@@ -2177,10 +2186,7 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx)
 		loadsegment(es, vmx->host_state.es_sel);
 	}
 #endif
-	reload_tss();
-#ifdef CONFIG_X86_64
-	wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base);
-#endif
+	//reload_tss();
 	if (vmx->host_state.msr_host_bndcfgs)
 		wrmsrl(MSR_IA32_BNDCFGS, vmx->host_state.msr_host_bndcfgs);
 	load_gdt(this_cpu_ptr(&host_gdt));
@@ -3469,6 +3475,7 @@ static int hardware_enable(void)
 		wrmsrl(MSR_IA32_FEATURE_CONTROL, old | test_bits);
 	}
 	cr4_set_bits(X86_CR4_VMXE);
+	cr4_set_bits(X86_CR4_FSGSBASE);
 
 	if (vmm_exclusive) {
 		kvm_cpu_vmxon(phys_addr);

> I propose that we rework this.  Add a percpu variable that indicates
> whether the TSS limit needs to be refreshed.  On task switch, if the
> new task has TIF_IO_BITMAP set, then check that flag and, if set,
> refresh TR and clear the flag.  On VMX exit, set the flag.
> 
> The TSS limit is (phew!) invisible to userspace, so we don't have ABI
> issues to worry about here.  We also shouldn't have security issues
> because a too-low TSS limit just results in unprivileged IO operations
> generating #GP, which is exactly what we want.
> 
> What do you all think?  I expect a speedup of a couple hundred cycles
> on each VM exit.

  parent reply	other threads:[~2017-02-20 11:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-18  3:29 RFC: Getting rid of LTR in VMX Andy Lutomirski
2017-02-18 15:19 ` Paolo Bonzini
2017-02-20 11:05 ` Paolo Bonzini [this message]
2017-02-20 16:46   ` Andy Lutomirski
2017-02-20 16:51     ` Paolo Bonzini
2017-02-20 16:55       ` Andy Lutomirski
2017-02-20 22:02         ` Paolo Bonzini
2017-02-20 22:23           ` hpa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38d613dc-5b46-1f4f-04d1-53c01932e6d6@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=bpetkov@suse.de \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.