From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Nadav Har'El" Subject: [PATCH 26/31] nVMX: Handling of CR0 and CR4 modifying instructions Date: Mon, 16 May 2011 22:57:14 +0300 Message-ID: <201105161957.p4GJvEIx002033@rice.haifa.ibm.com> References: <1305575004-nyh@il.ibm.com> Cc: gleb@redhat.com, avi@redhat.com To: kvm@vger.kernel.org Return-path: Received: from mtagate7.uk.ibm.com ([194.196.100.167]:56532 "EHLO mtagate7.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751833Ab1EPT5S (ORCPT ); Mon, 16 May 2011 15:57:18 -0400 Received: from d06nrmr1806.portsmouth.uk.ibm.com (d06nrmr1806.portsmouth.uk.ibm.com [9.149.39.193]) by mtagate7.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p4GJvHEg008429 for ; Mon, 16 May 2011 19:57:17 GMT Received: from d06av08.portsmouth.uk.ibm.com (d06av08.portsmouth.uk.ibm.com [9.149.37.249]) by d06nrmr1806.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4GJvGhK2654362 for ; Mon, 16 May 2011 20:57:16 +0100 Received: from d06av08.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4GJvGqG024674 for ; Mon, 16 May 2011 20:57:16 +0100 Sender: kvm-owner@vger.kernel.org List-ID: When L2 tries to modify CR0 or CR4 (with mov or clts), and modifies a bit which L1 asked to shadow (via CR[04]_GUEST_HOST_MASK), we already do the right thing: we let L1 handle the trap (see nested_vmx_exit_handled_cr() in a previous patch). When L2 modifies bits that L1 doesn't care about, we let it think (via CR[04]_READ_SHADOW) that it did these modifications, while only changing (in GUEST_CR[04]) the bits that L0 doesn't shadow. This is needed for corect handling of CR0.TS for lazy FPU loading: L0 may want to leave TS on, while pretending to allow the guest to change it. Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 58 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 55 insertions(+), 3 deletions(-) --- .before/arch/x86/kvm/vmx.c 2011-05-16 22:36:50.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2011-05-16 22:36:50.000000000 +0300 @@ -4153,6 +4153,58 @@ vmx_patch_hypercall(struct kvm_vcpu *vcp hypercall[2] = 0xc1; } +/* called to set cr0 as approriate for a mov-to-cr0 exit. */ +static int handle_set_cr0(struct kvm_vcpu *vcpu, unsigned long val) +{ + if (to_vmx(vcpu)->nested.vmxon && + ((val & VMXON_CR0_ALWAYSON) != VMXON_CR0_ALWAYSON)) + return 1; + + if (is_guest_mode(vcpu)) { + /* + * We get here when L2 changed cr0 in a way that did not change + * any of L1's shadowed bits (see nested_vmx_exit_handled_cr), + * but did change L0 shadowed bits. This can currently happen + * with the TS bit: L0 may want to leave TS on (for lazy fpu + * loading) while pretending to allow the guest to change it. + */ + if (kvm_set_cr0(vcpu, (val & vcpu->arch.cr0_guest_owned_bits) | + (vcpu->arch.cr0 & ~vcpu->arch.cr0_guest_owned_bits))) + return 1; + vmcs_writel(CR0_READ_SHADOW, val); + return 0; + } else + return kvm_set_cr0(vcpu, val); +} + +static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val) +{ + if (is_guest_mode(vcpu)) { + if (kvm_set_cr4(vcpu, (val & vcpu->arch.cr4_guest_owned_bits) | + (vcpu->arch.cr4 & ~vcpu->arch.cr4_guest_owned_bits))) + return 1; + vmcs_writel(CR4_READ_SHADOW, val); + return 0; + } else + return kvm_set_cr4(vcpu, val); +} + +/* called to set cr0 as approriate for clts instruction exit. */ +static void handle_clts(struct kvm_vcpu *vcpu) +{ + if (is_guest_mode(vcpu)) { + /* + * We get here when L2 did CLTS, and L1 didn't shadow CR0.TS + * but we did (!fpu_active). We need to keep GUEST_CR0.TS on, + * just pretend it's off (also in arch.cr0 for fpu_activate). + */ + vmcs_writel(CR0_READ_SHADOW, + vmcs_readl(CR0_READ_SHADOW) & ~X86_CR0_TS); + vcpu->arch.cr0 &= ~X86_CR0_TS; + } else + vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS)); +} + static int handle_cr(struct kvm_vcpu *vcpu) { unsigned long exit_qualification, val; @@ -4169,7 +4221,7 @@ static int handle_cr(struct kvm_vcpu *vc trace_kvm_cr_write(cr, val); switch (cr) { case 0: - err = kvm_set_cr0(vcpu, val); + err = handle_set_cr0(vcpu, val); kvm_complete_insn_gp(vcpu, err); return 1; case 3: @@ -4177,7 +4229,7 @@ static int handle_cr(struct kvm_vcpu *vc kvm_complete_insn_gp(vcpu, err); return 1; case 4: - err = kvm_set_cr4(vcpu, val); + err = handle_set_cr4(vcpu, val); kvm_complete_insn_gp(vcpu, err); return 1; case 8: { @@ -4195,7 +4247,7 @@ static int handle_cr(struct kvm_vcpu *vc }; break; case 2: /* clts */ - vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS)); + handle_clts(vcpu); trace_kvm_cr_write(0, kvm_read_cr0(vcpu)); skip_emulated_instruction(vcpu); vmx_fpu_activate(vcpu);