* [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding
@ 2017-11-06 0:54 Wanpeng Li
2017-11-06 0:54 ` [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry Wanpeng Li
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Wanpeng Li @ 2017-11-06 0:54 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Radim Krčmář,
Wanpeng Li, Nadav Amit, Pedro Fonseca
From: Wanpeng Li <wanpeng.li@hotmail.com>
Pedro reported:
During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
instruction under KVM produces different results on both memory and the SP
register depending on whether EPT support is enabled. With EPT the SP is
reduced by 4 bytes (and the written value is 0-padded) but without EPT support
it is only reduced by 2 bytes. The difference can be observed when the CS.DB
field is 1 (32-bit) but not when it's 0 (16-bit).
The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
also should be respected instead of just default operand/address-size/66H
prefix/67H prefix during instruction decoding. This patch fixes it by also
adjusting operand/address-size according to CS.D.
Reported-by: Pedro Fonseca <pfonseca@cs.washington.edu>
Tested-by: Pedro Fonseca <pfonseca@cs.washington.edu>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Pedro Fonseca <pfonseca@cs.washington.edu>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v4 -> v5:
* cleanup patch subject/description
v3 -> v4:
* def_ad_bytes must be changed to 4
* separate X86EMUL_MODE_PROT16 altogether from the others
v2 -> v3:
* cleanup the codes
v1 -> v2:
* respect cs.d for real/vm8096, other modes have already
been considered in init_emulate_ctxt().
arch/x86/kvm/emulate.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8079d14..b4a87de 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5000,6 +5000,8 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
bool op_prefix = false;
bool has_seg_override = false;
struct opcode opcode;
+ u16 dummy;
+ struct desc_struct desc;
ctxt->memop.type = OP_NONE;
ctxt->memopp = NULL;
@@ -5018,6 +5020,11 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
switch (mode) {
case X86EMUL_MODE_REAL:
case X86EMUL_MODE_VM86:
+ def_op_bytes = def_ad_bytes = 2;
+ ctxt->ops->get_segment(ctxt, &dummy, &desc, NULL, VCPU_SREG_CS);
+ if (desc.d)
+ def_op_bytes = def_ad_bytes = 4;
+ break;
case X86EMUL_MODE_PROT16:
def_op_bytes = def_ad_bytes = 2;
break;
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry
2017-11-06 0:54 [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Wanpeng Li
@ 2017-11-06 0:54 ` Wanpeng Li
2017-11-06 12:16 ` Paolo Bonzini
2017-11-06 0:54 ` [PATCH v6 3/3] KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure Wanpeng Li
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Wanpeng Li @ 2017-11-06 0:54 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li, Jim Mattson
From: Wanpeng Li <wanpeng.li@hotmail.com>
According to the SDM, if the "load IA32_BNDCFGS" VM-entry controls is 1, the
following checks are performed on the field for the IA32_BNDCFGS MSR:
- Bits reserved in the IA32_BNDCFGS MSR must be 0.
- The linear address in bits 63:12 must be canonical.
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v5 -> v6:
* keep the right conjunct
v3 -> v4:
* simply condition
* use && instead of nested "if"s
arch/x86/kvm/vmx.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e6c8ffa..6cf3972 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10805,6 +10805,11 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
return 1;
}
+ if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS) &&
+ (is_noncanonical_address(vmcs12->guest_bndcfgs & PAGE_MASK, vcpu) ||
+ (vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD)))
+ return 1;
+
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v6 3/3] KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
2017-11-06 0:54 [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Wanpeng Li
2017-11-06 0:54 ` [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry Wanpeng Li
@ 2017-11-06 0:54 ` Wanpeng Li
2017-11-06 12:17 ` Paolo Bonzini
2017-11-06 12:16 ` [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Paolo Bonzini
2017-11-10 21:40 ` Radim Krčmář
3 siblings, 1 reply; 7+ messages in thread
From: Wanpeng Li @ 2017-11-06 0:54 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li, Jim Mattson
From: Wanpeng Li <wanpeng.li@hotmail.com>
Commit 4f350c6dbcb (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure
properly) can result in L1(run kvm-unit-tests/run_tests.sh vmx_controls in L1)
null pointer deference and also L0 calltrace when EPT=0 on both L0 and L1.
In L1:
BUG: unable to handle kernel paging request at ffffffffc015bf8f
IP: vmx_vcpu_run+0x202/0x510 [kvm_intel]
PGD 146e13067 P4D 146e13067 PUD 146e15067 PMD 3d2686067 PTE 3d4af9161
Oops: 0003 [#1] PREEMPT SMP
CPU: 2 PID: 1798 Comm: qemu-system-x86 Not tainted 4.14.0-rc4+ #6
RIP: 0010:vmx_vcpu_run+0x202/0x510 [kvm_intel]
Call Trace:
WARNING: kernel stack frame pointer at ffffb86f4988bc18 in qemu-system-x86:1798 has bad value 0000000000000002
In L0:
-----------[ cut here ]------------
WARNING: CPU: 6 PID: 4460 at /home/kernel/linux/arch/x86/kvm//vmx.c:9845 vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
CPU: 6 PID: 4460 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc7+ #25
RIP: 0010:vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
Call Trace:
paging64_page_fault+0x500/0xde0 [kvm]
? paging32_gva_to_gpa_nested+0x120/0x120 [kvm]
? nonpaging_page_fault+0x3b0/0x3b0 [kvm]
? __asan_storeN+0x12/0x20
? paging64_gva_to_gpa+0xb0/0x120 [kvm]
? paging64_walk_addr_generic+0x11a0/0x11a0 [kvm]
? lock_acquire+0x2c0/0x2c0
? vmx_read_guest_seg_ar+0x97/0x100 [kvm_intel]
? vmx_get_segment+0x2a6/0x310 [kvm_intel]
? sched_clock+0x1f/0x30
? check_chain_key+0x137/0x1e0
? __lock_acquire+0x83c/0x2420
? kvm_multiple_exception+0xf2/0x220 [kvm]
? debug_check_no_locks_freed+0x240/0x240
? debug_smp_processor_id+0x17/0x20
? __lock_is_held+0x9e/0x100
kvm_mmu_page_fault+0x90/0x180 [kvm]
kvm_handle_page_fault+0x15c/0x310 [kvm]
? __lock_is_held+0x9e/0x100
handle_exception+0x3c7/0x4d0 [kvm_intel]
vmx_handle_exit+0x103/0x1010 [kvm_intel]
? kvm_arch_vcpu_ioctl_run+0x1628/0x2e20 [kvm]
The commit avoids to load host state of vmcs12 as vmcs01's guest state
since vmcs12 is not modified (except for the VM-instruction error field)
if the checking of vmcs control area fails. However, the mmu context is
switched to nested mmu in prepare_vmcs02() and it will not be reloaded
since load_vmcs12_host_state() is skipped when nested VMLAUNCH/VMRESUME
fails. This patch fixes it by reloading mmu context when nested
VMLAUNCH/VMRESUME fails.
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v3 -> v4:
* move it to a new function load_vmcs12_mmu_host_state
arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6cf3972..8aefb91 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11259,6 +11259,24 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
kvm_clear_interrupt_queue(vcpu);
}
+static void load_vmcs12_mmu_host_state(struct kvm_vcpu *vcpu,
+ struct vmcs12 *vmcs12)
+{
+ u32 entry_failure_code;
+
+ nested_ept_uninit_mmu_context(vcpu);
+
+ /*
+ * Only PDPTE load can fail as the value of cr3 was checked on entry and
+ * couldn't have changed.
+ */
+ if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
+ nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
+
+ if (!enable_ept)
+ vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
+}
+
/*
* A part of what we need to when the nested L2 guest exits and we want to
* run its L1 parent, is to reset L1's guest state to the host state specified
@@ -11272,7 +11290,6 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12)
{
struct kvm_segment seg;
- u32 entry_failure_code;
if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER)
vcpu->arch.efer = vmcs12->host_ia32_efer;
@@ -11299,17 +11316,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
vcpu->arch.cr4_guest_owned_bits = ~vmcs_readl(CR4_GUEST_HOST_MASK);
vmx_set_cr4(vcpu, vmcs12->host_cr4);
- nested_ept_uninit_mmu_context(vcpu);
-
- /*
- * Only PDPTE load can fail as the value of cr3 was checked on entry and
- * couldn't have changed.
- */
- if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
- nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
-
- if (!enable_ept)
- vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
+ load_vmcs12_mmu_host_state(vcpu, vmcs12);
if (enable_vpid) {
/*
@@ -11539,6 +11546,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
* accordingly.
*/
nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+
+ load_vmcs12_mmu_host_state(vcpu, vmcs12);
+
/*
* The emulated instruction was already skipped in
* nested_vmx_run, but the updated RIP was never
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding
2017-11-06 0:54 [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Wanpeng Li
2017-11-06 0:54 ` [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry Wanpeng Li
2017-11-06 0:54 ` [PATCH v6 3/3] KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure Wanpeng Li
@ 2017-11-06 12:16 ` Paolo Bonzini
2017-11-10 21:40 ` Radim Krčmář
3 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2017-11-06 12:16 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm
Cc: Radim Krčmář, Wanpeng Li, Nadav Amit, Pedro Fonseca
On 06/11/2017 01:54, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> Pedro reported:
> During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
> instruction under KVM produces different results on both memory and the SP
> register depending on whether EPT support is enabled. With EPT the SP is
> reduced by 4 bytes (and the written value is 0-padded) but without EPT support
> it is only reduced by 2 bytes. The difference can be observed when the CS.DB
> field is 1 (32-bit) but not when it's 0 (16-bit).
>
> The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
> also should be respected instead of just default operand/address-size/66H
> prefix/67H prefix during instruction decoding. This patch fixes it by also
> adjusting operand/address-size according to CS.D.
>
> Reported-by: Pedro Fonseca <pfonseca@cs.washington.edu>
> Tested-by: Pedro Fonseca <pfonseca@cs.washington.edu>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Nadav Amit <nadav.amit@gmail.com>
> Cc: Pedro Fonseca <pfonseca@cs.washington.edu>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> v4 -> v5:
> * cleanup patch subject/description
> v3 -> v4:
> * def_ad_bytes must be changed to 4
> * separate X86EMUL_MODE_PROT16 altogether from the others
> v2 -> v3:
> * cleanup the codes
> v1 -> v2:
> * respect cs.d for real/vm8096, other modes have already
> been considered in init_emulate_ctxt().
>
> arch/x86/kvm/emulate.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 8079d14..b4a87de 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -5000,6 +5000,8 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> bool op_prefix = false;
> bool has_seg_override = false;
> struct opcode opcode;
> + u16 dummy;
> + struct desc_struct desc;
>
> ctxt->memop.type = OP_NONE;
> ctxt->memopp = NULL;
> @@ -5018,6 +5020,11 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> switch (mode) {
> case X86EMUL_MODE_REAL:
> case X86EMUL_MODE_VM86:
> + def_op_bytes = def_ad_bytes = 2;
> + ctxt->ops->get_segment(ctxt, &dummy, &desc, NULL, VCPU_SREG_CS);
> + if (desc.d)
> + def_op_bytes = def_ad_bytes = 4;
> + break;
> case X86EMUL_MODE_PROT16:
> def_op_bytes = def_ad_bytes = 2;
> break;
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry
2017-11-06 0:54 ` [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry Wanpeng Li
@ 2017-11-06 12:16 ` Paolo Bonzini
0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2017-11-06 12:16 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm
Cc: Radim Krčmář, Wanpeng Li, Jim Mattson
On 06/11/2017 01:54, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> According to the SDM, if the "load IA32_BNDCFGS" VM-entry controls is 1, the
> following checks are performed on the field for the IA32_BNDCFGS MSR:
> - Bits reserved in the IA32_BNDCFGS MSR must be 0.
> - The linear address in bits 63:12 must be canonical.
>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Jim Mattson <jmattson@google.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> v5 -> v6:
> * keep the right conjunct
> v3 -> v4:
> * simply condition
> * use && instead of nested "if"s
>
> arch/x86/kvm/vmx.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index e6c8ffa..6cf3972 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -10805,6 +10805,11 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> return 1;
> }
>
> + if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS) &&
> + (is_noncanonical_address(vmcs12->guest_bndcfgs & PAGE_MASK, vcpu) ||
> + (vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD)))
> + return 1;
> +
> return 0;
> }
>
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 3/3] KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
2017-11-06 0:54 ` [PATCH v6 3/3] KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure Wanpeng Li
@ 2017-11-06 12:17 ` Paolo Bonzini
0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2017-11-06 12:17 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm
Cc: Radim Krčmář, Wanpeng Li, Jim Mattson
On 06/11/2017 01:54, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> Commit 4f350c6dbcb (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure
> properly) can result in L1(run kvm-unit-tests/run_tests.sh vmx_controls in L1)
> null pointer deference and also L0 calltrace when EPT=0 on both L0 and L1.
>
> In L1:
>
> BUG: unable to handle kernel paging request at ffffffffc015bf8f
> IP: vmx_vcpu_run+0x202/0x510 [kvm_intel]
> PGD 146e13067 P4D 146e13067 PUD 146e15067 PMD 3d2686067 PTE 3d4af9161
> Oops: 0003 [#1] PREEMPT SMP
> CPU: 2 PID: 1798 Comm: qemu-system-x86 Not tainted 4.14.0-rc4+ #6
> RIP: 0010:vmx_vcpu_run+0x202/0x510 [kvm_intel]
> Call Trace:
> WARNING: kernel stack frame pointer at ffffb86f4988bc18 in qemu-system-x86:1798 has bad value 0000000000000002
>
> In L0:
>
> -----------[ cut here ]------------
> WARNING: CPU: 6 PID: 4460 at /home/kernel/linux/arch/x86/kvm//vmx.c:9845 vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
> CPU: 6 PID: 4460 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc7+ #25
> RIP: 0010:vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
> Call Trace:
> paging64_page_fault+0x500/0xde0 [kvm]
> ? paging32_gva_to_gpa_nested+0x120/0x120 [kvm]
> ? nonpaging_page_fault+0x3b0/0x3b0 [kvm]
> ? __asan_storeN+0x12/0x20
> ? paging64_gva_to_gpa+0xb0/0x120 [kvm]
> ? paging64_walk_addr_generic+0x11a0/0x11a0 [kvm]
> ? lock_acquire+0x2c0/0x2c0
> ? vmx_read_guest_seg_ar+0x97/0x100 [kvm_intel]
> ? vmx_get_segment+0x2a6/0x310 [kvm_intel]
> ? sched_clock+0x1f/0x30
> ? check_chain_key+0x137/0x1e0
> ? __lock_acquire+0x83c/0x2420
> ? kvm_multiple_exception+0xf2/0x220 [kvm]
> ? debug_check_no_locks_freed+0x240/0x240
> ? debug_smp_processor_id+0x17/0x20
> ? __lock_is_held+0x9e/0x100
> kvm_mmu_page_fault+0x90/0x180 [kvm]
> kvm_handle_page_fault+0x15c/0x310 [kvm]
> ? __lock_is_held+0x9e/0x100
> handle_exception+0x3c7/0x4d0 [kvm_intel]
> vmx_handle_exit+0x103/0x1010 [kvm_intel]
> ? kvm_arch_vcpu_ioctl_run+0x1628/0x2e20 [kvm]
>
> The commit avoids to load host state of vmcs12 as vmcs01's guest state
> since vmcs12 is not modified (except for the VM-instruction error field)
> if the checking of vmcs control area fails. However, the mmu context is
> switched to nested mmu in prepare_vmcs02() and it will not be reloaded
> since load_vmcs12_host_state() is skipped when nested VMLAUNCH/VMRESUME
> fails. This patch fixes it by reloading mmu context when nested
> VMLAUNCH/VMRESUME fails.
>
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Jim Mattson <jmattson@google.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> v3 -> v4:
> * move it to a new function load_vmcs12_mmu_host_state
>
> arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++++------------
> 1 file changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 6cf3972..8aefb91 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11259,6 +11259,24 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> kvm_clear_interrupt_queue(vcpu);
> }
>
> +static void load_vmcs12_mmu_host_state(struct kvm_vcpu *vcpu,
> + struct vmcs12 *vmcs12)
> +{
> + u32 entry_failure_code;
> +
> + nested_ept_uninit_mmu_context(vcpu);
> +
> + /*
> + * Only PDPTE load can fail as the value of cr3 was checked on entry and
> + * couldn't have changed.
> + */
> + if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
> + nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
> +
> + if (!enable_ept)
> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> +}
> +
> /*
> * A part of what we need to when the nested L2 guest exits and we want to
> * run its L1 parent, is to reset L1's guest state to the host state specified
> @@ -11272,7 +11290,6 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> struct vmcs12 *vmcs12)
> {
> struct kvm_segment seg;
> - u32 entry_failure_code;
>
> if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER)
> vcpu->arch.efer = vmcs12->host_ia32_efer;
> @@ -11299,17 +11316,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> vcpu->arch.cr4_guest_owned_bits = ~vmcs_readl(CR4_GUEST_HOST_MASK);
> vmx_set_cr4(vcpu, vmcs12->host_cr4);
>
> - nested_ept_uninit_mmu_context(vcpu);
> -
> - /*
> - * Only PDPTE load can fail as the value of cr3 was checked on entry and
> - * couldn't have changed.
> - */
> - if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
> - nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
> -
> - if (!enable_ept)
> - vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> + load_vmcs12_mmu_host_state(vcpu, vmcs12);
>
> if (enable_vpid) {
> /*
> @@ -11539,6 +11546,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
> * accordingly.
> */
> nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
> +
> + load_vmcs12_mmu_host_state(vcpu, vmcs12);
> +
> /*
> * The emulated instruction was already skipped in
> * nested_vmx_run, but the updated RIP was never
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding
2017-11-06 0:54 [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Wanpeng Li
` (2 preceding siblings ...)
2017-11-06 12:16 ` [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Paolo Bonzini
@ 2017-11-10 21:40 ` Radim Krčmář
3 siblings, 0 replies; 7+ messages in thread
From: Radim Krčmář @ 2017-11-10 21:40 UTC (permalink / raw)
To: Wanpeng Li
Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li, Nadav Amit, Pedro Fonseca
Applied all three, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-11-10 21:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-06 0:54 [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Wanpeng Li
2017-11-06 0:54 ` [PATCH v6 2/3] KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry Wanpeng Li
2017-11-06 12:16 ` Paolo Bonzini
2017-11-06 0:54 ` [PATCH v6 3/3] KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure Wanpeng Li
2017-11-06 12:17 ` Paolo Bonzini
2017-11-06 12:16 ` [PATCH v6 1/3] KVM: X86: Fix operand/address-size during instruction decoding Paolo Bonzini
2017-11-10 21:40 ` Radim Krčmář
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.