All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/vvmx: Improvements to INVEPT instruction handling
@ 2017-02-06 16:55 Andrew Cooper
  2017-02-07 10:19 ` Jan Beulich
  2017-02-08  7:46 ` Tian, Kevin
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Cooper @ 2017-02-06 16:55 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, Jan Beulich

 * Latch current once at the start.
 * Avoid the memory operand read for INVEPT_ALL_CONTEXT.  Experimentally, this
   is how hardware behaves, and avoids an unnecessary pagewalk.
 * Reject Reg/Reg encodings of the instruction.
 * Audit eptp against maxphysaddr.
 * Introduce and use VMX_INSN_INVALID_INV_OPERAND to correct the vmfail
   semantics.
 * Add extra newlines for clarity

Also, introduce some TODOs for further checks which should be performed.
These checks are hard to perform at the moment, as there is no easy way to see
which MSR values where given to the guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>

This patch conflicts textually but not functionally with Sergey's "x86/vmx:
introduce vmwrite_safe()".

The MSR checks will be resolved once the MSR Levelling work is started and the
configuration can be read out of the MSR policy block.
---
 xen/arch/x86/hvm/vmx/vvmx.c        | 43 ++++++++++++++++++++++++++++++++------
 xen/include/asm-x86/hvm/vmx/vmcs.h |  1 +
 2 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 9caebe5..dfcc8a7 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1770,33 +1770,64 @@ int nvmx_handle_vmwrite(struct cpu_user_regs *regs)
 
 int nvmx_handle_invept(struct cpu_user_regs *regs)
 {
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     struct vmx_inst_decoded decode;
-    unsigned long eptp;
     int ret;
 
-    if ( (ret = decode_vmx_inst(regs, &decode, &eptp, 0)) != X86EMUL_OKAY )
+    if ( (ret = decode_vmx_inst(regs, &decode, NULL, 0)) != X86EMUL_OKAY )
         return ret;
 
+    /* TODO - reject instruction completely if not configured. */
+
+    /* Instruction must have a memory operand. */
+    if ( decode.type != VMX_INST_MEMREG_TYPE_MEMORY )
+    {
+        hvm_inject_hw_exception(TRAP_invalid_op, 0);
+        return X86EMUL_EXCEPTION;
+    }
+
     switch ( reg_read(regs, decode.reg2) )
     {
     case INVEPT_SINGLE_CONTEXT:
     {
-        struct p2m_domain *p2m = p2m_get_nestedp2m(current, eptp);
+        struct p2m_domain *p2m;
+        pagefault_info_t pfinfo;
+        unsigned long eptp;
+
+        /* TODO - reject SINGLE_CONTEXT if not configured. */
+
+        ret = hvm_copy_from_guest_linear(&eptp, decode.mem, 8, 0, &pfinfo);
+        if ( ret == HVMCOPY_bad_gva_to_gfn )
+            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
+        if ( ret != HVMCOPY_okay )
+            return X86EMUL_EXCEPTION;
+
+        if ( eptp >> currd->arch.cpuid->extd.maxphysaddr )
+            goto reject;
+
+        /* TODO - audit lower eptp metadata against configuration. */
+
+        p2m = p2m_get_nestedp2m(curr, eptp);
         if ( p2m )
         {
-            p2m_flush(current, p2m);
+            p2m_flush(curr, p2m);
             ept_sync_domain(p2m);
         }
         break;
     }
+
     case INVEPT_ALL_CONTEXT:
-        p2m_flush_nestedp2m(current->domain);
+        p2m_flush_nestedp2m(curr->domain);
         __invept(INVEPT_ALL_CONTEXT, 0, 0);
         break;
+
+    reject:
     default:
-        vmfail_invalid(regs);
+        vmfail(regs, VMX_INSN_INVALID_INV_OPERAND);
         return X86EMUL_OKAY;
     }
+
     vmsucceed(regs);
     return X86EMUL_OKAY;
 }
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index d71de04..309eaf0 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -526,6 +526,7 @@ enum vmx_insn_errno
     VMX_INSN_VMPTRLD_INVALID_PHYADDR       = 9,
     VMX_INSN_UNSUPPORTED_VMCS_COMPONENT    = 12,
     VMX_INSN_VMXON_IN_VMX_ROOT             = 15,
+    VMX_INSN_INVALID_INV_OPERAND           = 28,
 };
 
 void vmx_disable_intercept_for_msr(struct vcpu *v, u32 msr, int type);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/vvmx: Improvements to INVEPT instruction handling
  2017-02-06 16:55 [PATCH] x86/vvmx: Improvements to INVEPT instruction handling Andrew Cooper
@ 2017-02-07 10:19 ` Jan Beulich
  2017-02-07 10:39   ` Andrew Cooper
  2017-02-08  7:46 ` Tian, Kevin
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2017-02-07 10:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Kevin Tian, Jun Nakajima, Xen-devel

 >>> On 06.02.17 at 17:55, <andrew.cooper3@citrix.com> wrote:
> * Latch current once at the start.
>  * Avoid the memory operand read for INVEPT_ALL_CONTEXT.  Experimentally, this
>    is how hardware behaves, and avoids an unnecessary pagewalk.

Hmm, so you say #GP is being raised for all possible reasons, but
#PF can't result here? That would be pretty unusual instruction
semantics. But if it's that way (to be confirmed by Intel), and ...

>  * Reject Reg/Reg encodings of the instruction.

... if this is possible to occur at all (I'd expect #UD to result instead
of a VM exit), then ...

>  * Audit eptp against maxphysaddr.
>  * Introduce and use VMX_INSN_INVALID_INV_OPERAND to correct the vmfail
>    semantics.
>  * Add extra newlines for clarity
> 
> Also, introduce some TODOs for further checks which should be performed.
> These checks are hard to perform at the moment, as there is no easy way to see
> which MSR values where given to the guest.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

with one minor adjustment request:

> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -1770,33 +1770,64 @@ int nvmx_handle_vmwrite(struct cpu_user_regs *regs)
>  
>  int nvmx_handle_invept(struct cpu_user_regs *regs)
>  {
> +    struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
>      struct vmx_inst_decoded decode;
> -    unsigned long eptp;
>      int ret;
>  
> -    if ( (ret = decode_vmx_inst(regs, &decode, &eptp, 0)) != X86EMUL_OKAY )
> +    if ( (ret = decode_vmx_inst(regs, &decode, NULL, 0)) != X86EMUL_OKAY )
>          return ret;
>  
> +    /* TODO - reject instruction completely if not configured. */
> +
> +    /* Instruction must have a memory operand. */
> +    if ( decode.type != VMX_INST_MEMREG_TYPE_MEMORY )
> +    {
> +        hvm_inject_hw_exception(TRAP_invalid_op, 0);
> +        return X86EMUL_EXCEPTION;
> +    }
> +
>      switch ( reg_read(regs, decode.reg2) )
>      {
>      case INVEPT_SINGLE_CONTEXT:
>      {
> -        struct p2m_domain *p2m = p2m_get_nestedp2m(current, eptp);
> +        struct p2m_domain *p2m;
> +        pagefault_info_t pfinfo;
> +        unsigned long eptp;
> +
> +        /* TODO - reject SINGLE_CONTEXT if not configured. */
> +
> +        ret = hvm_copy_from_guest_linear(&eptp, decode.mem, 8, 0, &pfinfo);

Please use sizeof(eptp) here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/vvmx: Improvements to INVEPT instruction handling
  2017-02-07 10:19 ` Jan Beulich
@ 2017-02-07 10:39   ` Andrew Cooper
  2017-02-07 10:52     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2017-02-07 10:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Kevin Tian, Jun Nakajima, Xen-devel

On 07/02/17 10:19, Jan Beulich wrote:
>  >>> On 06.02.17 at 17:55, <andrew.cooper3@citrix.com> wrote:
>> * Latch current once at the start.
>>  * Avoid the memory operand read for INVEPT_ALL_CONTEXT.  Experimentally, this
>>    is how hardware behaves, and avoids an unnecessary pagewalk.
> Hmm, so you say #GP is being raised for all possible reasons, but
> #PF can't result here? That would be pretty unusual instruction
> semantics. But if it's that way (to be confirmed by Intel), and ...

No.  The memory operand is entirely ignored.  No #PF, or #GP or #SS for
bad segment or canonical setups.

>
>>  * Reject Reg/Reg encodings of the instruction.
> ... if this is possible to occur at all (I'd expect #UD to result instead
> of a VM exit), then ...

I'd hope so as well.  This addition is partly from an entirely emulation
point of view, as well as a proper sanity check of the decode.mem union
before use.

>
>>  * Audit eptp against maxphysaddr.
>>  * Introduce and use VMX_INSN_INVALID_INV_OPERAND to correct the vmfail
>>    semantics.
>>  * Add extra newlines for clarity
>>
>> Also, introduce some TODOs for further checks which should be performed.
>> These checks are hard to perform at the moment, as there is no easy way to see
>> which MSR values where given to the guest.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> with one minor adjustment request:
>
>> --- a/xen/arch/x86/hvm/vmx/vvmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
>> @@ -1770,33 +1770,64 @@ int nvmx_handle_vmwrite(struct cpu_user_regs *regs)
>>  
>>  int nvmx_handle_invept(struct cpu_user_regs *regs)
>>  {
>> +    struct vcpu *curr = current;
>> +    struct domain *currd = curr->domain;
>>      struct vmx_inst_decoded decode;
>> -    unsigned long eptp;
>>      int ret;
>>  
>> -    if ( (ret = decode_vmx_inst(regs, &decode, &eptp, 0)) != X86EMUL_OKAY )
>> +    if ( (ret = decode_vmx_inst(regs, &decode, NULL, 0)) != X86EMUL_OKAY )
>>          return ret;
>>  
>> +    /* TODO - reject instruction completely if not configured. */
>> +
>> +    /* Instruction must have a memory operand. */
>> +    if ( decode.type != VMX_INST_MEMREG_TYPE_MEMORY )
>> +    {
>> +        hvm_inject_hw_exception(TRAP_invalid_op, 0);
>> +        return X86EMUL_EXCEPTION;
>> +    }
>> +
>>      switch ( reg_read(regs, decode.reg2) )
>>      {
>>      case INVEPT_SINGLE_CONTEXT:
>>      {
>> -        struct p2m_domain *p2m = p2m_get_nestedp2m(current, eptp);
>> +        struct p2m_domain *p2m;
>> +        pagefault_info_t pfinfo;
>> +        unsigned long eptp;
>> +
>> +        /* TODO - reject SINGLE_CONTEXT if not configured. */
>> +
>> +        ret = hvm_copy_from_guest_linear(&eptp, decode.mem, 8, 0, &pfinfo);
> Please use sizeof(eptp) here.

Ok, but in that case eptp needs to become uint64_t to match the ABI.  In
fact, I probably need to read all 128 bytes and perform the MBZ check on
the 2nd word.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/vvmx: Improvements to INVEPT instruction handling
  2017-02-07 10:39   ` Andrew Cooper
@ 2017-02-07 10:52     ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2017-02-07 10:52 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Kevin Tian, Jun Nakajima, Xen-devel

>>> On 07.02.17 at 11:39, <andrew.cooper3@citrix.com> wrote:
> On 07/02/17 10:19, Jan Beulich wrote:
>>  >>> On 06.02.17 at 17:55, <andrew.cooper3@citrix.com> wrote:
>>> * Latch current once at the start.
>>>  * Avoid the memory operand read for INVEPT_ALL_CONTEXT.  Experimentally, this
>>>    is how hardware behaves, and avoids an unnecessary pagewalk.
>> Hmm, so you say #GP is being raised for all possible reasons, but
>> #PF can't result here? That would be pretty unusual instruction
>> semantics. But if it's that way (to be confirmed by Intel), and ...
> 
> No.  The memory operand is entirely ignored.  No #PF, or #GP or #SS for
> bad segment or canonical setups.

But then the #GP related checks in decode_vmx_inst() would also
need skipping.

>>>  * Reject Reg/Reg encodings of the instruction.
>> ... if this is possible to occur at all (I'd expect #UD to result instead
>> of a VM exit), then ...
> 
> I'd hope so as well.  This addition is partly from an entirely emulation
> point of view, as well as a proper sanity check of the decode.mem union
> before use.

The "entirely emulation point of view" is not realy applicable here,
as we don't decode the instruction a 2nd time (after the hardware
had done so). Sanity checking hardware produced values of course
is reasonable in places like this; I'm just not sure whether reporting
such issues as #UD to the guest is appropriate - I'd rather see the
guest killed if hardware doesn't behave itself.

>>> --- a/xen/arch/x86/hvm/vmx/vvmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
>>> @@ -1770,33 +1770,64 @@ int nvmx_handle_vmwrite(struct cpu_user_regs *regs)
>>>  
>>>  int nvmx_handle_invept(struct cpu_user_regs *regs)
>>>  {
>>> +    struct vcpu *curr = current;
>>> +    struct domain *currd = curr->domain;
>>>      struct vmx_inst_decoded decode;
>>> -    unsigned long eptp;
>>>      int ret;
>>>  
>>> -    if ( (ret = decode_vmx_inst(regs, &decode, &eptp, 0)) != X86EMUL_OKAY )
>>> +    if ( (ret = decode_vmx_inst(regs, &decode, NULL, 0)) != X86EMUL_OKAY )
>>>          return ret;
>>>  
>>> +    /* TODO - reject instruction completely if not configured. */
>>> +
>>> +    /* Instruction must have a memory operand. */
>>> +    if ( decode.type != VMX_INST_MEMREG_TYPE_MEMORY )
>>> +    {
>>> +        hvm_inject_hw_exception(TRAP_invalid_op, 0);
>>> +        return X86EMUL_EXCEPTION;
>>> +    }
>>> +
>>>      switch ( reg_read(regs, decode.reg2) )
>>>      {
>>>      case INVEPT_SINGLE_CONTEXT:
>>>      {
>>> -        struct p2m_domain *p2m = p2m_get_nestedp2m(current, eptp);
>>> +        struct p2m_domain *p2m;
>>> +        pagefault_info_t pfinfo;
>>> +        unsigned long eptp;
>>> +
>>> +        /* TODO - reject SINGLE_CONTEXT if not configured. */
>>> +
>>> +        ret = hvm_copy_from_guest_linear(&eptp, decode.mem, 8, 0, &pfinfo);
>> Please use sizeof(eptp) here.
> 
> Ok, but in that case eptp needs to become uint64_t to match the ABI.  In
> fact, I probably need to read all 128 bytes and perform the MBZ check on
> the 2nd word.

Oh, indeed, the more that this may also effect eventual
exceptions needing raising.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/vvmx: Improvements to INVEPT instruction handling
  2017-02-06 16:55 [PATCH] x86/vvmx: Improvements to INVEPT instruction handling Andrew Cooper
  2017-02-07 10:19 ` Jan Beulich
@ 2017-02-08  7:46 ` Tian, Kevin
  2017-05-15 15:02   ` Andrew Cooper
  1 sibling, 1 reply; 6+ messages in thread
From: Tian, Kevin @ 2017-02-08  7:46 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Nakajima, Jun, Jan Beulich

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Tuesday, February 07, 2017 12:55 AM
> 
>  * Latch current once at the start.
>  * Avoid the memory operand read for INVEPT_ALL_CONTEXT.  Experimentally, this
>    is how hardware behaves, and avoids an unnecessary pagewalk.
>  * Reject Reg/Reg encodings of the instruction.
>  * Audit eptp against maxphysaddr.
>  * Introduce and use VMX_INSN_INVALID_INV_OPERAND to correct the vmfail
>    semantics.
>  * Add extra newlines for clarity
> 
> Also, introduce some TODOs for further checks which should be performed.
> These checks are hard to perform at the moment, as there is no easy way to see
> which MSR values where given to the guest.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/vvmx: Improvements to INVEPT instruction handling
  2017-02-08  7:46 ` Tian, Kevin
@ 2017-05-15 15:02   ` Andrew Cooper
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Cooper @ 2017-05-15 15:02 UTC (permalink / raw)
  To: Tian, Kevin, Xen-devel; +Cc: Nakajima, Jun, Jan Beulich

On 08/02/17 07:46, Tian, Kevin wrote:
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: Tuesday, February 07, 2017 12:55 AM
>>
>>  * Latch current once at the start.
>>  * Avoid the memory operand read for INVEPT_ALL_CONTEXT.  Experimentally, this
>>    is how hardware behaves, and avoids an unnecessary pagewalk.
>>  * Reject Reg/Reg encodings of the instruction.
>>  * Audit eptp against maxphysaddr.
>>  * Introduce and use VMX_INSN_INVALID_INV_OPERAND to correct the vmfail
>>    semantics.
>>  * Add extra newlines for clarity
>>
>> Also, introduce some TODOs for further checks which should be performed.
>> These checks are hard to perform at the moment, as there is no easy way to see
>> which MSR values where given to the guest.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Kevin Tian <kevin.tian@intel.com>

Actually, it turns out that a combination of 2b2793d3 and f438b1c5 is
entirely broken for 32bit hypervisors, and this patch was an accidental
bugfix.

decode_vmx_inst() reads using the default memory operand size, meaning
that a 32bit code segment executing INVEPT only fills in the bottom half
of &eptp.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-15 15:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-06 16:55 [PATCH] x86/vvmx: Improvements to INVEPT instruction handling Andrew Cooper
2017-02-07 10:19 ` Jan Beulich
2017-02-07 10:39   ` Andrew Cooper
2017-02-07 10:52     ` Jan Beulich
2017-02-08  7:46 ` Tian, Kevin
2017-05-15 15:02   ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.