All of lore.kernel.org
 help / color / mirror / Atom feed
* Emulation and active (valid) interrupts
@ 2018-08-08 14:26 Razvan Cojocaru
  2018-08-08 15:01 ` Razvan Cojocaru
  2018-08-09  7:54 ` Jan Beulich
  0 siblings, 2 replies; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-08 14:26 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

[-- Attachment #1: Type: text/plain, Size: 2584 bytes --]

Hello,

I'm seeing rare occasions where this backtrace occurs at a point in
__vmx_inject_exception() where there's already a valid interrupt set up
in VM_ENTRY_INTR_INFO (that is, once the value is __vmread(), intr_info
& INTR_INFO_VALID_MASK is non-zero):

Xen call trace:
   [<ffff82d0801fe9c7>] vmx.c#__vmx_inject_exception+0x47/0xd9
   [<ffff82d080201c6a>] vmx.c#vmx_inject_trap+0x1e1/0x29f
   [<ffff82d0801d6b8a>] hvm_inject_trap+0xb0/0xb5
   [<ffff82d0801d6be7>] hvm_inject_page_fault+0x2a/0x2c
   [<ffff82d0801d6cc6>] hvm.c#__hvm_copy+0xdd/0x37f
   [<ffff82d0801d88fa>] hvm_copy_from_guest_virt+0x14/0x16
   [<ffff82d0801d28e3>] emulate.c#__hvmemul_read+0x12f/0x19f
   [<ffff82d0801d2a2c>] emulate.c#hvmemul_read+0x28/0x2a
   [<ffff82d0801a931c>] x86_emulate.c#read_ulong+0x13/0x15
   [<ffff82d0801ab412>] x86_emulate+0x16b1/0x11405
   [<ffff82d0801d180a>] emulate.c#_hvm_emulate_one+0x188/0x2bc
   [<ffff82d0801d1a34>] hvm_emulate_one+0x10/0x12
   [<ffff82d0801d2d39>] hvm_mem_access_emulate_one+0x7a/0xdd
   [<ffff82d0801dbe50>] hvm_do_resume+0x246/0x3d3
   [<ffff82d0801fbddb>] vmx_do_resume+0x102/0x119
   [<ffff82d080170ba3>] context_switch+0xf52/0xfad
   [<ffff82d08013182c>] schedule.c#schedule+0x753/0x792
   [<ffff82d080134f05>] softirq.c#__do_softirq+0x85/0x90
   [<ffff82d080134f5a>] do_softirq+0x13/0x15
   [<ffff82d08016b5f2>] domain.c#idle_loop+0x61/0x6e

(this is a Xen 4.7.5 trace, but it applies to staging as well).

This was the initial culprit:

[<ffff82d08031b55d>] vmx.c#__vmx_inject_exception+0xa1/0xda
[<ffff82d08031eb5c>] vmx_inject_extint+0x94/0x9f
[<ffff82d080315a0a>] vmx_intr_assist+0x4ee/0x5ad
[<ffff82d0803258ff>] vmx_asm_vmexit_handler+0xff/0x270

and I thought I'd fixed it by treating the emulation in a similar manner
to the single-step case: have vmx_intr_assist() block interrupts for the
duration of the emulation (please see the attached patch for staging).
However, a bit more rarely this time, we still end up overwriting an
interrupt via the above code path.

Obviously this works only if nothing has been written in
VM_ENTRY_INTR_INFO _before_ we block (return) in vmx_intr_assist().

My questions are:

1. Is it possible to already have a valid interrupt written in
VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
vmx_vmexit_handler()?

2. Is it possible that something else in that path writes into
VM_ENTRY_INTR_INFO (which the vmx_intr_assist() logic can't possibly
prevent)? For example, in my Xen 4.7.5 sources, there's a
pt_restore_timer(v); call in hvm_do_resume() before the vm_event
emulation code.


Thanks,
Razvan

[-- Attachment #2: emulate_intr_block.patch --]
[-- Type: text/x-patch, Size: 2549 bytes --]

diff --git a/xen/arch/x86/hvm/vm_event.c b/xen/arch/x86/hvm/vm_event.c
index 28d08a6..8e7c737 100644
--- a/xen/arch/x86/hvm/vm_event.c
+++ b/xen/arch/x86/hvm/vm_event.c
@@ -124,6 +124,8 @@ void hvm_vm_event_do_resume(struct vcpu *v)
 
         w->do_write.msr = 0;
     }
+
+    v->arch.vm_event->intr_block = false;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index eb9b288..97cecbf 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -22,6 +22,7 @@
 #include <xen/errno.h>
 #include <xen/trace.h>
 #include <xen/event.h>
+#include <xen/vm_event.h>
 #include <asm/apicdef.h>
 #include <asm/current.h>
 #include <asm/cpufeature.h>
@@ -37,6 +38,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/trace.h>
+#include <asm/vm_event.h>
 
 /*
  * A few notes on virtual NMI and INTR delivery, and interactions with
@@ -231,6 +233,11 @@ void vmx_intr_assist(void)
     enum hvm_intblk intblk;
     int pt_vector;
 
+    if ( unlikely(v->arch.vm_event) &&
+         vm_event_check_ring(v->domain->vm_event_monitor) &&
+         v->arch.vm_event->intr_block )
+        return;
+
     /* Block event injection when single step with MTF. */
     if ( unlikely(v->arch.hvm_vcpu.single_step) )
     {
diff --git a/xen/common/monitor.c b/xen/common/monitor.c
index c606683..4263929 100644
--- a/xen/common/monitor.c
+++ b/xen/common/monitor.c
@@ -113,6 +113,12 @@ int monitor_traps(struct vcpu *v, bool sync, vm_event_request_t *req)
     if ( sync )
     {
         req->flags |= VM_EVENT_FLAG_VCPU_PAUSED;
+        /*
+         * It only makes sense to block interrupts for the duration of
+         * processing blocking vm_events, since that is the only case where
+         * emulating the current instruction really applies.
+         */
+        v->arch.vm_event->intr_block = true;
         vm_event_vcpu_pause(v);
         rc = 1;
     }
diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h
index 39e73c8..2b36614 100644
--- a/xen/include/asm-x86/vm_event.h
+++ b/xen/include/asm-x86/vm_event.h
@@ -34,6 +34,12 @@ struct arch_vm_event {
     struct monitor_write_data write_data;
     struct vm_event_regs_x86 gprs;
     bool set_gprs;
+    /*
+     * Block interrupts until this vm_event is done handling (after the
+     * fashion of single-step). Meant for the cases where the vm_event
+     * reply asks for emulation of the current instruction.
+     */
+    bool intr_block;
 };
 
 int vm_event_init_domain(struct domain *d);

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-08 14:26 Emulation and active (valid) interrupts Razvan Cojocaru
@ 2018-08-08 15:01 ` Razvan Cojocaru
  2018-08-09  7:54 ` Jan Beulich
  1 sibling, 0 replies; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-08 15:01 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

> 2. Is it possible that something else in that path writes into
> VM_ENTRY_INTR_INFO (which the vmx_intr_assist() logic can't possibly
> prevent)? For example, in my Xen 4.7.5 sources, there's a
> pt_restore_timer(v); call in hvm_do_resume() before the vm_event
> emulation code.

Actually even handle_hvm_io_completion(v) could theoretically cause a
write into VM_ENTRY_INTR_INFO, because it emulates. I'm not sure how
probable this is, but theoretically the same issue with vm_event
emulation applies here: external interrupts could be lost.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-08 14:26 Emulation and active (valid) interrupts Razvan Cojocaru
  2018-08-08 15:01 ` Razvan Cojocaru
@ 2018-08-09  7:54 ` Jan Beulich
  2018-08-09  8:20   ` Razvan Cojocaru
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-08-09  7:54 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Xen-devel

>>> On 08.08.18 at 16:26, <rcojocaru@bitdefender.com> wrote:
> 1. Is it possible to already have a valid interrupt written in
> VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
> vmx_vmexit_handler()?

You mean right after the exit? Where would that come from? I'm
afraid I don't see the connection to your issue (or the call traces
you've provided).

I also can't help but thinking that we've had a similar discussion
before, with the (iirc) clear outcome that injection of the various
kinds of events needs to be re-arranged to strictly follow
architectural definitions. That is, for example, no interrupt may
be injected before _all_ exception injection sources for the
current insn have been dealt with. That's because real hardware
also only ever checks for interrupts at instruction boundaries,
not in the middle of processing an instruction.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-09  7:54 ` Jan Beulich
@ 2018-08-09  8:20   ` Razvan Cojocaru
  2018-08-09  8:35     ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-09  8:20 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Xen-devel

On 8/9/18 10:54 AM, Jan Beulich wrote:
>>>> On 08.08.18 at 16:26, <rcojocaru@bitdefender.com> wrote:
>> 1. Is it possible to already have a valid interrupt written in
>> VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
>> vmx_vmexit_handler()?
> 
> You mean right after the exit? Where would that come from? I'm
> afraid I don't see the connection to your issue (or the call traces
> you've provided).

I mean right before the exit - so a scenario where VM_ENTRY_INTR_INFO
has been written into and then ept_handle_violation() gets called - in
which case it's too late to block interrupts in vmx_intr_assist() since
one is already pending (somehow) when we get to hvm_do_resume().

The call traces I've provided do not indeed illustrate this case.
They've been posted as proof of what can (and did happen) with emulation
and interrupts (I did not have such specific information last time we've
discussed this).

> I also can't help but thinking that we've had a similar discussion
> before, with the (iirc) clear outcome that injection of the various
> kinds of events needs to be re-arranged to strictly follow
> architectural definitions. That is, for example, no interrupt may
> be injected before _all_ exception injection sources for the
> current insn have been dealt with. That's because real hardware
> also only ever checks for interrupts at instruction boundaries,
> not in the middle of processing an instruction.

We did discuss that, and that's what I've also understood the conclusion
to be. However there's no clear action plan for achieving that at this
time (that I am aware of), and since that's sensitive and somewhat
complex code I thought it would be nice to at least discuss general
guidelines of how to go about this.

It hasn't been a priority so far because in the past we've only seen
this when our agent injected an UD (which gets emulated with upstream
Xen, but is re-executed with the help of the Monitor Trap Flag in
XenServer). We could do this because that can only happen for execute
faults (on pages marked rw-), and for those faults we don't emulate but
run the instruction on hardware.

Also, in the UD case, we had more control, as we were explicitly calling
hvm_inject_event(&ctx.ctxt.event); in hvm_emulate_one_vm_event(). With
this backtrace, however, hvm_inject_page_fault() gets called further
down the line by hvm_emulate_one() and we can't control when or if that
happens.

Long story short, should we approach this or are there other plans for
this to be fixed? If we should approach this, any pointers / suggestions
with regard to the current Xen code and most desirable design approach?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-09  8:20   ` Razvan Cojocaru
@ 2018-08-09  8:35     ` Jan Beulich
  2018-08-13 11:55       ` Razvan Cojocaru
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-08-09  8:35 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Xen-devel

>>> On 09.08.18 at 10:20, <rcojocaru@bitdefender.com> wrote:
> On 8/9/18 10:54 AM, Jan Beulich wrote:
>>>>> On 08.08.18 at 16:26, <rcojocaru@bitdefender.com> wrote:
>>> 1. Is it possible to already have a valid interrupt written in
>>> VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
>>> vmx_vmexit_handler()?
>> 
>> You mean right after the exit? Where would that come from? I'm
>> afraid I don't see the connection to your issue (or the call traces
>> you've provided).
> 
> I mean right before the exit

Before? Iirc the CPU doesn't itself write VM_ENTRY_* fields,
other than to clear them (presumably during VM exit processing).

> - so a scenario where VM_ENTRY_INTR_INFO
> has been written into and then ept_handle_violation() gets called - in
> which case it's too late to block interrupts in vmx_intr_assist() since
> one is already pending (somehow) when we get to hvm_do_resume().
> 
> The call traces I've provided do not indeed illustrate this case.
> They've been posted as proof of what can (and did happen) with emulation
> and interrupts (I did not have such specific information last time we've
> discussed this).
> 
>> I also can't help but thinking that we've had a similar discussion
>> before, with the (iirc) clear outcome that injection of the various
>> kinds of events needs to be re-arranged to strictly follow
>> architectural definitions. That is, for example, no interrupt may
>> be injected before _all_ exception injection sources for the
>> current insn have been dealt with. That's because real hardware
>> also only ever checks for interrupts at instruction boundaries,
>> not in the middle of processing an instruction.
> 
> We did discuss that, and that's what I've also understood the conclusion
> to be. However there's no clear action plan for achieving that at this
> time (that I am aware of), and since that's sensitive and somewhat
> complex code I thought it would be nice to at least discuss general
> guidelines of how to go about this.
> 
> It hasn't been a priority so far because in the past we've only seen
> this when our agent injected an UD (which gets emulated with upstream
> Xen, but is re-executed with the help of the Monitor Trap Flag in
> XenServer). We could do this because that can only happen for execute
> faults (on pages marked rw-), and for those faults we don't emulate but
> run the instruction on hardware.
> 
> Also, in the UD case, we had more control, as we were explicitly calling
> hvm_inject_event(&ctx.ctxt.event); in hvm_emulate_one_vm_event(). With
> this backtrace, however, hvm_inject_page_fault() gets called further
> down the line by hvm_emulate_one() and we can't control when or if that
> happens.

Well, as said - interrupt injection has to happen after exception
injection, and only if there was no exception. Hence it doesn't really
matter how deep in the call tree the exception injection sits.

> Long story short, should we approach this or are there other plans for
> this to be fixed?

I for one have no plans (or time) for this. You may want to see
whether Andrew has intentions here, but I guess he'll be quite
busy with other stuff for the foreseeable future.

> If we should approach this, any pointers / suggestions
> with regard to the current Xen code and most desirable design approach?

Well, first of all get a clear understanding of what violations there
are in current code. Then it may become more clear whether some
simple re-ordering might already do the job, or whether more
intrusive changes are going to be needed.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-09  8:35     ` Jan Beulich
@ 2018-08-13 11:55       ` Razvan Cojocaru
  2018-08-13 12:22         ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-13 11:55 UTC (permalink / raw)
  To: Jan Beulich, Tamas K Lengyel; +Cc: Andrew Cooper, Xen-devel

On 8/9/18 11:35 AM, Jan Beulich wrote:
>>>> On 09.08.18 at 10:20, <rcojocaru@bitdefender.com> wrote:
>> On 8/9/18 10:54 AM, Jan Beulich wrote:
>>>>>> On 08.08.18 at 16:26, <rcojocaru@bitdefender.com> wrote:
>>>> 1. Is it possible to already have a valid interrupt written in
>>>> VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
>>>> vmx_vmexit_handler()?
>>>
>>> You mean right after the exit? Where would that come from? I'm
>>> afraid I don't see the connection to your issue (or the call traces
>>> you've provided).
>>
>> I mean right before the exit
> 
> Before? Iirc the CPU doesn't itself write VM_ENTRY_* fields,
> other than to clear them (presumably during VM exit processing).

I've dumped the backtraces of all places that
__vmwrite(VM_ENTRY_INTR_INFO, ...), and it appears that the last place
that does that before a domain crash caused by invalid guest state is
vmx_idtv_reinject(), which in my Xen 4.7.5 sources is called in
vmx_vmexit_handler(), and regardless of exit_reason.

I've reproduced this most easily with Tamas' old test:
https://lists.xen.org/archives/html/xen-devel/2016-01/msg00285.html

RFLAGS.IF is 0 there, but with a valid interrupt as well. Here's my
latest log:

Xen call trace:
   [<ffff82d0802027ec>] vmx_vmexit_handler+0x68a/0x1bf7
   [<ffff82d080208a9a>] vmx_asm_vmexit_handler+0xfa/0x260

Xen call trace:
   [<ffff82d0802027ec>] vmx_vmexit_handler+0x68a/0x1bf7
   [<ffff82d080208a9a>] vmx_asm_vmexit_handler+0xfa/0x260

Xen call trace:
   [<ffff82d0802027ec>] vmx_vmexit_handler+0x68a/0x1bf7
   [<ffff82d080208a9a>] vmx_asm_vmexit_handler+0xfa/0x260

Xen call trace:
   [<ffff82d0802027ec>] vmx_vmexit_handler+0x68a/0x1bf7
   [<ffff82d080208a9a>] vmx_asm_vmexit_handler+0xfa/0x260

Xen call trace:
   [<ffff82d0802027ec>] vmx_vmexit_handler+0x68a/0x1bf7
   [<ffff82d080208a9a>] vmx_asm_vmexit_handler+0xfa/0x260

Xen call trace:
   [<ffff82d0802027ec>] vmx_vmexit_handler+0x68a/0x1bf7
   [<ffff82d080208a9a>] vmx_asm_vmexit_handler+0xfa/0x260

Failed vm entry (exit reason 0x80000021) caused by invalid guest state (0).
************* VMCS Area **************
*** Guest State ***
CR0: actual=0x000000008001003b, shadow=0x000000008001003b,
gh_mask=ffffffffffffffff
CR4: actual=0x00000000000426f9, shadow=0x00000000000406f9,
gh_mask=ffffffffffffffff
CR3 = 0x0000000000185000
PDPTE0 = 0x0000000000186001  PDPTE1 = 0x0000000000187001
PDPTE2 = 0x0000000000188001  PDPTE3 = 0x0000000000189001
RSP = 0x000000008078ad10 (0x000000008078ad10)  RIP = 0x00000000826c1781
(0x00000000826c1781)
RFLAGS=0x00000046 (0x00000046)  DR7 = 0x0000000000000400
Sysenter RSP=000000008078b000 CS:RIP=0008:00000000826880c0
       sel  attr  limit   base
  CS: 0008 0c09b ffffffff 0000000000000000
  DS: 0023 0c0f3 ffffffff 0000000000000000
  SS: 0010 0c093 ffffffff 0000000000000000
  ES: 0023 0c0f3 ffffffff 0000000000000000
  FS: 0030 04093 00003748 0000000082775c00
  GS: 0000 1c000 ffffffff 0000000000000000
GDTR:            000003ff 0000000080b95000
LDTR: 0000 1c000 ffffffff 0000000000000000
IDTR:            000007ff 0000000080b95400
  TR: 0028 0008b 000020ab 00000000801da000
EFER = 0x0000000000000000  PAT = 0x0007010600070106
PreemptionTimer = 0x00000000  SM Base = 0x00000000
DebugCtl = 0x0000000000000000  DebugExceptions = 0x0000000000000000
PerfGlobCtl = 0x0000000000000000  BndCfgS = 0x0000000000000000
Interruptibility = 00000000  ActivityState = 00000000
*** Host State ***
RIP = 0xffff82d0802089a0 (vmx_asm_vmexit_handler)  RSP = 0xffff830c5a537f70
CS=e008 SS=0000 DS=0000 ES=0000 FS=0000 GS=0000 TR=e040
FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff830c5a53ec80
GDTBase=ffff830c5a52f000 IDTBase=ffff830c5a53b000
CR0=0000000080050033 CR3=0000000b0a110000 CR4=00000000003526e0
Sysenter RSP=ffff830c5a537fa0 CS:RIP=e008:ffff82d0802509c0
EFER = 0x0000000000000000  PAT = 0x0000050100070406
*** Control State ***
PinBased=0000003f CPUBased=bea065fa SecondaryExec=001054eb
EntryControls=000151ff ExitControls=008fefff
ExceptionBitmap=00060002 PFECmask=00000000 PFECmatch=00000000
VMEntry: intr_info=800000d1 errcode=00000000 ilen=00000000
VMExit: intr_info=00000000 errcode=00000000 ilen=00000003
        reason=80000021 qualification=0000000000000000
IDTVectoring: info=800000d1 errcode=00000000
TSC Offset = 0xffdba7f7b150188c  TSC Multiplier = 0x0000000000000000
TPR Threshold = 0x00  PostedIntrVec = 0x00
EPT pointer = 0x0000000b0a02e01e  EPTP index = 0x0000
PLE Gap=00000080 Window=00001000
Virtual processor ID = 0x1adb VMfunc controls = 0000000000000000
**************************************
domain_crash called from vmx.c:3388
Domain 1 (vcpu#0) crashed on cpu#1:
----[ Xen-4.7.5  x86_64  debug=y  Not tainted ]----
CPU:    1
RIP:    0008:[<00000000826c1781>]
RFLAGS: 0000000000000046   CONTEXT: hvm guest (d1v0)
rax: 000000008078ad4c   rbx: 000000008078ad4c   rcx: 000000008e9b6ed0
rdx: 0000000000000000   rsi: 000000008078ad80   rdi: 0000000085ba3d48
rbp: 000000008078ad34   rsp: 000000008078ad10   r8:  0000000000000000
r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
r15: 0000000000000000   cr0: 000000008001003b   cr4: 00000000000406f9
cr3: 0000000000185000   cr2: 0000000093d5e800
fsb: 0000000082775c00   gsb: 0000000000000000   gss: 0000000000000002
ds: 0023   es: 0023   fs: 0030   gs: 0000   ss: 0010   cs: 0008


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 11:55       ` Razvan Cojocaru
@ 2018-08-13 12:22         ` Jan Beulich
  2018-08-13 12:51           ` Razvan Cojocaru
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-08-13 12:22 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

>>> On 13.08.18 at 13:55, <rcojocaru@bitdefender.com> wrote:
> On 8/9/18 11:35 AM, Jan Beulich wrote:
>>>>> On 09.08.18 at 10:20, <rcojocaru@bitdefender.com> wrote:
>>> On 8/9/18 10:54 AM, Jan Beulich wrote:
>>>>>>> On 08.08.18 at 16:26, <rcojocaru@bitdefender.com> wrote:
>>>>> 1. Is it possible to already have a valid interrupt written in
>>>>> VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
>>>>> vmx_vmexit_handler()?
>>>>
>>>> You mean right after the exit? Where would that come from? I'm
>>>> afraid I don't see the connection to your issue (or the call traces
>>>> you've provided).
>>>
>>> I mean right before the exit
>> 
>> Before? Iirc the CPU doesn't itself write VM_ENTRY_* fields,
>> other than to clear them (presumably during VM exit processing).
> 
> I've dumped the backtraces of all places that
> __vmwrite(VM_ENTRY_INTR_INFO, ...), and it appears that the last place
> that does that before a domain crash caused by invalid guest state is
> vmx_idtv_reinject(), which in my Xen 4.7.5 sources is called in
> vmx_vmexit_handler(), and regardless of exit_reason.

That's indeed right after the exit, but aiui no other interrupt / exception
can legitimately be raised in that situation. In fact another exception
ought to combine to #DF, unless it's a benign one that can be squashed.
But just like for instructions and their boundaries, no unrelated interrupt
can be recognized while delivering an exception/interrupt. The next
possible checking point is the first insn of the respective handler.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Emulation and active (valid) interrupts
  2018-08-13 12:22         ` Jan Beulich
@ 2018-08-13 12:51           ` Razvan Cojocaru
  2018-08-13 12:58             ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-13 12:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

On 8/13/18 3:22 PM, Jan Beulich wrote:
>>>> On 13.08.18 at 13:55, <rcojocaru@bitdefender.com> wrote:
>> On 8/9/18 11:35 AM, Jan Beulich wrote:
>>>>>> On 09.08.18 at 10:20, <rcojocaru@bitdefender.com> wrote:
>>>> On 8/9/18 10:54 AM, Jan Beulich wrote:
>>>>>>>> On 08.08.18 at 16:26, <rcojocaru@bitdefender.com> wrote:
>>>>>> 1. Is it possible to already have a valid interrupt written in
>>>>>> VM_ENTRY_INTR_INFO at EXIT_REASON_EPT_VIOLATION-time in
>>>>>> vmx_vmexit_handler()?
>>>>>
>>>>> You mean right after the exit? Where would that come from? I'm
>>>>> afraid I don't see the connection to your issue (or the call traces
>>>>> you've provided).
>>>>
>>>> I mean right before the exit
>>>
>>> Before? Iirc the CPU doesn't itself write VM_ENTRY_* fields,
>>> other than to clear them (presumably during VM exit processing).
>>
>> I've dumped the backtraces of all places that
>> __vmwrite(VM_ENTRY_INTR_INFO, ...), and it appears that the last place
>> that does that before a domain crash caused by invalid guest state is
>> vmx_idtv_reinject(), which in my Xen 4.7.5 sources is called in
>> vmx_vmexit_handler(), and regardless of exit_reason.
> 
> That's indeed right after the exit, but aiui no other interrupt / exception
> can legitimately be raised in that situation. In fact another exception
> ought to combine to #DF, unless it's a benign one that can be squashed.
> But just like for instructions and their boundaries, no unrelated interrupt
> can be recognized while delivering an exception/interrupt. The next
> possible checking point is the first insn of the respective handler.

It also seems to be closely related to a CLI in the area:

(XEN) [  217.984301] Xen call trace:
(XEN) [  217.984302]    [<ffff82d0802027fc>] vmx_vmexit_handler+0x68a/0x1bf7
(XEN) [  217.984304]    [<ffff82d080208aaa>]
vmx_asm_vmexit_handler+0xfa/0x260
(XEN) [  217.984305]
(XEN) [  217.984754] d2v0 32bit @ 0008:826c0e1b -> fa f6 83 54 1a 00 00
3f 74 13 b1 02 ff 15 a0 a0
(XEN) [  217.984757] Failed vm entry (exit reason 0x80000021) caused by
invalid guest state (0).
(XEN) [  217.984758] ************* VMCS Area **************

I believe the following patch prints out the instruction that has been
emulated (I hope it's not the one immediately after it):

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 194d48e..f017120 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1966,6 +1966,7 @@ int hvm_mem_access_emulate_one(enum emul_kind
kind, unsigned int trapnr,
         /* Intentional fall-through. */
     default:
         rc = hvm_emulate_one(&ctx);
+        hvm_dump_emulation_state(XENLOG_G_DEBUG, &ctx);
     }

     switch ( rc )

So first we've got that vmx_idtv_reinject() call writing to the VMCS,
then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
ran first and then an interrupt popped up, or if an interrupt had
already been __vmwrit()ten and then CLI caused the invalid guest state.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 12:51           ` Razvan Cojocaru
@ 2018-08-13 12:58             ` Jan Beulich
  2018-08-13 13:19               ` Razvan Cojocaru
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-08-13 12:58 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
> ran first and then an interrupt popped up, or if an interrupt had
> already been __vmwrit()ten and then CLI caused the invalid guest state.

I'd expect it to be the latter - an external interrupt presumably
can't be injected when EFLAGS.IF is clear. Why are we emulating
CLI in the first place? With a pending external interrupt, shouldn't
we just exit back to guest context without emulating anything?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 12:58             ` Jan Beulich
@ 2018-08-13 13:19               ` Razvan Cojocaru
  2018-08-13 13:38                 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-13 13:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

On 8/13/18 3:58 PM, Jan Beulich wrote:
>>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
>> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
>> ran first and then an interrupt popped up, or if an interrupt had
>> already been __vmwrit()ten and then CLI caused the invalid guest state.
> 
> I'd expect it to be the latter - an external interrupt presumably
> can't be injected when EFLAGS.IF is clear. Why are we emulating
> CLI in the first place? With a pending external interrupt, shouldn't
> we just exit back to guest context without emulating anything?

In this particular case we're emulating CLI because the vm_event
response requests it.

Tamas' test marks all of the guest's pages XENMEM_access_x, and at some
point a vm_event arrives somewhere in a page where CLI is read from,
AFAICT. Doing nothing would get us into an infinite loop, and since we
don't want to mark the page rwx, we try to emulate CLI.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 13:19               ` Razvan Cojocaru
@ 2018-08-13 13:38                 ` Jan Beulich
  2018-08-13 13:44                   ` Razvan Cojocaru
  2018-08-13 13:45                   ` Razvan Cojocaru
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Beulich @ 2018-08-13 13:38 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

>>> On 13.08.18 at 15:19, <rcojocaru@bitdefender.com> wrote:
> On 8/13/18 3:58 PM, Jan Beulich wrote:
>>>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
>>> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
>>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
>>> ran first and then an interrupt popped up, or if an interrupt had
>>> already been __vmwrit()ten and then CLI caused the invalid guest state.
>> 
>> I'd expect it to be the latter - an external interrupt presumably
>> can't be injected when EFLAGS.IF is clear. Why are we emulating
>> CLI in the first place? With a pending external interrupt, shouldn't
>> we just exit back to guest context without emulating anything?
> 
> In this particular case we're emulating CLI because the vm_event
> response requests it.
> 
> Tamas' test marks all of the guest's pages XENMEM_access_x, and at some
> point a vm_event arrives somewhere in a page where CLI is read from,
> AFAICT. Doing nothing would get us into an infinite loop, and since we
> don't want to mark the page rwx, we try to emulate CLI.

Doing nothing would get you into an infinite loop only if at each
attempt there's yet again an event to be re-injected. Of course
the risk of this grows the longer it takes to processes things in
your tool, but if there is an event to be re-injected then I don't
see what else you can do. Trying to ditch the event would
certainly be the wrong thing. I suggest you try to get advice
from the VMX maintainers - perhaps I'm simply overlooking an
obvious route out of the state you're apparently in.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 13:38                 ` Jan Beulich
@ 2018-08-13 13:44                   ` Razvan Cojocaru
  2018-08-13 13:45                   ` Razvan Cojocaru
  1 sibling, 0 replies; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-13 13:44 UTC (permalink / raw)
  To: xen-devel

On 8/13/18 4:38 PM, Jan Beulich wrote:
>>>> On 13.08.18 at 15:19, <rcojocaru@bitdefender.com> wrote:
>> On 8/13/18 3:58 PM, Jan Beulich wrote:
>>>>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
>>>> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
>>>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
>>>> ran first and then an interrupt popped up, or if an interrupt had
>>>> already been __vmwrit()ten and then CLI caused the invalid guest state.
>>>
>>> I'd expect it to be the latter - an external interrupt presumably
>>> can't be injected when EFLAGS.IF is clear. Why are we emulating
>>> CLI in the first place? With a pending external interrupt, shouldn't
>>> we just exit back to guest context without emulating anything?
>>
>> In this particular case we're emulating CLI because the vm_event
>> response requests it.
>>
>> Tamas' test marks all of the guest's pages XENMEM_access_x, and at some
>> point a vm_event arrives somewhere in a page where CLI is read from,
>> AFAICT. Doing nothing would get us into an infinite loop, and since we
>> don't want to mark the page rwx, we try to emulate CLI.
> 
> Doing nothing would get you into an infinite loop only if at each
> attempt there's yet again an event to be re-injected. Of course
> the risk of this grows the longer it takes to processes things in
> your tool, but if there is an event to be re-injected then I don't
> see what else you can do. Trying to ditch the event would
> certainly be the wrong thing. I suggest you try to get advice
> from the VMX maintainers - perhaps I'm simply overlooking an
> obvious route out of the state you're apparently in.

You're of course right, what I meant to say was that if we don't
emulate, don't mark the page rwx, and don't move RIP we'll be in an
infinite loop of read-caused vm_event -> userspace tool gets event ->
does nothing, but responds to it -> guest resumes at the same RIP
(pointing, in this case at CLI, but it could be anything) -> goto begin.

We need to do something to keep the guest going, and the generic way to
accomplish this is to ask Xen to emulate whatever instruction is at RIP
(because the Xen emulator, at least for the time being, ignores EPT
restrictions).


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 13:38                 ` Jan Beulich
  2018-08-13 13:44                   ` Razvan Cojocaru
@ 2018-08-13 13:45                   ` Razvan Cojocaru
  2018-08-13 19:17                     ` Razvan Cojocaru
  1 sibling, 1 reply; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-13 13:45 UTC (permalink / raw)
  To: Jan Beulich, Jun Nakajima, Kevin Tian
  Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

On 8/13/18 4:38 PM, Jan Beulich wrote:
>>>> On 13.08.18 at 15:19, <rcojocaru@bitdefender.com> wrote:
>> On 8/13/18 3:58 PM, Jan Beulich wrote:
>>>>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
>>>> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
>>>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
>>>> ran first and then an interrupt popped up, or if an interrupt had
>>>> already been __vmwrit()ten and then CLI caused the invalid guest state.
>>>
>>> I'd expect it to be the latter - an external interrupt presumably
>>> can't be injected when EFLAGS.IF is clear. Why are we emulating
>>> CLI in the first place? With a pending external interrupt, shouldn't
>>> we just exit back to guest context without emulating anything?
>>
>> In this particular case we're emulating CLI because the vm_event
>> response requests it.
>>
>> Tamas' test marks all of the guest's pages XENMEM_access_x, and at some
>> point a vm_event arrives somewhere in a page where CLI is read from,
>> AFAICT. Doing nothing would get us into an infinite loop, and since we
>> don't want to mark the page rwx, we try to emulate CLI.
> 
> Doing nothing would get you into an infinite loop only if at each
> attempt there's yet again an event to be re-injected. Of course
> the risk of this grows the longer it takes to processes things in
> your tool, but if there is an event to be re-injected then I don't
> see what else you can do. Trying to ditch the event would
> certainly be the wrong thing. I suggest you try to get advice
> from the VMX maintainers - perhaps I'm simply overlooking an
> obvious route out of the state you're apparently in.

[Missed hitting "Reply all" - sorry, and re-sent. Also, added Jun and
Kevin to the conversation.]

You're of course right, what I meant to say was that if we don't
emulate, don't mark the page rwx, and don't move RIP we'll be in an
infinite loop of read-caused vm_event -> userspace tool gets event ->
does nothing, but responds to it -> guest resumes at the same RIP
(pointing, in this case at CLI, but it could be anything) -> goto begin.

We need to do something to keep the guest going, and the generic way to
accomplish this is to ask Xen to emulate whatever instruction is at RIP
(because the Xen emulator, at least for the time being, ignores EPT
restrictions).


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 13:45                   ` Razvan Cojocaru
@ 2018-08-13 19:17                     ` Razvan Cojocaru
  2018-08-14  7:19                       ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Razvan Cojocaru @ 2018-08-13 19:17 UTC (permalink / raw)
  To: Jan Beulich, Jun Nakajima, Kevin Tian
  Cc: Andrew Cooper, Tamas K Lengyel, Xen-devel

On 8/13/18 4:45 PM, Razvan Cojocaru wrote:
> On 8/13/18 4:38 PM, Jan Beulich wrote:
>>>>> On 13.08.18 at 15:19, <rcojocaru@bitdefender.com> wrote:
>>> On 8/13/18 3:58 PM, Jan Beulich wrote:
>>>>>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
>>>>> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
>>>>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
>>>>> ran first and then an interrupt popped up, or if an interrupt had
>>>>> already been __vmwrit()ten and then CLI caused the invalid guest state.
>>>>
>>>> I'd expect it to be the latter - an external interrupt presumably
>>>> can't be injected when EFLAGS.IF is clear. Why are we emulating
>>>> CLI in the first place? With a pending external interrupt, shouldn't
>>>> we just exit back to guest context without emulating anything?
>>>
>>> In this particular case we're emulating CLI because the vm_event
>>> response requests it.
>>>
>>> Tamas' test marks all of the guest's pages XENMEM_access_x, and at some
>>> point a vm_event arrives somewhere in a page where CLI is read from,
>>> AFAICT. Doing nothing would get us into an infinite loop, and since we
>>> don't want to mark the page rwx, we try to emulate CLI.
>>
>> Doing nothing would get you into an infinite loop only if at each
>> attempt there's yet again an event to be re-injected. Of course
>> the risk of this grows the longer it takes to processes things in
>> your tool, but if there is an event to be re-injected then I don't
>> see what else you can do. Trying to ditch the event would
>> certainly be the wrong thing. I suggest you try to get advice
>> from the VMX maintainers - perhaps I'm simply overlooking an
>> obvious route out of the state you're apparently in.
> 
> [Missed hitting "Reply all" - sorry, and re-sent. Also, added Jun and
> Kevin to the conversation.]
> 
> You're of course right, what I meant to say was that if we don't
> emulate, don't mark the page rwx, and don't move RIP we'll be in an
> infinite loop of read-caused vm_event -> userspace tool gets event ->
> does nothing, but responds to it -> guest resumes at the same RIP
> (pointing, in this case at CLI, but it could be anything) -> goto begin.
> 
> We need to do something to keep the guest going, and the generic way to
> accomplish this is to ask Xen to emulate whatever instruction is at RIP
> (because the Xen emulator, at least for the time being, ignores EPT
> restrictions).

On top of everything, there's also a basic design problem: the way the
code is written now:

1. The "inject events" code seems to be advertised as living in intr.c -
but here's an exception to the rule with vmx_idtv_reinject() living in
vmx.c.

2. The single-step code implies that once we have vmx_intr_assist()
return, event injection is blocked:

234     /* Block event injection when single step with MTF. */
235     if ( unlikely(v->arch.hvm_vcpu.single_step) )
236     {
237         v->arch.hvm_vmx.exec_control |= CPU_BASED_MONITOR_TRAP_FLAG;
238         vmx_update_cpu_exec_control(v);
239         return;
240     }

an assumption that has turned out to be false.

3. Obviously the idea of injecting something just before taking an exit,
for example caused by EXIT_REASON_EPT_VIOLATION is not natural to us.
Furthermore, the way the code is written now, _first_ we call
vmx_idtv_reinject() and only then do we handle EXIT_REASON_EPT_VIOLATION
(which is the only place I can find out if a vm_event needs to be sent
out, and so block injections until it is handled, in the fashion of
single-stepping). I believe that this is the reason why my previous
simple patch was "not working" - I was only blocking injections in
vmx_intr_assist().

All these observations, assuming you agree with me, would IMHO imply
that, if possible, we should move that code to intr.c /
vmx_intr_assist(), or at the very least have a single point where we can
say "block injections".


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Emulation and active (valid) interrupts
  2018-08-13 19:17                     ` Razvan Cojocaru
@ 2018-08-14  7:19                       ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-08-14  7:19 UTC (permalink / raw)
  To: Razvan Cojocaru
  Cc: Andrew Cooper, Kevin Tian, Tamas K Lengyel, Jun Nakajima, Xen-devel

>>> On 13.08.18 at 21:17, <rcojocaru@bitdefender.com> wrote:
> On 8/13/18 4:45 PM, Razvan Cojocaru wrote:
>> On 8/13/18 4:38 PM, Jan Beulich wrote:
>>>>>> On 13.08.18 at 15:19, <rcojocaru@bitdefender.com> wrote:
>>>> On 8/13/18 3:58 PM, Jan Beulich wrote:
>>>>>>>> On 13.08.18 at 14:51, <rcojocaru@bitdefender.com> wrote:
>>>>>> So first we've got that vmx_idtv_reinject() call writing to the VMCS,
>>>>>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI
>>>>>> ran first and then an interrupt popped up, or if an interrupt had
>>>>>> already been __vmwrit()ten and then CLI caused the invalid guest state.
>>>>>
>>>>> I'd expect it to be the latter - an external interrupt presumably
>>>>> can't be injected when EFLAGS.IF is clear. Why are we emulating
>>>>> CLI in the first place? With a pending external interrupt, shouldn't
>>>>> we just exit back to guest context without emulating anything?
>>>>
>>>> In this particular case we're emulating CLI because the vm_event
>>>> response requests it.
>>>>
>>>> Tamas' test marks all of the guest's pages XENMEM_access_x, and at some
>>>> point a vm_event arrives somewhere in a page where CLI is read from,
>>>> AFAICT. Doing nothing would get us into an infinite loop, and since we
>>>> don't want to mark the page rwx, we try to emulate CLI.
>>>
>>> Doing nothing would get you into an infinite loop only if at each
>>> attempt there's yet again an event to be re-injected. Of course
>>> the risk of this grows the longer it takes to processes things in
>>> your tool, but if there is an event to be re-injected then I don't
>>> see what else you can do. Trying to ditch the event would
>>> certainly be the wrong thing. I suggest you try to get advice
>>> from the VMX maintainers - perhaps I'm simply overlooking an
>>> obvious route out of the state you're apparently in.
>> 
>> [Missed hitting "Reply all" - sorry, and re-sent. Also, added Jun and
>> Kevin to the conversation.]
>> 
>> You're of course right, what I meant to say was that if we don't
>> emulate, don't mark the page rwx, and don't move RIP we'll be in an
>> infinite loop of read-caused vm_event -> userspace tool gets event ->
>> does nothing, but responds to it -> guest resumes at the same RIP
>> (pointing, in this case at CLI, but it could be anything) -> goto begin.
>> 
>> We need to do something to keep the guest going, and the generic way to
>> accomplish this is to ask Xen to emulate whatever instruction is at RIP
>> (because the Xen emulator, at least for the time being, ignores EPT
>> restrictions).
> 
> On top of everything, there's also a basic design problem: the way the
> code is written now:
> 
> 1. The "inject events" code seems to be advertised as living in intr.c -
> but here's an exception to the rule with vmx_idtv_reinject() living in
> vmx.c.

But "re-inject" != "inject".

> 2. The single-step code implies that once we have vmx_intr_assist()
> return, event injection is blocked:
> 
> 234     /* Block event injection when single step with MTF. */
> 235     if ( unlikely(v->arch.hvm_vcpu.single_step) )
> 236     {
> 237         v->arch.hvm_vmx.exec_control |= CPU_BASED_MONITOR_TRAP_FLAG;
> 238         vmx_update_cpu_exec_control(v);
> 239         return;
> 240     }
> 
> an assumption that has turned out to be false.

As far as I'm aware it is well known that MTF handling isn't the
greatest.

> 3. Obviously the idea of injecting something just before taking an exit,
> for example caused by EXIT_REASON_EPT_VIOLATION is not natural to us.

Indeed, yet so far I've not seen a summary of all the conditions
under which you see this happening. Remember that an EPT
violation with valid IDT vectoring information means the violation
has occurred _while_ delivering an event. It is my understanding
that this can only occur if the EPT violation happens for an IDT,
GDT, TSS, or stack access. In particular the instruction pointed
at does not matter here at all.

I therefore wonder whether either you're removing permissions
too aggressively, or whether the state information passed to the
tools side handling code of the VM event is insufficient to
recognize that instruction emulation must not be attempted, and
instead actions need to be taken to make it possible for the
pending event to be delivered without incurring another EPT
violation. Note that single stepping is as little of an option as
insn emulation in this case. If anything, the event delivery
would need emulating (for which there is no code at all in the
hypervisor, iirc).

> Furthermore, the way the code is written now, _first_ we call
> vmx_idtv_reinject() and only then do we handle EXIT_REASON_EPT_VIOLATION

I'm afraid if this was done in the opposite order, nothing would
change for you: The to-be-re-injected event would still need
re-injecting, and hence you still couldn't inject an event of your
liking (or allow e.g. CLI to be emulated).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-08-14  7:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-08 14:26 Emulation and active (valid) interrupts Razvan Cojocaru
2018-08-08 15:01 ` Razvan Cojocaru
2018-08-09  7:54 ` Jan Beulich
2018-08-09  8:20   ` Razvan Cojocaru
2018-08-09  8:35     ` Jan Beulich
2018-08-13 11:55       ` Razvan Cojocaru
2018-08-13 12:22         ` Jan Beulich
2018-08-13 12:51           ` Razvan Cojocaru
2018-08-13 12:58             ` Jan Beulich
2018-08-13 13:19               ` Razvan Cojocaru
2018-08-13 13:38                 ` Jan Beulich
2018-08-13 13:44                   ` Razvan Cojocaru
2018-08-13 13:45                   ` Razvan Cojocaru
2018-08-13 19:17                     ` Razvan Cojocaru
2018-08-14  7:19                       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.