All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: Jan Beulich <JBeulich@suse.com>, Xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2 13/19] x86/shadow: Avoid raising faults behind the emulators back
Date: Mon, 28 Nov 2016 16:04:19 +0000	[thread overview]
Message-ID: <63a04bf2-5a6b-06c8-040f-c6a3939308c1@citrix.com> (raw)
In-Reply-To: <20161128144930.GG73967@deinos.phlegethon.org>

On 28/11/16 14:49, Tim Deegan wrote:
> Hi,
>
> At 11:13 +0000 on 28 Nov (1480331610), Andrew Cooper wrote:
>> Use x86_emul_{hw_exception,pagefault}() rather than
>> {pv,hvm}_inject_page_fault() and hvm_inject_hw_exception() to cause raised
>> faults to be known to the emulator.  This requires altering the callers of
>> x86_emulate() to properly re-inject the event.
>>
>> While fixing this, fix the singlestep behaviour.  Previously, an otherwise
>> successful emulation would fail if singlestepping was active, as the emulator
>> couldn't raise #DB.  This is unreasonable from the point of view of the guest.
>>
>> We therefore tolerate #PF/#GP/SS and #DB being raised by the emulator, but
>> reject anything else as unexpected.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> +    if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
>> +    {
>> +        /*
>> +         * This emulation covers writes to shadow pagetables.  We tolerate #PF
>> +         * (from hitting adjacent pages), #GP/#SS (from segmentation errors),
>> +         * and #DB (from singlestepping).  Anything else is an emulation bug,
>> +         * or a guest playing with the instruction stream under Xen's feet.
>> +         */
>> +        if ( emul_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
>> +             (emul_ctxt.ctxt.event.vector < 32) &&
>> +             ((1u << emul_ctxt.ctxt.event.vector) &
>> +              ((1u << TRAP_debug) | (1u << TRAP_stack_error) |
>> +               (1u << TRAP_gp_fault) | (1u << TRAP_page_fault))) )
>> +        {
>> +            if ( is_hvm_vcpu(v) )
>> +                hvm_inject_event(&emul_ctxt.ctxt.event);
>> +            else
>> +                pv_inject_event(&emul_ctxt.ctxt.event);
>> +        }
>> +        else
>> +        {
>> +            if ( is_hvm_vcpu(v) )
>> +                hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> +            else
>> +                pv_inject_hw_exception(TRAP_gp_fault, 0);
>> +        }
> I don't think it's OK to lob #GP into a HVM guest here -- we can hit
> this path even if the guest isn't behaving strangely.

What circumstances are you thinking of?

Unless the guest is playing with the instruction stream,  event_pending
will only be set by the set in the paths altered by this patch, which is
the four whitelisted vectors.

> I think it would be better to run this check before the X86EMUL_UNHANDLEABLE one
> and convert injections that we choose not to handle into
> X86EMUL_UNHANDLEABLE.
>
> Which I guess brings us back full circle to the behaviour we had
> before, and perhaps the right change to the earlier patch is to
> start down this road, with an explanatory comment.
>
> e.g. start with something like this, immediately after the first call
> to x86_emulate():
>
>     /*
>      * Events raised within the emulator itself used to return
>      * X86EMUL_UNHANDLEABLE because we didn't supply injection
>      * callbacks.  Now the emulator supplies those to us via
>      * ctxt.event_pending instead.  Preserve the old behaviour
>      * for now.
>      */
>      if (emul_ctxt.ctxt.event_pending)
>          r = X86EMUL_UNHANDLEABLE;
>
> And in this patch, replace it with the hunk you have above, but
> setting r = X86EMUL_UNHANDLEABLE instead of injecting #GP.
>
> Does that make sense?

It does make sense, but that goes against the comment of not unshadowing
on exception.

A lot of this revolves around how likely we are to hit the #GP[0] case. 
I assert that we shouldn't be able to hit it unless the guest is playing
games, or we have a bug in emulation.

If we don't go throwing a #GP back, we should at least leave something
obvious in the log.

>
> Also, I'm a little confused after all this as to whether the emulator
> can still return with X86EMUL_OKAY and the event_pending set.

I have now determined this not to be the case.  Apologies for my
misinformation in v1.

All exception and software interrupt cases end up returning X86_EXCEPTION.

    case 0xcd: /* int imm8 */
        swint_type = x86_swint_int;
    swint:
        rc = inject_swint(swint_type, (uint8_t)src.val,
                          _regs.eip - ctxt->regs->eip,
                          ctxt, ops) ? : X86EMUL_EXCEPTION;
        goto done;


This is why I introduced the slightly relaxed check:

if ( emul_ctxt.ctxt.event_pending )
    ASSERT(r == X86EMUL_EXCEPTION);

in the earlier patch.

> If it
> can do that then we need to inject the event come what may, because
> any other side-effects will have been committted already. 

Because of the shadow pagetables use of x86_swint_emulate_none, all
software interrupts have fault semantics, so %eip isn't moved forward;
this is left to hardware on the re-inject path.  There are no other
pieces of state change for any software interrupts.

With some re-evaluation of hindsight, I plan to make all trap injection
have fault semantics out of x86_emulate(), leaving the possible
adjustment of %eip to the lower level vendor functions, as x86_event has
an insn_len field.  In reality, its only the svm path which needs to
adjust %eip, and only in certain circumstances.

The odd-case-out is singlestep, which completes writeback then raises an
exception.

>
>> @@ -3475,6 +3503,37 @@ static int sh_page_fault(struct vcpu *v,
>>              {
>>                  perfc_incr(shadow_em_ex_fail);
>>                  TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
>> +
>> +                if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
>> +                {
>> +                    /*
>> +                     * This emulation covers writes to shadow pagetables.  We
>> +                     * tolerate #PF (from hitting adjacent pages), #GP/#SS
>> +                     * (from segmentation errors), and #DB (from
>> +                     * singlestepping).  Anything else is an emulation bug, or
>> +                     * a guest playing with the instruction stream under Xen's
>> +                     * feet.
>> +                     */
>> +                    if ( emul_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
>> +                         (emul_ctxt.ctxt.event.vector < 32) &&
>> +                         ((1u << emul_ctxt.ctxt.event.vector) &
>> +                          ((1u << TRAP_debug) | (1u << TRAP_stack_error) |
>> +                           (1u << TRAP_gp_fault) | (1u << TRAP_page_fault))) )
>> +                    {
>> +                        if ( is_hvm_vcpu(v) )
>> +                            hvm_inject_event(&emul_ctxt.ctxt.event);
>> +                        else
>> +                            pv_inject_event(&emul_ctxt.ctxt.event);
>> +                    }
>> +                    else
>> +                    {
>> +                        if ( is_hvm_vcpu(v) )
>> +                            hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> +                        else
>> +                            pv_inject_hw_exception(TRAP_gp_fault, 0);
>> +                    }
>> +                }
>> +
> This looks like code duplication, but rather than trying to merge the
> two cases, I think we can drop this one entirely.  This emulation is
> optimistically trying to find the second half of a PAE PTE write -
> it's OK just to stop emulating if we hit anything this exciting.
> So we can lose the whole hunk.

At the very least we should retain the singlestep #DB injection, as it
still has trap semantics.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-11-28 16:04 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-28 11:13 [PATCH for-4.9 v2 00/19] XSA-191 followup Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 01/19] x86/shadow: Fix #PFs from emulated writes crossing a page boundary Andrew Cooper
2016-11-28 11:55   ` Tim Deegan
2016-11-29 15:24   ` Jan Beulich
2016-11-28 11:13 ` [PATCH v2 02/19] x86/emul: Drop X86EMUL_CMPXCHG_FAILED Andrew Cooper
2016-11-28 11:55   ` Tim Deegan
2016-11-29 15:29   ` Jan Beulich
2016-11-28 11:13 ` [PATCH v2 03/19] x86/emul: Simplfy emulation state setup Andrew Cooper
2016-11-28 11:58   ` Paul Durrant
2016-11-28 12:54   ` Paul Durrant
2016-11-28 11:13 ` [PATCH v2 04/19] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 05/19] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 06/19] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}() Andrew Cooper
2016-11-28 11:58   ` Tim Deegan
2016-11-28 11:59     ` Andrew Cooper
2016-11-29 16:00   ` Jan Beulich
2016-11-29 16:50     ` Andrew Cooper
2016-11-30  8:41       ` Jan Beulich
2016-11-30 13:17         ` Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 07/19] x86/emul: Remove opencoded exception generation Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 08/19] x86/emul: Rework emulator event injection Andrew Cooper
2016-11-28 12:04   ` Tim Deegan
2016-11-28 12:48     ` Andrew Cooper
2016-11-28 14:24       ` Tim Deegan
2016-11-28 14:34         ` Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 09/19] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 10/19] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
2016-11-28 14:18   ` Boris Ostrovsky
2016-11-28 11:13 ` [PATCH v2 11/19] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
2016-11-28 12:47   ` Paul Durrant
2016-11-29 16:02   ` Jan Beulich
2016-11-28 11:13 ` [PATCH v2 12/19] x86/pv: " Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 13/19] x86/shadow: " Andrew Cooper
2016-11-28 14:49   ` Tim Deegan
2016-11-28 16:04     ` Andrew Cooper [this message]
2016-11-28 17:21       ` Tim Deegan
2016-11-28 17:36         ` Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 14/19] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 15/19] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
2016-11-28 12:56   ` Paul Durrant
2016-11-28 11:13 ` [PATCH v2 16/19] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
2016-11-28 11:59   ` Paul Durrant
2016-11-28 11:13 ` [PATCH v2 17/19] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
2016-11-28 11:56   ` Paul Durrant
2016-11-28 12:58     ` Andrew Cooper
2016-11-28 13:01       ` Paul Durrant
2016-11-28 13:03         ` Andrew Cooper
2016-11-28 14:56   ` Tim Deegan
2016-11-28 16:32     ` Andrew Cooper
2016-11-28 16:42       ` Tim Deegan
2016-11-29  1:22   ` Tian, Kevin
2016-11-29 16:24   ` Jan Beulich
2016-11-29 16:30     ` Andrew Cooper
2016-11-29 16:36       ` Jan Beulich
2016-11-29 16:38         ` Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 18/19] x86/hvm: Prepare to allow use of system segments for memory references Andrew Cooper
2016-11-28 11:13 ` [PATCH v2 19/19] x86/hvm: Use system-segment relative memory accesses Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63a04bf2-5a6b-06c8-040f-c6a3939308c1@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.