Re: [PATCH 01/17] x86emul: split instruction decoding from execution

From: "Jan Beulich" <JBeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 01/17] x86emul: split instruction decoding from execution
Date: Mon, 12 Sep 2016 01:20:37 -0600	[thread overview]
Message-ID: <57D67365020000780010DD9B@prv-mh.provo.novell.com> (raw)
In-Reply-To: <d29726dc-0b7e-3a76-1369-cbc3c29e5641@citrix.com>

>>> On 09.09.16 at 20:35, <andrew.cooper3@citrix.com> wrote:
> On 08/09/16 14:04, Jan Beulich wrote:
>> @@ -607,7 +609,7 @@ do{ asm volatile (
>>  })
>>  #define truncate_ea(ea) truncate_word((ea), ad_bytes)
>>  
>> -#define mode_64bit() (def_ad_bytes == 8)
>> +#define mode_64bit() (ctxt->addr_size == 64)
>>  
>>  #define fail_if(p)                                      \
>>  do {                                                    \
>> @@ -1558,32 +1560,63 @@ int x86emul_unhandleable_rw(
>>      return X86EMUL_UNHANDLEABLE;
>>  }
>>  
>> -int
>> -x86_emulate(
>> -    struct x86_emulate_ctxt *ctxt,
>> -    const struct x86_emulate_ops  *ops)
>> -{
>> -    /* Shadow copy of register state. Committed on successful emulation. */
>> -    struct cpu_user_regs _regs = *ctxt->regs;
>> +struct x86_emulate_state {
>> +    unsigned int op_bytes, ad_bytes;
>> +
>> +    enum { ext_none, ext_0f, ext_0f38 } ext;
>> +    uint8_t opcode;
>> +    uint8_t modrm, modrm_mod, modrm_reg, modrm_rm;
>> +    uint8_t rex_prefix;
>> +    bool lock_prefix;
>> +    opcode_desc_t desc;
>> +    union vex vex;
>> +    int override_seg;
>>  
>> -    uint8_t b, d, sib, sib_index, sib_base, rex_prefix = 0;
>> -    uint8_t modrm = 0, modrm_mod = 0, modrm_reg = 0, modrm_rm = 0;
>> -    enum { ext_none, ext_0f, ext_0f38 } ext = ext_none;
>> -    union vex vex = {};
>> -    unsigned int op_bytes, def_op_bytes, ad_bytes, def_ad_bytes;
>> -    bool_t lock_prefix = 0;
>> -    int override_seg = -1, rc = X86EMUL_OKAY;
>> -    struct operand src = { .reg = REG_POISON };
>> -    struct operand dst = { .reg = REG_POISON };
>> -    enum x86_swint_type swint_type;
>> -    struct x86_emulate_stub stub = {};
>> -    DECLARE_ALIGNED(mmval_t, mmval);
>>      /*
>>       * Data operand effective address (usually computed from ModRM).
>>       * Default is a memory operand relative to segment DS.
>>       */
>> -    struct operand ea = { .type = OP_MEM, .reg = REG_POISON };
>> -    ea.mem.seg = x86_seg_ds; /* gcc may reject anon union initializer */
>> +    struct operand ea;
>> +
>> +    /* Immediate operand values, if any. Use otherwise unused fields. */
>> +#define imm1 ea.val
>> +#define imm2 ea.orig_val
> 
> Some instructions (e.g. bextr) have both immediate and memory operands. 
> Reusing ea like this is unsafe.

I disagree: Both fields have never been used for anything, and it was
more than once that I had thought about how to eliminate them from
ea without eliminating them from src and dst. Until finally I found a use
for them here.

> Immediate data was previously stashed in src, separately from ea.  In
> the light of the XSA-123 problems, I think it would be better to just
> have "uint64_t imm1, imm2;" here.

The XSA-123 problem was completely different, and I specifically took
this into consideration. We're not aliasing scalars and pointers here,
so there's no risk of causing a new security issue. Just to repeat -
these two fields have been completely unused so far.

>> +
>> +    /* Shadow copy of register state. Committed on successful emulation. */
>> +    struct cpu_user_regs regs;
>> +};
>> +
>> +/* Helper definitions. */
>> +#define op_bytes (state->op_bytes)
>> +#define ad_bytes (state->ad_bytes)
>> +#define ext (state->ext)
>> +#define modrm (state->modrm)
>> +#define modrm_mod (state->modrm_mod)
>> +#define modrm_reg (state->modrm_reg)
>> +#define modrm_rm (state->modrm_rm)
>> +#define rex_prefix (state->rex_prefix)
>> +#define lock_prefix (state->lock_prefix)
>> +#define vex (state->vex)
>> +#define override_seg (state->override_seg)
>> +#define ea (state->ea)
>> +#define _regs (state->regs)
>> +
>> +static int
>> +x86_decode(
>> +    struct x86_emulate_state *state,
>> +    struct x86_emulate_ctxt *ctxt,
>> +    const struct x86_emulate_ops  *ops)
>> +{
>> +    uint8_t b, d, sib, sib_index, sib_base;
>> +    unsigned int def_op_bytes, def_ad_bytes;
>> +    int rc = X86EMUL_OKAY;
>> +
>> +    memset(state, 0, sizeof(*state));
>> +    override_seg = -1;
>> +    ea.type = OP_MEM;
>> +    ea.mem.seg = x86_seg_ds;
>> +    ea.reg = REG_POISON;
>> +    _regs = *ctxt->regs;
> 
> The helper definitions are fine for the transition period, but I would
> like to see them eventually removed to help reduce the quantity of
> information hiding in this area.  Please don't introduce new uses.

I simply can't write e.g. "state->ea.type" here, as that would get
macro expanded to "state->(state->ea).type". I've carefully tried to
avoid introducing new uses where possible, and I'll be happy to
correct cases where I failed to, but here we just can't (and I hope
you agree that it would be odd to not use the helpers consistently
in a single block of code, i.e. I'd like to not replace _regs (which goes
away in patch 3 anyway).

>> +int
>> +x86_emulate(
>> +    struct x86_emulate_ctxt *ctxt,
>> +    const struct x86_emulate_ops *ops)
>> +{
>> +    struct x86_emulate_state state;
>> +    int rc;
>> +    uint8_t b, d;
>> +    struct operand src = { .reg = REG_POISON };
>> +    struct operand dst = { .reg = REG_POISON };
>> +    enum x86_swint_type swint_type;
>> +    struct x86_emulate_stub stub = {};
>> +    DECLARE_ALIGNED(mmval_t, mmval);
>> +
>> +    rc = x86_decode(&state, ctxt, ops);
>> +    if ( rc != X86EMUL_OKAY)
> 
> Space before the bracket (although I guess these lines drop out before
> the end of the series anyway?)

Oops. And to the question - no, why would they? Emulation obviously
requires decoding as the first step.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel