Re: [PATCH v4 2/2] x86/emulate: Send vm_event from emulate

From: "Jan Beulich" <JBeulich@suse.com>
To: aisaila@bitdefender.com
Cc: Tamas K Lengyel <tamas@tklengyel.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Razvan Cojocaru <rcojocaru@bitdefender.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Paul Durrant <paul.durrant@citrix.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH v4 2/2] x86/emulate: Send vm_event from emulate
Date: Wed, 22 May 2019 07:34:11 -0600	[thread overview]
Message-ID: <5CE54FD30200007800231603@prv1-mh.provo.novell.com> (raw)
In-Reply-To: <608cae57-7a7a-6502-9c9a-439aa0b88f25@bitdefender.com>

>>> On 22.05.19 at 14:59, <aisaila@bitdefender.com> wrote:
> On 22.05.2019 12:56, Jan Beulich wrote:
>>>>> On 20.05.19 at 14:55, <aisaila@bitdefender.com> wrote:
>>> First we try to send a vm event and if the event is sent then emulation
>>> returns X86EMUL_ACCESS_EXCEPTION. If the event is not sent then the
>>> emulation goes on as expected.
>> 
>> Perhaps it's obvious for a vm-event person why successful sending
>> of an event is to result in X86EMUL_ACCESS_EXCEPTION, but it's not
>> to me, despite having looked at prior versions. Can this (odd at the
>> first glance) behavior please be briefly explained here?
> 
> If the event was successfully sent then the emulation has to stop and 
> return.

Which is where we commonly use X86EMUL_RETRY. I've explained to
you before that introduction of new return values needs careful
inspection that it'll work for all involved pieces of code (in particular
ones specially treating some of the values).

>>> @@ -619,6 +621,68 @@ static int hvmemul_linear_to_phys(
>>>       return X86EMUL_OKAY;
>>>   }
>>>   
>>> +static bool hvmemul_send_vm_event(unsigned long gla,
>>> +                                  uint32_t pfec, unsigned int bytes,
>>> +                                  struct hvm_emulate_ctxt ctxt)
>>> +{
>>> +    xenmem_access_t access;
>>> +    vm_event_request_t req = {};
>>> +    gfn_t gfn;
>>> +    paddr_t gpa;
>>> +    unsigned long reps = 1;
>>> +    int rc;
>>> +
>>> +    if ( !ctxt.send_event || !pfec )
>> 
>> Why the !pfec part of the condition?
> 
> Because it is used to check the type of access violation and if it is 0 
> then we do not want to call get_mem_access or get the gpa, it is clear 
> that there will be no violation.

So what about, as an example, the case of just PFEC_implicit set?
And do you really care about accesses with PFEC_page_present
clear?

>>> +        return false;
>>> +
>>> +    rc = hvmemul_linear_to_phys(gla, &gpa, bytes, &reps, pfec, &ctxt);
>> 
>> As said before - I don't think it's a good idea to do the page walk
>> twice: This and the pre-existing one can easily return different
>> results.
> 
> I do this just to get the gpa. If there is another way I will gladly use it.

To get the gpa you need to do a page walk. But you shouldn't do
two page walks. Which as a result tells me that the question is
not about "another way", but that things need to be structured
differently.

>>> +    switch ( access ) {

Btw, I'm noticing this style issue only now.

>>> +    case XENMEM_access_x:
>>> +    case XENMEM_access_rx:
>>> +        if ( pfec & PFEC_write_access )
>>> +            req.u.mem_access.flags = MEM_ACCESS_R | MEM_ACCESS_W;
>>> +        break;
>>> +
>>> +    case XENMEM_access_w:
>>> +    case XENMEM_access_rw:
>>> +        if ( pfec & PFEC_insn_fetch )
>>> +            req.u.mem_access.flags = MEM_ACCESS_X;
>>> +        break;
>>> +
>>> +    case XENMEM_access_r:
>>> +    case XENMEM_access_n:
>>> +        if ( pfec & PFEC_write_access )
>>> +            req.u.mem_access.flags |= MEM_ACCESS_R | MEM_ACCESS_W;
>>> +        if ( pfec & PFEC_insn_fetch )
>>> +            req.u.mem_access.flags |= MEM_ACCESS_X;
>>> +        break;
>>> +
>>> +    default:
>>> +        return false;
>>> +    }
>> 
>> Aren't you looking at the leaf page here, rather than at any of the
>> involved page tables? Or am I misunderstanding the description
>> saying "page-walks on access protected pages"?
> 
> We want to ignore access write for the page tables and only fire a 
> vm_event for "regular" pages possibly hit by the actual instruction that 
> has also happened to trigger the A/D write(s). So we don't want to send 
> out vm_events for written-to page tables at all.

In which case may I ask for the description to be worded to make
this unambiguous?

>>> @@ -1248,7 +1318,21 @@ int hvmemul_insn_fetch(
>>>           container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
>>>       /* Careful, as offset can wrap or truncate WRT insn_buf_eip. */
>>>       uint8_t insn_off = offset - hvmemul_ctxt->insn_buf_eip;
>>> +    uint32_t pfec = PFEC_page_present | PFEC_insn_fetch;
>>> +    unsigned long addr, reps = 1;
>>> +    int rc = 0;
>>> +
>>> +    rc = hvmemul_virtual_to_linear(
>>> +        seg, offset, bytes, &reps, hvm_access_insn_fetch, hvmemul_ctxt, &addr);
>>> +
>>> +    if ( rc != X86EMUL_OKAY || !bytes )
>>> +        return rc;
>>> +
>>> +    if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 )
>>> +        pfec |= PFEC_user_mode;
>>>   
>>> +    if ( hvmemul_send_vm_event(addr, pfec, bytes, *hvmemul_ctxt) )
>>> +        return X86EMUL_ACCESS_EXCEPTION;
>>>       /*
>>>        * Fall back if requested bytes are not in the prefetch cache.
>>>        * But always perform the (fake) read when bytes == 0.
>> 
>> Despite what was said before you're still doing things a 2nd time
>> here just because of hvmemul_send_vm_event()'s needs, even
>> if that function ends up bailing right away.
> 
> I don't understand what things are done 2 times. Can you please explain?

You add code above that exists already in __hvmemul_read().
Even worse, __hvmemul_read() may not need calling at all, in
which case there's no (emulated) memory access and hence
imo there shouldn't be any event. Plus, just like in the
hvmemul_linear_to_phys() case there may be an exception
raised by the function, yet just like there you also discard the
return value saying so without also zapping the exception.

>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>> @@ -162,6 +162,8 @@ struct x86_emul_fpu_aux {
>>>   #define X86EMUL_UNRECOGNIZED   X86EMUL_UNIMPLEMENTED
>>>    /* (cmpxchg accessor): CMPXCHG failed. */
>>>   #define X86EMUL_CMPXCHG_FAILED 7
>>> +/* Emulator tried to access a protected page. */
>>> +#define X86EMUL_ACCESS_EXCEPTION 8
>> 
>> This still doesn't make clear what the difference is to
>> X86EMUL_EXCEPTION.
> 
> We need a return that has no side effects.

So besides saying so you also need to make sure there actually
are no side effects.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel