All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>,
	xen-devel@lists.xen.org, Paul Durrant <paul.durrant@citrix.com>,
	zhiyuan.lv@intel.com, Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.
Date: Sat, 11 Mar 2017 16:42:07 +0800	[thread overview]
Message-ID: <58C3B85F.6060205@linux.intel.com> (raw)
In-Reply-To: <58C2D462020000780014203E@prv-mh.provo.novell.com>



On 3/10/2017 11:29 PM, Jan Beulich wrote:
>>>> On 08.03.17 at 16:33, <yu.c.zhang@linux.intel.com> wrote:
>> changes in v7:
>>    - Use new ioreq server interface - XEN_DMOP_map_mem_type_to_ioreq_server.
>>    - According to comments from George: removed domain_pause/unpause() in
>>      hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
>>      and we can avoid the:
>>      a> deadlock between p2m lock and ioreq server lock by using these locks
>>         in the same order - solved in patch 4;
> That is, until patch 4 there is deadlock potential? I think you want to
> re-order the patches if so. Or was it that the type can't really be used
> until the last patch of the series? (I'm sorry, it's been quite a while
> since the previous version.)

Oh. There's no deadlock potential in this version patch set. But in v6, 
there was, and I used
domain_pause/unpause() to avoid this. Later on, I realized that if I use 
different locks in the
same order, the deadlock potential can be avoid and we do not need 
domain_pause/unpause
in this version.

>> @@ -365,6 +383,24 @@ static int dm_op(domid_t domid,
>>           break;
>>       }
>>   
>> +    case XEN_DMOP_map_mem_type_to_ioreq_server:
>> +    {
>> +        const struct xen_dm_op_map_mem_type_to_ioreq_server *data =
>> +            &op.u.map_mem_type_to_ioreq_server;
>> +
>> +        rc = -EINVAL;
>> +        if ( data->pad )
>> +            break;
>> +
>> +        /* Only support for HAP enabled hvm. */
>> +        if ( !hap_enabled(d) )
>> +            break;
> Perhaps better to give an error other than -EINVAL in this case?
> If so, then the same error should likely also be used in your
> set_mem_type() addition.
How about -ENOTSUP?

>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -99,6 +99,7 @@ static int hvmemul_do_io(
>>       uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
>>   {
>>       struct vcpu *curr = current;
>> +    struct domain *currd = curr->domain;
>>       struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
>>       ioreq_t p = {
>>           .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
>> @@ -140,7 +141,7 @@ static int hvmemul_do_io(
>>                (p.dir != dir) ||
>>                (p.df != df) ||
>>                (p.data_is_ptr != data_is_addr) )
>> -            domain_crash(curr->domain);
>> +            domain_crash(currd);
> If you mean to do this transformation here, then please do so
> consistently for the entire function.

OK. Thanks.

>> @@ -177,8 +178,65 @@ static int hvmemul_do_io(
>>           break;
>>       case X86EMUL_UNHANDLEABLE:
>>       {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        /*
>> +         * Xen isn't emulating the instruction internally, so see if
>> +         * there's an ioreq server that can handle it. Rules:
>> +         *
>> +         * - PIO and "normal" mmio run through hvm_select_ioreq_server()
>> +         * to choose the ioreq server by range. If no server is found,
>> +         * the access is ignored.
>> +         *
>> +         * - p2m_ioreq_server accesses are handled by the current
>> +         * ioreq_server for the domain, but there are some corner
>> +         * cases:
> Who or what is "the current ioreq_server for the domain"?
It means "the current ioreq_server which maps the p2m_ioreq_server type 
for this domain"...
I'd like to use a succinct phrase, but now seems not accurate enough. 
Any preference?

>> +         *   - If the domain ioreq_server is NULL, assume this is a
>> +         *   race between the unbinding of ioreq server and guest fault
>> +         *   so re-try the instruction.
>> +         *
>> +         *   - If the IOREQ_MEM_ACCESS_WRITE flag is not set, treat it
>> +         *   like a normal PIO or MMIO that doesn't have an ioreq
>> +         *   server (i.e., by ignoring it).
>> +         */
>> +        struct hvm_ioreq_server *s = NULL;
>> +        p2m_type_t p2mt = p2m_invalid;
>> +
>> +        if ( is_mmio )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> Stray cast.

OK. Will remove it.

>
>> +            if ( p2mt == p2m_ioreq_server )
>> +            {
>> +                unsigned int flags;
>> +
>> +                s = p2m_get_ioreq_server(currd, &flags);
>> +
>> +                /*
>> +                 * If p2mt is ioreq_server but ioreq_server is NULL,
>> +                 * we probably lost a race with unbinding of ioreq
>> +                 * server, just retry the access.
>> +                 */
>> +                if ( s == NULL )
>> +                {
>> +                    rc = X86EMUL_RETRY;
>> +                    vio->io_req.state = STATE_IOREQ_NONE;
>> +                    break;
>> +                }
>> +
>> +                /*
>> +                 * If the IOREQ_MEM_ACCESS_WRITE flag is not set,
>> +                 * we should set s to NULL, and just ignore such
>> +                 * access.
>> +                 */
>> +                if ( !(flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +                    s = NULL;
> What is this about? You only allow WRITE registrations, so this looks
> to be dead code. Yet if it is meant to guard against future enabling
> of READ, then this clearly should not be done for reads.

It's to guard against future emulation of READ. We can remove it for now.

>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
>>               entry->r = entry->w = entry->x = 1;
>>               entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>               break;
>> +        case p2m_ioreq_server:
>> +            entry->r = 1;
>> +            entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
> Along the lines of the previous comment - if you mean to have the
> code cope with READ, please also set ->r accordingly, or add a
> comment why this won't have the intended effect (yielding a not
> present EPTE).

How about we keep this code and do not support READ? I'll remove above 
dead code in hvmemul_do_io().

>> @@ -92,8 +94,13 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
>>       default:
>>           return flags | _PAGE_NX_BIT;
>>       case p2m_grant_map_ro:
>> -    case p2m_ioreq_server:
>>           return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
>> +    case p2m_ioreq_server:
>> +        flags |= P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
>> +        if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
>> +            return flags & ~_PAGE_RW;
>> +        else
>> +            return flags;
> Stray else. But even better would imo be
>
>          if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
>              flags &= ~_PAGE_RW;
>          return flags;
Oh. Thanks. :)

>> +struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
>> +                                              unsigned int *flags)
>> +{
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +    struct hvm_ioreq_server *s;
>> +
>> +    spin_lock(&p2m->ioreq.lock);
>> +
>> +    s = p2m->ioreq.server;
>> +    *flags = p2m->ioreq.flags;
>> +
>> +    spin_unlock(&p2m->ioreq.lock);
>> +    return s;
>> +}
> I'm afraid this question was asked before, but since there's no
> comment here or anywhere, I can't recall if there was a reason why
> s potentially being stale by the time the caller looks at it is not a
> problem.

Well, it is possibe that s is stale. I did not take it as a problem 
because the device model
will later discard such io request. And I believe current 
hvm_select_ioreq_server() also
has the same issue - the returned s should be considered to be stale, if 
the MMIO/PIO
address is removed from the ioreq server's rangeset.

Another thought is, if you think it is inappropriate for device model to 
do the check,
we can use spin_lock_recursive on ioreq_server.lock to protect all the 
ioreq server select
and release the lock after the ioreq server is sent out.

>> --- a/xen/include/public/hvm/dm_op.h
>> +++ b/xen/include/public/hvm/dm_op.h
>> @@ -318,6 +318,30 @@ struct xen_dm_op_inject_msi {
>>       uint64_aligned_t addr;
>>   };
>>   
>> +/*
>> + * XEN_DMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
>> + *                                      to specific memroy type <type>
>> + *                                      for specific accesses <flags>
>> + *
>> + * For now, flags only accept the value of XEN_DMOP_IOREQ_MEM_ACCESS_WRITE,
>> + * which means only write operations are to be forwarded to an ioreq server.
>> + * Support for the emulation of read operations can be added when an ioreq
>> + * server has such requirement in future.
>> + */
>> +#define XEN_DMOP_map_mem_type_to_ioreq_server 15
>> +
>> +struct xen_dm_op_map_mem_type_to_ioreq_server {
>> +    ioservid_t id;      /* IN - ioreq server id */
>> +    uint16_t type;      /* IN - memory type */
>> +    uint16_t pad;
> Perhaps there was padding needed when this was a hvmop, but
> now the padding does exactly the wrong thing.

Right, padding is useless in this new structure. I will remove it if 
other field does not change(I proposed
a change for flag field in reply to Andrew's comments on patch 5). Thank 
you, Jan.

Yu
>
> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-03-11  8:42 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-08 15:33 [PATCH v7 0/5] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2017-03-08 15:33 ` [PATCH v7 1/5] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2017-03-08 15:33 ` [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2017-03-10 15:29   ` Jan Beulich
2017-03-11  8:42     ` Yu Zhang [this message]
2017-03-13 11:20       ` Jan Beulich
2017-03-14  7:28         ` Yu Zhang
2017-03-14  9:40           ` Paul Durrant
2017-03-14  9:52             ` Yu Zhang
2017-03-14 10:40               ` Paul Durrant
2017-03-14 12:03                 ` Yu Zhang
2017-03-14 13:10                   ` Jan Beulich
2017-03-14 13:28                     ` Yu Zhang
2017-03-14 10:26           ` Jan Beulich
2017-03-08 15:33 ` [PATCH v7 3/5] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2017-03-10 15:33   ` Jan Beulich
2017-03-11  8:42     ` Yu Zhang
2017-03-13 11:22       ` Jan Beulich
2017-03-14  7:28         ` Yu Zhang
2017-03-08 15:33 ` [PATCH v7 4/5] ix86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries Yu Zhang
2017-03-10 16:03   ` Jan Beulich
2017-03-11  8:42     ` Yu Zhang
2017-03-13 11:24       ` Jan Beulich
2017-03-14  7:42         ` Yu Zhang
2017-03-14 10:49           ` Jan Beulich
2017-03-14 12:18             ` Yu Zhang
2017-03-14 13:11               ` Jan Beulich
2017-03-14 13:29                 ` Yu Zhang
2017-03-08 15:33 ` [PATCH v7 5/5] x86/ioreq server: Synchronously reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2017-03-10 16:17   ` Jan Beulich
2017-03-11  8:42     ` Yu Zhang
2017-03-13 11:24       ` Jan Beulich
2017-03-10 16:59   ` Andrew Cooper
2017-03-11  8:42     ` Yu Zhang
2017-03-13 11:32       ` Jan Beulich
2017-03-14  7:42         ` Yu Zhang
2017-03-14 10:51           ` Jan Beulich
2017-03-14 12:22             ` Yu Zhang
2017-03-14 13:12               ` Jan Beulich
2017-03-14 13:29                 ` Yu Zhang
  -- strict thread matches above, loose matches on Subject: below --
2017-03-08 13:32 [PATCH v7 0/5] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2017-03-08 13:32 ` [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58C3B85F.6060205@linux.intel.com \
    --to=yu.c.zhang@linux.intel.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=paul.durrant@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.