From: Paul Durrant <Paul.Durrant@citrix.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: Kevin Tian <kevin.tian@intel.com>,
"Keir (Xen.org)" <keir@xen.org>,
"jbeulich@suse.com" <jbeulich@suse.com>,
Andrew Cooper <Andrew.Cooper3@citrix.com>,
"Tim (Xen.org)" <tim@xen.org>,
George Dunlap <George.Dunlap@citrix.com>,
"zhiyuan.lv@intel.com" <zhiyuan.lv@intel.com>,
"jun.nakajima@intel.com" <jun.nakajima@intel.com>
Subject: Re: [PATCH 3/3] Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
Date: Tue, 22 Mar 2016 17:51:12 +0000 [thread overview]
Message-ID: <a52cfdf42b1945868ae157844abaf060@AMSPEX02CL03.citrite.net> (raw)
In-Reply-To: <56F1805C.6060604@citrix.com>
> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@citrix.com]
> Sent: 22 March 2016 17:27
> To: Yu Zhang; xen-devel@lists.xen.org
> Cc: Keir (Xen.org); jbeulich@suse.com; Andrew Cooper; George Dunlap;
> jun.nakajima@intel.com; Kevin Tian; Tim (Xen.org); Paul Durrant;
> zhiyuan.lv@intel.com
> Subject: Re: [PATCH 3/3] Add HVMOP to map guest ram with
> p2m_ioreq_server to an ioreq server
>
> On 16/03/16 12:22, Yu Zhang wrote:
> > A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
> > let one ioreq server claim its responsibility for the handling
> > of guest pages with p2m type p2m_ioreq_server. Users of this
> > HVMOP can specify whether the p2m_ioreq_server is supposed to
> > handle write accesses or read ones in a flag. By now, we only
> > support one ioreq server for this p2m type, so once an ioreq
> > server has claimed its ownership, following calls of the
> > HVMOP_map_mem_type_to_ioreq_server will fail.
> >
> > Note: this HVMOP does not change the p2m type of any guest ram
> > page, until the HVMOP_set_mem_type is triggered. So normally
> > the steps would be the backend driver first claims its ownership
> > of guest ram pages with p2m_ioreq_server type. At then sets the
> > memory type to p2m_ioreq_server for specified guest ram pages.
>
> Yu, thanks for this work. I think it's heading in the right direction.
>
> A couple of comments:
>
> There's not much documentation in the code about how this is expected to
> be used.
>
> For instance, having separate flags seems to imply that you can
> independently select either read intercept, write intercept, or both;
> but [ept_]p2m_type_to_flags() seems to assume that READ_ACCESS implies
> WRITE_ACCESS. Do you plan to implement them separately in the future?
> If not, would it be better to make the interface an enum instead?
>
> At very least it should be documented that READ_ACCESS implies
> WRITE_ACCESS.
That's not true. If WRITE_ACCESS has not been requested then writes are handled directly in Xen rather than being forwarded to the ioreq server. If h/w were to allow pages to be marked write-only then we wouldn't need to do that.
>
> Calling the function with no flags appears to mean, "Detach the ioreq
> server from this type" -- this should also be documented.
>
That was documented in the design, but it's true that it does need to pulled into comments.
> Furthermore, you appear to deal with p2m_ioreq_server types when there
> is no ioreq server by making the "flags=0" case RW in
> [ept_]p2m_type_to_flags. This should be spelled out somewhere (probably
> in the p2m structure definition).
>
> Finally, you call p2m_memory_type_changed() when changing the ioreq
> server; but this appears to be only implemented for ept; meaning that if
> you're either running HAP on AMD or running in shadow mode, if the
> caller isn't careful (i.e., doesn't remove all p2m_ioreq_server types
> before removing itself as a server), they may end up with bogus
> permissions left over in the p2m table.
>
> Maybe you should check for the existence of p2m->memory_type_changed
> and
> return -ENOTSUPP or something if it's not present?
>
> More comments inline...
>
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>
> > @@ -168,13 +226,65 @@ static int hvmemul_do_io(
> > break;
> > case X86EMUL_UNHANDLEABLE:
> > {
> > - struct hvm_ioreq_server *s =
> > - hvm_select_ioreq_server(curr->domain, &p);
> > + struct hvm_ioreq_server *s;
> > + p2m_type_t p2mt;
> > +
> > + if ( is_mmio )
> > + {
> > + unsigned long gmfn = paddr_to_pfn(addr);
> > +
> > + (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> > +
> > + switch ( p2mt )
> > + {
> > + case p2m_ioreq_server:
> > + {
> > + unsigned long flags;
> > +
> > + p2m_get_ioreq_server(currd, p2mt, &flags, &s);
> > +
> > + if ( !s )
> > + break;
> > +
> > + if ( (dir == IOREQ_READ &&
> > + !(flags & P2M_IOREQ_HANDLE_READ_ACCESS)) ||
> > + (dir == IOREQ_WRITE &&
> > + !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS)) )
> > + s = NULL;
> > +
> > + break;
> > + }
> > + case p2m_ram_rw:
> > + s = NULL;
> > + break;
> > +
> > + default:
> > + s = hvm_select_ioreq_server(currd, &p);
> > + break;
> > + }
> > + }
> > + else
> > + {
> > + p2mt = p2m_invalid;
> > +
> > + s = hvm_select_ioreq_server(currd, &p);
> > + }
> >
> > /* If there is no suitable backing DM, just ignore accesses */
> > if ( !s )
> > {
> > - rc = hvm_process_io_intercept(&null_handler, &p);
> > + switch ( p2mt )
> > + {
> > + case p2m_ioreq_server:
> > + case p2m_ram_rw:
> > + rc = hvm_process_io_intercept(&mem_handler, &p);
> > + break;
> > +
> > + default:
> > + rc = hvm_process_io_intercept(&null_handler, &p);
> > + break;
> > + }
>
> Is it actually possible to get here with "is_mmio" true and p2mt ==
> p2m_ram_rw?
I think that's possible if the type change races with an access.
>
> If so, it would appear that this changes the handling in that case.
> Before this patch, on p2m_ram_rw, you'd call hvm_select_ioreq_server(),
> which would either give you a server (perhaps the default one), or you'd
> be called with null_handler.
>
> After this patch, on p2m_ram_rw, you'll always be called with mem_handler.
>
> If this is an intentional change, you need to explan why in the
> changelog (and probably also a comment here).
>
That is an intentional change.
> > @@ -1411,6 +1413,55 @@ static int
> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
> > return rc;
> > }
> >
> > +static int hvm_map_mem_type_to_ioreq_server(struct domain *d,
> > + ioservid_t id,
> > + hvmmem_type_t type,
> > + uint32_t flags)
> > +{
> > + struct hvm_ioreq_server *s;
> > + p2m_type_t p2mt;
> > + int rc;
> > +
> > + switch ( type )
> > + {
> > + case HVMMEM_ioreq_server:
> > + p2mt = p2m_ioreq_server;
> > + break;
> > +
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + if ( flags & ~(HVMOP_IOREQ_MEM_ACCESS_READ |
> > + HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> > + return -EINVAL;
> > +
> > + spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> > +
> > + rc = -ENOENT;
> > + list_for_each_entry ( s,
> > + &d->arch.hvm_domain.ioreq_server.list,
> > + list_entry )
> > + {
> > + if ( s == d->arch.hvm_domain.default_ioreq_server )
> > + continue;
> > +
> > + if ( s->id == id )
> > + {
> > + rc = p2m_set_ioreq_server(d, p2mt, flags, s);
> > + if ( rc )
> > + break;
> > +
> > + gdprintk(XENLOG_DEBUG, "%u claimed type p2m_ioreq_server\n",
> > + s->id);
>
> Don't we want to break out of the loop if p2m_set_ioreq_server()
> succeeds as well? Or do we really want to go check all the other ioreq
> servers to see if their ID matches too? :-)
No, the break should be unconditional.
>
> > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> > index 380ec25..21e04ce 100644
> > --- a/xen/arch/x86/mm/p2m-ept.c
> > +++ b/xen/arch/x86/mm/p2m-ept.c
> > @@ -132,6 +132,14 @@ static void ept_p2m_type_to_flags(struct
> p2m_domain *p2m, ept_entry_t *entry,
> > entry->r = entry->w = entry->x = 1;
> > entry->a = entry->d = !!cpu_has_vmx_ept_ad;
> > break;
> > + case p2m_ioreq_server:
> > + entry->r = !(p2m->ioreq.flags &
> P2M_IOREQ_HANDLE_READ_ACCESS);
> > + entry->w = (entry->r &
> > + !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS));
> > + entry->x = entry->r;
> > + entry->a = !!cpu_has_vmx_ept_ad;
> > + entry->d = 0;
> > + break;
>
> In line with the other types, would it make sense for entry->d here to
> be entry->w && cpu_has_vmx_ept_ad?
>
>
> > @@ -289,6 +291,83 @@ void p2m_memory_type_changed(struct domain
> *d)
> > }
> > }
> >
> > +int p2m_set_ioreq_server(struct domain *d, p2m_type_t t,
> > + unsigned long flags,
> > + struct hvm_ioreq_server *s)
> > +{
> > + struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > + int rc;
> > +
> > + spin_lock(&p2m->ioreq.lock);
> > +
> > + rc = -EINVAL;
> > + if ( t != p2m_ioreq_server )
> > + goto out;
>
> Since these are all internal interfaces, I'm not sure what the point is
> of passing this type in when we only accept one type anyway. If we're
> going to have a type in the interface but only accept one type, we
> should just check it at the top level and assume p2m_ioreq_server all
> the way through (until such time as we accept a second type).
>
OK. It was there for future expansion, but that could be added later.
> > +
> > + rc = -EBUSY;
> > + if ( ( p2m->ioreq.server != NULL ) && (flags != 0) )
> > + goto out;
> > +
> > + if ( flags == 0 )
> > + {
> > + p2m->ioreq.server = NULL;
> > + p2m->ioreq.flags = 0;
> > + }
> > + else
> > + {
> > + p2m->ioreq.server = s;
> > + p2m->ioreq.flags = flags;
> > + }
>
> I think it would be a bit more robust to
>
> 1) Require callers to always specify (their own) server ID when making
> this call, even if they're un-claiming the p2m type
>
> 2) When un-claiming, check to make sure that the server id of the caller
> matches the server id that's being detached.
>
Ok. Fair enough.
> > +
> > + p2m_memory_type_changed(d);
> > +
> > + rc = 0;
> > +
> > +out:
> > + spin_unlock(&p2m->ioreq.lock);
> > +
> > + return rc;
> > +}
> > +
> > +void p2m_get_ioreq_server(struct domain *d, p2m_type_t t,
> > + unsigned long *flags,
> > + struct hvm_ioreq_server **s)
> > +{
> > + struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > +
> > + if ( t != p2m_ioreq_server )
> > + {
> > + *s = NULL;
> > + *flags = 0;
> > + return;
> > + }
> > +
> > + spin_lock(&p2m->ioreq.lock);
> > +
> > + *s = p2m->ioreq.server;
> > + *flags = p2m->ioreq.flags;
> > +
> > + spin_unlock(&p2m->ioreq.lock);
> > +}
> > +
> > +void p2m_destroy_ioreq_server(struct domain *d,
> > + struct hvm_ioreq_server *s)
> > +{
> > + struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > +
> > + spin_lock(&p2m->ioreq.lock);
> > +
> > + if ( p2m->ioreq.server == s )
> > + {
> > + p2m->ioreq.server = NULL;
> > + p2m->ioreq.flags = 0;
> > + }
> > +
> > + p2m_memory_type_changed(d);
>
> Do we really want to call this unconditionally, even if the server being
> destroyed here isn't the server associated with p2m_ioreq_server? It
> shouldn't have a functional impact, but it will slow things down as the
> ept tables are unnecessarily rebuilt.
>
True. That should be conditional.
> > diff --git a/xen/include/public/hvm/hvm_op.h
> b/xen/include/public/hvm/hvm_op.h
> > index a1eae52..e7fc75d 100644
> > --- a/xen/include/public/hvm/hvm_op.h
> > +++ b/xen/include/public/hvm/hvm_op.h
> > @@ -489,6 +489,30 @@ struct xen_hvm_altp2m_op {
> > typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
> > DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
> >
> > +#if defined(__XEN__) || defined(__XEN_TOOLS__)
> > +
> > +#define HVMOP_map_mem_type_to_ioreq_server 26
> > +struct xen_hvm_map_mem_type_to_ioreq_server {
> > + domid_t domid; /* IN - domain to be serviced */
> > + ioservid_t id; /* IN - server id */
> > + hvmmem_type_t type; /* IN - memory type */
>
> What is this 'type' for? Do you expect to map other types of memory
> this way in the future?
>
As stated in the design, yes.
Paul
> -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-03-22 17:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-16 12:22 [PATCH 3/3] Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-03-22 17:26 ` George Dunlap
2016-03-22 17:51 ` Paul Durrant [this message]
2016-03-23 9:22 ` Jan Beulich
2016-03-23 9:44 ` Yu, Zhang
2016-03-23 10:10 ` Jan Beulich
2016-03-23 10:37 ` Andrew Cooper
2016-03-23 11:43 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a52cfdf42b1945868ae157844abaf060@AMSPEX02CL03.citrite.net \
--to=paul.durrant@citrix.com \
--cc=Andrew.Cooper3@citrix.com \
--cc=George.Dunlap@citrix.com \
--cc=jbeulich@suse.com \
--cc=jun.nakajima@intel.com \
--cc=keir@xen.org \
--cc=kevin.tian@intel.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
--cc=yu.c.zhang@linux.intel.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).