xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers
@ 2019-09-03 16:14 Roger Pau Monne
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup Roger Pau Monne
                   ` (10 more replies)
  0 siblings, 11 replies; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich, Roger Pau Monne

Such internal servers are implemented by a single function that handles
ioreqs inside the hypervisor.

The motivation behind this change is to switch vPCI to become an
internal ioreq server, so that accesses to the PCI config space can be
multiplexed between devices handled by vPCI and devices handled by other
ioreq servers.

The implementation is fairly simple and limited to what's needed by
vPCI, but can be expanded in the future if other more complex users
appear.

The series can also be found at:

git://xenbits.xen.org/people/royger/xen.git ioreq_vpci_v2

Thanks, Roger.

Roger Pau Monne (11):
  ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  ioreq: terminate cf8 handling at hypervisor level
  ioreq: switch selection and forwarding to use ioservid_t
  ioreq: add fields to allow internal ioreq servers
  ioreq: add internal ioreq initialization support
  ioreq: allow dispatching ioreqs to internal servers
  ioreq: allow registering internal ioreq server handler
  ioreq: allow decoding accesses to MMCFG regions
  vpci: register as an internal ioreq server
  ioreq: split the code to detect PCI config space accesses
  ioreq: provide support for long-running operations...

 tools/tests/vpci/Makefile           |   5 +-
 tools/tests/vpci/emul.h             |   4 +
 xen/arch/x86/hvm/dm.c               |  19 +-
 xen/arch/x86/hvm/dom0_build.c       |   9 +-
 xen/arch/x86/hvm/emulate.c          |  14 +-
 xen/arch/x86/hvm/hvm.c              |   7 +-
 xen/arch/x86/hvm/io.c               | 248 ++--------------
 xen/arch/x86/hvm/ioreq.c            | 434 ++++++++++++++++++++--------
 xen/arch/x86/hvm/stdvga.c           |   8 +-
 xen/arch/x86/mm/p2m.c               |  20 +-
 xen/arch/x86/physdev.c              |   6 +-
 xen/drivers/passthrough/x86/iommu.c |   2 +-
 xen/drivers/vpci/header.c           |  61 ++--
 xen/drivers/vpci/vpci.c             |  75 ++++-
 xen/include/asm-x86/hvm/domain.h    |  35 ++-
 xen/include/asm-x86/hvm/io.h        |  29 +-
 xen/include/asm-x86/hvm/ioreq.h     |  17 +-
 xen/include/asm-x86/hvm/vcpu.h      |   3 +-
 xen/include/asm-x86/p2m.h           |   9 +-
 xen/include/public/hvm/dm_op.h      |   1 +
 xen/include/xen/vpci.h              |  28 +-
 21 files changed, 559 insertions(+), 475 deletions(-)

-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 10:44   ` Paul Durrant
  2019-09-10 13:28   ` Jan Beulich
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level Roger Pau Monne
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

The loop in FOR_EACH_IOREQ_SERVER is backwards hence the cleanup on
failure needs to be done forwards.

Fixes: 97a5a3e30161 ('x86/hvm/ioreq: maintain an array of ioreq servers rather than a list')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/ioreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index a79cabb680..692b710b02 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
     return 0;
 
  fail:
-    while ( id-- != 0 )
+    while ( id++ != MAX_NR_IOREQ_SERVERS )
     {
         s = GET_IOREQ_SERVER(d, id);
 
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-03 17:13   ` Andrew Cooper
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t Roger Pau Monne
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

Do not forward accesses to cf8 to external emulators, decoding of PCI
accesses is handled by Xen, and emulators can request handling of
config space accesses of devices using the provided ioreq interface.

Fully terminate cf8 accesses at the hypervisor level, by improving the
existing hvm_access_cf8 helper to also handle register reads, and
always return X86EMUL_OKAY in order to terminate the emulation.

Also return an error to ioreq servers attempting to map PCI IO ports
(0xcf8-cfc), as those are handled by Xen.

Note that without this change in the absence of some external emulator
that catches accesses to cf8 read requests to the register would
misbehave, as the ioreq internal handler did not handle those.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/ioreq.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 692b710b02..69652e1080 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
     switch ( type )
     {
     case XEN_DMOP_IO_RANGE_PORT:
+        rc = -EINVAL;
+        /* PCI config space accesses are handled internally. */
+        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
+            goto out;
+        else
+            /* fallthrough. */
     case XEN_DMOP_IO_RANGE_MEMORY:
     case XEN_DMOP_IO_RANGE_PCI:
         r = s->range[type];
@@ -1518,11 +1524,15 @@ static int hvm_access_cf8(
 {
     struct domain *d = current->domain;
 
-    if ( dir == IOREQ_WRITE && bytes == 4 )
+    if ( bytes != 4 )
+        return X86EMUL_OKAY;
+
+    if ( dir == IOREQ_WRITE )
         d->arch.hvm.pci_cf8 = *val;
+    else
+        *val = d->arch.hvm.pci_cf8;
 
-    /* We always need to fall through to the catch all emulator */
-    return X86EMUL_UNHANDLEABLE;
+    return X86EMUL_OKAY;
 }
 
 void hvm_ioreq_init(struct domain *d)
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup Roger Pau Monne
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 12:31   ` Paul Durrant
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers Roger Pau Monne
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich, Roger Pau Monne

hvm_select_ioreq_server and hvm_send_ioreq where both using
hvm_ioreq_server directly, switch to use ioservid_t in order to select
and forward ioreqs.

This is a preparatory change, since future patches will use the ioreq
server id in order to differentiate between internal and external
ioreq servers.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/dm.c           |  2 +-
 xen/arch/x86/hvm/emulate.c      | 14 +++++++-------
 xen/arch/x86/hvm/ioreq.c        | 24 ++++++++++++------------
 xen/arch/x86/hvm/stdvga.c       |  8 ++++----
 xen/arch/x86/mm/p2m.c           | 20 ++++++++++----------
 xen/include/asm-x86/hvm/ioreq.h |  5 ++---
 xen/include/asm-x86/p2m.h       |  9 ++++-----
 xen/include/public/hvm/dm_op.h  |  1 +
 8 files changed, 41 insertions(+), 42 deletions(-)

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index d6d0e8be89..c2fca9f729 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -263,7 +263,7 @@ static int set_mem_type(struct domain *d,
             return -EOPNOTSUPP;
 
         /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */
-        if ( !p2m_get_ioreq_server(d, &flags) )
+        if ( p2m_get_ioreq_server(d, &flags) == XEN_INVALID_IOSERVID )
             return -EINVAL;
     }
 
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d75d3e6fd6..51d2fcba2d 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -254,7 +254,7 @@ static int hvmemul_do_io(
          * However, there's no cheap approach to avoid above situations in xen,
          * so the device model side needs to check the incoming ioreq event.
          */
-        struct hvm_ioreq_server *s = NULL;
+        ioservid_t id = XEN_INVALID_IOSERVID;
         p2m_type_t p2mt = p2m_invalid;
 
         if ( is_mmio )
@@ -267,9 +267,9 @@ static int hvmemul_do_io(
             {
                 unsigned int flags;
 
-                s = p2m_get_ioreq_server(currd, &flags);
+                id = p2m_get_ioreq_server(currd, &flags);
 
-                if ( s == NULL )
+                if ( id == XEN_INVALID_IOSERVID )
                 {
                     rc = X86EMUL_RETRY;
                     vio->io_req.state = STATE_IOREQ_NONE;
@@ -289,18 +289,18 @@ static int hvmemul_do_io(
             }
         }
 
-        if ( !s )
-            s = hvm_select_ioreq_server(currd, &p);
+        if ( id == XEN_INVALID_IOSERVID )
+            id = hvm_select_ioreq_server(currd, &p);
 
         /* If there is no suitable backing DM, just ignore accesses */
-        if ( !s )
+        if ( id == XEN_INVALID_IOSERVID )
         {
             rc = hvm_process_io_intercept(&null_handler, &p);
             vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
         {
-            rc = hvm_send_ioreq(s, &p, 0);
+            rc = hvm_send_ioreq(id, &p, 0);
             if ( rc != X86EMUL_RETRY || currd->is_shutting_down )
                 vio->io_req.state = STATE_IOREQ_NONE;
             else if ( !hvm_ioreq_needs_completion(&vio->io_req) )
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 69652e1080..95492bc111 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -39,6 +39,7 @@ static void set_ioreq_server(struct domain *d, unsigned int id,
 {
     ASSERT(id < MAX_NR_IOREQ_SERVERS);
     ASSERT(!s || !d->arch.hvm.ioreq_server.server[id]);
+    BUILD_BUG_ON(MAX_NR_IOREQ_SERVERS >= XEN_INVALID_IOSERVID);
 
     d->arch.hvm.ioreq_server.server[id] = s;
 }
@@ -868,7 +869,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
     domain_pause(d);
 
-    p2m_set_ioreq_server(d, 0, s);
+    p2m_set_ioreq_server(d, 0, id);
 
     hvm_ioreq_server_disable(s);
 
@@ -1131,7 +1132,7 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     if ( s->emulator != current->domain )
         goto out;
 
-    rc = p2m_set_ioreq_server(d, flags, s);
+    rc = p2m_set_ioreq_server(d, flags, id);
 
  out:
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -1255,8 +1256,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 }
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p)
+ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
 {
     struct hvm_ioreq_server *s;
     uint32_t cf8;
@@ -1265,7 +1265,7 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
     unsigned int id;
 
     if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
-        return NULL;
+        return XEN_INVALID_IOSERVID;
 
     cf8 = d->arch.hvm.pci_cf8;
 
@@ -1320,7 +1320,7 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
             start = addr;
             end = start + p->size - 1;
             if ( rangeset_contains_range(r, start, end) )
-                return s;
+                return id;
 
             break;
 
@@ -1329,7 +1329,7 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
             end = hvm_mmio_last_byte(p);
 
             if ( rangeset_contains_range(r, start, end) )
-                return s;
+                return id;
 
             break;
 
@@ -1338,14 +1338,14 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
             {
                 p->type = IOREQ_TYPE_PCI_CONFIG;
                 p->addr = addr;
-                return s;
+                return id;
             }
 
             break;
         }
     }
 
-    return NULL;
+    return XEN_INVALID_IOSERVID;
 }
 
 static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
@@ -1441,12 +1441,12 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
     return X86EMUL_OKAY;
 }
 
-int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
-                   bool buffered)
+int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
 {
     struct vcpu *curr = current;
     struct domain *d = curr->domain;
     struct hvm_ioreq_vcpu *sv;
+    struct hvm_ioreq_server *s = get_ioreq_server(d, id);
 
     ASSERT(s);
 
@@ -1512,7 +1512,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
         if ( !s->enabled )
             continue;
 
-        if ( hvm_send_ioreq(s, p, buffered) == X86EMUL_UNHANDLEABLE )
+        if ( hvm_send_ioreq(id, p, buffered) == X86EMUL_UNHANDLEABLE )
             failed++;
     }
 
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index bd398dbb1b..a689269712 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -466,7 +466,7 @@ static int stdvga_mem_write(const struct hvm_io_handler *handler,
         .dir = IOREQ_WRITE,
         .data = data,
     };
-    struct hvm_ioreq_server *srv;
+    ioservid_t id;
 
     if ( !stdvga_cache_is_enabled(s) || !s->stdvga )
         goto done;
@@ -507,11 +507,11 @@ static int stdvga_mem_write(const struct hvm_io_handler *handler,
     }
 
  done:
-    srv = hvm_select_ioreq_server(current->domain, &p);
-    if ( !srv )
+    id = hvm_select_ioreq_server(current->domain, &p);
+    if ( id == XEN_INVALID_IOSERVID )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvm_send_ioreq(srv, &p, 1);
+    return hvm_send_ioreq(id, &p, 1);
 }
 
 static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 8a5229ee21..43849cbbd9 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -102,6 +102,7 @@ static int p2m_initialise(struct domain *d, struct p2m_domain *p2m)
         p2m_pt_init(p2m);
 
     spin_lock_init(&p2m->ioreq.lock);
+    p2m->ioreq.server = XEN_INVALID_IOSERVID;
 
     return ret;
 }
@@ -361,7 +362,7 @@ void p2m_memory_type_changed(struct domain *d)
 
 int p2m_set_ioreq_server(struct domain *d,
                          unsigned int flags,
-                         struct hvm_ioreq_server *s)
+                         ioservid_t id)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int rc;
@@ -376,16 +377,16 @@ int p2m_set_ioreq_server(struct domain *d,
     if ( flags == 0 )
     {
         rc = -EINVAL;
-        if ( p2m->ioreq.server != s )
+        if ( p2m->ioreq.server != id )
             goto out;
 
-        p2m->ioreq.server = NULL;
+        p2m->ioreq.server = XEN_INVALID_IOSERVID;
         p2m->ioreq.flags = 0;
     }
     else
     {
         rc = -EBUSY;
-        if ( p2m->ioreq.server != NULL )
+        if ( p2m->ioreq.server != XEN_INVALID_IOSERVID )
             goto out;
 
         /*
@@ -397,7 +398,7 @@ int p2m_set_ioreq_server(struct domain *d,
         if ( read_atomic(&p2m->ioreq.entry_count) )
             goto out;
 
-        p2m->ioreq.server = s;
+        p2m->ioreq.server = id;
         p2m->ioreq.flags = flags;
     }
 
@@ -409,19 +410,18 @@ int p2m_set_ioreq_server(struct domain *d,
     return rc;
 }
 
-struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
-                                              unsigned int *flags)
+ioservid_t p2m_get_ioreq_server(struct domain *d, unsigned int *flags)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    struct hvm_ioreq_server *s;
+    ioservid_t id;
 
     spin_lock(&p2m->ioreq.lock);
 
-    s = p2m->ioreq.server;
+    id = p2m->ioreq.server;
     *flags = p2m->ioreq.flags;
 
     spin_unlock(&p2m->ioreq.lock);
-    return s;
+    return id;
 }
 
 void p2m_enable_hardware_log_dirty(struct domain *d)
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index e2588e912f..65491c48d2 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -47,9 +47,8 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
 void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
 void hvm_destroy_all_ioreq_servers(struct domain *d);
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p);
-int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
+ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p);
+int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p,
                    bool buffered);
 unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 94285db1b4..99a1dab311 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -354,7 +354,7 @@ struct p2m_domain {
           * ioreq server who's responsible for the emulation of
           * gfns with specific p2m type(for now, p2m_ioreq_server).
           */
-         struct hvm_ioreq_server *server;
+         ioservid_t server;
          /*
           * flags specifies whether read, write or both operations
           * are to be emulated by an ioreq server.
@@ -819,7 +819,7 @@ static inline p2m_type_t p2m_recalc_type_range(bool recalc, p2m_type_t t,
     if ( !recalc || !p2m_is_changeable(t) )
         return t;
 
-    if ( t == p2m_ioreq_server && p2m->ioreq.server != NULL )
+    if ( t == p2m_ioreq_server && p2m->ioreq.server != XEN_INVALID_IOSERVID )
         return t;
 
     return p2m_is_logdirty_range(p2m, gfn_start, gfn_end) ? p2m_ram_logdirty
@@ -938,9 +938,8 @@ static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt, mfn_t mfn)
 }
 
 int p2m_set_ioreq_server(struct domain *d, unsigned int flags,
-                         struct hvm_ioreq_server *s);
-struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
-                                              unsigned int *flags);
+                         ioservid_t id);
+ioservid_t p2m_get_ioreq_server(struct domain *d, unsigned int *flags);
 
 static inline int p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
                                    p2m_type_t ot, mfn_t nfn, mfn_t ofn,
diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
index d3b554d019..8725cc20d3 100644
--- a/xen/include/public/hvm/dm_op.h
+++ b/xen/include/public/hvm/dm_op.h
@@ -54,6 +54,7 @@
  */
 
 typedef uint16_t ioservid_t;
+#define XEN_INVALID_IOSERVID 0xffff
 
 /*
  * XEN_DMOP_create_ioreq_server: Instantiate a new IOREQ Server for a
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (2 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 12:34   ` Paul Durrant
  2019-09-20 10:53   ` Jan Beulich
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support Roger Pau Monne
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Internal ioreq servers are plain function handlers implemented inside
of the hypervisor. Note that most fields used by current (external)
ioreq servers are not needed for internal ones, and hence have been
placed inside of a struct and packed in an union together with the
only internal specific field, a function pointer to a handler.

This is required in order to have PCI config accesses forwarded to
external ioreq servers or to internal ones (ie: QEMU emulated devices
vs vPCI passthrough), and is the first step in order to allow
unprivileged domains to use vPCI.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Do not add an internal field to the ioreq server struct, whether a
   server is internal or external can already be inferred from the id.
 - Add an extra parameter to the internal handler in order to pass
   user-provided opaque data to the handler.
---
 xen/include/asm-x86/hvm/domain.h | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index bcc5621797..9fbe83f45a 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -52,21 +52,29 @@ struct hvm_ioreq_vcpu {
 #define MAX_NR_IO_RANGES  256
 
 struct hvm_ioreq_server {
-    struct domain          *target, *emulator;
-
+    struct domain          *target;
     /* Lock to serialize toolstack modifications */
     spinlock_t             lock;
-
-    struct hvm_ioreq_page  ioreq;
-    struct list_head       ioreq_vcpu_list;
-    struct hvm_ioreq_page  bufioreq;
-
-    /* Lock to serialize access to buffered ioreq ring */
-    spinlock_t             bufioreq_lock;
-    evtchn_port_t          bufioreq_evtchn;
     struct rangeset        *range[NR_IO_RANGE_TYPES];
     bool                   enabled;
-    uint8_t                bufioreq_handling;
+
+    union {
+        struct {
+            struct domain          *emulator;
+            struct hvm_ioreq_page  ioreq;
+            struct list_head       ioreq_vcpu_list;
+            struct hvm_ioreq_page  bufioreq;
+
+            /* Lock to serialize access to buffered ioreq ring */
+            spinlock_t             bufioreq_lock;
+            evtchn_port_t          bufioreq_evtchn;
+            uint8_t                bufioreq_handling;
+        };
+        struct {
+            void                   *data;
+            int (*handler)(struct vcpu *v, ioreq_t *, void *);
+        };
+    };
 };
 
 /*
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (3 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 12:59   ` Paul Durrant
  2019-09-20 11:15   ` Jan Beulich
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers Roger Pau Monne
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

Add support for internal ioreq servers to initialization and
deinitialization routines, prevent some functions from being executed
against internal ioreq servers and add guards to only allow internal
callers to modify internal ioreq servers. External callers (ie: from
hypercalls) are only allowed to deal with external ioreq servers.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Do not pass an 'internal' parameter to most functions, and instead
   use the id to key whether an ioreq server is internal or external.
 - Prevent enabling an internal server without a handler.
---
 xen/arch/x86/hvm/dm.c            |  17 ++-
 xen/arch/x86/hvm/ioreq.c         | 173 +++++++++++++++++++------------
 xen/include/asm-x86/hvm/domain.h |   5 +-
 xen/include/asm-x86/hvm/ioreq.h  |   8 +-
 4 files changed, 135 insertions(+), 68 deletions(-)

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index c2fca9f729..6a3682e58c 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -417,7 +417,7 @@ static int dm_op(const struct dmop_args *op_args)
             break;
 
         rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
-                                     &data->id);
+                                     &data->id, false);
         break;
     }
 
@@ -450,6 +450,9 @@ static int dm_op(const struct dmop_args *op_args)
         rc = -EINVAL;
         if ( data->pad )
             break;
+        rc = -EPERM;
+        if ( hvm_ioreq_is_internal(data->id) )
+            break;
 
         rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
                                               data->start, data->end);
@@ -464,6 +467,9 @@ static int dm_op(const struct dmop_args *op_args)
         rc = -EINVAL;
         if ( data->pad )
             break;
+        rc = -EPERM;
+        if ( hvm_ioreq_is_internal(data->id) )
+            break;
 
         rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
                                                   data->start, data->end);
@@ -481,6 +487,9 @@ static int dm_op(const struct dmop_args *op_args)
         rc = -EOPNOTSUPP;
         if ( !hap_enabled(d) )
             break;
+        rc = -EPERM;
+        if ( hvm_ioreq_is_internal(data->id) )
+            break;
 
         if ( first_gfn == 0 )
             rc = hvm_map_mem_type_to_ioreq_server(d, data->id,
@@ -528,6 +537,9 @@ static int dm_op(const struct dmop_args *op_args)
         rc = -EINVAL;
         if ( data->pad )
             break;
+        rc = -EPERM;
+        if ( hvm_ioreq_is_internal(data->id) )
+            break;
 
         rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
         break;
@@ -541,6 +553,9 @@ static int dm_op(const struct dmop_args *op_args)
         rc = -EINVAL;
         if ( data->pad )
             break;
+        rc = -EPERM;
+        if ( hvm_ioreq_is_internal(data->id) )
+            break;
 
         rc = hvm_destroy_ioreq_server(d, data->id);
         break;
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 95492bc111..dbc5e6b4c5 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -59,10 +59,11 @@ static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
 /*
  * Iterate over all possible ioreq servers.
  *
- * NOTE: The iteration is backwards such that more recently created
- *       ioreq servers are favoured in hvm_select_ioreq_server().
- *       This is a semantic that previously existed when ioreq servers
- *       were held in a linked list.
+ * NOTE: The iteration is backwards such that internal and more recently
+ *       created external ioreq servers are favoured in
+ *       hvm_select_ioreq_server().
+ *       This is a semantic that previously existed for external servers when
+ *       ioreq servers were held in a linked list.
  */
 #define FOR_EACH_IOREQ_SERVER(d, id, s) \
     for ( (id) = MAX_NR_IOREQ_SERVERS; (id) != 0; ) \
@@ -70,6 +71,12 @@ static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
             continue; \
         else
 
+#define FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s) \
+    for ( (id) = MAX_NR_EXTERNAL_IOREQ_SERVERS; (id) != 0; ) \
+        if ( !(s = GET_IOREQ_SERVER(d, --(id))) ) \
+            continue; \
+        else
+
 static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
 {
     shared_iopage_t *p = s->ioreq.va;
@@ -86,7 +93,7 @@ bool hvm_io_pending(struct vcpu *v)
     struct hvm_ioreq_server *s;
     unsigned int id;
 
-    FOR_EACH_IOREQ_SERVER(d, id, s)
+    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
 
@@ -190,7 +197,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
         return false;
     }
 
-    FOR_EACH_IOREQ_SERVER(d, id, s)
+    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
 
@@ -430,7 +437,7 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
 
-    FOR_EACH_IOREQ_SERVER(d, id, s)
+    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
     {
         if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
         {
@@ -688,7 +695,7 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s, bool internal)
 {
     struct hvm_ioreq_vcpu *sv;
 
@@ -697,29 +704,40 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
     if ( s->enabled )
         goto done;
 
-    hvm_remove_ioreq_gfn(s, false);
-    hvm_remove_ioreq_gfn(s, true);
+    if ( !internal )
+    {
+        hvm_remove_ioreq_gfn(s, false);
+        hvm_remove_ioreq_gfn(s, true);
 
-    s->enabled = true;
+        list_for_each_entry ( sv,
+                              &s->ioreq_vcpu_list,
+                              list_entry )
+            hvm_update_ioreq_evtchn(s, sv);
+    }
+    else if ( !s->handler )
+    {
+        ASSERT_UNREACHABLE();
+        goto done;
+    }
 
-    list_for_each_entry ( sv,
-                          &s->ioreq_vcpu_list,
-                          list_entry )
-        hvm_update_ioreq_evtchn(s, sv);
+    s->enabled = true;
 
   done:
     spin_unlock(&s->lock);
 }
 
-static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s, bool internal)
 {
     spin_lock(&s->lock);
 
     if ( !s->enabled )
         goto done;
 
-    hvm_add_ioreq_gfn(s, true);
-    hvm_add_ioreq_gfn(s, false);
+    if ( !internal )
+    {
+        hvm_add_ioreq_gfn(s, true);
+        hvm_add_ioreq_gfn(s, false);
+    }
 
     s->enabled = false;
 
@@ -736,33 +754,39 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     int rc;
 
     s->target = d;
-
-    get_knownalive_domain(currd);
-    s->emulator = currd;
-
     spin_lock_init(&s->lock);
-    INIT_LIST_HEAD(&s->ioreq_vcpu_list);
-    spin_lock_init(&s->bufioreq_lock);
-
-    s->ioreq.gfn = INVALID_GFN;
-    s->bufioreq.gfn = INVALID_GFN;
 
     rc = hvm_ioreq_server_alloc_rangesets(s, id);
     if ( rc )
         return rc;
 
-    s->bufioreq_handling = bufioreq_handling;
-
-    for_each_vcpu ( d, v )
+    if ( !hvm_ioreq_is_internal(id) )
     {
-        rc = hvm_ioreq_server_add_vcpu(s, v);
-        if ( rc )
-            goto fail_add;
+        get_knownalive_domain(currd);
+
+        s->emulator = currd;
+        INIT_LIST_HEAD(&s->ioreq_vcpu_list);
+        spin_lock_init(&s->bufioreq_lock);
+
+        s->ioreq.gfn = INVALID_GFN;
+        s->bufioreq.gfn = INVALID_GFN;
+
+        s->bufioreq_handling = bufioreq_handling;
+
+        for_each_vcpu ( d, v )
+        {
+            rc = hvm_ioreq_server_add_vcpu(s, v);
+            if ( rc )
+                goto fail_add;
+        }
     }
+    else
+        s->handler = NULL;
 
     return 0;
 
  fail_add:
+    ASSERT(!hvm_ioreq_is_internal(id));
     hvm_ioreq_server_remove_all_vcpus(s);
     hvm_ioreq_server_unmap_pages(s);
 
@@ -772,30 +796,34 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s, bool internal)
 {
     ASSERT(!s->enabled);
-    hvm_ioreq_server_remove_all_vcpus(s);
-
-    /*
-     * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
-     *       hvm_ioreq_server_free_pages() in that order.
-     *       This is because the former will do nothing if the pages
-     *       are not mapped, leaving the page to be freed by the latter.
-     *       However if the pages are mapped then the former will set
-     *       the page_info pointer to NULL, meaning the latter will do
-     *       nothing.
-     */
-    hvm_ioreq_server_unmap_pages(s);
-    hvm_ioreq_server_free_pages(s);
 
     hvm_ioreq_server_free_rangesets(s);
 
-    put_domain(s->emulator);
+    if ( !internal )
+    {
+        hvm_ioreq_server_remove_all_vcpus(s);
+
+        /*
+         * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
+         *       hvm_ioreq_server_free_pages() in that order.
+         *       This is because the former will do nothing if the pages
+         *       are not mapped, leaving the page to be freed by the latter.
+         *       However if the pages are mapped then the former will set
+         *       the page_info pointer to NULL, meaning the latter will do
+         *       nothing.
+         */
+        hvm_ioreq_server_unmap_pages(s);
+        hvm_ioreq_server_free_pages(s);
+
+        put_domain(s->emulator);
+    }
 }
 
 int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                            ioservid_t *id)
+                            ioservid_t *id, bool internal)
 {
     struct hvm_ioreq_server *s;
     unsigned int i;
@@ -811,7 +839,9 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     domain_pause(d);
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
 
-    for ( i = 0; i < MAX_NR_IOREQ_SERVERS; i++ )
+    for ( i = (internal ? MAX_NR_EXTERNAL_IOREQ_SERVERS : 0);
+          i < (internal ? MAX_NR_IOREQ_SERVERS : MAX_NR_EXTERNAL_IOREQ_SERVERS);
+          i++ )
     {
         if ( !GET_IOREQ_SERVER(d, i) )
             break;
@@ -821,6 +851,9 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     if ( i >= MAX_NR_IOREQ_SERVERS )
         goto fail;
 
+    ASSERT((internal &&
+            i >= MAX_NR_EXTERNAL_IOREQ_SERVERS && i < MAX_NR_IOREQ_SERVERS) ||
+           (!internal && i < MAX_NR_EXTERNAL_IOREQ_SERVERS));
     /*
      * It is safe to call set_ioreq_server() prior to
      * hvm_ioreq_server_init() since the target domain is paused.
@@ -864,20 +897,21 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
         goto out;
 
     rc = -EPERM;
-    if ( s->emulator != current->domain )
+    /* NB: internal servers cannot be destroyed. */
+    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
         goto out;
 
     domain_pause(d);
 
     p2m_set_ioreq_server(d, 0, id);
 
-    hvm_ioreq_server_disable(s);
+    hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
 
     /*
      * It is safe to call hvm_ioreq_server_deinit() prior to
      * set_ioreq_server() since the target domain is paused.
      */
-    hvm_ioreq_server_deinit(s);
+    hvm_ioreq_server_deinit(s, hvm_ioreq_is_internal(id));
     set_ioreq_server(d, id, NULL);
 
     domain_unpause(d);
@@ -909,7 +943,8 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
         goto out;
 
     rc = -EPERM;
-    if ( s->emulator != current->domain )
+    /* NB: don't allow fetching information from internal ioreq servers. */
+    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
         goto out;
 
     if ( ioreq_gfn || bufioreq_gfn )
@@ -956,7 +991,7 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
         goto out;
 
     rc = -EPERM;
-    if ( s->emulator != current->domain )
+    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
         goto out;
 
     rc = hvm_ioreq_server_alloc_pages(s);
@@ -1010,7 +1045,7 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
         goto out;
 
     rc = -EPERM;
-    if ( s->emulator != current->domain )
+    if ( !hvm_ioreq_is_internal(id) && s->emulator != current->domain )
         goto out;
 
     switch ( type )
@@ -1068,7 +1103,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
         goto out;
 
     rc = -EPERM;
-    if ( s->emulator != current->domain )
+    if ( !hvm_ioreq_is_internal(id) && s->emulator != current->domain )
         goto out;
 
     switch ( type )
@@ -1128,6 +1163,14 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     if ( !s )
         goto out;
 
+    /*
+     * NB: do not support mapping internal ioreq servers to memory types, as
+     * the current internal ioreq servers don't need this feature and it's not
+     * been tested.
+     */
+    rc = -EINVAL;
+    if ( hvm_ioreq_is_internal(id) )
+        goto out;
     rc = -EPERM;
     if ( s->emulator != current->domain )
         goto out;
@@ -1163,15 +1206,15 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
         goto out;
 
     rc = -EPERM;
-    if ( s->emulator != current->domain )
+    if ( !hvm_ioreq_is_internal(id) && s->emulator != current->domain )
         goto out;
 
     domain_pause(d);
 
     if ( enabled )
-        hvm_ioreq_server_enable(s);
+        hvm_ioreq_server_enable(s, hvm_ioreq_is_internal(id));
     else
-        hvm_ioreq_server_disable(s);
+        hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
 
     domain_unpause(d);
 
@@ -1190,7 +1233,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
 
-    FOR_EACH_IOREQ_SERVER(d, id, s)
+    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
     {
         rc = hvm_ioreq_server_add_vcpu(s, v);
         if ( rc )
@@ -1202,7 +1245,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
     return 0;
 
  fail:
-    while ( id++ != MAX_NR_IOREQ_SERVERS )
+    while ( id++ != MAX_NR_EXTERNAL_IOREQ_SERVERS )
     {
         s = GET_IOREQ_SERVER(d, id);
 
@@ -1224,7 +1267,7 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
 
-    FOR_EACH_IOREQ_SERVER(d, id, s)
+    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
         hvm_ioreq_server_remove_vcpu(s, v);
 
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -1241,13 +1284,13 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
-        hvm_ioreq_server_disable(s);
+        hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
 
         /*
          * It is safe to call hvm_ioreq_server_deinit() prior to
          * set_ioreq_server() since the target domain is being destroyed.
          */
-        hvm_ioreq_server_deinit(s);
+        hvm_ioreq_server_deinit(s, hvm_ioreq_is_internal(id));
         set_ioreq_server(d, id, NULL);
 
         xfree(s);
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 9fbe83f45a..9f92838b6e 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -97,7 +97,10 @@ struct hvm_pi_ops {
     void (*vcpu_block)(struct vcpu *);
 };
 
-#define MAX_NR_IOREQ_SERVERS 8
+#define MAX_NR_EXTERNAL_IOREQ_SERVERS 8
+#define MAX_NR_INTERNAL_IOREQ_SERVERS 1
+#define MAX_NR_IOREQ_SERVERS \
+    (MAX_NR_EXTERNAL_IOREQ_SERVERS + MAX_NR_INTERNAL_IOREQ_SERVERS)
 
 struct hvm_domain {
     /* Guest page range used for non-default ioreq servers */
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index 65491c48d2..c3917aa74d 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -24,7 +24,7 @@ bool handle_hvm_io_completion(struct vcpu *v);
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
 
 int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                            ioservid_t *id);
+                            ioservid_t *id, bool internal);
 int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
 int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
                               unsigned long *ioreq_gfn,
@@ -54,6 +54,12 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
 void hvm_ioreq_init(struct domain *d);
 
+static inline bool hvm_ioreq_is_internal(unsigned int id)
+{
+    ASSERT(id < MAX_NR_IOREQ_SERVERS);
+    return id >= MAX_NR_EXTERNAL_IOREQ_SERVERS;
+}
+
 #endif /* __ASM_X86_HVM_IOREQ_H__ */
 
 /*
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (4 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 13:06   ` Paul Durrant
  2019-09-20 11:35   ` Jan Beulich
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler Roger Pau Monne
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

Internal ioreq servers are always processed first, and ioreqs are
dispatched by calling the handler function. Note this is already the
case due to the implementation of FOR_EACH_IOREQ_SERVER.

Note that hvm_send_ioreq doesn't get passed the ioreq server id, so
obtain it from the ioreq server data by doing pointer arithmetic.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Avoid having to iterate twice over the list of ioreq servers since
   now internal servers are always processed first by
   FOR_EACH_IOREQ_SERVER.
 - Obtain ioreq server id using pointer arithmetic.
---
 xen/arch/x86/hvm/ioreq.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index dbc5e6b4c5..8331a89eae 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -1493,9 +1493,18 @@ int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
 
     ASSERT(s);
 
+    if ( hvm_ioreq_is_internal(id) && buffered )
+    {
+        ASSERT_UNREACHABLE();
+        return X86EMUL_UNHANDLEABLE;
+    }
+
     if ( buffered )
         return hvm_send_buffered_ioreq(s, proto_p);
 
+    if ( hvm_ioreq_is_internal(id) )
+        return s->handler(curr, proto_p, s->data);
+
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
         return X86EMUL_RETRY;
 
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (5 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 13:12   ` Paul Durrant
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions Roger Pau Monne
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

Provide a routine to register the handler for an internal ioreq
server.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Allow to provide an opaque data parameter to pass to the handler.
 - Allow changing the handler as long as the server is not enabled.
---
 xen/arch/x86/hvm/ioreq.c        | 35 +++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/ioreq.h |  4 ++++
 2 files changed, 39 insertions(+)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 8331a89eae..6339e5f884 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -485,6 +485,41 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     return rc;
 }
 
+int hvm_add_ioreq_handler(struct domain *d, ioservid_t id,
+                          int (*handler)(struct vcpu *v, ioreq_t *, void *),
+                          void *data)
+{
+    struct hvm_ioreq_server *s;
+    int rc = 0;
+
+    if ( !hvm_ioreq_is_internal(id) )
+    {
+        rc = -EINVAL;
+        goto out;
+    }
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    s = get_ioreq_server(d, id);
+    if ( !s )
+    {
+        rc = -ENOENT;
+        goto out;
+    }
+    if ( s->enabled )
+    {
+        rc = -EBUSY;
+        goto out;
+    }
+
+    s->handler = handler;
+    s->data = data;
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
 static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
                                     struct hvm_ioreq_vcpu *sv)
 {
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index c3917aa74d..90cc2aa938 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -54,6 +54,10 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
 void hvm_ioreq_init(struct domain *d);
 
+int hvm_add_ioreq_handler(struct domain *d, ioservid_t id,
+                          int (*handler)(struct vcpu *v, ioreq_t *, void *),
+                          void *data);
+
 static inline bool hvm_ioreq_is_internal(unsigned int id)
 {
     ASSERT(id < MAX_NR_IOREQ_SERVERS);
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (6 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 13:37   ` Paul Durrant
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server Roger Pau Monne
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

Pick up on the infrastructure already added for vPCI and allow ioreq
to decode accesses to MMCFG regions registered for a domain. This
infrastructure is still only accessible from internal callers, so
MMCFG regions can only be registered from the internal domain builder
used by PVH dom0.

Note that the vPCI infrastructure to decode and handle accesses to
MMCFG regions will be removed in following patches when vPCI is
switched to become an internal ioreq server.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Remove prototype for destroy_vpci_mmcfg.
 - Keep the code in io.c so PCI accesses to MMCFG regions can be
   decoded before ioreq processing.
---
 xen/arch/x86/hvm/dom0_build.c       |  8 +--
 xen/arch/x86/hvm/hvm.c              |  2 +-
 xen/arch/x86/hvm/io.c               | 79 ++++++++++++-----------------
 xen/arch/x86/hvm/ioreq.c            | 47 ++++++++++++-----
 xen/arch/x86/physdev.c              |  5 +-
 xen/drivers/passthrough/x86/iommu.c |  2 +-
 xen/include/asm-x86/hvm/io.h        | 29 ++++++++---
 7 files changed, 96 insertions(+), 76 deletions(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 8845399ae9..1ddbd46b39 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -1117,10 +1117,10 @@ static void __hwdom_init pvh_setup_mmcfg(struct domain *d)
 
     for ( i = 0; i < pci_mmcfg_config_num; i++ )
     {
-        rc = register_vpci_mmcfg_handler(d, pci_mmcfg_config[i].address,
-                                         pci_mmcfg_config[i].start_bus_number,
-                                         pci_mmcfg_config[i].end_bus_number,
-                                         pci_mmcfg_config[i].pci_segment);
+        rc = hvm_register_mmcfg(d, pci_mmcfg_config[i].address,
+                                pci_mmcfg_config[i].start_bus_number,
+                                pci_mmcfg_config[i].end_bus_number,
+                                pci_mmcfg_config[i].pci_segment);
         if ( rc )
             printk("Unable to setup MMCFG handler at %#lx for segment %u\n",
                    pci_mmcfg_config[i].address,
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2b8189946b..fec0073618 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -741,7 +741,7 @@ void hvm_domain_destroy(struct domain *d)
         xfree(ioport);
     }
 
-    destroy_vpci_mmcfg(d);
+    hvm_free_mmcfg(d);
 }
 
 static int hvm_save_tsc_adjust(struct vcpu *v, hvm_domain_context_t *h)
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index a5b0a23f06..3334888136 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -279,6 +279,18 @@ unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr,
     return CF8_ADDR_LO(cf8) | (addr & 3);
 }
 
+unsigned int hvm_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
+                                   paddr_t addr, pci_sbdf_t *sbdf)
+{
+    addr -= mmcfg->addr;
+    sbdf->bdf = MMCFG_BDF(addr);
+    sbdf->bus += mmcfg->start_bus;
+    sbdf->seg = mmcfg->segment;
+
+    return addr & (PCI_CFG_SPACE_EXP_SIZE - 1);
+}
+
+
 /* Do some sanity checks. */
 static bool vpci_access_allowed(unsigned int reg, unsigned int len)
 {
@@ -383,50 +395,14 @@ void register_vpci_portio_handler(struct domain *d)
     handler->ops = &vpci_portio_ops;
 }
 
-struct hvm_mmcfg {
-    struct list_head next;
-    paddr_t addr;
-    unsigned int size;
-    uint16_t segment;
-    uint8_t start_bus;
-};
-
 /* Handlers to trap PCI MMCFG config accesses. */
-static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d,
-                                               paddr_t addr)
-{
-    const struct hvm_mmcfg *mmcfg;
-
-    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
-        if ( addr >= mmcfg->addr && addr < mmcfg->addr + mmcfg->size )
-            return mmcfg;
-
-    return NULL;
-}
-
-bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr)
-{
-    return vpci_mmcfg_find(d, addr);
-}
-
-static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
-                                           paddr_t addr, pci_sbdf_t *sbdf)
-{
-    addr -= mmcfg->addr;
-    sbdf->bdf = MMCFG_BDF(addr);
-    sbdf->bus += mmcfg->start_bus;
-    sbdf->seg = mmcfg->segment;
-
-    return addr & (PCI_CFG_SPACE_EXP_SIZE - 1);
-}
-
 static int vpci_mmcfg_accept(struct vcpu *v, unsigned long addr)
 {
     struct domain *d = v->domain;
     bool found;
 
     read_lock(&d->arch.hvm.mmcfg_lock);
-    found = vpci_mmcfg_find(d, addr);
+    found = hvm_is_mmcfg_address(d, addr);
     read_unlock(&d->arch.hvm.mmcfg_lock);
 
     return found;
@@ -443,14 +419,14 @@ static int vpci_mmcfg_read(struct vcpu *v, unsigned long addr,
     *data = ~0ul;
 
     read_lock(&d->arch.hvm.mmcfg_lock);
-    mmcfg = vpci_mmcfg_find(d, addr);
+    mmcfg = hvm_mmcfg_find(d, addr);
     if ( !mmcfg )
     {
         read_unlock(&d->arch.hvm.mmcfg_lock);
         return X86EMUL_RETRY;
     }
 
-    reg = vpci_mmcfg_decode_addr(mmcfg, addr, &sbdf);
+    reg = hvm_mmcfg_decode_addr(mmcfg, addr, &sbdf);
     read_unlock(&d->arch.hvm.mmcfg_lock);
 
     if ( !vpci_access_allowed(reg, len) ||
@@ -485,14 +461,14 @@ static int vpci_mmcfg_write(struct vcpu *v, unsigned long addr,
     pci_sbdf_t sbdf;
 
     read_lock(&d->arch.hvm.mmcfg_lock);
-    mmcfg = vpci_mmcfg_find(d, addr);
+    mmcfg = hvm_mmcfg_find(d, addr);
     if ( !mmcfg )
     {
         read_unlock(&d->arch.hvm.mmcfg_lock);
         return X86EMUL_RETRY;
     }
 
-    reg = vpci_mmcfg_decode_addr(mmcfg, addr, &sbdf);
+    reg = hvm_mmcfg_decode_addr(mmcfg, addr, &sbdf);
     read_unlock(&d->arch.hvm.mmcfg_lock);
 
     if ( !vpci_access_allowed(reg, len) ||
@@ -512,9 +488,9 @@ static const struct hvm_mmio_ops vpci_mmcfg_ops = {
     .write = vpci_mmcfg_write,
 };
 
-int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
-                                unsigned int start_bus, unsigned int end_bus,
-                                unsigned int seg)
+int hvm_register_mmcfg(struct domain *d, paddr_t addr,
+                       unsigned int start_bus, unsigned int end_bus,
+                       unsigned int seg)
 {
     struct hvm_mmcfg *mmcfg, *new;
 
@@ -549,7 +525,7 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
             return ret;
         }
 
-    if ( list_empty(&d->arch.hvm.mmcfg_regions) )
+    if ( list_empty(&d->arch.hvm.mmcfg_regions) && has_vpci(d) )
         register_mmio_handler(d, &vpci_mmcfg_ops);
 
     list_add(&new->next, &d->arch.hvm.mmcfg_regions);
@@ -558,7 +534,7 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
     return 0;
 }
 
-void destroy_vpci_mmcfg(struct domain *d)
+void hvm_free_mmcfg(struct domain *d)
 {
     struct list_head *mmcfg_regions = &d->arch.hvm.mmcfg_regions;
 
@@ -574,6 +550,17 @@ void destroy_vpci_mmcfg(struct domain *d)
     write_unlock(&d->arch.hvm.mmcfg_lock);
 }
 
+const struct hvm_mmcfg *hvm_mmcfg_find(const struct domain *d, paddr_t addr)
+{
+    const struct hvm_mmcfg *mmcfg;
+
+    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
+        if ( addr >= mmcfg->addr && addr < mmcfg->addr + mmcfg->size )
+            return mmcfg;
+
+    return NULL;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 6339e5f884..fecdc2786f 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -1090,21 +1090,34 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
         /* PCI config space accesses are handled internally. */
         if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
             goto out;
-        else
-            /* fallthrough. */
+        break;
+
     case XEN_DMOP_IO_RANGE_MEMORY:
+    {
+        const struct hvm_mmcfg *mmcfg;
+
+        rc = -EINVAL;
+        /* PCI config space accesses are handled internally. */
+        read_lock(&d->arch.hvm.mmcfg_lock);
+        list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
+            if ( start <= mmcfg->addr + mmcfg->size && mmcfg->addr <= end )
+            {
+                read_unlock(&d->arch.hvm.mmcfg_lock);
+                goto out;
+            }
+        read_unlock(&d->arch.hvm.mmcfg_lock);
+        break;
+    }
+
     case XEN_DMOP_IO_RANGE_PCI:
-        r = s->range[type];
         break;
 
     default:
-        r = NULL;
-        break;
+        rc = -EINVAL;
+        goto out;
     }
 
-    rc = -EINVAL;
-    if ( !r )
-        goto out;
+    r = s->range[type];
 
     rc = -EEXIST;
     if ( rangeset_overlaps_range(r, start, end) )
@@ -1341,27 +1354,34 @@ ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
     uint8_t type;
     uint64_t addr;
     unsigned int id;
+    const struct hvm_mmcfg *mmcfg;
 
     if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
         return XEN_INVALID_IOSERVID;
 
     cf8 = d->arch.hvm.pci_cf8;
 
-    if ( p->type == IOREQ_TYPE_PIO &&
-         (p->addr & ~3) == 0xcfc &&
-         CF8_ENABLED(cf8) )
+    read_lock(&d->arch.hvm.mmcfg_lock);
+    if ( (p->type == IOREQ_TYPE_PIO &&
+          (p->addr & ~3) == 0xcfc &&
+          CF8_ENABLED(cf8)) ||
+         (p->type == IOREQ_TYPE_COPY &&
+          (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
     {
         uint32_t x86_fam;
         pci_sbdf_t sbdf;
         unsigned int reg;
 
-        reg = hvm_pci_decode_addr(cf8, p->addr, &sbdf);
+        reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
+                                                              &sbdf)
+                                        : hvm_mmcfg_decode_addr(mmcfg, p->addr,
+                                                                &sbdf);
 
         /* PCI config data cycle */
         type = XEN_DMOP_IO_RANGE_PCI;
         addr = ((uint64_t)sbdf.sbdf << 32) | reg;
         /* AMD extended configuration space access? */
-        if ( CF8_ADDR_HI(cf8) &&
+        if ( p->type == IOREQ_TYPE_PIO && CF8_ADDR_HI(cf8) &&
              d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
              (x86_fam = get_cpu_family(
                  d->arch.cpuid->basic.raw_fms, NULL, NULL)) > 0x10 &&
@@ -1380,6 +1400,7 @@ ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
                 XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
         addr = p->addr;
     }
+    read_unlock(&d->arch.hvm.mmcfg_lock);
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 3a3c15890b..f61f66df5f 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -562,9 +562,8 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
              * For HVM (PVH) domains try to add the newly found MMCFG to the
              * domain.
              */
-            ret = register_vpci_mmcfg_handler(currd, info.address,
-                                              info.start_bus, info.end_bus,
-                                              info.segment);
+            ret = hvm_register_mmcfg(currd, info.address, info.start_bus,
+                                     info.end_bus, info.segment);
         }
 
         break;
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 92c1d01edf..a33e31e361 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -246,7 +246,7 @@ static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
      * TODO: runtime added MMCFG regions are not checked to make sure they
      * don't overlap with already mapped regions, thus preventing trapping.
      */
-    if ( has_vpci(d) && vpci_is_mmcfg_address(d, pfn_to_paddr(pfn)) )
+    if ( has_vpci(d) && hvm_is_mmcfg_address(d, pfn_to_paddr(pfn)) )
         return false;
 
     return true;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 7ceb119b64..86ebbd1e7e 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -165,9 +165,19 @@ void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
 
-/* Decode a PCI port IO access into a bus/slot/func/reg. */
+struct hvm_mmcfg {
+    struct list_head next;
+    paddr_t addr;
+    unsigned int size;
+    uint16_t segment;
+    uint8_t start_bus;
+};
+
+/* Decode a PCI port IO or MMCFG access into a bus/slot/func/reg. */
 unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr,
                                  pci_sbdf_t *sbdf);
+unsigned int hvm_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
+                                   paddr_t addr, pci_sbdf_t *sbdf);
 
 /*
  * HVM port IO handler that performs forwarding of guest IO ports into machine
@@ -178,15 +188,18 @@ void register_g2m_portio_handler(struct domain *d);
 /* HVM port IO handler for vPCI accesses. */
 void register_vpci_portio_handler(struct domain *d);
 
-/* HVM MMIO handler for PCI MMCFG accesses. */
-int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
-                                unsigned int start_bus, unsigned int end_bus,
-                                unsigned int seg);
-/* Destroy tracked MMCFG areas. */
-void destroy_vpci_mmcfg(struct domain *d);
+/* HVM PCI MMCFG regions registration. */
+int hvm_register_mmcfg(struct domain *d, paddr_t addr,
+                       unsigned int start_bus, unsigned int end_bus,
+                       unsigned int seg);
+void hvm_free_mmcfg(struct domain *d);
+const struct hvm_mmcfg *hvm_mmcfg_find(const struct domain *d, paddr_t addr);
 
 /* Check if an address is between a MMCFG region for a domain. */
-bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr);
+static inline bool hvm_is_mmcfg_address(const struct domain *d, paddr_t addr)
+{
+    return hvm_mmcfg_find(d, addr);
+}
 
 #endif /* __ASM_X86_HVM_IO_H__ */
 
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (7 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 13:49   ` Paul Durrant
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses Roger Pau Monne
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations Roger Pau Monne
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich, Roger Pau Monne

Switch vPCI to become an internal ioreq server, and hence drop all the
vPCI specific decoding and trapping to PCI IO ports and MMCFG regions.

This allows to unify the vPCI code with the ioreq infrastructure,
opening the door for domains having PCI accesses handled by vPCI and
other ioreq servers at the same time.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Remove prototypes for register_vpci_portio_handler and
   register_vpci_mmcfg_handler.
 - Re-add vpci check in hwdom_iommu_map.
 - Fix test harness.
 - Remove vpci_{read/write} prototypes and make the functions static.
---
 tools/tests/vpci/Makefile     |   5 +-
 tools/tests/vpci/emul.h       |   4 +
 xen/arch/x86/hvm/dom0_build.c |   1 +
 xen/arch/x86/hvm/hvm.c        |   5 +-
 xen/arch/x86/hvm/io.c         | 201 ----------------------------------
 xen/arch/x86/physdev.c        |   1 +
 xen/drivers/vpci/vpci.c       |  69 +++++++++++-
 xen/include/xen/vpci.h        |  22 +---
 8 files changed, 86 insertions(+), 222 deletions(-)

diff --git a/tools/tests/vpci/Makefile b/tools/tests/vpci/Makefile
index 5075bc2be2..c365c4522a 100644
--- a/tools/tests/vpci/Makefile
+++ b/tools/tests/vpci/Makefile
@@ -25,7 +25,10 @@ install:
 
 vpci.c: $(XEN_ROOT)/xen/drivers/vpci/vpci.c
 	# Remove includes and add the test harness header
-	sed -e '/#include/d' -e '1s/^/#include "emul.h"/' <$< >$@
+	sed -e '/#include/d' -e '1s/^/#include "emul.h"/' \
+	    -e 's/^static uint32_t read/uint32_t vpci_read/' \
+	    -e 's/^static void write/void vpci_write/' <$< >$@
+
 
 list.h: $(XEN_ROOT)/xen/include/xen/list.h
 vpci.h: $(XEN_ROOT)/xen/include/xen/vpci.h
diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h
index 796797fdc2..790c4de601 100644
--- a/tools/tests/vpci/emul.h
+++ b/tools/tests/vpci/emul.h
@@ -125,6 +125,10 @@ typedef union {
         tx > ty ? tx : ty;              \
 })
 
+uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
+void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
+                uint32_t data);
+
 #endif
 
 /*
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 1ddbd46b39..c022502bb8 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -29,6 +29,7 @@
 
 #include <asm/bzimage.h>
 #include <asm/dom0_build.h>
+#include <asm/hvm/ioreq.h>
 #include <asm/hvm/support.h>
 #include <asm/io_apic.h>
 #include <asm/p2m.h>
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index fec0073618..228c79643d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -644,10 +644,13 @@ int hvm_domain_initialise(struct domain *d)
         d->arch.hvm.io_bitmap = hvm_io_bitmap;
 
     register_g2m_portio_handler(d);
-    register_vpci_portio_handler(d);
 
     hvm_ioreq_init(d);
 
+    rc = vpci_register_ioreq(d);
+    if ( rc )
+        goto fail1;
+
     hvm_init_guest_time(d);
 
     d->arch.hvm.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 3334888136..4c72e68a5b 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -290,204 +290,6 @@ unsigned int hvm_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
     return addr & (PCI_CFG_SPACE_EXP_SIZE - 1);
 }
 
-
-/* Do some sanity checks. */
-static bool vpci_access_allowed(unsigned int reg, unsigned int len)
-{
-    /* Check access size. */
-    if ( len != 1 && len != 2 && len != 4 && len != 8 )
-        return false;
-
-    /* Check that access is size aligned. */
-    if ( (reg & (len - 1)) )
-        return false;
-
-    return true;
-}
-
-/* vPCI config space IO ports handlers (0xcf8/0xcfc). */
-static bool vpci_portio_accept(const struct hvm_io_handler *handler,
-                               const ioreq_t *p)
-{
-    return (p->addr == 0xcf8 && p->size == 4) || (p->addr & ~3) == 0xcfc;
-}
-
-static int vpci_portio_read(const struct hvm_io_handler *handler,
-                            uint64_t addr, uint32_t size, uint64_t *data)
-{
-    const struct domain *d = current->domain;
-    unsigned int reg;
-    pci_sbdf_t sbdf;
-    uint32_t cf8;
-
-    *data = ~(uint64_t)0;
-
-    if ( addr == 0xcf8 )
-    {
-        ASSERT(size == 4);
-        *data = d->arch.hvm.pci_cf8;
-        return X86EMUL_OKAY;
-    }
-
-    ASSERT((addr & ~3) == 0xcfc);
-    cf8 = ACCESS_ONCE(d->arch.hvm.pci_cf8);
-    if ( !CF8_ENABLED(cf8) )
-        return X86EMUL_UNHANDLEABLE;
-
-    reg = hvm_pci_decode_addr(cf8, addr, &sbdf);
-
-    if ( !vpci_access_allowed(reg, size) )
-        return X86EMUL_OKAY;
-
-    *data = vpci_read(sbdf, reg, size);
-
-    return X86EMUL_OKAY;
-}
-
-static int vpci_portio_write(const struct hvm_io_handler *handler,
-                             uint64_t addr, uint32_t size, uint64_t data)
-{
-    struct domain *d = current->domain;
-    unsigned int reg;
-    pci_sbdf_t sbdf;
-    uint32_t cf8;
-
-    if ( addr == 0xcf8 )
-    {
-        ASSERT(size == 4);
-        d->arch.hvm.pci_cf8 = data;
-        return X86EMUL_OKAY;
-    }
-
-    ASSERT((addr & ~3) == 0xcfc);
-    cf8 = ACCESS_ONCE(d->arch.hvm.pci_cf8);
-    if ( !CF8_ENABLED(cf8) )
-        return X86EMUL_UNHANDLEABLE;
-
-    reg = hvm_pci_decode_addr(cf8, addr, &sbdf);
-
-    if ( !vpci_access_allowed(reg, size) )
-        return X86EMUL_OKAY;
-
-    vpci_write(sbdf, reg, size, data);
-
-    return X86EMUL_OKAY;
-}
-
-static const struct hvm_io_ops vpci_portio_ops = {
-    .accept = vpci_portio_accept,
-    .read = vpci_portio_read,
-    .write = vpci_portio_write,
-};
-
-void register_vpci_portio_handler(struct domain *d)
-{
-    struct hvm_io_handler *handler;
-
-    if ( !has_vpci(d) )
-        return;
-
-    handler = hvm_next_io_handler(d);
-    if ( !handler )
-        return;
-
-    handler->type = IOREQ_TYPE_PIO;
-    handler->ops = &vpci_portio_ops;
-}
-
-/* Handlers to trap PCI MMCFG config accesses. */
-static int vpci_mmcfg_accept(struct vcpu *v, unsigned long addr)
-{
-    struct domain *d = v->domain;
-    bool found;
-
-    read_lock(&d->arch.hvm.mmcfg_lock);
-    found = hvm_is_mmcfg_address(d, addr);
-    read_unlock(&d->arch.hvm.mmcfg_lock);
-
-    return found;
-}
-
-static int vpci_mmcfg_read(struct vcpu *v, unsigned long addr,
-                           unsigned int len, unsigned long *data)
-{
-    struct domain *d = v->domain;
-    const struct hvm_mmcfg *mmcfg;
-    unsigned int reg;
-    pci_sbdf_t sbdf;
-
-    *data = ~0ul;
-
-    read_lock(&d->arch.hvm.mmcfg_lock);
-    mmcfg = hvm_mmcfg_find(d, addr);
-    if ( !mmcfg )
-    {
-        read_unlock(&d->arch.hvm.mmcfg_lock);
-        return X86EMUL_RETRY;
-    }
-
-    reg = hvm_mmcfg_decode_addr(mmcfg, addr, &sbdf);
-    read_unlock(&d->arch.hvm.mmcfg_lock);
-
-    if ( !vpci_access_allowed(reg, len) ||
-         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
-        return X86EMUL_OKAY;
-
-    /*
-     * According to the PCIe 3.1A specification:
-     *  - Configuration Reads and Writes must usually be DWORD or smaller
-     *    in size.
-     *  - Because Root Complex implementations are not required to support
-     *    accesses to a RCRB that cross DW boundaries [...] software
-     *    should take care not to cause the generation of such accesses
-     *    when accessing a RCRB unless the Root Complex will support the
-     *    access.
-     *  Xen however supports 8byte accesses by splitting them into two
-     *  4byte accesses.
-     */
-    *data = vpci_read(sbdf, reg, min(4u, len));
-    if ( len == 8 )
-        *data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
-
-    return X86EMUL_OKAY;
-}
-
-static int vpci_mmcfg_write(struct vcpu *v, unsigned long addr,
-                            unsigned int len, unsigned long data)
-{
-    struct domain *d = v->domain;
-    const struct hvm_mmcfg *mmcfg;
-    unsigned int reg;
-    pci_sbdf_t sbdf;
-
-    read_lock(&d->arch.hvm.mmcfg_lock);
-    mmcfg = hvm_mmcfg_find(d, addr);
-    if ( !mmcfg )
-    {
-        read_unlock(&d->arch.hvm.mmcfg_lock);
-        return X86EMUL_RETRY;
-    }
-
-    reg = hvm_mmcfg_decode_addr(mmcfg, addr, &sbdf);
-    read_unlock(&d->arch.hvm.mmcfg_lock);
-
-    if ( !vpci_access_allowed(reg, len) ||
-         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
-        return X86EMUL_OKAY;
-
-    vpci_write(sbdf, reg, min(4u, len), data);
-    if ( len == 8 )
-        vpci_write(sbdf, reg + 4, 4, data >> 32);
-
-    return X86EMUL_OKAY;
-}
-
-static const struct hvm_mmio_ops vpci_mmcfg_ops = {
-    .check = vpci_mmcfg_accept,
-    .read = vpci_mmcfg_read,
-    .write = vpci_mmcfg_write,
-};
-
 int hvm_register_mmcfg(struct domain *d, paddr_t addr,
                        unsigned int start_bus, unsigned int end_bus,
                        unsigned int seg)
@@ -525,9 +327,6 @@ int hvm_register_mmcfg(struct domain *d, paddr_t addr,
             return ret;
         }
 
-    if ( list_empty(&d->arch.hvm.mmcfg_regions) && has_vpci(d) )
-        register_mmio_handler(d, &vpci_mmcfg_ops);
-
     list_add(&new->next, &d->arch.hvm.mmcfg_regions);
     write_unlock(&d->arch.hvm.mmcfg_lock);
 
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index f61f66df5f..bf2c64a0a9 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -11,6 +11,7 @@
 #include <asm/current.h>
 #include <asm/io_apic.h>
 #include <asm/msi.h>
+#include <asm/hvm/ioreq.h>
 #include <asm/hvm/irq.h>
 #include <asm/hypercall.h>
 #include <public/xen.h>
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index cbd1bac7fc..5664020c2d 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -20,6 +20,8 @@
 #include <xen/sched.h>
 #include <xen/vpci.h>
 
+#include <asm/hvm/ioreq.h>
+
 /* Internal struct to store the emulated PCI registers. */
 struct vpci_register {
     vpci_read_t *read;
@@ -302,7 +304,7 @@ static uint32_t merge_result(uint32_t data, uint32_t new, unsigned int size,
     return (data & ~(mask << (offset * 8))) | ((new & mask) << (offset * 8));
 }
 
-uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
+static uint32_t read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
 {
     const struct domain *d = current->domain;
     const struct pci_dev *pdev;
@@ -404,8 +406,8 @@ static void vpci_write_helper(const struct pci_dev *pdev,
              r->private);
 }
 
-void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
-                uint32_t data)
+static void write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
+                  uint32_t data)
 {
     const struct domain *d = current->domain;
     const struct pci_dev *pdev;
@@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
     spin_unlock(&pdev->vpci->lock);
 }
 
+#ifdef __XEN__
+static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
+{
+    pci_sbdf_t sbdf;
+
+    if ( req->type == IOREQ_TYPE_INVALIDATE )
+        /*
+         * Ignore invalidate requests, those can be received even without
+         * having any memory ranges registered, see send_invalidate_req.
+         */
+        return X86EMUL_OKAY;
+
+    if ( req->type != IOREQ_TYPE_PCI_CONFIG || req->data_is_ptr )
+    {
+        ASSERT_UNREACHABLE();
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    sbdf.sbdf = req->addr >> 32;
+
+    if ( req->dir )
+        req->data = read(sbdf, req->addr, req->size);
+    else
+        write(sbdf, req->addr, req->size, req->data);
+
+    return X86EMUL_OKAY;
+}
+
+int vpci_register_ioreq(struct domain *d)
+{
+    ioservid_t id;
+    int rc;
+
+    if ( !has_vpci(d) )
+        return 0;
+
+    rc = hvm_create_ioreq_server(d, HVM_IOREQSRV_BUFIOREQ_OFF, &id, true);
+    if ( rc )
+        return rc;
+
+    rc = hvm_add_ioreq_handler(d, id, ioreq_handler, NULL);
+    if ( rc )
+        return rc;
+
+    if ( is_hardware_domain(d) )
+    {
+        /* Handle all devices in vpci. */
+        rc = hvm_map_io_range_to_ioreq_server(d, id, XEN_DMOP_IO_RANGE_PCI,
+                                              0, ~(uint64_t)0);
+        if ( rc )
+            return rc;
+    }
+
+    rc = hvm_set_ioreq_server_state(d, id, true);
+    if ( rc )
+        return rc;
+
+    return rc;
+}
+#endif
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 4cf233c779..36f435ed5b 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -23,6 +23,9 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
   static vpci_register_init_t *const x##_entry  \
                __used_section(".data.vpci." p) = x
 
+/* Register vPCI handler with ioreq. */
+int vpci_register_ioreq(struct domain *d);
+
 /* Add vPCI handlers to device. */
 int __must_check vpci_add_handlers(struct pci_dev *dev);
 
@@ -38,11 +41,6 @@ int __must_check vpci_add_register(struct vpci *vpci,
 int __must_check vpci_remove_register(struct vpci *vpci, unsigned int offset,
                                       unsigned int size);
 
-/* Generic read/write handlers for the PCI config space. */
-uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
-void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
-                uint32_t data);
-
 /* Passthrough handlers. */
 uint32_t vpci_hw_read16(const struct pci_dev *pdev, unsigned int reg,
                         void *data);
@@ -221,20 +219,12 @@ static inline int vpci_add_handlers(struct pci_dev *pdev)
     return 0;
 }
 
-static inline void vpci_dump_msi(void) { }
-
-static inline uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,
-                                 unsigned int size)
+static inline int vpci_register_ioreq(struct domain *d)
 {
-    ASSERT_UNREACHABLE();
-    return ~(uint32_t)0;
+    return 0;
 }
 
-static inline void vpci_write(pci_sbdf_t sbdf, unsigned int reg,
-                              unsigned int size, uint32_t data)
-{
-    ASSERT_UNREACHABLE();
-}
+static inline void vpci_dump_msi(void) { }
 
 static inline bool vpci_process_pending(struct vcpu *v)
 {
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (8 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 14:06   ` Paul Durrant
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations Roger Pau Monne
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich, Roger Pau Monne

Place the code that converts a PIO/COPY ioreq into a PCI_CONFIG one
into a separate function, and adjust the code to make use of this
newly introduced function.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/ioreq.c | 111 +++++++++++++++++++++++----------------
 1 file changed, 67 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index fecdc2786f..33c56b880c 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -183,6 +183,54 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
     return true;
 }
 
+static void convert_pci_ioreq(struct domain *d, ioreq_t *p)
+{
+    const struct hvm_mmcfg *mmcfg;
+    uint32_t cf8 = d->arch.hvm.pci_cf8;
+
+    if ( p->type != IOREQ_TYPE_PIO && p->type != IOREQ_TYPE_COPY )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    read_lock(&d->arch.hvm.mmcfg_lock);
+    if ( (p->type == IOREQ_TYPE_PIO &&
+          (p->addr & ~3) == 0xcfc &&
+          CF8_ENABLED(cf8)) ||
+         (p->type == IOREQ_TYPE_COPY &&
+          (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
+    {
+        uint32_t x86_fam;
+        pci_sbdf_t sbdf;
+        unsigned int reg;
+
+        reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
+                                                              &sbdf)
+                                        : hvm_mmcfg_decode_addr(mmcfg, p->addr,
+                                                                &sbdf);
+
+        /* PCI config data cycle */
+        p->addr = ((uint64_t)sbdf.sbdf << 32) | reg;
+        /* AMD extended configuration space access? */
+        if ( p->type == IOREQ_TYPE_PIO && CF8_ADDR_HI(cf8) &&
+             d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
+             (x86_fam = get_cpu_family(
+                 d->arch.cpuid->basic.raw_fms, NULL, NULL)) > 0x10 &&
+             x86_fam < 0x17 )
+        {
+            uint64_t msr_val;
+
+            if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
+                 (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
+                p->addr |= CF8_ADDR_HI(cf8);
+        }
+        p->type = IOREQ_TYPE_PCI_CONFIG;
+
+    }
+    read_unlock(&d->arch.hvm.mmcfg_lock);
+}
+
 bool handle_hvm_io_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
@@ -1350,57 +1398,36 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
 ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
 {
     struct hvm_ioreq_server *s;
-    uint32_t cf8;
     uint8_t type;
-    uint64_t addr;
     unsigned int id;
-    const struct hvm_mmcfg *mmcfg;
 
     if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
         return XEN_INVALID_IOSERVID;
 
-    cf8 = d->arch.hvm.pci_cf8;
+    /*
+     * Check and convert the PIO/MMIO ioreq to a PCI config space
+     * access.
+     */
+    convert_pci_ioreq(d, p);
 
-    read_lock(&d->arch.hvm.mmcfg_lock);
-    if ( (p->type == IOREQ_TYPE_PIO &&
-          (p->addr & ~3) == 0xcfc &&
-          CF8_ENABLED(cf8)) ||
-         (p->type == IOREQ_TYPE_COPY &&
-          (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
+    switch ( p->type )
     {
-        uint32_t x86_fam;
-        pci_sbdf_t sbdf;
-        unsigned int reg;
+    case IOREQ_TYPE_PIO:
+        type = XEN_DMOP_IO_RANGE_PORT;
+        break;
 
-        reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
-                                                              &sbdf)
-                                        : hvm_mmcfg_decode_addr(mmcfg, p->addr,
-                                                                &sbdf);
+    case IOREQ_TYPE_COPY:
+        type = XEN_DMOP_IO_RANGE_MEMORY;
+        break;
 
-        /* PCI config data cycle */
+    case IOREQ_TYPE_PCI_CONFIG:
         type = XEN_DMOP_IO_RANGE_PCI;
-        addr = ((uint64_t)sbdf.sbdf << 32) | reg;
-        /* AMD extended configuration space access? */
-        if ( p->type == IOREQ_TYPE_PIO && CF8_ADDR_HI(cf8) &&
-             d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
-             (x86_fam = get_cpu_family(
-                 d->arch.cpuid->basic.raw_fms, NULL, NULL)) > 0x10 &&
-             x86_fam < 0x17 )
-        {
-            uint64_t msr_val;
+        break;
 
-            if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
-                 (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
-                addr |= CF8_ADDR_HI(cf8);
-        }
-    }
-    else
-    {
-        type = (p->type == IOREQ_TYPE_PIO) ?
-                XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
-        addr = p->addr;
+    default:
+        ASSERT_UNREACHABLE();
+        return XEN_INVALID_IOSERVID;
     }
-    read_unlock(&d->arch.hvm.mmcfg_lock);
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
@@ -1416,7 +1443,7 @@ ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
             unsigned long start, end;
 
         case XEN_DMOP_IO_RANGE_PORT:
-            start = addr;
+            start = p->addr;
             end = start + p->size - 1;
             if ( rangeset_contains_range(r, start, end) )
                 return id;
@@ -1433,12 +1460,8 @@ ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
             break;
 
         case XEN_DMOP_IO_RANGE_PCI:
-            if ( rangeset_contains_singleton(r, addr >> 32) )
-            {
-                p->type = IOREQ_TYPE_PCI_CONFIG;
-                p->addr = addr;
+            if ( rangeset_contains_singleton(r, p->addr >> 32) )
                 return id;
-            }
 
             break;
         }
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations...
  2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
                   ` (9 preceding siblings ...)
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses Roger Pau Monne
@ 2019-09-03 16:14 ` Roger Pau Monne
  2019-09-10 14:14   ` Paul Durrant
  10 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monne @ 2019-09-03 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich, Roger Pau Monne

...and switch vPCI to use this infrastructure for long running
physmap modification operations.

This allows to get rid of the vPCI specific modifications done to
handle_hvm_io_completion and allows generalizing the support for
long-running operations to other internal ioreq servers. Such support
is implemented as a specific handler that can be registers by internal
ioreq servers and that will be called to check for pending work.
Returning true from this handler will prevent the vcpu from running
until the handler returns false.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/ioreq.c       | 55 +++++++++++++++++++++++++-----
 xen/drivers/vpci/header.c      | 61 ++++++++++++++++++----------------
 xen/drivers/vpci/vpci.c        |  8 ++++-
 xen/include/asm-x86/hvm/vcpu.h |  3 +-
 xen/include/xen/vpci.h         |  6 ----
 5 files changed, 89 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 33c56b880c..caa53dfa84 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -239,16 +239,48 @@ bool handle_hvm_io_completion(struct vcpu *v)
     enum hvm_io_completion io_completion;
     unsigned int id;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-    {
-        raise_softirq(SCHEDULE_SOFTIRQ);
-        return false;
-    }
-
-    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
 
+        if ( hvm_ioreq_is_internal(id) )
+        {
+            if ( vio->io_req.state == STATE_IOREQ_INPROCESS )
+            {
+                ioreq_t req = vio->io_req;
+
+                /*
+                 * Check and convert the PIO/MMIO ioreq to a PCI config space
+                 * access.
+                 */
+                convert_pci_ioreq(d, &req);
+
+                if ( s->handler(v, &req, s->data) == X86EMUL_RETRY )
+                {
+                    /*
+                     * Need to raise a scheduler irq in order to prevent the
+                     * guest vcpu from resuming execution.
+                     *
+                     * Note this is not required for external ioreq operations
+                     * because in that case the vcpu is marked as blocked, but
+                     * this cannot be done for long-running internal
+                     * operations, since it would prevent the vcpu from being
+                     * scheduled and thus the long running operation from
+                     * finishing.
+                     */
+                    raise_softirq(SCHEDULE_SOFTIRQ);
+                    return false;
+                }
+
+                /* Finished processing the ioreq. */
+                if ( hvm_ioreq_needs_completion(&vio->io_req) )
+                    vio->io_req.state = STATE_IORESP_READY;
+                else
+                    vio->io_req.state = STATE_IOREQ_NONE;
+            }
+            continue;
+        }
+
         list_for_each_entry ( sv,
                               &s->ioreq_vcpu_list,
                               list_entry )
@@ -1582,7 +1614,14 @@ int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
         return hvm_send_buffered_ioreq(s, proto_p);
 
     if ( hvm_ioreq_is_internal(id) )
-        return s->handler(curr, proto_p, s->data);
+    {
+        int rc = s->handler(curr, proto_p, s->data);
+
+        if ( rc == X86EMUL_RETRY )
+            curr->arch.hvm.hvm_io.io_req.state = STATE_IOREQ_INPROCESS;
+
+        return rc;
+    }
 
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
         return X86EMUL_RETRY;
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 3c794f486d..f1c1a69492 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -129,37 +129,42 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
 
 bool vpci_process_pending(struct vcpu *v)
 {
-    if ( v->vpci.mem )
+    struct map_data data = {
+        .d = v->domain,
+        .map = v->vpci.cmd & PCI_COMMAND_MEMORY,
+    };
+    int rc;
+
+    if ( !v->vpci.mem )
     {
-        struct map_data data = {
-            .d = v->domain,
-            .map = v->vpci.cmd & PCI_COMMAND_MEMORY,
-        };
-        int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
-
-        if ( rc == -ERESTART )
-            return true;
-
-        spin_lock(&v->vpci.pdev->vpci->lock);
-        /* Disable memory decoding unconditionally on failure. */
-        modify_decoding(v->vpci.pdev,
-                        rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
-                        !rc && v->vpci.rom_only);
-        spin_unlock(&v->vpci.pdev->vpci->lock);
-
-        rangeset_destroy(v->vpci.mem);
-        v->vpci.mem = NULL;
-        if ( rc )
-            /*
-             * FIXME: in case of failure remove the device from the domain.
-             * Note that there might still be leftover mappings. While this is
-             * safe for Dom0, for DomUs the domain will likely need to be
-             * killed in order to avoid leaking stale p2m mappings on
-             * failure.
-             */
-            vpci_remove_device(v->vpci.pdev);
+        ASSERT_UNREACHABLE();
+        return false;
     }
 
+    rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
+
+    if ( rc == -ERESTART )
+        return true;
+
+    spin_lock(&v->vpci.pdev->vpci->lock);
+    /* Disable memory decoding unconditionally on failure. */
+    modify_decoding(v->vpci.pdev,
+                    rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
+                    !rc && v->vpci.rom_only);
+    spin_unlock(&v->vpci.pdev->vpci->lock);
+
+    rangeset_destroy(v->vpci.mem);
+    v->vpci.mem = NULL;
+    if ( rc )
+        /*
+         * FIXME: in case of failure remove the device from the domain.
+         * Note that there might still be leftover mappings. While this is
+         * safe for Dom0, for DomUs the domain will likely need to be
+         * killed in order to avoid leaking stale p2m mappings on
+         * failure.
+         */
+        vpci_remove_device(v->vpci.pdev);
+
     return false;
 }
 
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 5664020c2d..6069dff612 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -498,6 +498,12 @@ static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
         return X86EMUL_UNHANDLEABLE;
     }
 
+    if ( v->vpci.mem )
+    {
+        ASSERT(req->state == STATE_IOREQ_INPROCESS);
+        return vpci_process_pending(v) ? X86EMUL_RETRY : X86EMUL_OKAY;
+    }
+
     sbdf.sbdf = req->addr >> 32;
 
     if ( req->dir )
@@ -505,7 +511,7 @@ static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
     else
         write(sbdf, req->addr, req->size, req->data);
 
-    return X86EMUL_OKAY;
+    return v->vpci.mem ? X86EMUL_RETRY : X86EMUL_OKAY;
 }
 
 int vpci_register_ioreq(struct domain *d)
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 38f5c2bb9b..4563746466 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -92,7 +92,8 @@ struct hvm_vcpu_io {
 
 static inline bool hvm_ioreq_needs_completion(const ioreq_t *ioreq)
 {
-    return ioreq->state == STATE_IOREQ_READY &&
+    return (ioreq->state == STATE_IOREQ_READY ||
+            ioreq->state == STATE_IOREQ_INPROCESS) &&
            !ioreq->data_is_ptr &&
            (ioreq->type != IOREQ_TYPE_PIO || ioreq->dir != IOREQ_WRITE);
 }
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 36f435ed5b..a65491e0c9 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -225,12 +225,6 @@ static inline int vpci_register_ioreq(struct domain *d)
 }
 
 static inline void vpci_dump_msi(void) { }
-
-static inline bool vpci_process_pending(struct vcpu *v)
-{
-    ASSERT_UNREACHABLE();
-    return false;
-}
 #endif
 
 #endif
-- 
2.22.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level Roger Pau Monne
@ 2019-09-03 17:13   ` Andrew Cooper
  2019-09-04  7:49     ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Andrew Cooper @ 2019-09-03 17:13 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Paul Durrant, Wei Liu, Jan Beulich

On 03/09/2019 17:14, Roger Pau Monne wrote:
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 692b710b02..69652e1080 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>      switch ( type )
>      {
>      case XEN_DMOP_IO_RANGE_PORT:
> +        rc = -EINVAL;
> +        /* PCI config space accesses are handled internally. */
> +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
> +            goto out;
> +        else
> +            /* fallthrough. */

You need to drop the else, or it may still trigger warnings.

Furthermore, qemu registers cf8-cff so I think you need some fix-ups
there first before throwing errors back here.

Finally, this prohibits registering cf9 which may legitimately not be
terminated in Xen.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-03 17:13   ` Andrew Cooper
@ 2019-09-04  7:49     ` Roger Pau Monné
  2019-09-04  8:00       ` Roger Pau Monné
                         ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-04  7:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, Jan Beulich

On Tue, Sep 03, 2019 at 06:13:59PM +0100, Andrew Cooper wrote:
> On 03/09/2019 17:14, Roger Pau Monne wrote:
> > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > index 692b710b02..69652e1080 100644
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> >      switch ( type )
> >      {
> >      case XEN_DMOP_IO_RANGE_PORT:
> > +        rc = -EINVAL;
> > +        /* PCI config space accesses are handled internally. */
> > +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
> > +            goto out;
> > +        else
> > +            /* fallthrough. */
> 
> You need to drop the else, or it may still trigger warnings.

Yes, my mistake. The else branch is not needed.

> Furthermore, qemu registers cf8-cff so I think you need some fix-ups
> there first before throwing errors back here.

The version of QEMU I have doesn't seem to register 0xcf8 or 0xcfc,
there are no errors on the log and QEMU seems to work just fine.

I always assumed QEMU was getting accesses to cf8/cfc forwarded
because it was the default device model, and everything not trapped by
Xen would be forwarded to it. This default device model behaviour was
removed by Paul some time ago, and now QEMU registers explicitly which
IO accesses it wants to trap.

> Finally, this prohibits registering cf9 which may legitimately not be
> terminated in Xen.

Yes, that should be cf8 - 7 not 8, thanks for catching it! Will update
on the next version.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-04  7:49     ` Roger Pau Monné
@ 2019-09-04  8:00       ` Roger Pau Monné
  2019-09-04  8:04       ` Jan Beulich
  2019-09-04  9:46       ` Paul Durrant
  2 siblings, 0 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-04  8:00 UTC (permalink / raw)
  To: ndrew Cooper; +Cc: xen-devel, Paul Durrant, Jan Beulich, Wei Liu

On Wed, Sep 04, 2019 at 09:49:23AM +0200, Roger Pau Monné wrote:
> On Tue, Sep 03, 2019 at 06:13:59PM +0100, Andrew Cooper wrote:
> > On 03/09/2019 17:14, Roger Pau Monne wrote:
> > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > > index 692b710b02..69652e1080 100644
> > > --- a/xen/arch/x86/hvm/ioreq.c
> > > +++ b/xen/arch/x86/hvm/ioreq.c
> > > @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > >      switch ( type )
> > >      {
> > >      case XEN_DMOP_IO_RANGE_PORT:
> > > +        rc = -EINVAL;
> > > +        /* PCI config space accesses are handled internally. */
> > > +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
> > > +            goto out;
> > > +        else
> > > +            /* fallthrough. */
> > Finally, this prohibits registering cf9 which may legitimately not be
> > terminated in Xen.
> 
> Yes, that should be cf8 - 7 not 8, thanks for catching it! Will update
> on the next version.

... or just use < instead of <=.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-04  7:49     ` Roger Pau Monné
  2019-09-04  8:00       ` Roger Pau Monné
@ 2019-09-04  8:04       ` Jan Beulich
  2019-09-04  9:46       ` Paul Durrant
  2 siblings, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2019-09-04  8:04 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On 04.09.2019 09:49, Roger Pau Monné  wrote:
> On Tue, Sep 03, 2019 at 06:13:59PM +0100, Andrew Cooper wrote:
>> On 03/09/2019 17:14, Roger Pau Monne wrote:
>>> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
>>> index 692b710b02..69652e1080 100644
>>> --- a/xen/arch/x86/hvm/ioreq.c
>>> +++ b/xen/arch/x86/hvm/ioreq.c
>>> @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>>>      switch ( type )
>>>      {
>>>      case XEN_DMOP_IO_RANGE_PORT:
>>> +        rc = -EINVAL;
>>> +        /* PCI config space accesses are handled internally. */
>>> +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
>>> +            goto out;
>>> +        else
>>> +            /* fallthrough. */
>>
>> Finally, this prohibits registering cf9 which may legitimately not be
>> terminated in Xen.
> 
> Yes, that should be cf8 - 7 not 8, thanks for catching it! Will update
> on the next version.

Well, assuming you mean + instead of - , then yes, this needs fixing.
But doing so won't take care of Andrew's comment.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-04  7:49     ` Roger Pau Monné
  2019-09-04  8:00       ` Roger Pau Monné
  2019-09-04  8:04       ` Jan Beulich
@ 2019-09-04  9:46       ` Paul Durrant
  2019-09-04 13:39         ` Roger Pau Monné
  2 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-04  9:46 UTC (permalink / raw)
  To: Roger Pau Monne, Andrew Cooper; +Cc: xen-devel, Wei Liu, Jan Beulich

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 04 September 2019 08:49
> To: Andrew Cooper <Andrew.Cooper3@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
> 
> On Tue, Sep 03, 2019 at 06:13:59PM +0100, Andrew Cooper wrote:
> > On 03/09/2019 17:14, Roger Pau Monne wrote:
> > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > > index 692b710b02..69652e1080 100644
> > > --- a/xen/arch/x86/hvm/ioreq.c
> > > +++ b/xen/arch/x86/hvm/ioreq.c
> > > @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > >      switch ( type )
> > >      {
> > >      case XEN_DMOP_IO_RANGE_PORT:
> > > +        rc = -EINVAL;
> > > +        /* PCI config space accesses are handled internally. */
> > > +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
> > > +            goto out;
> > > +        else
> > > +            /* fallthrough. */
> >
> > You need to drop the else, or it may still trigger warnings.
> 
> Yes, my mistake. The else branch is not needed.
> 
> > Furthermore, qemu registers cf8-cff so I think you need some fix-ups
> > there first before throwing errors back here.
> 
> The version of QEMU I have doesn't seem to register 0xcf8 or 0xcfc,
> there are no errors on the log and QEMU seems to work just fine.
> 
> I always assumed QEMU was getting accesses to cf8/cfc forwarded
> because it was the default device model, and everything not trapped by
> Xen would be forwarded to it. This default device model behaviour was
> removed by Paul some time ago, and now QEMU registers explicitly which
> IO accesses it wants to trap.

Yes, it used to need them to work correctly as a default emulator. However, we don't generally stop an external emulator from registering ranges that are handled by emulation directly in Xen (e.g. pmtimer) so I don't think you need special-case these ports.

  Paul

> 
> > Finally, this prohibits registering cf9 which may legitimately not be
> > terminated in Xen.
> 
> Yes, that should be cf8 - 7 not 8, thanks for catching it! Will update
> on the next version.
> 
> Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-04  9:46       ` Paul Durrant
@ 2019-09-04 13:39         ` Roger Pau Monné
  2019-09-04 13:56           ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-04 13:39 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, xen-devel

On Wed, Sep 04, 2019 at 11:46:24AM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 04 September 2019 08:49
> > To: Andrew Cooper <Andrew.Cooper3@citrix.com>
> > Cc: xen-devel@lists.xenproject.org; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > <jbeulich@suse.com>; Wei Liu <wl@xen.org>
> > Subject: Re: [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
> > 
> > On Tue, Sep 03, 2019 at 06:13:59PM +0100, Andrew Cooper wrote:
> > > On 03/09/2019 17:14, Roger Pau Monne wrote:
> > > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > > > index 692b710b02..69652e1080 100644
> > > > --- a/xen/arch/x86/hvm/ioreq.c
> > > > +++ b/xen/arch/x86/hvm/ioreq.c
> > > > @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > > >      switch ( type )
> > > >      {
> > > >      case XEN_DMOP_IO_RANGE_PORT:
> > > > +        rc = -EINVAL;
> > > > +        /* PCI config space accesses are handled internally. */
> > > > +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
> > > > +            goto out;
> > > > +        else
> > > > +            /* fallthrough. */
> > >
> > > You need to drop the else, or it may still trigger warnings.
> > 
> > Yes, my mistake. The else branch is not needed.
> > 
> > > Furthermore, qemu registers cf8-cff so I think you need some fix-ups
> > > there first before throwing errors back here.
> > 
> > The version of QEMU I have doesn't seem to register 0xcf8 or 0xcfc,
> > there are no errors on the log and QEMU seems to work just fine.
> > 
> > I always assumed QEMU was getting accesses to cf8/cfc forwarded
> > because it was the default device model, and everything not trapped by
> > Xen would be forwarded to it. This default device model behaviour was
> > removed by Paul some time ago, and now QEMU registers explicitly which
> > IO accesses it wants to trap.
> 
> Yes, it used to need them to work correctly as a default emulator. However, we don't generally stop an external emulator from registering ranges that are handled by emulation directly in Xen (e.g. pmtimer) so I don't think you need special-case these ports.

That's right, so I guess I would just remove that check (and the one
introduced for MCFG regions). We also don't check whether any other
ioreq server has already registered a range.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
  2019-09-04 13:39         ` Roger Pau Monné
@ 2019-09-04 13:56           ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-04 13:56 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, xen-devel

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 04 September 2019 14:40
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel@lists.xenproject.org; Jan Beulich
> <jbeulich@suse.com>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
> 
> On Wed, Sep 04, 2019 at 11:46:24AM +0200, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > Sent: 04 September 2019 08:49
> > > To: Andrew Cooper <Andrew.Cooper3@citrix.com>
> > > Cc: xen-devel@lists.xenproject.org; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > > <jbeulich@suse.com>; Wei Liu <wl@xen.org>
> > > Subject: Re: [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level
> > >
> > > On Tue, Sep 03, 2019 at 06:13:59PM +0100, Andrew Cooper wrote:
> > > > On 03/09/2019 17:14, Roger Pau Monne wrote:
> > > > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > > > > index 692b710b02..69652e1080 100644
> > > > > --- a/xen/arch/x86/hvm/ioreq.c
> > > > > +++ b/xen/arch/x86/hvm/ioreq.c
> > > > > @@ -1015,6 +1015,12 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > > > >      switch ( type )
> > > > >      {
> > > > >      case XEN_DMOP_IO_RANGE_PORT:
> > > > > +        rc = -EINVAL;
> > > > > +        /* PCI config space accesses are handled internally. */
> > > > > +        if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
> > > > > +            goto out;
> > > > > +        else
> > > > > +            /* fallthrough. */
> > > >
> > > > You need to drop the else, or it may still trigger warnings.
> > >
> > > Yes, my mistake. The else branch is not needed.
> > >
> > > > Furthermore, qemu registers cf8-cff so I think you need some fix-ups
> > > > there first before throwing errors back here.
> > >
> > > The version of QEMU I have doesn't seem to register 0xcf8 or 0xcfc,
> > > there are no errors on the log and QEMU seems to work just fine.
> > >
> > > I always assumed QEMU was getting accesses to cf8/cfc forwarded
> > > because it was the default device model, and everything not trapped by
> > > Xen would be forwarded to it. This default device model behaviour was
> > > removed by Paul some time ago, and now QEMU registers explicitly which
> > > IO accesses it wants to trap.
> >
> > Yes, it used to need them to work correctly as a default emulator. However, we don't generally stop
> an external emulator from registering ranges that are handled by emulation directly in Xen (e.g.
> pmtimer) so I don't think you need special-case these ports.
> 
> That's right, so I guess I would just remove that check (and the one
> introduced for MCFG regions). We also don't check whether any other
> ioreq server has already registered a range.

That's right... it's a last-one-wins game. We could decide to change this in future, but it is quite convenient for testing purposes.

  Paul

> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup Roger Pau Monne
@ 2019-09-10 10:44   ` Paul Durrant
  2019-09-10 13:28   ` Jan Beulich
  1 sibling, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 10:44 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>
> Subject: [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
> 
> The loop in FOR_EACH_IOREQ_SERVER is backwards hence the cleanup on
> failure needs to be done forwards.
> 
> Fixes: 97a5a3e30161 ('x86/hvm/ioreq: maintain an array of ioreq servers rather than a list')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Good spot!

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> Changes since v1:
>  - New in this version.
> ---
>  xen/arch/x86/hvm/ioreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index a79cabb680..692b710b02 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
>      return 0;
> 
>   fail:
> -    while ( id-- != 0 )
> +    while ( id++ != MAX_NR_IOREQ_SERVERS )
>      {
>          s = GET_IOREQ_SERVER(d, id);
> 
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t Roger Pau Monne
@ 2019-09-10 12:31   ` Paul Durrant
  2019-09-20 10:47     ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 12:31 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson,
	Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>;
> Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t
> 
> hvm_select_ioreq_server and hvm_send_ioreq where both using
> hvm_ioreq_server directly, switch to use ioservid_t in order to select
> and forward ioreqs.
> 
> This is a preparatory change, since future patches will use the ioreq
> server id in order to differentiate between internal and external
> ioreq servers.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

... with one suggestion.

[snip]
> diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
> index d3b554d019..8725cc20d3 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -54,6 +54,7 @@
>   */
> 
>  typedef uint16_t ioservid_t;
> +#define XEN_INVALID_IOSERVID 0xffff
> 

Perhaps use (ioservid_t)~0 rather than hardcoding?

  Paul

>  /*
>   * XEN_DMOP_create_ioreq_server: Instantiate a new IOREQ Server for a
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers Roger Pau Monne
@ 2019-09-10 12:34   ` Paul Durrant
  2019-09-20 10:53   ` Jan Beulich
  1 sibling, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 12:34 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Jan Beulich, Wei Liu, Roger Pau Monne

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Roger Pau Monne
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; Jan Beulich <jbeulich@suse.com>;
> Roger Pau Monne <roger.pau@citrix.com>
> Subject: [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers
> 
> Internal ioreq servers are plain function handlers implemented inside
> of the hypervisor. Note that most fields used by current (external)
> ioreq servers are not needed for internal ones, and hence have been
> placed inside of a struct and packed in an union together with the
> only internal specific field, a function pointer to a handler.
> 
> This is required in order to have PCI config accesses forwarded to
> external ioreq servers or to internal ones (ie: QEMU emulated devices
> vs vPCI passthrough), and is the first step in order to allow
> unprivileged domains to use vPCI.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> Changes since v1:
>  - Do not add an internal field to the ioreq server struct, whether a
>    server is internal or external can already be inferred from the id.
>  - Add an extra parameter to the internal handler in order to pass
>    user-provided opaque data to the handler.
> ---
>  xen/include/asm-x86/hvm/domain.h | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
> index bcc5621797..9fbe83f45a 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -52,21 +52,29 @@ struct hvm_ioreq_vcpu {
>  #define MAX_NR_IO_RANGES  256
> 
>  struct hvm_ioreq_server {
> -    struct domain          *target, *emulator;
> -
> +    struct domain          *target;
>      /* Lock to serialize toolstack modifications */
>      spinlock_t             lock;
> -
> -    struct hvm_ioreq_page  ioreq;
> -    struct list_head       ioreq_vcpu_list;
> -    struct hvm_ioreq_page  bufioreq;
> -
> -    /* Lock to serialize access to buffered ioreq ring */
> -    spinlock_t             bufioreq_lock;
> -    evtchn_port_t          bufioreq_evtchn;
>      struct rangeset        *range[NR_IO_RANGE_TYPES];
>      bool                   enabled;
> -    uint8_t                bufioreq_handling;
> +
> +    union {
> +        struct {
> +            struct domain          *emulator;
> +            struct hvm_ioreq_page  ioreq;
> +            struct list_head       ioreq_vcpu_list;
> +            struct hvm_ioreq_page  bufioreq;
> +
> +            /* Lock to serialize access to buffered ioreq ring */
> +            spinlock_t             bufioreq_lock;
> +            evtchn_port_t          bufioreq_evtchn;
> +            uint8_t                bufioreq_handling;
> +        };
> +        struct {
> +            void                   *data;
> +            int (*handler)(struct vcpu *v, ioreq_t *, void *);
> +        };
> +    };
>  };
> 
>  /*
> --
> 2.22.0
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support Roger Pau Monne
@ 2019-09-10 12:59   ` Paul Durrant
  2019-09-26 10:49     ` Roger Pau Monné
  2019-09-20 11:15   ` Jan Beulich
  1 sibling, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 12:59 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH v2 05/11] ioreq: add internal ioreq initialization support
> 
> Add support for internal ioreq servers to initialization and
> deinitialization routines, prevent some functions from being executed
> against internal ioreq servers and add guards to only allow internal
> callers to modify internal ioreq servers. External callers (ie: from
> hypercalls) are only allowed to deal with external ioreq servers.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - Do not pass an 'internal' parameter to most functions, and instead
>    use the id to key whether an ioreq server is internal or external.
>  - Prevent enabling an internal server without a handler.
> ---
>  xen/arch/x86/hvm/dm.c            |  17 ++-
>  xen/arch/x86/hvm/ioreq.c         | 173 +++++++++++++++++++------------
>  xen/include/asm-x86/hvm/domain.h |   5 +-
>  xen/include/asm-x86/hvm/ioreq.h  |   8 +-
>  4 files changed, 135 insertions(+), 68 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index c2fca9f729..6a3682e58c 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -417,7 +417,7 @@ static int dm_op(const struct dmop_args *op_args)
>              break;
> 
>          rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
> -                                     &data->id);
> +                                     &data->id, false);
>          break;
>      }
> 
> @@ -450,6 +450,9 @@ static int dm_op(const struct dmop_args *op_args)
>          rc = -EINVAL;
>          if ( data->pad )
>              break;
> +        rc = -EPERM;
> +        if ( hvm_ioreq_is_internal(data->id) )
> +            break;
> 
>          rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
>                                                data->start, data->end);
> @@ -464,6 +467,9 @@ static int dm_op(const struct dmop_args *op_args)
>          rc = -EINVAL;
>          if ( data->pad )
>              break;
> +        rc = -EPERM;
> +        if ( hvm_ioreq_is_internal(data->id) )
> +            break;
> 
>          rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
>                                                    data->start, data->end);
> @@ -481,6 +487,9 @@ static int dm_op(const struct dmop_args *op_args)
>          rc = -EOPNOTSUPP;
>          if ( !hap_enabled(d) )
>              break;
> +        rc = -EPERM;
> +        if ( hvm_ioreq_is_internal(data->id) )
> +            break;
> 
>          if ( first_gfn == 0 )
>              rc = hvm_map_mem_type_to_ioreq_server(d, data->id,
> @@ -528,6 +537,9 @@ static int dm_op(const struct dmop_args *op_args)
>          rc = -EINVAL;
>          if ( data->pad )
>              break;
> +        rc = -EPERM;
> +        if ( hvm_ioreq_is_internal(data->id) )
> +            break;
> 
>          rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
>          break;
> @@ -541,6 +553,9 @@ static int dm_op(const struct dmop_args *op_args)
>          rc = -EINVAL;
>          if ( data->pad )
>              break;
> +        rc = -EPERM;
> +        if ( hvm_ioreq_is_internal(data->id) )
> +            break;
> 
>          rc = hvm_destroy_ioreq_server(d, data->id);
>          break;
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 95492bc111..dbc5e6b4c5 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -59,10 +59,11 @@ static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
>  /*
>   * Iterate over all possible ioreq servers.
>   *
> - * NOTE: The iteration is backwards such that more recently created
> - *       ioreq servers are favoured in hvm_select_ioreq_server().
> - *       This is a semantic that previously existed when ioreq servers
> - *       were held in a linked list.
> + * NOTE: The iteration is backwards such that internal and more recently
> + *       created external ioreq servers are favoured in
> + *       hvm_select_ioreq_server().
> + *       This is a semantic that previously existed for external servers when
> + *       ioreq servers were held in a linked list.
>   */
>  #define FOR_EACH_IOREQ_SERVER(d, id, s) \
>      for ( (id) = MAX_NR_IOREQ_SERVERS; (id) != 0; ) \
> @@ -70,6 +71,12 @@ static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
>              continue; \
>          else
> 
> +#define FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s) \
> +    for ( (id) = MAX_NR_EXTERNAL_IOREQ_SERVERS; (id) != 0; ) \
> +        if ( !(s = GET_IOREQ_SERVER(d, --(id))) ) \
> +            continue; \
> +        else
> +
>  static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
>  {
>      shared_iopage_t *p = s->ioreq.va;
> @@ -86,7 +93,7 @@ bool hvm_io_pending(struct vcpu *v)
>      struct hvm_ioreq_server *s;
>      unsigned int id;
> 
> -    FOR_EACH_IOREQ_SERVER(d, id, s)
> +    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
>      {
>          struct hvm_ioreq_vcpu *sv;
> 
> @@ -190,7 +197,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
>          return false;
>      }
> 
> -    FOR_EACH_IOREQ_SERVER(d, id, s)
> +    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
>      {
>          struct hvm_ioreq_vcpu *sv;
> 
> @@ -430,7 +437,7 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
> 
>      spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> 
> -    FOR_EACH_IOREQ_SERVER(d, id, s)
> +    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
>      {
>          if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
>          {
> @@ -688,7 +695,7 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>      return rc;
>  }
> 
> -static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
> +static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s, bool internal)
>  {
>      struct hvm_ioreq_vcpu *sv;
> 
> @@ -697,29 +704,40 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
>      if ( s->enabled )
>          goto done;
> 
> -    hvm_remove_ioreq_gfn(s, false);
> -    hvm_remove_ioreq_gfn(s, true);
> +    if ( !internal )
> +    {
> +        hvm_remove_ioreq_gfn(s, false);
> +        hvm_remove_ioreq_gfn(s, true);
> 
> -    s->enabled = true;
> +        list_for_each_entry ( sv,
> +                              &s->ioreq_vcpu_list,
> +                              list_entry )
> +            hvm_update_ioreq_evtchn(s, sv);
> +    }
> +    else if ( !s->handler )
> +    {
> +        ASSERT_UNREACHABLE();
> +        goto done;
> +    }
> 
> -    list_for_each_entry ( sv,
> -                          &s->ioreq_vcpu_list,
> -                          list_entry )
> -        hvm_update_ioreq_evtchn(s, sv);
> +    s->enabled = true;
> 
>    done:
>      spin_unlock(&s->lock);
>  }
> 
> -static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
> +static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s, bool internal)
>  {
>      spin_lock(&s->lock);
> 
>      if ( !s->enabled )
>          goto done;
> 
> -    hvm_add_ioreq_gfn(s, true);
> -    hvm_add_ioreq_gfn(s, false);
> +    if ( !internal )
> +    {
> +        hvm_add_ioreq_gfn(s, true);
> +        hvm_add_ioreq_gfn(s, false);
> +    }
> 
>      s->enabled = false;
> 
> @@ -736,33 +754,39 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
>      int rc;
> 
>      s->target = d;
> -
> -    get_knownalive_domain(currd);
> -    s->emulator = currd;
> -
>      spin_lock_init(&s->lock);
> -    INIT_LIST_HEAD(&s->ioreq_vcpu_list);
> -    spin_lock_init(&s->bufioreq_lock);
> -
> -    s->ioreq.gfn = INVALID_GFN;
> -    s->bufioreq.gfn = INVALID_GFN;
> 
>      rc = hvm_ioreq_server_alloc_rangesets(s, id);
>      if ( rc )
>          return rc;
> 
> -    s->bufioreq_handling = bufioreq_handling;
> -
> -    for_each_vcpu ( d, v )
> +    if ( !hvm_ioreq_is_internal(id) )
>      {
> -        rc = hvm_ioreq_server_add_vcpu(s, v);
> -        if ( rc )
> -            goto fail_add;
> +        get_knownalive_domain(currd);
> +
> +        s->emulator = currd;
> +        INIT_LIST_HEAD(&s->ioreq_vcpu_list);
> +        spin_lock_init(&s->bufioreq_lock);
> +
> +        s->ioreq.gfn = INVALID_GFN;
> +        s->bufioreq.gfn = INVALID_GFN;
> +
> +        s->bufioreq_handling = bufioreq_handling;
> +
> +        for_each_vcpu ( d, v )
> +        {
> +            rc = hvm_ioreq_server_add_vcpu(s, v);
> +            if ( rc )
> +                goto fail_add;
> +        }
>      }
> +    else
> +        s->handler = NULL;

The struct is zeroed out so initializing the handler is not necessary.

> 
>      return 0;
> 
>   fail_add:
> +    ASSERT(!hvm_ioreq_is_internal(id));
>      hvm_ioreq_server_remove_all_vcpus(s);
>      hvm_ioreq_server_unmap_pages(s);
> 

I think it would be worthwhile having that ASSERT repeated in the called functions, and other functions that only operate on external ioreq servers e.g. hvm_ioreq_server_add_vcpu(), hvm_ioreq_server_map_pages(), etc. 

> @@ -772,30 +796,34 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
>      return rc;
>  }
> 
> -static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
> +static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s, bool internal)
>  {
>      ASSERT(!s->enabled);
> -    hvm_ioreq_server_remove_all_vcpus(s);
> -
> -    /*
> -     * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
> -     *       hvm_ioreq_server_free_pages() in that order.
> -     *       This is because the former will do nothing if the pages
> -     *       are not mapped, leaving the page to be freed by the latter.
> -     *       However if the pages are mapped then the former will set
> -     *       the page_info pointer to NULL, meaning the latter will do
> -     *       nothing.
> -     */
> -    hvm_ioreq_server_unmap_pages(s);
> -    hvm_ioreq_server_free_pages(s);
> 
>      hvm_ioreq_server_free_rangesets(s);
> 
> -    put_domain(s->emulator);
> +    if ( !internal )

Perhaps 'if ( internal ) return;' so as to avoid indenting the code below and thus shrink the diff.

> +    {
> +        hvm_ioreq_server_remove_all_vcpus(s);
> +
> +        /*
> +         * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
> +         *       hvm_ioreq_server_free_pages() in that order.
> +         *       This is because the former will do nothing if the pages
> +         *       are not mapped, leaving the page to be freed by the latter.
> +         *       However if the pages are mapped then the former will set
> +         *       the page_info pointer to NULL, meaning the latter will do
> +         *       nothing.
> +         */
> +        hvm_ioreq_server_unmap_pages(s);
> +        hvm_ioreq_server_free_pages(s);
> +
> +        put_domain(s->emulator);
> +    }
>  }
> 
>  int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> -                            ioservid_t *id)
> +                            ioservid_t *id, bool internal)
>  {
>      struct hvm_ioreq_server *s;
>      unsigned int i;
> @@ -811,7 +839,9 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
>      domain_pause(d);
>      spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> 
> -    for ( i = 0; i < MAX_NR_IOREQ_SERVERS; i++ )
> +    for ( i = (internal ? MAX_NR_EXTERNAL_IOREQ_SERVERS : 0);
> +          i < (internal ? MAX_NR_IOREQ_SERVERS : MAX_NR_EXTERNAL_IOREQ_SERVERS);
> +          i++ )
>      {
>          if ( !GET_IOREQ_SERVER(d, i) )
>              break;
> @@ -821,6 +851,9 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
>      if ( i >= MAX_NR_IOREQ_SERVERS )
>          goto fail;
> 
> +    ASSERT((internal &&
> +            i >= MAX_NR_EXTERNAL_IOREQ_SERVERS && i < MAX_NR_IOREQ_SERVERS) ||
> +           (!internal && i < MAX_NR_EXTERNAL_IOREQ_SERVERS));
>      /*
>       * It is safe to call set_ioreq_server() prior to
>       * hvm_ioreq_server_init() since the target domain is paused.
> @@ -864,20 +897,21 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
>          goto out;
> 
>      rc = -EPERM;
> -    if ( s->emulator != current->domain )
> +    /* NB: internal servers cannot be destroyed. */
> +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )

Shouldn't the test of hvm_ioreq_is_internal(id) simply be an ASSERT? This function should only be called from a dm_op(), right?

>          goto out;
> 
>      domain_pause(d);
> 
>      p2m_set_ioreq_server(d, 0, id);
> 
> -    hvm_ioreq_server_disable(s);
> +    hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
> 
>      /*
>       * It is safe to call hvm_ioreq_server_deinit() prior to
>       * set_ioreq_server() since the target domain is paused.
>       */
> -    hvm_ioreq_server_deinit(s);
> +    hvm_ioreq_server_deinit(s, hvm_ioreq_is_internal(id));
>      set_ioreq_server(d, id, NULL);
> 
>      domain_unpause(d);
> @@ -909,7 +943,8 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
>          goto out;
> 
>      rc = -EPERM;
> -    if ( s->emulator != current->domain )
> +    /* NB: don't allow fetching information from internal ioreq servers. */
> +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )

Again here, and several places below.

  Paul

>          goto out;
> 
>      if ( ioreq_gfn || bufioreq_gfn )
> @@ -956,7 +991,7 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
>          goto out;
> 
>      rc = -EPERM;
> -    if ( s->emulator != current->domain )
> +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
>          goto out;
> 
>      rc = hvm_ioreq_server_alloc_pages(s);
> @@ -1010,7 +1045,7 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>          goto out;
> 
>      rc = -EPERM;
> -    if ( s->emulator != current->domain )
> +    if ( !hvm_ioreq_is_internal(id) && s->emulator != current->domain )
>          goto out;
> 
>      switch ( type )
> @@ -1068,7 +1103,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>          goto out;
> 
>      rc = -EPERM;
> -    if ( s->emulator != current->domain )
> +    if ( !hvm_ioreq_is_internal(id) && s->emulator != current->domain )
>          goto out;
> 
>      switch ( type )
> @@ -1128,6 +1163,14 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>      if ( !s )
>          goto out;
> 
> +    /*
> +     * NB: do not support mapping internal ioreq servers to memory types, as
> +     * the current internal ioreq servers don't need this feature and it's not
> +     * been tested.
> +     */
> +    rc = -EINVAL;
> +    if ( hvm_ioreq_is_internal(id) )
> +        goto out;
>      rc = -EPERM;
>      if ( s->emulator != current->domain )
>          goto out;
> @@ -1163,15 +1206,15 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
>          goto out;
> 
>      rc = -EPERM;
> -    if ( s->emulator != current->domain )
> +    if ( !hvm_ioreq_is_internal(id) && s->emulator != current->domain )
>          goto out;
> 
>      domain_pause(d);
> 
>      if ( enabled )
> -        hvm_ioreq_server_enable(s);
> +        hvm_ioreq_server_enable(s, hvm_ioreq_is_internal(id));
>      else
> -        hvm_ioreq_server_disable(s);
> +        hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
> 
>      domain_unpause(d);
> 
> @@ -1190,7 +1233,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
> 
>      spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> 
> -    FOR_EACH_IOREQ_SERVER(d, id, s)
> +    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
>      {
>          rc = hvm_ioreq_server_add_vcpu(s, v);
>          if ( rc )
> @@ -1202,7 +1245,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
>      return 0;
> 
>   fail:
> -    while ( id++ != MAX_NR_IOREQ_SERVERS )
> +    while ( id++ != MAX_NR_EXTERNAL_IOREQ_SERVERS )
>      {
>          s = GET_IOREQ_SERVER(d, id);
> 
> @@ -1224,7 +1267,7 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
> 
>      spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> 
> -    FOR_EACH_IOREQ_SERVER(d, id, s)
> +    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
>          hvm_ioreq_server_remove_vcpu(s, v);
> 
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
> @@ -1241,13 +1284,13 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
> 
>      FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
> -        hvm_ioreq_server_disable(s);
> +        hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
> 
>          /*
>           * It is safe to call hvm_ioreq_server_deinit() prior to
>           * set_ioreq_server() since the target domain is being destroyed.
>           */
> -        hvm_ioreq_server_deinit(s);
> +        hvm_ioreq_server_deinit(s, hvm_ioreq_is_internal(id));
>          set_ioreq_server(d, id, NULL);
> 
>          xfree(s);
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
> index 9fbe83f45a..9f92838b6e 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -97,7 +97,10 @@ struct hvm_pi_ops {
>      void (*vcpu_block)(struct vcpu *);
>  };
> 
> -#define MAX_NR_IOREQ_SERVERS 8
> +#define MAX_NR_EXTERNAL_IOREQ_SERVERS 8
> +#define MAX_NR_INTERNAL_IOREQ_SERVERS 1
> +#define MAX_NR_IOREQ_SERVERS \
> +    (MAX_NR_EXTERNAL_IOREQ_SERVERS + MAX_NR_INTERNAL_IOREQ_SERVERS)
> 
>  struct hvm_domain {
>      /* Guest page range used for non-default ioreq servers */
> diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
> index 65491c48d2..c3917aa74d 100644
> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -24,7 +24,7 @@ bool handle_hvm_io_completion(struct vcpu *v);
>  bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
> 
>  int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> -                            ioservid_t *id);
> +                            ioservid_t *id, bool internal);
>  int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
>  int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
>                                unsigned long *ioreq_gfn,
> @@ -54,6 +54,12 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
> 
>  void hvm_ioreq_init(struct domain *d);
> 
> +static inline bool hvm_ioreq_is_internal(unsigned int id)
> +{
> +    ASSERT(id < MAX_NR_IOREQ_SERVERS);
> +    return id >= MAX_NR_EXTERNAL_IOREQ_SERVERS;
> +}
> +
>  #endif /* __ASM_X86_HVM_IOREQ_H__ */
> 
>  /*
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers Roger Pau Monne
@ 2019-09-10 13:06   ` Paul Durrant
  2019-09-20 11:35   ` Jan Beulich
  1 sibling, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 13:06 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>
> Subject: [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
> 
> Internal ioreq servers are always processed first, and ioreqs are
> dispatched by calling the handler function. Note this is already the
> case due to the implementation of FOR_EACH_IOREQ_SERVER.
> 

I'd re-jig this a bit. Something like...

"Internal ioreq servers will be processed first due to the implementation
 of FOR_EACH_IOREQ_SERVER, and ioreqs are dispatched simply by calling
 the handler function."

> Note that hvm_send_ioreq doesn't get passed the ioreq server id, so
> obtain it from the ioreq server data by doing pointer arithmetic.
> 

I think this 2nd paragraph is stale now?

> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Otherwise LGTM, so with those things fixed up...

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> Changes since v1:
>  - Avoid having to iterate twice over the list of ioreq servers since
>    now internal servers are always processed first by
>    FOR_EACH_IOREQ_SERVER.
>  - Obtain ioreq server id using pointer arithmetic.
> ---
>  xen/arch/x86/hvm/ioreq.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index dbc5e6b4c5..8331a89eae 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1493,9 +1493,18 @@ int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
> 
>      ASSERT(s);
> 
> +    if ( hvm_ioreq_is_internal(id) && buffered )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return X86EMUL_UNHANDLEABLE;
> +    }
> +
>      if ( buffered )
>          return hvm_send_buffered_ioreq(s, proto_p);
> 
> +    if ( hvm_ioreq_is_internal(id) )
> +        return s->handler(curr, proto_p, s->data);
> +
>      if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
>          return X86EMUL_RETRY;
> 
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler Roger Pau Monne
@ 2019-09-10 13:12   ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 13:12 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne



> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>
> Subject: [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler
> 
> Provide a routine to register the handler for an internal ioreq
> server.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - Allow to provide an opaque data parameter to pass to the handler.
>  - Allow changing the handler as long as the server is not enabled.
> ---
>  xen/arch/x86/hvm/ioreq.c        | 35 +++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/ioreq.h |  4 ++++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 8331a89eae..6339e5f884 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -485,6 +485,41 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
>      return rc;
>  }
> 
> +int hvm_add_ioreq_handler(struct domain *d, ioservid_t id,

I did ask for 'hvm_set_ioreq_handler()'. I think it makes more sense as there's no corresponding 'hvm_remove_ioreq_handler()'.

> +                          int (*handler)(struct vcpu *v, ioreq_t *, void *),
> +                          void *data)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc = 0;
> +
> +    if ( !hvm_ioreq_is_internal(id) )
> +    {
> +        rc = -EINVAL;
> +        goto out;

You just want to return here because you're not holding the lock.

  Paul

> +    }
> +
> +    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> +    s = get_ioreq_server(d, id);
> +    if ( !s )
> +    {
> +        rc = -ENOENT;
> +        goto out;
> +    }
> +    if ( s->enabled )
> +    {
> +        rc = -EBUSY;
> +        goto out;
> +    }
> +
> +    s->handler = handler;
> +    s->data = data;
> +
> + out:
> +    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
> +
> +    return rc;
> +}
> +
>  static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
>                                      struct hvm_ioreq_vcpu *sv)
>  {
> diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
> index c3917aa74d..90cc2aa938 100644
> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -54,6 +54,10 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
> 
>  void hvm_ioreq_init(struct domain *d);
> 
> +int hvm_add_ioreq_handler(struct domain *d, ioservid_t id,
> +                          int (*handler)(struct vcpu *v, ioreq_t *, void *),
> +                          void *data);
> +
>  static inline bool hvm_ioreq_is_internal(unsigned int id)
>  {
>      ASSERT(id < MAX_NR_IOREQ_SERVERS);
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup Roger Pau Monne
  2019-09-10 10:44   ` Paul Durrant
@ 2019-09-10 13:28   ` Jan Beulich
  2019-09-10 13:33     ` Roger Pau Monné
  1 sibling, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2019-09-10 13:28 UTC (permalink / raw)
  To: Roger Pau Monne, Paul Durrant; +Cc: xen-devel, Wei Liu, Andrew Cooper

On 03.09.2019 18:14, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
>      return 0;
>  
>   fail:
> -    while ( id-- != 0 )
> +    while ( id++ != MAX_NR_IOREQ_SERVERS )
>      {
>          s = GET_IOREQ_SERVER(d, id);

With Paul's R-b I was about to commit this, but doesn't this
need to be ++id? (If so, I'll be happy to fix while committing.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-10 13:28   ` Jan Beulich
@ 2019-09-10 13:33     ` Roger Pau Monné
  2019-09-10 13:35       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-10 13:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Tue, Sep 10, 2019 at 03:28:57PM +0200, Jan Beulich wrote:
> On 03.09.2019 18:14, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
> >      return 0;
> >  
> >   fail:
> > -    while ( id-- != 0 )
> > +    while ( id++ != MAX_NR_IOREQ_SERVERS )
> >      {
> >          s = GET_IOREQ_SERVER(d, id);
> 
> With Paul's R-b I was about to commit this, but doesn't this
> need to be ++id? (If so, I'll be happy to fix while committing.)

The increment is already done in the loop condition.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-10 13:33     ` Roger Pau Monné
@ 2019-09-10 13:35       ` Jan Beulich
  2019-09-10 13:42         ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2019-09-10 13:35 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On 10.09.2019 15:33, Roger Pau Monné  wrote:
> On Tue, Sep 10, 2019 at 03:28:57PM +0200, Jan Beulich wrote:
>> On 03.09.2019 18:14, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/ioreq.c
>>> +++ b/xen/arch/x86/hvm/ioreq.c
>>> @@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
>>>      return 0;
>>>  
>>>   fail:
>>> -    while ( id-- != 0 )
>>> +    while ( id++ != MAX_NR_IOREQ_SERVERS )
>>>      {
>>>          s = GET_IOREQ_SERVER(d, id);
>>
>> With Paul's R-b I was about to commit this, but doesn't this
>> need to be ++id? (If so, I'll be happy to fix while committing.)
> 
> The increment is already done in the loop condition.

That's the increment I mean. I'm sorry for the ambiguity; I
didn't want to cut too much of the context.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions Roger Pau Monne
@ 2019-09-10 13:37   ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 13:37 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions
> 
> Pick up on the infrastructure already added for vPCI and allow ioreq
> to decode accesses to MMCFG regions registered for a domain. This
> infrastructure is still only accessible from internal callers, so
> MMCFG regions can only be registered from the internal domain builder
> used by PVH dom0.
> 
> Note that the vPCI infrastructure to decode and handle accesses to
> MMCFG regions will be removed in following patches when vPCI is
> switched to become an internal ioreq server.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

[snip]
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 6339e5f884..fecdc2786f 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1090,21 +1090,34 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>          /* PCI config space accesses are handled internally. */
>          if ( start <= 0xcf8 + 8 && 0xcf8 <= end )
>              goto out;
> -        else
> -            /* fallthrough. */
> +        break;
> +
>      case XEN_DMOP_IO_RANGE_MEMORY:
> +    {
> +        const struct hvm_mmcfg *mmcfg;
> +
> +        rc = -EINVAL;
> +        /* PCI config space accesses are handled internally. */
> +        read_lock(&d->arch.hvm.mmcfg_lock);
> +        list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
> +            if ( start <= mmcfg->addr + mmcfg->size && mmcfg->addr <= end )
> +            {
> +                read_unlock(&d->arch.hvm.mmcfg_lock);
> +                goto out;
> +            }
> +        read_unlock(&d->arch.hvm.mmcfg_lock);
> +        break;
> +    }
> +

Like with cf8 registration, I don't think you want to error here. It's never been a hard error for an external emulator to attempt to register for accesses that are actually handled within Xen. Doing so would mean that we may need to teach QEMU what Xen does and doesn't deal with internally and that seems like an unnecessary headache.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-10 13:35       ` Jan Beulich
@ 2019-09-10 13:42         ` Roger Pau Monné
  2019-09-10 13:53           ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-10 13:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On Tue, Sep 10, 2019 at 03:35:06PM +0200, Jan Beulich wrote:
> On 10.09.2019 15:33, Roger Pau Monné  wrote:
> > On Tue, Sep 10, 2019 at 03:28:57PM +0200, Jan Beulich wrote:
> >> On 03.09.2019 18:14, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/ioreq.c
> >>> +++ b/xen/arch/x86/hvm/ioreq.c
> >>> @@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
> >>>      return 0;
> >>>  
> >>>   fail:
> >>> -    while ( id-- != 0 )
> >>> +    while ( id++ != MAX_NR_IOREQ_SERVERS )
> >>>      {
> >>>          s = GET_IOREQ_SERVER(d, id);
> >>
> >> With Paul's R-b I was about to commit this, but doesn't this
> >> need to be ++id? (If so, I'll be happy to fix while committing.)
> > 
> > The increment is already done in the loop condition.
> 
> That's the increment I mean. I'm sorry for the ambiguity; I
> didn't want to cut too much of the context.

Oh sorry, yes I think you are correct, or else we would overrun the
array by one.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server Roger Pau Monne
@ 2019-09-10 13:49   ` Paul Durrant
  2019-09-26 15:07     ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 13:49 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson,
	Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> <wl@xen.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan
> Beulich <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>;
> Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> 
> Switch vPCI to become an internal ioreq server, and hence drop all the
> vPCI specific decoding and trapping to PCI IO ports and MMCFG regions.
> 
> This allows to unify the vPCI code with the ioreq infrastructure,
> opening the door for domains having PCI accesses handled by vPCI and
> other ioreq servers at the same time.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

[snip]
> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
> index f61f66df5f..bf2c64a0a9 100644
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -11,6 +11,7 @@
>  #include <asm/current.h>
>  #include <asm/io_apic.h>
>  #include <asm/msi.h>
> +#include <asm/hvm/ioreq.h>

Why is this change necessary on its own?

>  #include <asm/hvm/irq.h>
>  #include <asm/hypercall.h>
>  #include <public/xen.h>
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index cbd1bac7fc..5664020c2d 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -20,6 +20,8 @@
>  #include <xen/sched.h>
>  #include <xen/vpci.h>
> 
> +#include <asm/hvm/ioreq.h>
> +
>  /* Internal struct to store the emulated PCI registers. */
>  struct vpci_register {
>      vpci_read_t *read;
> @@ -302,7 +304,7 @@ static uint32_t merge_result(uint32_t data, uint32_t new, unsigned int size,
>      return (data & ~(mask << (offset * 8))) | ((new & mask) << (offset * 8));
>  }
> 
> -uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> +static uint32_t read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>  {
>      const struct domain *d = current->domain;
>      const struct pci_dev *pdev;
> @@ -404,8 +406,8 @@ static void vpci_write_helper(const struct pci_dev *pdev,
>               r->private);
>  }
> 
> -void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> -                uint32_t data)
> +static void write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> +                  uint32_t data)
>  {
>      const struct domain *d = current->domain;
>      const struct pci_dev *pdev;
> @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>      spin_unlock(&pdev->vpci->lock);
>  }
> 
> +#ifdef __XEN__
> +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> +{
> +    pci_sbdf_t sbdf;
> +
> +    if ( req->type == IOREQ_TYPE_INVALIDATE )
> +        /*
> +         * Ignore invalidate requests, those can be received even without
> +         * having any memory ranges registered, see send_invalidate_req.
> +         */
> +        return X86EMUL_OKAY;

In general, I wonder whether internal servers will ever need to deal with invalidate? The code only exists to get QEMU to drop its map cache after a decrease_reservation so that the page refs get dropped.

  Paul

> +
> +    if ( req->type != IOREQ_TYPE_PCI_CONFIG || req->data_is_ptr )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return X86EMUL_UNHANDLEABLE;
> +    }
> +
> +    sbdf.sbdf = req->addr >> 32;
> +
> +    if ( req->dir )
> +        req->data = read(sbdf, req->addr, req->size);
> +    else
> +        write(sbdf, req->addr, req->size, req->data);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +int vpci_register_ioreq(struct domain *d)
> +{
> +    ioservid_t id;
> +    int rc;
> +
> +    if ( !has_vpci(d) )
> +        return 0;
> +
> +    rc = hvm_create_ioreq_server(d, HVM_IOREQSRV_BUFIOREQ_OFF, &id, true);
> +    if ( rc )
> +        return rc;
> +
> +    rc = hvm_add_ioreq_handler(d, id, ioreq_handler, NULL);
> +    if ( rc )
> +        return rc;
> +
> +    if ( is_hardware_domain(d) )
> +    {
> +        /* Handle all devices in vpci. */
> +        rc = hvm_map_io_range_to_ioreq_server(d, id, XEN_DMOP_IO_RANGE_PCI,
> +                                              0, ~(uint64_t)0);
> +        if ( rc )
> +            return rc;
> +    }
> +
> +    rc = hvm_set_ioreq_server_state(d, id, true);
> +    if ( rc )
> +        return rc;
> +
> +    return rc;
> +}
> +#endif
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 4cf233c779..36f435ed5b 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -23,6 +23,9 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>    static vpci_register_init_t *const x##_entry  \
>                 __used_section(".data.vpci." p) = x
> 
> +/* Register vPCI handler with ioreq. */
> +int vpci_register_ioreq(struct domain *d);
> +
>  /* Add vPCI handlers to device. */
>  int __must_check vpci_add_handlers(struct pci_dev *dev);
> 
> @@ -38,11 +41,6 @@ int __must_check vpci_add_register(struct vpci *vpci,
>  int __must_check vpci_remove_register(struct vpci *vpci, unsigned int offset,
>                                        unsigned int size);
> 
> -/* Generic read/write handlers for the PCI config space. */
> -uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
> -void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> -                uint32_t data);
> -
>  /* Passthrough handlers. */
>  uint32_t vpci_hw_read16(const struct pci_dev *pdev, unsigned int reg,
>                          void *data);
> @@ -221,20 +219,12 @@ static inline int vpci_add_handlers(struct pci_dev *pdev)
>      return 0;
>  }
> 
> -static inline void vpci_dump_msi(void) { }
> -
> -static inline uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,
> -                                 unsigned int size)
> +static inline int vpci_register_ioreq(struct domain *d)
>  {
> -    ASSERT_UNREACHABLE();
> -    return ~(uint32_t)0;
> +    return 0;
>  }
> 
> -static inline void vpci_write(pci_sbdf_t sbdf, unsigned int reg,
> -                              unsigned int size, uint32_t data)
> -{
> -    ASSERT_UNREACHABLE();
> -}
> +static inline void vpci_dump_msi(void) { }
> 
>  static inline bool vpci_process_pending(struct vcpu *v)
>  {
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
  2019-09-10 13:42         ` Roger Pau Monné
@ 2019-09-10 13:53           ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 13:53 UTC (permalink / raw)
  To: Roger Pau Monne, Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 10 September 2019 14:42
> To: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; xen-
> devel@lists.xenproject.org; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
> 
> On Tue, Sep 10, 2019 at 03:35:06PM +0200, Jan Beulich wrote:
> > On 10.09.2019 15:33, Roger Pau Monné  wrote:
> > > On Tue, Sep 10, 2019 at 03:28:57PM +0200, Jan Beulich wrote:
> > >> On 03.09.2019 18:14, Roger Pau Monne wrote:
> > >>> --- a/xen/arch/x86/hvm/ioreq.c
> > >>> +++ b/xen/arch/x86/hvm/ioreq.c
> > >>> @@ -1195,7 +1195,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
> > >>>      return 0;
> > >>>
> > >>>   fail:
> > >>> -    while ( id-- != 0 )
> > >>> +    while ( id++ != MAX_NR_IOREQ_SERVERS )
> > >>>      {
> > >>>          s = GET_IOREQ_SERVER(d, id);
> > >>
> > >> With Paul's R-b I was about to commit this, but doesn't this
> > >> need to be ++id? (If so, I'll be happy to fix while committing.)
> > >
> > > The increment is already done in the loop condition.
> >
> > That's the increment I mean. I'm sorry for the ambiguity; I
> > didn't want to cut too much of the context.
> 
> Oh sorry, yes I think you are correct, or else we would overrun the
> array by one.

Indeed. I should have spotted that.

  Paul

> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses Roger Pau Monne
@ 2019-09-10 14:06   ` Paul Durrant
  2019-09-26 16:05     ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 14:06 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>
> Subject: [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses
> 
> Place the code that converts a PIO/COPY ioreq into a PCI_CONFIG one
> into a separate function, and adjust the code to make use of this
> newly introduced function.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - New in this version.
> ---
>  xen/arch/x86/hvm/ioreq.c | 111 +++++++++++++++++++++++----------------
>  1 file changed, 67 insertions(+), 44 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index fecdc2786f..33c56b880c 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -183,6 +183,54 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
>      return true;
>  }
> 
> +static void convert_pci_ioreq(struct domain *d, ioreq_t *p)
> +{
> +    const struct hvm_mmcfg *mmcfg;
> +    uint32_t cf8 = d->arch.hvm.pci_cf8;
> +
> +    if ( p->type != IOREQ_TYPE_PIO && p->type != IOREQ_TYPE_COPY )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return;
> +    }
> +
> +    read_lock(&d->arch.hvm.mmcfg_lock);

Actually, looking at this... can you not restrict holding the mmcfg_lock...

> +    if ( (p->type == IOREQ_TYPE_PIO &&
> +          (p->addr & ~3) == 0xcfc &&
> +          CF8_ENABLED(cf8)) ||
> +         (p->type == IOREQ_TYPE_COPY &&
> +          (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
> +    {
> +        uint32_t x86_fam;
> +        pci_sbdf_t sbdf;
> +        unsigned int reg;
> +
> +        reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
> +                                                              &sbdf)
> +                                        : hvm_mmcfg_decode_addr(mmcfg, p->addr,
> +                                                                &sbdf);

... to within hvm_mmcfg_decode_addr()?

  Paul

> +
> +        /* PCI config data cycle */
> +        p->addr = ((uint64_t)sbdf.sbdf << 32) | reg;
> +        /* AMD extended configuration space access? */
> +        if ( p->type == IOREQ_TYPE_PIO && CF8_ADDR_HI(cf8) &&
> +             d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
> +             (x86_fam = get_cpu_family(
> +                 d->arch.cpuid->basic.raw_fms, NULL, NULL)) > 0x10 &&
> +             x86_fam < 0x17 )
> +        {
> +            uint64_t msr_val;
> +
> +            if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
> +                 (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
> +                p->addr |= CF8_ADDR_HI(cf8);
> +        }
> +        p->type = IOREQ_TYPE_PCI_CONFIG;
> +
> +    }
> +    read_unlock(&d->arch.hvm.mmcfg_lock);
> +}
> +
>  bool handle_hvm_io_completion(struct vcpu *v)
>  {
>      struct domain *d = v->domain;
> @@ -1350,57 +1398,36 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
>  ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
>  {
>      struct hvm_ioreq_server *s;
> -    uint32_t cf8;
>      uint8_t type;
> -    uint64_t addr;
>      unsigned int id;
> -    const struct hvm_mmcfg *mmcfg;
> 
>      if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
>          return XEN_INVALID_IOSERVID;
> 
> -    cf8 = d->arch.hvm.pci_cf8;
> +    /*
> +     * Check and convert the PIO/MMIO ioreq to a PCI config space
> +     * access.
> +     */
> +    convert_pci_ioreq(d, p);
> 
> -    read_lock(&d->arch.hvm.mmcfg_lock);
> -    if ( (p->type == IOREQ_TYPE_PIO &&
> -          (p->addr & ~3) == 0xcfc &&
> -          CF8_ENABLED(cf8)) ||
> -         (p->type == IOREQ_TYPE_COPY &&
> -          (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
> +    switch ( p->type )
>      {
> -        uint32_t x86_fam;
> -        pci_sbdf_t sbdf;
> -        unsigned int reg;
> +    case IOREQ_TYPE_PIO:
> +        type = XEN_DMOP_IO_RANGE_PORT;
> +        break;
> 
> -        reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
> -                                                              &sbdf)
> -                                        : hvm_mmcfg_decode_addr(mmcfg, p->addr,
> -                                                                &sbdf);
> +    case IOREQ_TYPE_COPY:
> +        type = XEN_DMOP_IO_RANGE_MEMORY;
> +        break;
> 
> -        /* PCI config data cycle */
> +    case IOREQ_TYPE_PCI_CONFIG:
>          type = XEN_DMOP_IO_RANGE_PCI;
> -        addr = ((uint64_t)sbdf.sbdf << 32) | reg;
> -        /* AMD extended configuration space access? */
> -        if ( p->type == IOREQ_TYPE_PIO && CF8_ADDR_HI(cf8) &&
> -             d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
> -             (x86_fam = get_cpu_family(
> -                 d->arch.cpuid->basic.raw_fms, NULL, NULL)) > 0x10 &&
> -             x86_fam < 0x17 )
> -        {
> -            uint64_t msr_val;
> +        break;
> 
> -            if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
> -                 (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
> -                addr |= CF8_ADDR_HI(cf8);
> -        }
> -    }
> -    else
> -    {
> -        type = (p->type == IOREQ_TYPE_PIO) ?
> -                XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
> -        addr = p->addr;
> +    default:
> +        ASSERT_UNREACHABLE();
> +        return XEN_INVALID_IOSERVID;
>      }
> -    read_unlock(&d->arch.hvm.mmcfg_lock);
> 
>      FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
> @@ -1416,7 +1443,7 @@ ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
>              unsigned long start, end;
> 
>          case XEN_DMOP_IO_RANGE_PORT:
> -            start = addr;
> +            start = p->addr;
>              end = start + p->size - 1;
>              if ( rangeset_contains_range(r, start, end) )
>                  return id;
> @@ -1433,12 +1460,8 @@ ioservid_t hvm_select_ioreq_server(struct domain *d, ioreq_t *p)
>              break;
> 
>          case XEN_DMOP_IO_RANGE_PCI:
> -            if ( rangeset_contains_singleton(r, addr >> 32) )
> -            {
> -                p->type = IOREQ_TYPE_PCI_CONFIG;
> -                p->addr = addr;
> +            if ( rangeset_contains_singleton(r, p->addr >> 32) )
>                  return id;
> -            }
> 
>              break;
>          }
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations...
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations Roger Pau Monne
@ 2019-09-10 14:14   ` Paul Durrant
  2019-09-10 14:28     ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 14:14 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson,
	Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 03 September 2019 17:14
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim
> (Xen.org) <tim@xen.org>
> Subject: [PATCH v2 11/11] ioreq: provide support for long-running operations...
> 
> ...and switch vPCI to use this infrastructure for long running
> physmap modification operations.
> 
> This allows to get rid of the vPCI specific modifications done to
> handle_hvm_io_completion and allows generalizing the support for
> long-running operations to other internal ioreq servers. Such support
> is implemented as a specific handler that can be registers by internal
> ioreq servers and that will be called to check for pending work.
> Returning true from this handler will prevent the vcpu from running
> until the handler returns false.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  xen/arch/x86/hvm/ioreq.c       | 55 +++++++++++++++++++++++++-----
>  xen/drivers/vpci/header.c      | 61 ++++++++++++++++++----------------
>  xen/drivers/vpci/vpci.c        |  8 ++++-
>  xen/include/asm-x86/hvm/vcpu.h |  3 +-
>  xen/include/xen/vpci.h         |  6 ----
>  5 files changed, 89 insertions(+), 44 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 33c56b880c..caa53dfa84 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -239,16 +239,48 @@ bool handle_hvm_io_completion(struct vcpu *v)
>      enum hvm_io_completion io_completion;
>      unsigned int id;
> 
> -    if ( has_vpci(d) && vpci_process_pending(v) )
> -    {
> -        raise_softirq(SCHEDULE_SOFTIRQ);
> -        return false;
> -    }
> -
> -    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
> +    FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
>          struct hvm_ioreq_vcpu *sv;
> 
> +        if ( hvm_ioreq_is_internal(id) )
> +        {

I wonder whether it would be neater by saying:

           if ( !hvm_ioreq_is_internal(id) )
               continue;

here to avoid the indentation below.

> +            if ( vio->io_req.state == STATE_IOREQ_INPROCESS )
> +            {
> +                ioreq_t req = vio->io_req;
> +
> +                /*
> +                 * Check and convert the PIO/MMIO ioreq to a PCI config space
> +                 * access.
> +                 */
> +                convert_pci_ioreq(d, &req);
> +
> +                if ( s->handler(v, &req, s->data) == X86EMUL_RETRY )
> +                {
> +                    /*
> +                     * Need to raise a scheduler irq in order to prevent the
> +                     * guest vcpu from resuming execution.
> +                     *
> +                     * Note this is not required for external ioreq operations
> +                     * because in that case the vcpu is marked as blocked, but
> +                     * this cannot be done for long-running internal
> +                     * operations, since it would prevent the vcpu from being
> +                     * scheduled and thus the long running operation from
> +                     * finishing.
> +                     */
> +                    raise_softirq(SCHEDULE_SOFTIRQ);
> +                    return false;
> +                }
> +
> +                /* Finished processing the ioreq. */
> +                if ( hvm_ioreq_needs_completion(&vio->io_req) )
> +                    vio->io_req.state = STATE_IORESP_READY;
> +                else
> +                    vio->io_req.state = STATE_IOREQ_NONE;

IMO the above is also neater as:

    vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
                        STATE_IORESP_READY : STATE_IOREQ_NONE;

> +            }
> +            continue;
> +        }
> +
>          list_for_each_entry ( sv,
>                                &s->ioreq_vcpu_list,
>                                list_entry )
> @@ -1582,7 +1614,14 @@ int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
>          return hvm_send_buffered_ioreq(s, proto_p);
> 
>      if ( hvm_ioreq_is_internal(id) )
> -        return s->handler(curr, proto_p, s->data);
> +    {
> +        int rc = s->handler(curr, proto_p, s->data);
> +
> +        if ( rc == X86EMUL_RETRY )
> +            curr->arch.hvm.hvm_io.io_req.state = STATE_IOREQ_INPROCESS;
> +
> +        return rc;
> +    }
> 
>      if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
>          return X86EMUL_RETRY;
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 3c794f486d..f1c1a69492 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -129,37 +129,42 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
> 
>  bool vpci_process_pending(struct vcpu *v)
>  {
> -    if ( v->vpci.mem )
> +    struct map_data data = {
> +        .d = v->domain,
> +        .map = v->vpci.cmd & PCI_COMMAND_MEMORY,
> +    };
> +    int rc;
> +
> +    if ( !v->vpci.mem )
>      {
> -        struct map_data data = {
> -            .d = v->domain,
> -            .map = v->vpci.cmd & PCI_COMMAND_MEMORY,
> -        };
> -        int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
> -
> -        if ( rc == -ERESTART )
> -            return true;
> -
> -        spin_lock(&v->vpci.pdev->vpci->lock);
> -        /* Disable memory decoding unconditionally on failure. */
> -        modify_decoding(v->vpci.pdev,
> -                        rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
> -                        !rc && v->vpci.rom_only);
> -        spin_unlock(&v->vpci.pdev->vpci->lock);
> -
> -        rangeset_destroy(v->vpci.mem);
> -        v->vpci.mem = NULL;
> -        if ( rc )
> -            /*
> -             * FIXME: in case of failure remove the device from the domain.
> -             * Note that there might still be leftover mappings. While this is
> -             * safe for Dom0, for DomUs the domain will likely need to be
> -             * killed in order to avoid leaking stale p2m mappings on
> -             * failure.
> -             */
> -            vpci_remove_device(v->vpci.pdev);
> +        ASSERT_UNREACHABLE();
> +        return false;
>      }
> 
> +    rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
> +

Extraneous blank line?

  Paul

> +    if ( rc == -ERESTART )
> +        return true;
> +
> +    spin_lock(&v->vpci.pdev->vpci->lock);
> +    /* Disable memory decoding unconditionally on failure. */
> +    modify_decoding(v->vpci.pdev,
> +                    rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
> +                    !rc && v->vpci.rom_only);
> +    spin_unlock(&v->vpci.pdev->vpci->lock);
> +
> +    rangeset_destroy(v->vpci.mem);
> +    v->vpci.mem = NULL;
> +    if ( rc )
> +        /*
> +         * FIXME: in case of failure remove the device from the domain.
> +         * Note that there might still be leftover mappings. While this is
> +         * safe for Dom0, for DomUs the domain will likely need to be
> +         * killed in order to avoid leaking stale p2m mappings on
> +         * failure.
> +         */
> +        vpci_remove_device(v->vpci.pdev);
> +
>      return false;
>  }
> 
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 5664020c2d..6069dff612 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -498,6 +498,12 @@ static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> +    if ( v->vpci.mem )
> +    {
> +        ASSERT(req->state == STATE_IOREQ_INPROCESS);
> +        return vpci_process_pending(v) ? X86EMUL_RETRY : X86EMUL_OKAY;
> +    }
> +
>      sbdf.sbdf = req->addr >> 32;
> 
>      if ( req->dir )
> @@ -505,7 +511,7 @@ static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
>      else
>          write(sbdf, req->addr, req->size, req->data);
> 
> -    return X86EMUL_OKAY;
> +    return v->vpci.mem ? X86EMUL_RETRY : X86EMUL_OKAY;
>  }
> 
>  int vpci_register_ioreq(struct domain *d)
> diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
> index 38f5c2bb9b..4563746466 100644
> --- a/xen/include/asm-x86/hvm/vcpu.h
> +++ b/xen/include/asm-x86/hvm/vcpu.h
> @@ -92,7 +92,8 @@ struct hvm_vcpu_io {
> 
>  static inline bool hvm_ioreq_needs_completion(const ioreq_t *ioreq)
>  {
> -    return ioreq->state == STATE_IOREQ_READY &&
> +    return (ioreq->state == STATE_IOREQ_READY ||
> +            ioreq->state == STATE_IOREQ_INPROCESS) &&
>             !ioreq->data_is_ptr &&
>             (ioreq->type != IOREQ_TYPE_PIO || ioreq->dir != IOREQ_WRITE);
>  }
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 36f435ed5b..a65491e0c9 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -225,12 +225,6 @@ static inline int vpci_register_ioreq(struct domain *d)
>  }
> 
>  static inline void vpci_dump_msi(void) { }
> -
> -static inline bool vpci_process_pending(struct vcpu *v)
> -{
> -    ASSERT_UNREACHABLE();
> -    return false;
> -}
>  #endif
> 
>  #endif
> --
> 2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations...
  2019-09-10 14:14   ` Paul Durrant
@ 2019-09-10 14:28     ` Roger Pau Monné
  2019-09-10 14:40       ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-10 14:28 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

On Tue, Sep 10, 2019 at 04:14:18PM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 03 September 2019 17:14
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap
> > <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>;
> > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim
> > (Xen.org) <tim@xen.org>
> > Subject: [PATCH v2 11/11] ioreq: provide support for long-running operations...
> > 
> > ...and switch vPCI to use this infrastructure for long running
> > physmap modification operations.
> > 
> > This allows to get rid of the vPCI specific modifications done to
> > handle_hvm_io_completion and allows generalizing the support for
> > long-running operations to other internal ioreq servers. Such support
> > is implemented as a specific handler that can be registers by internal
> > ioreq servers and that will be called to check for pending work.
> > Returning true from this handler will prevent the vcpu from running
> > until the handler returns false.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> >  xen/arch/x86/hvm/ioreq.c       | 55 +++++++++++++++++++++++++-----
> >  xen/drivers/vpci/header.c      | 61 ++++++++++++++++++----------------
> >  xen/drivers/vpci/vpci.c        |  8 ++++-
> >  xen/include/asm-x86/hvm/vcpu.h |  3 +-
> >  xen/include/xen/vpci.h         |  6 ----
> >  5 files changed, 89 insertions(+), 44 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > index 33c56b880c..caa53dfa84 100644
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -239,16 +239,48 @@ bool handle_hvm_io_completion(struct vcpu *v)
> >      enum hvm_io_completion io_completion;
> >      unsigned int id;
> > 
> > -    if ( has_vpci(d) && vpci_process_pending(v) )
> > -    {
> > -        raise_softirq(SCHEDULE_SOFTIRQ);
> > -        return false;
> > -    }
> > -
> > -    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
> >      {
> >          struct hvm_ioreq_vcpu *sv;
> > 
> > +        if ( hvm_ioreq_is_internal(id) )
> > +        {
> 
> I wonder whether it would be neater by saying:
> 
>            if ( !hvm_ioreq_is_internal(id) )
>                continue;
> 
> here to avoid the indentation below.

I'm not sure I follow. This loop does work for both the internal and
the external ioreq servers, and hence skipping external servers would
prevent the iteration over ioreq_vcpu_list done below for external
servers.

> 
> > +            if ( vio->io_req.state == STATE_IOREQ_INPROCESS )
> > +            {
> > +                ioreq_t req = vio->io_req;
> > +
> > +                /*
> > +                 * Check and convert the PIO/MMIO ioreq to a PCI config space
> > +                 * access.
> > +                 */
> > +                convert_pci_ioreq(d, &req);
> > +
> > +                if ( s->handler(v, &req, s->data) == X86EMUL_RETRY )
> > +                {
> > +                    /*
> > +                     * Need to raise a scheduler irq in order to prevent the
> > +                     * guest vcpu from resuming execution.
> > +                     *
> > +                     * Note this is not required for external ioreq operations
> > +                     * because in that case the vcpu is marked as blocked, but
> > +                     * this cannot be done for long-running internal
> > +                     * operations, since it would prevent the vcpu from being
> > +                     * scheduled and thus the long running operation from
> > +                     * finishing.
> > +                     */
> > +                    raise_softirq(SCHEDULE_SOFTIRQ);
> > +                    return false;
> > +                }
> > +
> > +                /* Finished processing the ioreq. */
> > +                if ( hvm_ioreq_needs_completion(&vio->io_req) )
> > +                    vio->io_req.state = STATE_IORESP_READY;
> > +                else
> > +                    vio->io_req.state = STATE_IOREQ_NONE;
> 
> IMO the above is also neater as:
> 
>     vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
>                         STATE_IORESP_READY : STATE_IOREQ_NONE;
> 
> > +            }
> > +            continue;
> > +        }
> > +
> >          list_for_each_entry ( sv,
> >                                &s->ioreq_vcpu_list,
> >                                list_entry )

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations...
  2019-09-10 14:28     ` Roger Pau Monné
@ 2019-09-10 14:40       ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-10 14:40 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 10 September 2019 15:28
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v2 11/11] ioreq: provide support for long-running operations...
> 
> On Tue, Sep 10, 2019 at 04:14:18PM +0200, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > Sent: 03 September 2019 17:14
> > > To: xen-devel@lists.xenproject.org
> > > Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > > <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; George
> Dunlap
> > > <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Julien Grall
> <julien.grall@arm.com>;
> > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim
> > > (Xen.org) <tim@xen.org>
> > > Subject: [PATCH v2 11/11] ioreq: provide support for long-running operations...
> > >
> > > ...and switch vPCI to use this infrastructure for long running
> > > physmap modification operations.
> > >
> > > This allows to get rid of the vPCI specific modifications done to
> > > handle_hvm_io_completion and allows generalizing the support for
> > > long-running operations to other internal ioreq servers. Such support
> > > is implemented as a specific handler that can be registers by internal
> > > ioreq servers and that will be called to check for pending work.
> > > Returning true from this handler will prevent the vcpu from running
> > > until the handler returns false.
> > >
> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > ---
> > >  xen/arch/x86/hvm/ioreq.c       | 55 +++++++++++++++++++++++++-----
> > >  xen/drivers/vpci/header.c      | 61 ++++++++++++++++++----------------
> > >  xen/drivers/vpci/vpci.c        |  8 ++++-
> > >  xen/include/asm-x86/hvm/vcpu.h |  3 +-
> > >  xen/include/xen/vpci.h         |  6 ----
> > >  5 files changed, 89 insertions(+), 44 deletions(-)
> > >
> > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > > index 33c56b880c..caa53dfa84 100644
> > > --- a/xen/arch/x86/hvm/ioreq.c
> > > +++ b/xen/arch/x86/hvm/ioreq.c
> > > @@ -239,16 +239,48 @@ bool handle_hvm_io_completion(struct vcpu *v)
> > >      enum hvm_io_completion io_completion;
> > >      unsigned int id;
> > >
> > > -    if ( has_vpci(d) && vpci_process_pending(v) )
> > > -    {
> > > -        raise_softirq(SCHEDULE_SOFTIRQ);
> > > -        return false;
> > > -    }
> > > -
> > > -    FOR_EACH_EXTERNAL_IOREQ_SERVER(d, id, s)
> > > +    FOR_EACH_IOREQ_SERVER(d, id, s)
> > >      {
> > >          struct hvm_ioreq_vcpu *sv;
> > >
> > > +        if ( hvm_ioreq_is_internal(id) )
> > > +        {
> >
> > I wonder whether it would be neater by saying:
> >
> >            if ( !hvm_ioreq_is_internal(id) )
> >                continue;
> >
> > here to avoid the indentation below.
> 
> I'm not sure I follow. This loop does work for both the internal and
> the external ioreq servers, and hence skipping external servers would
> prevent the iteration over ioreq_vcpu_list done below for external
> servers.

Sorry, I got my nesting mixed up. What I actually mean is:

+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;

+        if ( hvm_ioreq_is_internal(id) )
+        {
+            ioreq_t req = vio->io_req;
+
+            if ( vio->io_req.state != STATE_IOREQ_INPROCESS )
+                continue;
+
+            /*
+             * Check and convert the PIO/MMIO ioreq to a PCI config space
+             * access.
+             */
+            convert_pci_ioreq(d, &req);
+
+            if ( s->handler(v, &req, s->data) == X86EMUL_RETRY )
+            {
+                /*
+                 * Need to raise a scheduler irq in order to prevent the
+                 * guest vcpu from resuming execution.
+                 *
+                 * Note this is not required for external ioreq operations
+                 * because in that case the vcpu is marked as blocked, but
+                 * this cannot be done for long-running internal
+                 * operations, since it would prevent the vcpu from being
+                 * scheduled and thus the long running operation from
+                 * finishing.
+                 */
+                raise_softirq(SCHEDULE_SOFTIRQ);
+                return false;
+            }
+
+            /* Finished processing the ioreq. */
+            vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
+                                STATE_IORESP_READY : STATE_IOREQ_NONE;
+            continue;
+        }
+

  Paul

> 
> >
> > > +            if ( vio->io_req.state == STATE_IOREQ_INPROCESS )
> > > +            {
> > > +                ioreq_t req = vio->io_req;
> > > +
> > > +                /*
> > > +                 * Check and convert the PIO/MMIO ioreq to a PCI config space
> > > +                 * access.
> > > +                 */
> > > +                convert_pci_ioreq(d, &req);
> > > +
> > > +                if ( s->handler(v, &req, s->data) == X86EMUL_RETRY )
> > > +                {
> > > +                    /*
> > > +                     * Need to raise a scheduler irq in order to prevent the
> > > +                     * guest vcpu from resuming execution.
> > > +                     *
> > > +                     * Note this is not required for external ioreq operations
> > > +                     * because in that case the vcpu is marked as blocked, but
> > > +                     * this cannot be done for long-running internal
> > > +                     * operations, since it would prevent the vcpu from being
> > > +                     * scheduled and thus the long running operation from
> > > +                     * finishing.
> > > +                     */
> > > +                    raise_softirq(SCHEDULE_SOFTIRQ);
> > > +                    return false;
> > > +                }
> > > +
> > > +                /* Finished processing the ioreq. */
> > > +                if ( hvm_ioreq_needs_completion(&vio->io_req) )
> > > +                    vio->io_req.state = STATE_IORESP_READY;
> > > +                else
> > > +                    vio->io_req.state = STATE_IOREQ_NONE;
> >
> > IMO the above is also neater as:
> >
> >     vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
> >                         STATE_IORESP_READY : STATE_IOREQ_NONE;
> >
> > > +            }
> > > +            continue;
> > > +        }
> > > +
> > >          list_for_each_entry ( sv,
> > >                                &s->ioreq_vcpu_list,
> > >                                list_entry )
> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t
  2019-09-10 12:31   ` Paul Durrant
@ 2019-09-20 10:47     ` Jan Beulich
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2019-09-20 10:47 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim(Xen.org),
	GeorgeDunlap, Julien Grall, Paul Durrant, Ian Jackson, xen-devel

On 10.09.2019 14:31, Paul Durrant wrote:
>> -----Original Message-----
>> From: Roger Pau Monne <roger.pau@citrix.com>
>> Sent: 03 September 2019 17:14
>> To: xen-devel@lists.xenproject.org
>> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap <George.Dunlap@citrix.com>; Ian
>> Jackson <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
>> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>;
>> Paul Durrant <Paul.Durrant@citrix.com>
>> Subject: [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t
>>
>> hvm_select_ioreq_server and hvm_send_ioreq where both using
>> hvm_ioreq_server directly, switch to use ioservid_t in order to select
>> and forward ioreqs.
>>
>> This is a preparatory change, since future patches will use the ioreq
>> server id in order to differentiate between internal and external
>> ioreq servers.
>>
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
> 
> ... with one suggestion.
> 
> [snip]
>> diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
>> index d3b554d019..8725cc20d3 100644
>> --- a/xen/include/public/hvm/dm_op.h
>> +++ b/xen/include/public/hvm/dm_op.h
>> @@ -54,6 +54,7 @@
>>   */
>>
>>  typedef uint16_t ioservid_t;
>> +#define XEN_INVALID_IOSERVID 0xffff
>>
> 
> Perhaps use (ioservid_t)~0 rather than hardcoding?

And then (suitably parenthesized) applicable parts
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers Roger Pau Monne
  2019-09-10 12:34   ` Paul Durrant
@ 2019-09-20 10:53   ` Jan Beulich
  1 sibling, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2019-09-20 10:53 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Andrew Cooper

On 03.09.2019 18:14, Roger Pau Monne wrote:
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -52,21 +52,29 @@ struct hvm_ioreq_vcpu {
>  #define MAX_NR_IO_RANGES  256
>  
>  struct hvm_ioreq_server {
> -    struct domain          *target, *emulator;
> -
> +    struct domain          *target;
>      /* Lock to serialize toolstack modifications */
>      spinlock_t             lock;
> -
> -    struct hvm_ioreq_page  ioreq;
> -    struct list_head       ioreq_vcpu_list;
> -    struct hvm_ioreq_page  bufioreq;
> -
> -    /* Lock to serialize access to buffered ioreq ring */
> -    spinlock_t             bufioreq_lock;
> -    evtchn_port_t          bufioreq_evtchn;
>      struct rangeset        *range[NR_IO_RANGE_TYPES];
>      bool                   enabled;
> -    uint8_t                bufioreq_handling;
> +
> +    union {
> +        struct {
> +            struct domain          *emulator;
> +            struct hvm_ioreq_page  ioreq;
> +            struct list_head       ioreq_vcpu_list;
> +            struct hvm_ioreq_page  bufioreq;
> +
> +            /* Lock to serialize access to buffered ioreq ring */
> +            spinlock_t             bufioreq_lock;
> +            evtchn_port_t          bufioreq_evtchn;
> +            uint8_t                bufioreq_handling;
> +        };
> +        struct {
> +            void                   *data;
> +            int (*handler)(struct vcpu *v, ioreq_t *, void *);

If you omit the latter two parameter names, the first one should
be omitted, too. And if there was to be any inconsistency in this
regard, then the one parameter where the type doesn't immediately
clarify the purpose would be the one to have a name.

As to the struct vcpu * parameter - is there an expectation that
the handler would be called with this being other than "current"?
If not, the parameter would want to either be dropped, or be
named "curr".

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support Roger Pau Monne
  2019-09-10 12:59   ` Paul Durrant
@ 2019-09-20 11:15   ` Jan Beulich
  2019-09-26 10:51     ` Roger Pau Monné
  1 sibling, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2019-09-20 11:15 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 03.09.2019 18:14, Roger Pau Monne wrote:
> @@ -821,6 +851,9 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
>      if ( i >= MAX_NR_IOREQ_SERVERS )
>          goto fail;
>  
> +    ASSERT((internal &&
> +            i >= MAX_NR_EXTERNAL_IOREQ_SERVERS && i < MAX_NR_IOREQ_SERVERS) ||
> +           (!internal && i < MAX_NR_EXTERNAL_IOREQ_SERVERS));

Perhaps easier to read both here and in the event the assertion
would actually trigger as either

    ASSERT(internal
           ? i >= MAX_NR_EXTERNAL_IOREQ_SERVERS && i < MAX_NR_IOREQ_SERVERS
           : i < MAX_NR_EXTERNAL_IOREQ_SERVERS);

or even

    ASSERT(i < MAX_NR_EXTERNAL_IOREQ_SERVERS
           ? !internal
           : internal && i < MAX_NR_IOREQ_SERVERS);

?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-03 16:14 ` [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers Roger Pau Monne
  2019-09-10 13:06   ` Paul Durrant
@ 2019-09-20 11:35   ` Jan Beulich
  2019-09-26 11:14     ` Roger Pau Monné
  1 sibling, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2019-09-20 11:35 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 03.09.2019 18:14, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1493,9 +1493,18 @@ int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
>  
>      ASSERT(s);
>  
> +    if ( hvm_ioreq_is_internal(id) && buffered )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return X86EMUL_UNHANDLEABLE;
> +    }
> +
>      if ( buffered )
>          return hvm_send_buffered_ioreq(s, proto_p);

Perhaps better (to avoid yet another conditional on the non-
buffered path)

    if ( buffered )
    {
        if ( likely(!hvm_ioreq_is_internal(id)) )
            return hvm_send_buffered_ioreq(s, proto_p);

        ASSERT_UNREACHABLE();
        return X86EMUL_UNHANDLEABLE;
    }

?

> +    if ( hvm_ioreq_is_internal(id) )
> +        return s->handler(curr, proto_p, s->data);

At this point I'm becoming curious what the significance of
ioreq_t's state field is for internal servers, as nothing was
said so far in this regard: Is it entirely unused? Is every
handler supposed to drive it? If so, what about return value
here and proto_p->state not really matching up? And if not,
shouldn't you update the field here, at the very least to
avoid any chance of confusing callers?

A possible consequence of the answers to this might be for
the hook's middle parameter to be constified (in patch 4).

Having said this, as a result of having looked at some of the
involved code, and with the cover letter not clarifying this,
what's the reason for going this seemingly more complicated
route, rather than putting vPCI behind the hvm_io_intercept()
machinery, just like is the case for other internal handling?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support
  2019-09-10 12:59   ` Paul Durrant
@ 2019-09-26 10:49     ` Roger Pau Monné
  2019-09-26 10:58       ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 10:49 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Sep 10, 2019 at 02:59:57PM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 03 September 2019 17:14
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> > <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <Paul.Durrant@citrix.com>
> > Subject: [PATCH v2 05/11] ioreq: add internal ioreq initialization support
> > @@ -736,33 +754,39 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
> >      int rc;
> > 
> >      s->target = d;
> > -
> > -    get_knownalive_domain(currd);
> > -    s->emulator = currd;
> > -
> >      spin_lock_init(&s->lock);
> > -    INIT_LIST_HEAD(&s->ioreq_vcpu_list);
> > -    spin_lock_init(&s->bufioreq_lock);
> > -
> > -    s->ioreq.gfn = INVALID_GFN;
> > -    s->bufioreq.gfn = INVALID_GFN;
> > 
> >      rc = hvm_ioreq_server_alloc_rangesets(s, id);
> >      if ( rc )
> >          return rc;
> > 
> > -    s->bufioreq_handling = bufioreq_handling;
> > -
> > -    for_each_vcpu ( d, v )
> > +    if ( !hvm_ioreq_is_internal(id) )
> >      {
> > -        rc = hvm_ioreq_server_add_vcpu(s, v);
> > -        if ( rc )
> > -            goto fail_add;
> > +        get_knownalive_domain(currd);
> > +
> > +        s->emulator = currd;
> > +        INIT_LIST_HEAD(&s->ioreq_vcpu_list);
> > +        spin_lock_init(&s->bufioreq_lock);
> > +
> > +        s->ioreq.gfn = INVALID_GFN;
> > +        s->bufioreq.gfn = INVALID_GFN;
> > +
> > +        s->bufioreq_handling = bufioreq_handling;
> > +
> > +        for_each_vcpu ( d, v )
> > +        {
> > +            rc = hvm_ioreq_server_add_vcpu(s, v);
> > +            if ( rc )
> > +                goto fail_add;
> > +        }
> >      }
> > +    else
> > +        s->handler = NULL;
> 
> The struct is zeroed out so initializing the handler is not necessary.
> 
> > 
> >      return 0;
> > 
> >   fail_add:
> > +    ASSERT(!hvm_ioreq_is_internal(id));
> >      hvm_ioreq_server_remove_all_vcpus(s);
> >      hvm_ioreq_server_unmap_pages(s);
> > 
> 
> I think it would be worthwhile having that ASSERT repeated in the called functions, and other functions that only operate on external ioreq servers e.g. hvm_ioreq_server_add_vcpu(), hvm_ioreq_server_map_pages(), etc. 

That's fine, but then I would also need to pass the ioreq server id to
those functions just to perform the ASSERT. I will leave those as-is
because I think passing the id just for that ASSERT is kind of
pointless.

> > @@ -864,20 +897,21 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
> >          goto out;
> > 
> >      rc = -EPERM;
> > -    if ( s->emulator != current->domain )
> > +    /* NB: internal servers cannot be destroyed. */
> > +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
> 
> Shouldn't the test of hvm_ioreq_is_internal(id) simply be an ASSERT? This function should only be called from a dm_op(), right?

Right, I think I've wrongly assumed this was also called when
destroying a domain, but domain destruction uses
hvm_destroy_all_ioreq_servers instead.

> >          goto out;
> > 
> >      domain_pause(d);
> > 
> >      p2m_set_ioreq_server(d, 0, id);
> > 
> > -    hvm_ioreq_server_disable(s);
> > +    hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
> > 
> >      /*
> >       * It is safe to call hvm_ioreq_server_deinit() prior to
> >       * set_ioreq_server() since the target domain is paused.
> >       */
> > -    hvm_ioreq_server_deinit(s);
> > +    hvm_ioreq_server_deinit(s, hvm_ioreq_is_internal(id));
> >      set_ioreq_server(d, id, NULL);
> > 
> >      domain_unpause(d);
> > @@ -909,7 +943,8 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
> >          goto out;
> > 
> >      rc = -EPERM;
> > -    if ( s->emulator != current->domain )
> > +    /* NB: don't allow fetching information from internal ioreq servers. */
> > +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
> 
> Again here, and several places below.

I've fixed the calls to hvm_get_ioreq_server_info,
hvm_get_ioreq_server_frame and hvm_map_mem_type_to_ioreq_server to
include an ASSERT that the passed ioreq is not internal.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support
  2019-09-20 11:15   ` Jan Beulich
@ 2019-09-26 10:51     ` Roger Pau Monné
  0 siblings, 0 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 10:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Fri, Sep 20, 2019 at 01:15:06PM +0200, Jan Beulich wrote:
> On 03.09.2019 18:14, Roger Pau Monne wrote:
> > @@ -821,6 +851,9 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> >      if ( i >= MAX_NR_IOREQ_SERVERS )
> >          goto fail;
> >  
> > +    ASSERT((internal &&
> > +            i >= MAX_NR_EXTERNAL_IOREQ_SERVERS && i < MAX_NR_IOREQ_SERVERS) ||
> > +           (!internal && i < MAX_NR_EXTERNAL_IOREQ_SERVERS));
> 
> Perhaps easier to read both here and in the event the assertion
> would actually trigger as either
> 
>     ASSERT(internal
>            ? i >= MAX_NR_EXTERNAL_IOREQ_SERVERS && i < MAX_NR_IOREQ_SERVERS
>            : i < MAX_NR_EXTERNAL_IOREQ_SERVERS);
> 
> or even
> 
>     ASSERT(i < MAX_NR_EXTERNAL_IOREQ_SERVERS
>            ? !internal
>            : internal && i < MAX_NR_IOREQ_SERVERS);
> 
> ?

I went with the last variation of your proposed ASSERT.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support
  2019-09-26 10:49     ` Roger Pau Monné
@ 2019-09-26 10:58       ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-26 10:58 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

> -----Original Message-----
[snip]
> > >
> > >      return 0;
> > >
> > >   fail_add:
> > > +    ASSERT(!hvm_ioreq_is_internal(id));
> > >      hvm_ioreq_server_remove_all_vcpus(s);
> > >      hvm_ioreq_server_unmap_pages(s);
> > >
> >
> > I think it would be worthwhile having that ASSERT repeated in the called functions, and other
> functions that only operate on external ioreq servers e.g. hvm_ioreq_server_add_vcpu(),
> hvm_ioreq_server_map_pages(), etc.
> 
> That's fine, but then I would also need to pass the ioreq server id to
> those functions just to perform the ASSERT. I will leave those as-is
> because I think passing the id just for that ASSERT is kind of
> pointless.

Oh, I was misremembering the id being recorded in the struct. I guess that was when ioreq servers were in a list rather than an array. Indeed there's no point in passing an id just to ASSERT on it.

> 
> > > @@ -864,20 +897,21 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
> > >          goto out;
> > >
> > >      rc = -EPERM;
> > > -    if ( s->emulator != current->domain )
> > > +    /* NB: internal servers cannot be destroyed. */
> > > +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
> >
> > Shouldn't the test of hvm_ioreq_is_internal(id) simply be an ASSERT? This function should only be
> called from a dm_op(), right?
> 
> Right, I think I've wrongly assumed this was also called when
> destroying a domain, but domain destruction uses
> hvm_destroy_all_ioreq_servers instead.
> 

That's right.

> > >          goto out;
> > >
> > >      domain_pause(d);
> > >
> > >      p2m_set_ioreq_server(d, 0, id);
> > >
> > > -    hvm_ioreq_server_disable(s);
> > > +    hvm_ioreq_server_disable(s, hvm_ioreq_is_internal(id));
> > >
> > >      /*
> > >       * It is safe to call hvm_ioreq_server_deinit() prior to
> > >       * set_ioreq_server() since the target domain is paused.
> > >       */
> > > -    hvm_ioreq_server_deinit(s);
> > > +    hvm_ioreq_server_deinit(s, hvm_ioreq_is_internal(id));
> > >      set_ioreq_server(d, id, NULL);
> > >
> > >      domain_unpause(d);
> > > @@ -909,7 +943,8 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
> > >          goto out;
> > >
> > >      rc = -EPERM;
> > > -    if ( s->emulator != current->domain )
> > > +    /* NB: don't allow fetching information from internal ioreq servers. */
> > > +    if ( hvm_ioreq_is_internal(id) || s->emulator != current->domain )
> >
> > Again here, and several places below.
> 
> I've fixed the calls to hvm_get_ioreq_server_info,
> hvm_get_ioreq_server_frame and hvm_map_mem_type_to_ioreq_server to
> include an ASSERT that the passed ioreq is not internal.
> 

Cool. Thanks,

  Paul

> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-20 11:35   ` Jan Beulich
@ 2019-09-26 11:14     ` Roger Pau Monné
  2019-09-26 13:17       ` Jan Beulich
  2019-09-26 16:36       ` Roger Pau Monné
  0 siblings, 2 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 11:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
> On 03.09.2019 18:14, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -1493,9 +1493,18 @@ int hvm_send_ioreq(ioservid_t id, ioreq_t *proto_p, bool buffered)
> >  
> >      ASSERT(s);
> >  
> > +    if ( hvm_ioreq_is_internal(id) && buffered )
> > +    {
> > +        ASSERT_UNREACHABLE();
> > +        return X86EMUL_UNHANDLEABLE;
> > +    }
> > +
> >      if ( buffered )
> >          return hvm_send_buffered_ioreq(s, proto_p);
> 
> Perhaps better (to avoid yet another conditional on the non-
> buffered path)
> 
>     if ( buffered )
>     {
>         if ( likely(!hvm_ioreq_is_internal(id)) )
>             return hvm_send_buffered_ioreq(s, proto_p);
> 
>         ASSERT_UNREACHABLE();
>         return X86EMUL_UNHANDLEABLE;
>     }
> 
> ?

Sure.

> > +    if ( hvm_ioreq_is_internal(id) )
> > +        return s->handler(curr, proto_p, s->data);
> 
> At this point I'm becoming curious what the significance of
> ioreq_t's state field is for internal servers, as nothing was
> said so far in this regard: Is it entirely unused? Is every
> handler supposed to drive it? If so, what about return value
> here and proto_p->state not really matching up? And if not,
> shouldn't you update the field here, at the very least to
> avoid any chance of confusing callers?

The ioreq state field when used by internal servers is modified here
in order to use it as an indication of long-running operations, but
that's introduced in patch 11/11 ('ioreq: provide support for
long-running operations...').

> 
> A possible consequence of the answers to this might be for
> the hook's middle parameter to be constified (in patch 4).

Yes, I think it can be constified.

> Having said this, as a result of having looked at some of the
> involved code, and with the cover letter not clarifying this,
> what's the reason for going this seemingly more complicated
> route, rather than putting vPCI behind the hvm_io_intercept()
> machinery, just like is the case for other internal handling?

If vPCI is handled at the hvm_io_intercept level (like its done ATM)
then it's not possible to have both (external) ioreq servers and vPCI
handling accesses to different devices in the PCI config space, since
vPCI would trap all accesses to the PCI IO ports and the MCFG regions
and those would never reach the ioreq processing.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-26 11:14     ` Roger Pau Monné
@ 2019-09-26 13:17       ` Jan Beulich
  2019-09-26 13:46         ` Roger Pau Monné
  2019-09-26 16:36       ` Roger Pau Monné
  1 sibling, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2019-09-26 13:17 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On 26.09.2019 13:14, Roger Pau Monné  wrote:
> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
>> Having said this, as a result of having looked at some of the
>> involved code, and with the cover letter not clarifying this,
>> what's the reason for going this seemingly more complicated
>> route, rather than putting vPCI behind the hvm_io_intercept()
>> machinery, just like is the case for other internal handling?
> 
> If vPCI is handled at the hvm_io_intercept level (like its done ATM)
> then it's not possible to have both (external) ioreq servers and vPCI
> handling accesses to different devices in the PCI config space, since
> vPCI would trap all accesses to the PCI IO ports and the MCFG regions
> and those would never reach the ioreq processing.

Why would vPCI (want to) do that? The accept() handler should
sub-class the CF8-CFF port range; there would likely want to
be another struct hvm_io_ops instance dealing with config
space accesses (and perhaps with ones through port I/O and
through MCFG at the same time). In the end this would likely
more similar to how chipsets handle this on real hardware
than your "internal server" solution (albeit I agree to a
degree it's in implementation detail anyway).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-26 13:17       ` Jan Beulich
@ 2019-09-26 13:46         ` Roger Pau Monné
  2019-09-26 15:13           ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 13:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On Thu, Sep 26, 2019 at 03:17:15PM +0200, Jan Beulich wrote:
> On 26.09.2019 13:14, Roger Pau Monné  wrote:
> > On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
> >> Having said this, as a result of having looked at some of the
> >> involved code, and with the cover letter not clarifying this,
> >> what's the reason for going this seemingly more complicated
> >> route, rather than putting vPCI behind the hvm_io_intercept()
> >> machinery, just like is the case for other internal handling?
> > 
> > If vPCI is handled at the hvm_io_intercept level (like its done ATM)
> > then it's not possible to have both (external) ioreq servers and vPCI
> > handling accesses to different devices in the PCI config space, since
> > vPCI would trap all accesses to the PCI IO ports and the MCFG regions
> > and those would never reach the ioreq processing.
> 
> Why would vPCI (want to) do that? The accept() handler should
> sub-class the CF8-CFF port range; there would likely want to
> be another struct hvm_io_ops instance dealing with config
> space accesses (and perhaps with ones through port I/O and
> through MCFG at the same time).

Do you mean to expand hvm_io_handler to add something like a pciconf
sub-structure to the existing union of portio and mmio?

That's indeed feasible, but I'm not sure why it's better that the
approach proposed on this series. Long term I think we would like all
intercept handlers to use the ioreq infrastructure and remove the
usage of hvm_io_intercept.

> In the end this would likely
> more similar to how chipsets handle this on real hardware
> than your "internal server" solution (albeit I agree to a
> degree it's in implementation detail anyway).

I think the end goal should be to unify the internal and external
intercepts into a single point, and the only feasible way to do this
is to switch the internal intercepts to use the ioreq infrastructure.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-10 13:49   ` Paul Durrant
@ 2019-09-26 15:07     ` Roger Pau Monné
  2019-09-27  8:29       ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 15:07 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

On Tue, Sep 10, 2019 at 03:49:41PM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 03 September 2019 17:14
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne <roger.pau@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> > <wl@xen.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan
> > Beulich <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>;
> > Paul Durrant <Paul.Durrant@citrix.com>
> > Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> >      spin_unlock(&pdev->vpci->lock);
> >  }
> > 
> > +#ifdef __XEN__
> > +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> > +{
> > +    pci_sbdf_t sbdf;
> > +
> > +    if ( req->type == IOREQ_TYPE_INVALIDATE )
> > +        /*
> > +         * Ignore invalidate requests, those can be received even without
> > +         * having any memory ranges registered, see send_invalidate_req.
> > +         */
> > +        return X86EMUL_OKAY;
> 
> In general, I wonder whether internal servers will ever need to deal with invalidate? The code only exists to get QEMU to drop its map cache after a decrease_reservation so that the page refs get dropped.

I think the best solution here is to rename hvm_broadcast_ioreq to
hvm_broadcast_ioreq_external and switch it's callers. Both
send_timeoffset_req and send_invalidate_req seem only relevant to
external ioreq servers.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-26 13:46         ` Roger Pau Monné
@ 2019-09-26 15:13           ` Jan Beulich
  2019-09-26 15:59             ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2019-09-26 15:13 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On 26.09.2019 15:46, Roger Pau Monné  wrote:
> On Thu, Sep 26, 2019 at 03:17:15PM +0200, Jan Beulich wrote:
>> On 26.09.2019 13:14, Roger Pau Monné  wrote:
>>> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
>>>> Having said this, as a result of having looked at some of the
>>>> involved code, and with the cover letter not clarifying this,
>>>> what's the reason for going this seemingly more complicated
>>>> route, rather than putting vPCI behind the hvm_io_intercept()
>>>> machinery, just like is the case for other internal handling?
>>>
>>> If vPCI is handled at the hvm_io_intercept level (like its done ATM)
>>> then it's not possible to have both (external) ioreq servers and vPCI
>>> handling accesses to different devices in the PCI config space, since
>>> vPCI would trap all accesses to the PCI IO ports and the MCFG regions
>>> and those would never reach the ioreq processing.
>>
>> Why would vPCI (want to) do that? The accept() handler should
>> sub-class the CF8-CFF port range; there would likely want to
>> be another struct hvm_io_ops instance dealing with config
>> space accesses (and perhaps with ones through port I/O and
>> through MCFG at the same time).
> 
> Do you mean to expand hvm_io_handler to add something like a pciconf
> sub-structure to the existing union of portio and mmio?

Yes, something along these lines.

> That's indeed feasible, but I'm not sure why it's better that the
> approach proposed on this series. Long term I think we would like all
> intercept handlers to use the ioreq infrastructure and remove the
> usage of hvm_io_intercept.
> 
>> In the end this would likely
>> more similar to how chipsets handle this on real hardware
>> than your "internal server" solution (albeit I agree to a
>> degree it's in implementation detail anyway).
> 
> I think the end goal should be to unify the internal and external
> intercepts into a single point, and the only feasible way to do this
> is to switch the internal intercepts to use the ioreq infrastructure.

Well, I recall this having been mentioned as an option; I don't
recall this being a firm plan. There are certainly benefits to
such a model, but there's also potentially more overhead (at the
very least the ioreq_t will then need setting up / maintaining
everywhere, when right now the interfaces are using more
immediate parameters).

But yes, if this _is_ the plan, then going that route right away
for vPCI is desirable.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-26 15:13           ` Jan Beulich
@ 2019-09-26 15:59             ` Roger Pau Monné
  2019-09-27  8:17               ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 15:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On Thu, Sep 26, 2019 at 05:13:23PM +0200, Jan Beulich wrote:
> On 26.09.2019 15:46, Roger Pau Monné  wrote:
> > On Thu, Sep 26, 2019 at 03:17:15PM +0200, Jan Beulich wrote:
> >> On 26.09.2019 13:14, Roger Pau Monné  wrote:
> >>> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
> >>>> Having said this, as a result of having looked at some of the
> >>>> involved code, and with the cover letter not clarifying this,
> >>>> what's the reason for going this seemingly more complicated
> >>>> route, rather than putting vPCI behind the hvm_io_intercept()
> >>>> machinery, just like is the case for other internal handling?
> >>>
> >>> If vPCI is handled at the hvm_io_intercept level (like its done ATM)
> >>> then it's not possible to have both (external) ioreq servers and vPCI
> >>> handling accesses to different devices in the PCI config space, since
> >>> vPCI would trap all accesses to the PCI IO ports and the MCFG regions
> >>> and those would never reach the ioreq processing.
> >>
> >> Why would vPCI (want to) do that? The accept() handler should
> >> sub-class the CF8-CFF port range; there would likely want to
> >> be another struct hvm_io_ops instance dealing with config
> >> space accesses (and perhaps with ones through port I/O and
> >> through MCFG at the same time).
> > 
> > Do you mean to expand hvm_io_handler to add something like a pciconf
> > sub-structure to the existing union of portio and mmio?
> 
> Yes, something along these lines.
> 
> > That's indeed feasible, but I'm not sure why it's better that the
> > approach proposed on this series. Long term I think we would like all
> > intercept handlers to use the ioreq infrastructure and remove the
> > usage of hvm_io_intercept.
> > 
> >> In the end this would likely
> >> more similar to how chipsets handle this on real hardware
> >> than your "internal server" solution (albeit I agree to a
> >> degree it's in implementation detail anyway).
> > 
> > I think the end goal should be to unify the internal and external
> > intercepts into a single point, and the only feasible way to do this
> > is to switch the internal intercepts to use the ioreq infrastructure.
> 
> Well, I recall this having been mentioned as an option; I don't
> recall this being a firm plan. There are certainly benefits to
> such a model, but there's also potentially more overhead (at the
> very least the ioreq_t will then need setting up / maintaining
> everywhere, when right now the interfaces are using more
> immediate parameters).

AFAICT from code in hvmemul_do_io which dispatches to both
hvm_io_intercept and ioreq servers the ioreq is already there, so I'm
not sure why more setup would be required in order to handle internal
intercepts as ioreq servers. For vPCI at least I've been able to get
away without having to modify hvmemul_do_io IIRC.

> But yes, if this _is_ the plan, then going that route right away
> for vPCI is desirable.

I think it would be desirable to have a single point where intercepts
are handled instead of having such different implementations for
internal vs external, and the only way I can devise to achieve this is
by moving intercepts to the ioreq model.

I'm not certainly planning to move all intercepts right now, but I
think it's a good first step having the code in place to allow this,
and at least vPCI using it.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses
  2019-09-10 14:06   ` Paul Durrant
@ 2019-09-26 16:05     ` Roger Pau Monné
  0 siblings, 0 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 16:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Sep 10, 2019 at 04:06:20PM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 03 September 2019 17:14
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne <roger.pau@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu <wl@xen.org>
> > Subject: [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses
> > 
> > Place the code that converts a PIO/COPY ioreq into a PCI_CONFIG one
> > into a separate function, and adjust the code to make use of this
> > newly introduced function.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Changes since v1:
> >  - New in this version.
> > ---
> >  xen/arch/x86/hvm/ioreq.c | 111 +++++++++++++++++++++++----------------
> >  1 file changed, 67 insertions(+), 44 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > index fecdc2786f..33c56b880c 100644
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -183,6 +183,54 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
> >      return true;
> >  }
> > 
> > +static void convert_pci_ioreq(struct domain *d, ioreq_t *p)
> > +{
> > +    const struct hvm_mmcfg *mmcfg;
> > +    uint32_t cf8 = d->arch.hvm.pci_cf8;
> > +
> > +    if ( p->type != IOREQ_TYPE_PIO && p->type != IOREQ_TYPE_COPY )
> > +    {
> > +        ASSERT_UNREACHABLE();
> > +        return;
> > +    }
> > +
> > +    read_lock(&d->arch.hvm.mmcfg_lock);
> 
> Actually, looking at this... can you not restrict holding the mmcfg_lock...
> 
> > +    if ( (p->type == IOREQ_TYPE_PIO &&
> > +          (p->addr & ~3) == 0xcfc &&
> > +          CF8_ENABLED(cf8)) ||
> > +         (p->type == IOREQ_TYPE_COPY &&
> > +          (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
> > +    {
> > +        uint32_t x86_fam;
> > +        pci_sbdf_t sbdf;
> > +        unsigned int reg;
> > +
> > +        reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
> > +                                                              &sbdf)
> > +                                        : hvm_mmcfg_decode_addr(mmcfg, p->addr,
> > +                                                                &sbdf);
> 
> ... to within hvm_mmcfg_decode_addr()?

Hm, not really. There's a call to hvm_mmcfg_find in the if condition
which needs the lock to be held, and then breaking this into two
different lock sections (pick lock, get mmcfg, unlock, pick lock,
decode, unlock) could lead to the mmcfg region being freed under our
feet.

I think the locking needs to stay as-is unless we switch to a
different locking model for the mmcfg regions. Note it's a read lock,
so it shouldn't have any contention since modifying the mmcfg region
list is very rare.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-26 11:14     ` Roger Pau Monné
  2019-09-26 13:17       ` Jan Beulich
@ 2019-09-26 16:36       ` Roger Pau Monné
  1 sibling, 0 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-26 16:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Thu, Sep 26, 2019 at 01:14:04PM +0200, Roger Pau Monné wrote:
> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
> > On 03.09.2019 18:14, Roger Pau Monne wrote:
> > A possible consequence of the answers to this might be for
> > the hook's middle parameter to be constified (in patch 4).
> 
> Yes, I think it can be constified.

No, it can't be constified because the handler needs to fill
ioreq->data for read requests.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
  2019-09-26 15:59             ` Roger Pau Monné
@ 2019-09-27  8:17               ` Paul Durrant
  0 siblings, 0 replies; 56+ messages in thread
From: Paul Durrant @ 2019-09-27  8:17 UTC (permalink / raw)
  To: Roger Pau Monne, Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 26 September 2019 16:59
> To: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; xen-
> devel@lists.xenproject.org; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers
> 
> On Thu, Sep 26, 2019 at 05:13:23PM +0200, Jan Beulich wrote:
> > On 26.09.2019 15:46, Roger Pau Monné  wrote:
> > > On Thu, Sep 26, 2019 at 03:17:15PM +0200, Jan Beulich wrote:
> > >> On 26.09.2019 13:14, Roger Pau Monné  wrote:
> > >>> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
> > >>>> Having said this, as a result of having looked at some of the
> > >>>> involved code, and with the cover letter not clarifying this,
> > >>>> what's the reason for going this seemingly more complicated
> > >>>> route, rather than putting vPCI behind the hvm_io_intercept()
> > >>>> machinery, just like is the case for other internal handling?
> > >>>
> > >>> If vPCI is handled at the hvm_io_intercept level (like its done ATM)
> > >>> then it's not possible to have both (external) ioreq servers and vPCI
> > >>> handling accesses to different devices in the PCI config space, since
> > >>> vPCI would trap all accesses to the PCI IO ports and the MCFG regions
> > >>> and those would never reach the ioreq processing.
> > >>
> > >> Why would vPCI (want to) do that? The accept() handler should
> > >> sub-class the CF8-CFF port range; there would likely want to
> > >> be another struct hvm_io_ops instance dealing with config
> > >> space accesses (and perhaps with ones through port I/O and
> > >> through MCFG at the same time).
> > >
> > > Do you mean to expand hvm_io_handler to add something like a pciconf
> > > sub-structure to the existing union of portio and mmio?
> >
> > Yes, something along these lines.
> >
> > > That's indeed feasible, but I'm not sure why it's better that the
> > > approach proposed on this series. Long term I think we would like all
> > > intercept handlers to use the ioreq infrastructure and remove the
> > > usage of hvm_io_intercept.
> > >
> > >> In the end this would likely
> > >> more similar to how chipsets handle this on real hardware
> > >> than your "internal server" solution (albeit I agree to a
> > >> degree it's in implementation detail anyway).
> > >
> > > I think the end goal should be to unify the internal and external
> > > intercepts into a single point, and the only feasible way to do this
> > > is to switch the internal intercepts to use the ioreq infrastructure.
> >
> > Well, I recall this having been mentioned as an option; I don't
> > recall this being a firm plan. There are certainly benefits to
> > such a model, but there's also potentially more overhead (at the
> > very least the ioreq_t will then need setting up / maintaining
> > everywhere, when right now the interfaces are using more
> > immediate parameters).
> 
> AFAICT from code in hvmemul_do_io which dispatches to both
> hvm_io_intercept and ioreq servers the ioreq is already there, so I'm
> not sure why more setup would be required in order to handle internal
> intercepts as ioreq servers. For vPCI at least I've been able to get
> away without having to modify hvmemul_do_io IIRC.
> 
> > But yes, if this _is_ the plan, then going that route right away
> > for vPCI is desirable.
> 
> I think it would be desirable to have a single point where intercepts
> are handled instead of having such different implementations for
> internal vs external, and the only way I can devise to achieve this is
> by moving intercepts to the ioreq model.
> 

+1 for the plan from me... doing this has been on my own to-do list for a while.

The lack of range-based registration for internal emulators is at least one thing that will be addressed by going this route, and I'd also expect some degree of simplification to the code by unifying things, once all the emulation is ported over.

> I'm not certainly planning to move all intercepts right now, but I
> think it's a good first step having the code in place to allow this,
> and at least vPCI using it.
> 

I think it's fine to do things piecemeal but all the internal emulators do need to be ported over a.s.a.p. I can certainly try to help with this once the groundwork is done.

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-26 15:07     ` Roger Pau Monné
@ 2019-09-27  8:29       ` Paul Durrant
  2019-09-27  8:45         ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-27  8:29 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 26 September 2019 16:07
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wl@xen.org>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v2 09/11] vpci: register as an internal ioreq server
> 
> On Tue, Sep 10, 2019 at 03:49:41PM +0200, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > Sent: 03 September 2019 17:14
> > > To: xen-devel@lists.xenproject.org
> > > Cc: Roger Pau Monne <roger.pau@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> > > <wl@xen.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> Jan
> > > Beulich <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org)
> <tim@xen.org>;
> > > Paul Durrant <Paul.Durrant@citrix.com>
> > > Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > > @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> > >      spin_unlock(&pdev->vpci->lock);
> > >  }
> > >
> > > +#ifdef __XEN__
> > > +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> > > +{
> > > +    pci_sbdf_t sbdf;
> > > +
> > > +    if ( req->type == IOREQ_TYPE_INVALIDATE )
> > > +        /*
> > > +         * Ignore invalidate requests, those can be received even without
> > > +         * having any memory ranges registered, see send_invalidate_req.
> > > +         */
> > > +        return X86EMUL_OKAY;
> >
> > In general, I wonder whether internal servers will ever need to deal with invalidate? The code only
> exists to get QEMU to drop its map cache after a decrease_reservation so that the page refs get
> dropped.
> 
> I think the best solution here is to rename hvm_broadcast_ioreq to
> hvm_broadcast_ioreq_external and switch it's callers. Both
> send_timeoffset_req and send_invalidate_req seem only relevant to
> external ioreq servers.

send_timeoffset_req() is relic which ought to be replaced with another mechanism IMO...

When an HVM guest writes its RTC, a new 'timeoffset' value (offset of RTC from host time) is calculated (also applied to the PV wallclock) and advertised via this ioreq. In XenServer, this is picked up by QEMU, forwarded via QMP to XAPI and then written into the VM meta-data (which than causes it to be written into xenstore too). All this is so that that guest's RTC can be set correctly when it is rebooted... There has to be a better way (e.g. extracting RTC via hvm context and saving it before cleaning up the domain).

send_invalidate_req() is relevant for any emulator maintaining a cache of guest->host memory mappings which, I guess, could include internal emulators even if this is not the case at the moment.

  Paul

> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-27  8:29       ` Paul Durrant
@ 2019-09-27  8:45         ` Roger Pau Monné
  2019-09-27  9:01           ` Paul Durrant
  0 siblings, 1 reply; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-27  8:45 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

On Fri, Sep 27, 2019 at 10:29:21AM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 26 September 2019 16:07
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: xen-devel@lists.xenproject.org; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wl@xen.org>; Andrew
> > Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> > <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> > Subject: Re: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > 
> > On Tue, Sep 10, 2019 at 03:49:41PM +0200, Paul Durrant wrote:
> > > > -----Original Message-----
> > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > Sent: 03 September 2019 17:14
> > > > To: xen-devel@lists.xenproject.org
> > > > Cc: Roger Pau Monne <roger.pau@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> > > > <wl@xen.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> > Jan
> > > > Beulich <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > > > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org)
> > <tim@xen.org>;
> > > > Paul Durrant <Paul.Durrant@citrix.com>
> > > > Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > > > @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> > > >      spin_unlock(&pdev->vpci->lock);
> > > >  }
> > > >
> > > > +#ifdef __XEN__
> > > > +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> > > > +{
> > > > +    pci_sbdf_t sbdf;
> > > > +
> > > > +    if ( req->type == IOREQ_TYPE_INVALIDATE )
> > > > +        /*
> > > > +         * Ignore invalidate requests, those can be received even without
> > > > +         * having any memory ranges registered, see send_invalidate_req.
> > > > +         */
> > > > +        return X86EMUL_OKAY;
> > >
> > > In general, I wonder whether internal servers will ever need to deal with invalidate? The code only
> > exists to get QEMU to drop its map cache after a decrease_reservation so that the page refs get
> > dropped.
> > 
> > I think the best solution here is to rename hvm_broadcast_ioreq to
> > hvm_broadcast_ioreq_external and switch it's callers. Both
> > send_timeoffset_req and send_invalidate_req seem only relevant to
> > external ioreq servers.
> 
> send_timeoffset_req() is relic which ought to be replaced with another mechanism IMO...
> 
> When an HVM guest writes its RTC, a new 'timeoffset' value (offset of RTC from host time) is calculated (also applied to the PV wallclock) and advertised via this ioreq. In XenServer, this is picked up by QEMU, forwarded via QMP to XAPI and then written into the VM meta-data (which than causes it to be written into xenstore too). All this is so that that guest's RTC can be set correctly when it is rebooted... There has to be a better way (e.g. extracting RTC via hvm context and saving it before cleaning up the domain).
> 
> send_invalidate_req() is relevant for any emulator maintaining a cache of guest->host memory mappings which, I guess, could include internal emulators even if this is not the case at the moment.

Maybe, but I would expect an internal emulator to get a reference on
the gfn if it does need to keep it in some kind of cache, or else I
don't think code in the hypervisor should be keeping such references.
IMO I would start by not forwarding invalidate requests to internal
emulators. We can always change this in the future if we come up with
a use-cases that needs it.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-27  8:45         ` Roger Pau Monné
@ 2019-09-27  9:01           ` Paul Durrant
  2019-09-27 10:46             ` Roger Pau Monné
  0 siblings, 1 reply; 56+ messages in thread
From: Paul Durrant @ 2019-09-27  9:01 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 27 September 2019 09:46
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wl@xen.org>; Andrew
> Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v2 09/11] vpci: register as an internal ioreq server
> 
> On Fri, Sep 27, 2019 at 10:29:21AM +0200, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > Sent: 26 September 2019 16:07
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: xen-devel@lists.xenproject.org; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wl@xen.org>;
> Andrew
> > > Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> > > <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> > > Subject: Re: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > >
> > > On Tue, Sep 10, 2019 at 03:49:41PM +0200, Paul Durrant wrote:
> > > > > -----Original Message-----
> > > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > > Sent: 03 September 2019 17:14
> > > > > To: xen-devel@lists.xenproject.org
> > > > > Cc: Roger Pau Monne <roger.pau@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> > > > > <wl@xen.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>;
> > > Jan
> > > > > Beulich <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > > > > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org)
> > > <tim@xen.org>;
> > > > > Paul Durrant <Paul.Durrant@citrix.com>
> > > > > Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > > > > @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> > > > >      spin_unlock(&pdev->vpci->lock);
> > > > >  }
> > > > >
> > > > > +#ifdef __XEN__
> > > > > +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> > > > > +{
> > > > > +    pci_sbdf_t sbdf;
> > > > > +
> > > > > +    if ( req->type == IOREQ_TYPE_INVALIDATE )
> > > > > +        /*
> > > > > +         * Ignore invalidate requests, those can be received even without
> > > > > +         * having any memory ranges registered, see send_invalidate_req.
> > > > > +         */
> > > > > +        return X86EMUL_OKAY;
> > > >
> > > > In general, I wonder whether internal servers will ever need to deal with invalidate? The code
> only
> > > exists to get QEMU to drop its map cache after a decrease_reservation so that the page refs get
> > > dropped.
> > >
> > > I think the best solution here is to rename hvm_broadcast_ioreq to
> > > hvm_broadcast_ioreq_external and switch it's callers. Both
> > > send_timeoffset_req and send_invalidate_req seem only relevant to
> > > external ioreq servers.
> >
> > send_timeoffset_req() is relic which ought to be replaced with another mechanism IMO...
> >
> > When an HVM guest writes its RTC, a new 'timeoffset' value (offset of RTC from host time) is
> calculated (also applied to the PV wallclock) and advertised via this ioreq. In XenServer, this is
> picked up by QEMU, forwarded via QMP to XAPI and then written into the VM meta-data (which than causes
> it to be written into xenstore too). All this is so that that guest's RTC can be set correctly when it
> is rebooted... There has to be a better way (e.g. extracting RTC via hvm context and saving it before
> cleaning up the domain).
> >
> > send_invalidate_req() is relevant for any emulator maintaining a cache of guest->host memory
> mappings which, I guess, could include internal emulators even if this is not the case at the moment.
> 
> Maybe, but I would expect an internal emulator to get a reference on
> the gfn if it does need to keep it in some kind of cache, or else I
> don't think code in the hypervisor should be keeping such references.

Oh indeed, but that's not the issue. The issue is when to drop those refs... If the guest does a decrease_reservation on a gfn cached by the emulator then the emulator needs to drop its ref to allow the page to be freed.

  Paul

> IMO I would start by not forwarding invalidate requests to internal
> emulators. We can always change this in the future if we come up with
> a use-cases that needs it.
> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server
  2019-09-27  9:01           ` Paul Durrant
@ 2019-09-27 10:46             ` Roger Pau Monné
  0 siblings, 0 replies; 56+ messages in thread
From: Roger Pau Monné @ 2019-09-27 10:46 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

On Fri, Sep 27, 2019 at 11:01:39AM +0200, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 27 September 2019 09:46
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: xen-devel@lists.xenproject.org; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wl@xen.org>; Andrew
> > Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> > <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> > Subject: Re: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > 
> > On Fri, Sep 27, 2019 at 10:29:21AM +0200, Paul Durrant wrote:
> > > > -----Original Message-----
> > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > Sent: 26 September 2019 16:07
> > > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > Cc: xen-devel@lists.xenproject.org; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wl@xen.org>;
> > Andrew
> > > > Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> > > > <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > > > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> > > > Subject: Re: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > > >
> > > > On Tue, Sep 10, 2019 at 03:49:41PM +0200, Paul Durrant wrote:
> > > > > > -----Original Message-----
> > > > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > > > Sent: 03 September 2019 17:14
> > > > > > To: xen-devel@lists.xenproject.org
> > > > > > Cc: Roger Pau Monne <roger.pau@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> > > > > > <wl@xen.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap
> > <George.Dunlap@citrix.com>;
> > > > Jan
> > > > > > Beulich <jbeulich@suse.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk
> > > > > > <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org)
> > > > <tim@xen.org>;
> > > > > > Paul Durrant <Paul.Durrant@citrix.com>
> > > > > > Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > > > > > @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
> > > > > >      spin_unlock(&pdev->vpci->lock);
> > > > > >  }
> > > > > >
> > > > > > +#ifdef __XEN__
> > > > > > +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> > > > > > +{
> > > > > > +    pci_sbdf_t sbdf;
> > > > > > +
> > > > > > +    if ( req->type == IOREQ_TYPE_INVALIDATE )
> > > > > > +        /*
> > > > > > +         * Ignore invalidate requests, those can be received even without
> > > > > > +         * having any memory ranges registered, see send_invalidate_req.
> > > > > > +         */
> > > > > > +        return X86EMUL_OKAY;
> > > > >
> > > > > In general, I wonder whether internal servers will ever need to deal with invalidate? The code
> > only
> > > > exists to get QEMU to drop its map cache after a decrease_reservation so that the page refs get
> > > > dropped.
> > > >
> > > > I think the best solution here is to rename hvm_broadcast_ioreq to
> > > > hvm_broadcast_ioreq_external and switch it's callers. Both
> > > > send_timeoffset_req and send_invalidate_req seem only relevant to
> > > > external ioreq servers.
> > >
> > > send_timeoffset_req() is relic which ought to be replaced with another mechanism IMO...
> > >
> > > When an HVM guest writes its RTC, a new 'timeoffset' value (offset of RTC from host time) is
> > calculated (also applied to the PV wallclock) and advertised via this ioreq. In XenServer, this is
> > picked up by QEMU, forwarded via QMP to XAPI and then written into the VM meta-data (which than causes
> > it to be written into xenstore too). All this is so that that guest's RTC can be set correctly when it
> > is rebooted... There has to be a better way (e.g. extracting RTC via hvm context and saving it before
> > cleaning up the domain).
> > >
> > > send_invalidate_req() is relevant for any emulator maintaining a cache of guest->host memory
> > mappings which, I guess, could include internal emulators even if this is not the case at the moment.
> > 
> > Maybe, but I would expect an internal emulator to get a reference on
> > the gfn if it does need to keep it in some kind of cache, or else I
> > don't think code in the hypervisor should be keeping such references.
> 
> Oh indeed, but that's not the issue. The issue is when to drop those refs... If the guest does a decrease_reservation on a gfn cached by the emulator then the emulator needs to drop its ref to allow the page to be freed.

Then I think this also could be used by internal servers.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2019-09-27 10:47 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03 16:14 [Xen-devel] [PATCH v2 00/11] ioreq: add support for internal servers Roger Pau Monne
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 01/11] ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup Roger Pau Monne
2019-09-10 10:44   ` Paul Durrant
2019-09-10 13:28   ` Jan Beulich
2019-09-10 13:33     ` Roger Pau Monné
2019-09-10 13:35       ` Jan Beulich
2019-09-10 13:42         ` Roger Pau Monné
2019-09-10 13:53           ` Paul Durrant
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 02/11] ioreq: terminate cf8 handling at hypervisor level Roger Pau Monne
2019-09-03 17:13   ` Andrew Cooper
2019-09-04  7:49     ` Roger Pau Monné
2019-09-04  8:00       ` Roger Pau Monné
2019-09-04  8:04       ` Jan Beulich
2019-09-04  9:46       ` Paul Durrant
2019-09-04 13:39         ` Roger Pau Monné
2019-09-04 13:56           ` Paul Durrant
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 03/11] ioreq: switch selection and forwarding to use ioservid_t Roger Pau Monne
2019-09-10 12:31   ` Paul Durrant
2019-09-20 10:47     ` Jan Beulich
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 04/11] ioreq: add fields to allow internal ioreq servers Roger Pau Monne
2019-09-10 12:34   ` Paul Durrant
2019-09-20 10:53   ` Jan Beulich
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 05/11] ioreq: add internal ioreq initialization support Roger Pau Monne
2019-09-10 12:59   ` Paul Durrant
2019-09-26 10:49     ` Roger Pau Monné
2019-09-26 10:58       ` Paul Durrant
2019-09-20 11:15   ` Jan Beulich
2019-09-26 10:51     ` Roger Pau Monné
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers Roger Pau Monne
2019-09-10 13:06   ` Paul Durrant
2019-09-20 11:35   ` Jan Beulich
2019-09-26 11:14     ` Roger Pau Monné
2019-09-26 13:17       ` Jan Beulich
2019-09-26 13:46         ` Roger Pau Monné
2019-09-26 15:13           ` Jan Beulich
2019-09-26 15:59             ` Roger Pau Monné
2019-09-27  8:17               ` Paul Durrant
2019-09-26 16:36       ` Roger Pau Monné
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 07/11] ioreq: allow registering internal ioreq server handler Roger Pau Monne
2019-09-10 13:12   ` Paul Durrant
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 08/11] ioreq: allow decoding accesses to MMCFG regions Roger Pau Monne
2019-09-10 13:37   ` Paul Durrant
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server Roger Pau Monne
2019-09-10 13:49   ` Paul Durrant
2019-09-26 15:07     ` Roger Pau Monné
2019-09-27  8:29       ` Paul Durrant
2019-09-27  8:45         ` Roger Pau Monné
2019-09-27  9:01           ` Paul Durrant
2019-09-27 10:46             ` Roger Pau Monné
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses Roger Pau Monne
2019-09-10 14:06   ` Paul Durrant
2019-09-26 16:05     ` Roger Pau Monné
2019-09-03 16:14 ` [Xen-devel] [PATCH v2 11/11] ioreq: provide support for long-running operations Roger Pau Monne
2019-09-10 14:14   ` Paul Durrant
2019-09-10 14:28     ` Roger Pau Monné
2019-09-10 14:40       ` Paul Durrant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).