xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] iommu page-table memory pool
@ 2020-10-05  9:49 Paul Durrant
  2020-10-05  9:49 ` [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead Paul Durrant
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Paul Durrant @ 2020-10-05  9:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Andrew Cooper, Anthony PERARD, Daniel De Graaf,
	George Dunlap, Ian Jackson, Jan Beulich, Julien Grall,
	Roger Pau Monné,
	Stefano Stabellini, Volodymyr Babchuk, Wei Liu

From: Paul Durrant <pdurrant@amazon.com>

This series introduces a pool of memory analogous to the shadow/HAP pool,
accounted to the guest domain, from which IOMMU page-tables are allocated.

Paul Durrant (5):
  libxl: remove separate calculation of IOMMU memory overhead
  iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
  iommu: set 'hap_pt_share' and 'need_sync' flags earlier in
    iommu_domain_init()
  x86 / iommu: create a dedicated pool of page-table pages

 tools/flask/policy/modules/dom0.te    |   2 +
 tools/libs/ctrl/include/xenctrl.h     |   5 +
 tools/libs/ctrl/xc_domain.c           |  16 ++++
 tools/libs/light/libxl_create.c       |  22 +----
 tools/libs/light/libxl_x86.c          |  10 ++
 xen/arch/x86/domain.c                 |   4 +-
 xen/drivers/passthrough/iommu.c       |  63 +++++++++---
 xen/drivers/passthrough/x86/iommu.c   | 132 ++++++++++++++++++++++----
 xen/include/asm-arm/iommu.h           |   6 ++
 xen/include/asm-x86/iommu.h           |   7 +-
 xen/include/public/domctl.h           |  22 +++++
 xen/include/xsm/dummy.h               |  17 +++-
 xen/include/xsm/xsm.h                 |  26 +++--
 xen/xsm/dummy.c                       |   6 +-
 xen/xsm/flask/hooks.c                 |  26 +++--
 xen/xsm/flask/policy/access_vectors   |   7 ++
 xen/xsm/flask/policy/security_classes |   1 +
 17 files changed, 300 insertions(+), 72 deletions(-)
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: Wei Liu <wl@xen.org>
-- 
2.20.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead
  2020-10-05  9:49 [PATCH 0/5] iommu page-table memory pool Paul Durrant
@ 2020-10-05  9:49 ` Paul Durrant
  2020-10-08 13:07   ` Wei Liu
  2020-10-05  9:49 ` [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl Paul Durrant
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2020-10-05  9:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Ian Jackson, Wei Liu, Anthony PERARD

From: Paul Durrant <pdurrant@amazon.com>

When using 'shared_pt' mode the IOMMU is using the EPT PTEs. In 'sync_pt'
mode these PTEs are instead replicated for the IOMMU to use. Hence, it is
fairly clear that the memory overhead in this mode is essentially another
copy of the P2M.

This patch removes the independent calculation done in
libxl__get_required_iommu_memory() and instead simply uses 'shadow_memkb'
as the value of the IOMMU overhead since this is the estimated size of
the P2M.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>
---
 tools/libs/light/libxl_create.c | 22 +++++-----------------
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 9a6e92b3a5..f07ba84850 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -1001,21 +1001,6 @@ static bool ok_to_default_memkb_in_create(libxl__gc *gc)
      */
 }
 
-static unsigned long libxl__get_required_iommu_memory(unsigned long maxmem_kb)
-{
-    unsigned long iommu_pages = 0, mem_pages = maxmem_kb / 4;
-    unsigned int level;
-
-    /* Assume a 4 level page table with 512 entries per level */
-    for (level = 0; level < 4; level++)
-    {
-        mem_pages = DIV_ROUNDUP(mem_pages, 512);
-        iommu_pages += mem_pages;
-    }
-
-    return iommu_pages * 4;
-}
-
 int libxl__domain_config_setdefault(libxl__gc *gc,
                                     libxl_domain_config *d_config,
                                     uint32_t domid /* for logging, only */)
@@ -1168,12 +1153,15 @@ int libxl__domain_config_setdefault(libxl__gc *gc,
             libxl_get_required_shadow_memory(d_config->b_info.max_memkb,
                                              d_config->b_info.max_vcpus);
 
-    /* No IOMMU reservation is needed if passthrough mode is not 'sync_pt' */
+    /* No IOMMU reservation is needed if passthrough mode is not 'sync_pt'
+     * otherwise we need a reservation sufficient to accommodate a copy of
+     * the P2M.
+     */
     if (d_config->b_info.iommu_memkb == LIBXL_MEMKB_DEFAULT
         && ok_to_default_memkb_in_create(gc))
         d_config->b_info.iommu_memkb =
             (d_config->c_info.passthrough == LIBXL_PASSTHROUGH_SYNC_PT)
-            ? libxl__get_required_iommu_memory(d_config->b_info.max_memkb)
+            ? d_config->b_info.shadow_memkb
             : 0;
 
     ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  2020-10-05  9:49 [PATCH 0/5] iommu page-table memory pool Paul Durrant
  2020-10-05  9:49 ` [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead Paul Durrant
@ 2020-10-05  9:49 ` Paul Durrant
  2020-10-16 15:47   ` Julien Grall
  2020-10-05  9:49 ` [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION Paul Durrant
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2020-10-05  9:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Daniel De Graaf, Ian Jackson, Wei Liu,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini

From: Paul Durrant <pdurrant@amazon.com>

A subsequent patch will introduce code to apply a cap to memory used for
IOMMU page-tables for a domain. This patch simply introduces the boilerplate
for the new domctl.

The implementation of new sub-operations of the new domctl will be added to
iommu_ctl() and hence, whilst this initially only returns -EOPNOTSUPP, the
code in iommu_do_domctl() is modified to set up a hypercall continuation
should iommu_ctl() return -ERESTART.

NOTE: The op code is passed into the newly introduced xsm_iommu_ctl()
      function, but for the moment only a single default 'ctl' perm is
      defined in the new 'iommu' class.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
---
 tools/flask/policy/modules/dom0.te    |  2 ++
 xen/drivers/passthrough/iommu.c       | 39 ++++++++++++++++++++++++---
 xen/include/public/domctl.h           | 14 ++++++++++
 xen/include/xsm/dummy.h               | 17 +++++++++---
 xen/include/xsm/xsm.h                 | 26 ++++++++++++------
 xen/xsm/dummy.c                       |  6 +++--
 xen/xsm/flask/hooks.c                 | 26 +++++++++++++-----
 xen/xsm/flask/policy/access_vectors   |  7 +++++
 xen/xsm/flask/policy/security_classes |  1 +
 9 files changed, 114 insertions(+), 24 deletions(-)

diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/modules/dom0.te
index 0a63ce15b6..ab5eb682c7 100644
--- a/tools/flask/policy/modules/dom0.te
+++ b/tools/flask/policy/modules/dom0.te
@@ -69,6 +69,8 @@ auditallow dom0_t security_t:security { load_policy setenforce setbool };
 # Allow dom0 to report platform configuration changes back to the hypervisor
 allow dom0_t xen_t:resource setup;
 
+allow dom0_t xen_t:iommu ctl;
+
 admin_device(dom0_t, device_t)
 admin_device(dom0_t, irq_t)
 admin_device(dom0_t, ioport_t)
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 87f9a857bb..bef0405984 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -502,6 +502,27 @@ void iommu_resume()
         iommu_get_ops()->resume();
 }
 
+static int iommu_ctl(
+    struct xen_domctl *domctl, struct domain *d,
+    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+    struct xen_domctl_iommu_ctl *ctl = &domctl->u.iommu_ctl;
+    int rc;
+
+    rc = xsm_iommu_ctl(XSM_HOOK, d, ctl->op);
+    if ( rc )
+        return rc;
+
+    switch ( ctl->op )
+    {
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    return rc;
+}
+
 int iommu_do_domctl(
     struct xen_domctl *domctl, struct domain *d,
     XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
@@ -511,14 +532,26 @@ int iommu_do_domctl(
     if ( !is_iommu_enabled(d) )
         return -EOPNOTSUPP;
 
+    switch ( domctl->cmd )
+    {
+    case XEN_DOMCTL_iommu_ctl:
+        ret = iommu_ctl(domctl, d, u_domctl);
+        if ( ret == -ERESTART )
+            ret = hypercall_create_continuation(__HYPERVISOR_domctl,
+                                                "h", u_domctl);
+        break;
+
+    default:
 #ifdef CONFIG_HAS_PCI
-    ret = iommu_do_pci_domctl(domctl, d, u_domctl);
+        ret = iommu_do_pci_domctl(domctl, d, u_domctl);
 #endif
 
 #ifdef CONFIG_HAS_DEVICE_TREE
-    if ( ret == -ENODEV )
-        ret = iommu_do_dt_domctl(domctl, d, u_domctl);
+        if ( ret == -ENODEV )
+            ret = iommu_do_dt_domctl(domctl, d, u_domctl);
 #endif
+        break;
+    }
 
     return ret;
 }
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 791f0a2592..75e855625a 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1130,6 +1130,18 @@ struct xen_domctl_vuart_op {
                                  */
 };
 
+/*
+ * XEN_DOMCTL_iommu_ctl
+ *
+ * Control of VM IOMMU settings
+ */
+
+#define XEN_DOMCTL_IOMMU_INVALID 0
+
+struct xen_domctl_iommu_ctl {
+    uint32_t op; /* XEN_DOMCTL_IOMMU_* */
+};
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1214,6 +1226,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_vuart_op                      81
 #define XEN_DOMCTL_get_cpu_policy                82
 #define XEN_DOMCTL_set_cpu_policy                83
+#define XEN_DOMCTL_iommu_ctl                     84
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1274,6 +1287,7 @@ struct xen_domctl {
         struct xen_domctl_monitor_op        monitor_op;
         struct xen_domctl_psr_alloc         psr_alloc;
         struct xen_domctl_vuart_op          vuart_op;
+        struct xen_domctl_iommu_ctl         iommu_ctl;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 2368acebed..9825533c75 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -348,7 +348,15 @@ static XSM_INLINE int xsm_get_vnumainfo(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
+#if defined(CONFIG_HAS_PASSTHROUGH)
+static XSM_INLINE int xsm_iommu_ctl(XSM_DEFAULT_ARG struct domain *d,
+                                    unsigned int op)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
+#if defined(CONFIG_HAS_PCI)
 static XSM_INLINE int xsm_get_device_group(XSM_DEFAULT_ARG uint32_t machine_bdf)
 {
     XSM_ASSERT_ACTION(XSM_HOOK);
@@ -367,9 +375,9 @@ static XSM_INLINE int xsm_deassign_device(XSM_DEFAULT_ARG struct domain *d, uint
     return xsm_default_action(action, current->domain, d);
 }
 
-#endif /* HAS_PASSTHROUGH && HAS_PCI */
+#endif /* HAS_PCI */
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
+#if defined(CONFIG_HAS_DEVICE_TREE)
 static XSM_INLINE int xsm_assign_dtdevice(XSM_DEFAULT_ARG struct domain *d,
                                           const char *dtpath)
 {
@@ -384,7 +392,8 @@ static XSM_INLINE int xsm_deassign_dtdevice(XSM_DEFAULT_ARG struct domain *d,
     return xsm_default_action(action, current->domain, d);
 }
 
-#endif /* HAS_PASSTHROUGH && HAS_DEVICE_TREE */
+#endif /* HAS_DEVICE_TREE */
+#endif /* HAS_PASSTHROUGH */
 
 static XSM_INLINE int xsm_resource_plug_core(XSM_DEFAULT_VOID)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index b21c3783d3..1a96d3502c 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -106,17 +106,19 @@ struct xsm_operations {
     int (*irq_permission) (struct domain *d, int pirq, uint8_t allow);
     int (*iomem_permission) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
     int (*iomem_mapping) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
-    int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t end, uint8_t access);
+    int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t end, uint8_t access); 
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
+#if defined(CONFIG_HAS_PASSTHROUGH)
+    int (*iommu_ctl) (struct domain *d, unsigned int op);
+#if defined(CONFIG_HAS_PCI)
     int (*get_device_group) (uint32_t machine_bdf);
     int (*assign_device) (struct domain *d, uint32_t machine_bdf);
     int (*deassign_device) (struct domain *d, uint32_t machine_bdf);
 #endif
-
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
+#if defined(CONFIG_HAS_DEVICE_TREE)
     int (*assign_dtdevice) (struct domain *d, const char *dtpath);
     int (*deassign_dtdevice) (struct domain *d, const char *dtpath);
+#endif
 #endif
 
     int (*resource_plug_core) (void);
@@ -466,7 +468,14 @@ static inline int xsm_pci_config_permission (xsm_default_t def, struct domain *d
     return xsm_ops->pci_config_permission(d, machine_bdf, start, end, access);
 }
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
+#if defined(CONFIG_HAS_PASSTHROUGH)
+static inline int xsm_iommu_ctl(xsm_default_t def, struct domain *d,
+                                unsigned int op)
+{
+    return xsm_ops->iommu_ctl(d, op);
+}
+
+#if defined(CONFIG_HAS_PCI)
 static inline int xsm_get_device_group(xsm_default_t def, uint32_t machine_bdf)
 {
     return xsm_ops->get_device_group(machine_bdf);
@@ -481,9 +490,9 @@ static inline int xsm_deassign_device(xsm_default_t def, struct domain *d, uint3
 {
     return xsm_ops->deassign_device(d, machine_bdf);
 }
-#endif /* HAS_PASSTHROUGH && HAS_PCI) */
+#endif /* HAS_PCI */
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
+#if defined(CONFIG_HAS_DEVICE_TREE)
 static inline int xsm_assign_dtdevice(xsm_default_t def, struct domain *d,
                                       const char *dtpath)
 {
@@ -496,7 +505,8 @@ static inline int xsm_deassign_dtdevice(xsm_default_t def, struct domain *d,
     return xsm_ops->deassign_dtdevice(d, dtpath);
 }
 
-#endif /* HAS_PASSTHROUGH && HAS_DEVICE_TREE */
+#endif /* HAS_DEVICE_TREE */
+#endif /* HAS_PASSTHROUGH */
 
 static inline int xsm_resource_plug_pci (xsm_default_t def, uint32_t machine_bdf)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index d4cce68089..a924f1dfd1 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -84,14 +84,16 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, get_vnumainfo);
 
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
+    set_to_dummy_if_null(ops, iommu_ctl);
+#if defined(CONFIG_HAS_PCI)
     set_to_dummy_if_null(ops, get_device_group);
     set_to_dummy_if_null(ops, assign_device);
     set_to_dummy_if_null(ops, deassign_device);
 #endif
-
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
+#if defined(CONFIG_HAS_DEVICE_TREE)
     set_to_dummy_if_null(ops, assign_dtdevice);
     set_to_dummy_if_null(ops, deassign_dtdevice);
+#endif
 #endif
 
     set_to_dummy_if_null(ops, resource_plug_core);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index b3addbf701..a2858fb0c0 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -624,6 +624,7 @@ static int flask_domctl(struct domain *d, int cmd)
      * These have individual XSM hooks
      * (drivers/passthrough/{pci,device_tree.c)
      */
+    case XEN_DOMCTL_iommu_ctl:
     case XEN_DOMCTL_get_device_group:
     case XEN_DOMCTL_test_assign_device:
     case XEN_DOMCTL_assign_device:
@@ -1278,7 +1279,15 @@ static int flask_mem_sharing(struct domain *d)
 }
 #endif
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
+#if defined(CONFIG_HAS_PASSTHROUGH)
+
+static int flask_iommu_ctl(struct domain *d, unsigned int op)
+{
+    /* All current ops are subject to default 'ctl' perm */
+    return current_has_perm(d, SECCLASS_IOMMU, IOMMU__CTL);
+}
+
+#if defined(CONFIG_HAS_PCI)
 static int flask_get_device_group(uint32_t machine_bdf)
 {
     u32 rsid;
@@ -1348,9 +1357,9 @@ static int flask_deassign_device(struct domain *d, uint32_t machine_bdf)
 
     return avc_current_has_perm(rsid, SECCLASS_RESOURCE, RESOURCE__REMOVE_DEVICE, NULL);
 }
-#endif /* HAS_PASSTHROUGH && HAS_PCI */
+#endif /* HAS_PCI */
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
+#if defined(CONFIG_HAS_DEVICE_TREE)
 static int flask_test_assign_dtdevice(const char *dtpath)
 {
     u32 rsid;
@@ -1410,7 +1419,8 @@ static int flask_deassign_dtdevice(struct domain *d, const char *dtpath)
     return avc_current_has_perm(rsid, SECCLASS_RESOURCE, RESOURCE__REMOVE_DEVICE,
                                 NULL);
 }
-#endif /* HAS_PASSTHROUGH && HAS_DEVICE_TREE */
+#endif /* HAS_DEVICE_TREE */
+#endif /* HAS_PASSTHROUGH */
 
 static int flask_platform_op(uint32_t op)
 {
@@ -1855,15 +1865,17 @@ static struct xsm_operations flask_ops = {
     .remove_from_physmap = flask_remove_from_physmap,
     .map_gmfn_foreign = flask_map_gmfn_foreign,
 
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
+#if defined(CONFIG_HAS_PASSTHROUGH)
+    .iommu_ctl = flask_iommu_ctl,
+#if defined(CONFIG_HAS_PCI)
     .get_device_group = flask_get_device_group,
     .assign_device = flask_assign_device,
     .deassign_device = flask_deassign_device,
 #endif
-
-#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
+#if defined(CONFIG_HAS_DEVICE_TREE)
     .assign_dtdevice = flask_assign_dtdevice,
     .deassign_dtdevice = flask_deassign_dtdevice,
+#endif
 #endif
 
     .platform_op = flask_platform_op,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index fde5162c7e..c017a38666 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -542,3 +542,10 @@ class argo
     # Domain sending a message to another domain.
     send
 }
+
+# Class iommu describes operations on the IOMMU resources of a domain
+class iommu
+{
+    # Miscellaneous control
+    ctl
+}
diff --git a/xen/xsm/flask/policy/security_classes b/xen/xsm/flask/policy/security_classes
index 50ecbabc5c..882968e79c 100644
--- a/xen/xsm/flask/policy/security_classes
+++ b/xen/xsm/flask/policy/security_classes
@@ -20,5 +20,6 @@ class grant
 class security
 class version
 class argo
+class iommu
 
 # FLASK
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
  2020-10-05  9:49 [PATCH 0/5] iommu page-table memory pool Paul Durrant
  2020-10-05  9:49 ` [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead Paul Durrant
  2020-10-05  9:49 ` [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl Paul Durrant
@ 2020-10-05  9:49 ` Paul Durrant
  2020-10-08 13:08   ` Wei Liu
                     ` (2 more replies)
  2020-10-05  9:49 ` [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init() Paul Durrant
  2020-10-05  9:49 ` [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages Paul Durrant
  4 siblings, 3 replies; 19+ messages in thread
From: Paul Durrant @ 2020-10-05  9:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Ian Jackson, Wei Liu, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Anthony PERARD,
	Volodymyr Babchuk, Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

... sub-operation of XEN_DOMCTL_iommu_ctl.

This patch adds a new sub-operation into the domctl. The code in iommu_ctl()
is extended to call a new arch-specific iommu_set_allocation() function which
will be called with the IOMMU page-table overhead (in 4k pages) in response
to libxl issuing a new domctl via the xc_iommu_set_allocation() helper
function.

The helper function is only called in the x86 implementation of
libxl__arch_domain_create() when the calculated 'iommu_memkb' value is
non-zero. Hence the ARM implementation of iommu_set_allocation() simply
returns -EOPNOTSUPP.

NOTE: The implementation of the IOMMU page-table memory pool will be added in
      a subsequent patch and so the x86 implementation of
      iommu_set_allocation() currently does nothing other than return 0 (to
      indicate success) thereby ensuring that the new call in
      libxl__arch_domain_create() always succeeds.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
 tools/libs/ctrl/include/xenctrl.h   |  5 +++++
 tools/libs/ctrl/xc_domain.c         | 16 ++++++++++++++++
 tools/libs/light/libxl_x86.c        | 10 ++++++++++
 xen/drivers/passthrough/iommu.c     |  8 ++++++++
 xen/drivers/passthrough/x86/iommu.c |  5 +++++
 xen/include/asm-arm/iommu.h         |  6 ++++++
 xen/include/asm-x86/iommu.h         |  2 ++
 xen/include/public/domctl.h         |  8 ++++++++
 8 files changed, 60 insertions(+)

diff --git a/tools/libs/ctrl/include/xenctrl.h b/tools/libs/ctrl/include/xenctrl.h
index 3796425e1e..4d6c9d44bc 100644
--- a/tools/libs/ctrl/include/xenctrl.h
+++ b/tools/libs/ctrl/include/xenctrl.h
@@ -2650,6 +2650,11 @@ int xc_livepatch_replace(xc_interface *xch, char *name, uint32_t timeout, uint32
 int xc_domain_cacheflush(xc_interface *xch, uint32_t domid,
                          xen_pfn_t start_pfn, xen_pfn_t nr_pfns);
 
+/* IOMMU control operations */
+
+int xc_iommu_set_allocation(xc_interface *xch, uint32_t domid,
+                            unsigned int nr_pages);
+
 /* Compat shims */
 #include "xenctrl_compat.h"
 
diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index e7cea4a17d..0b20a8f2ee 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -2185,6 +2185,22 @@ int xc_domain_soft_reset(xc_interface *xch,
     domctl.domain = domid;
     return do_domctl(xch, &domctl);
 }
+
+int xc_iommu_set_allocation(xc_interface *xch, uint32_t domid,
+                            unsigned int nr_pages)
+{
+    DECLARE_DOMCTL;
+
+    memset(&domctl, 0, sizeof(domctl));
+
+    domctl.cmd = XEN_DOMCTL_iommu_ctl;
+    domctl.domain = domid;
+    domctl.u.iommu_ctl.op = XEN_DOMCTL_IOMMU_SET_ALLOCATION;
+    domctl.u.iommu_ctl.u.set_allocation.nr_pages = nr_pages;
+
+    return do_domctl(xch, &domctl);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 6ec6c27c83..9631974dd6 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -520,6 +520,16 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
                           NULL, 0, &shadow, 0, NULL);
     }
 
+    if (d_config->b_info.iommu_memkb) {
+        unsigned int nr_pages = DIV_ROUNDUP(d_config->b_info.iommu_memkb, 4);
+
+        ret = xc_iommu_set_allocation(ctx->xch, domid, nr_pages);
+        if (ret) {
+            LOGED(ERROR, domid, "Failed to set IOMMU allocation");
+            goto out;
+        }
+    }
+
     if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV &&
             libxl_defbool_val(d_config->b_info.u.pv.e820_host)) {
         ret = libxl__e820_alloc(gc, domid, d_config);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index bef0405984..642d5c8331 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -515,6 +515,14 @@ static int iommu_ctl(
 
     switch ( ctl->op )
     {
+    case XEN_DOMCTL_IOMMU_SET_ALLOCATION:
+    {
+        struct xen_domctl_iommu_set_allocation *set_allocation =
+            &ctl->u.set_allocation;
+
+        rc = iommu_set_allocation(d, set_allocation->nr_pages);
+        break;
+    }
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index f17b1820f4..b168073f10 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -134,6 +134,11 @@ void __hwdom_init arch_iommu_check_autotranslated_hwdom(struct domain *d)
         panic("PVH hardware domain iommu must be set in 'strict' mode\n");
 }
 
+int iommu_set_allocation(struct domain *d, unsigned nr_pages)
+{
+    return 0;
+}
+
 int arch_iommu_domain_init(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 937edc8373..2e4735bace 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -33,6 +33,12 @@ int __must_check arm_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
 int __must_check arm_iommu_unmap_page(struct domain *d, dfn_t dfn,
                                       unsigned int *flush_flags);
 
+static inline int iommu_set_allocation(struct domain *d,
+                                       unsigned int nr_pages)
+{
+    return -EOPNOTSUPP;
+}
+
 #endif /* __ARCH_ARM_IOMMU_H__ */
 
 /*
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 970eb06ffa..d086f564af 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -138,6 +138,8 @@ int pi_update_irte(const struct pi_desc *pi_desc, const struct pirq *pirq,
 int __must_check iommu_free_pgtables(struct domain *d);
 struct page_info *__must_check iommu_alloc_pgtable(struct domain *d);
 
+int __must_check iommu_set_allocation(struct domain *d, unsigned int nr_pages);
+
 #endif /* !__ARCH_X86_IOMMU_H__ */
 /*
  * Local variables:
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 75e855625a..6402678838 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1138,8 +1138,16 @@ struct xen_domctl_vuart_op {
 
 #define XEN_DOMCTL_IOMMU_INVALID 0
 
+#define XEN_DOMCTL_IOMMU_SET_ALLOCATION 1
+struct xen_domctl_iommu_set_allocation {
+    uint32_t nr_pages;
+};
+
 struct xen_domctl_iommu_ctl {
     uint32_t op; /* XEN_DOMCTL_IOMMU_* */
+    union {
+        struct xen_domctl_iommu_set_allocation set_allocation;
+    } u;
 };
 
 struct xen_domctl {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init()
  2020-10-05  9:49 [PATCH 0/5] iommu page-table memory pool Paul Durrant
                   ` (2 preceding siblings ...)
  2020-10-05  9:49 ` [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION Paul Durrant
@ 2020-10-05  9:49 ` Paul Durrant
  2020-10-16 16:07   ` Julien Grall
  2020-10-30 16:11   ` Jan Beulich
  2020-10-05  9:49 ` [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages Paul Durrant
  4 siblings, 2 replies; 19+ messages in thread
From: Paul Durrant @ 2020-10-05  9:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Jan Beulich

From: Paul Durrant <pdurrant@amazon.com>

Set these flags prior to the calls to arch_iommu_domain_init() and the
implementation init() method. There is no reason to hide this information from
those functions and the value of 'hap_pt_share' will be needed by a
modification to arch_iommu_domain_init() made in a subsequent patch.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/iommu.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 642d5c8331..fd9705b3a9 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -174,15 +174,6 @@ int iommu_domain_init(struct domain *d, unsigned int opts)
     hd->node = NUMA_NO_NODE;
 #endif
 
-    ret = arch_iommu_domain_init(d);
-    if ( ret )
-        return ret;
-
-    hd->platform_ops = iommu_get_ops();
-    ret = hd->platform_ops->init(d);
-    if ( ret || is_system_domain(d) )
-        return ret;
-
     /*
      * Use shared page tables for HAP and IOMMU if the global option
      * is enabled (from which we can infer the h/w is capable) and
@@ -202,7 +193,12 @@ int iommu_domain_init(struct domain *d, unsigned int opts)
 
     ASSERT(!(hd->need_sync && hd->hap_pt_share));
 
-    return 0;
+    ret = arch_iommu_domain_init(d);
+    if ( ret )
+        return ret;
+
+    hd->platform_ops = iommu_get_ops();
+    return hd->platform_ops->init(d);
 }
 
 void __hwdom_init iommu_hwdom_init(struct domain *d)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages
  2020-10-05  9:49 [PATCH 0/5] iommu page-table memory pool Paul Durrant
                   ` (3 preceding siblings ...)
  2020-10-05  9:49 ` [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init() Paul Durrant
@ 2020-10-05  9:49 ` Paul Durrant
  2020-10-30 16:43   ` Jan Beulich
  4 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2020-10-05  9:49 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Paul Durrant <pdurrant@amazon.com>

This patch, in a way analogous to HAP page allocation, creates a pool of
pages for use as IOMMU page-tables when the size of that pool is configured
by a call to iommu_set_allocation(). That call occurs when the tool-stack
has calculates the size of the pool and issues the
XEN_DOMCTL_IOMMU_SET_ALLOCATION sub-operation of the XEN_DOMCTL_iommu_ctl
domctl. However, some IOMMU mappings must be set up during domain_create(),
before the tool-stack has had a chance to make its calculation and issue the
domctl. Hence an initial hard-coded pool size of 256 is set for domains
not sharing EPT during the call to arch_iommu_domain_init().

NOTE: No pool is configured for the hardware or quarantine domains. They
      continue to allocate page-table pages on demand.
      The prototype of iommu_free_pgtables() is change to void return as
      it no longer needs to be re-startable.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Wei Liu <wl@xen.org>
---
 xen/arch/x86/domain.c               |   4 +-
 xen/drivers/passthrough/x86/iommu.c | 129 ++++++++++++++++++++++++----
 xen/include/asm-x86/iommu.h         |   5 +-
 3 files changed, 116 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7e16d49bfd..101ef4aba5 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2304,7 +2304,9 @@ int domain_relinquish_resources(struct domain *d)
 
     PROGRESS(iommu_pagetables):
 
-        ret = iommu_free_pgtables(d);
+        iommu_free_pgtables(d);
+
+        ret = iommu_set_allocation(d, 0);
         if ( ret )
             return ret;
 
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index b168073f10..de0cc52489 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -134,21 +134,106 @@ void __hwdom_init arch_iommu_check_autotranslated_hwdom(struct domain *d)
         panic("PVH hardware domain iommu must be set in 'strict' mode\n");
 }
 
-int iommu_set_allocation(struct domain *d, unsigned nr_pages)
+static int destroy_pgtable(struct domain *d)
 {
+    struct domain_iommu *hd = dom_iommu(d);
+    struct page_info *pg;
+
+    if ( !hd->arch.pgtables.nr )
+    {
+        ASSERT_UNREACHABLE();
+        return -ENOENT;
+    }
+
+    pg = page_list_remove_head(&hd->arch.pgtables.free_list);
+    if ( !pg )
+        return -EBUSY;
+
+    hd->arch.pgtables.nr--;
+    free_domheap_page(pg);
+
     return 0;
 }
 
+static int create_pgtable(struct domain *d)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+    unsigned int memflags = 0;
+    struct page_info *pg;
+
+#ifdef CONFIG_NUMA
+    if ( hd->node != NUMA_NO_NODE )
+        memflags = MEMF_node(hd->node);
+#endif
+
+    pg = alloc_domheap_page(NULL, memflags);
+    if ( !pg )
+        return -ENOMEM;
+
+    page_list_add(pg, &hd->arch.pgtables.free_list);
+    hd->arch.pgtables.nr++;
+
+    return 0;
+}
+
+static int set_allocation(struct domain *d, unsigned int nr_pages,
+                          bool allow_preempt)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+    unsigned int done = 0;
+    int rc = 0;
+
+    spin_lock(&hd->arch.pgtables.lock);
+
+    while ( !rc )
+    {
+        if ( hd->arch.pgtables.nr < nr_pages )
+            rc = create_pgtable(d);
+        else if ( hd->arch.pgtables.nr > nr_pages )
+            rc = destroy_pgtable(d);
+        else
+            break;
+
+        if ( allow_preempt && !rc && !(++done & 0xff) &&
+             general_preempt_check() )
+            rc = -ERESTART;
+    }
+
+    spin_unlock(&hd->arch.pgtables.lock);
+
+    return rc;
+}
+
+int iommu_set_allocation(struct domain *d, unsigned int nr_pages)
+{
+    return set_allocation(d, nr_pages, true);
+}
+
+/*
+ * Some IOMMU mappings are set up during domain_create() before the tool-
+ * stack has a chance to calculate and set the appropriate page-table
+ * allocation. A hard-coded initial allocation covers this gap.
+ */
+#define INITIAL_ALLOCATION 256
+
 int arch_iommu_domain_init(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
 
     spin_lock_init(&hd->arch.mapping_lock);
 
+    INIT_PAGE_LIST_HEAD(&hd->arch.pgtables.free_list);
     INIT_PAGE_LIST_HEAD(&hd->arch.pgtables.list);
     spin_lock_init(&hd->arch.pgtables.lock);
 
-    return 0;
+    /*
+     * The hardware and quarantine domains are not subject to a quota
+     * and domains sharing EPT do not require any allocation.
+     */
+    if ( is_hardware_domain(d) || d == dom_io || iommu_use_hap_pt(d) )
+        return 0;
+
+    return set_allocation(d, INITIAL_ALLOCATION, false);
 }
 
 void arch_iommu_domain_destroy(struct domain *d)
@@ -265,38 +350,45 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         return;
 }
 
-int iommu_free_pgtables(struct domain *d)
+void iommu_free_pgtables(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
-    struct page_info *pg;
-    unsigned int done = 0;
 
-    while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
-    {
-        free_domheap_page(pg);
+    spin_lock(&hd->arch.pgtables.lock);
 
-        if ( !(++done & 0xff) && general_preempt_check() )
-            return -ERESTART;
-    }
+    page_list_splice(&hd->arch.pgtables.list, &hd->arch.pgtables.free_list);
+    INIT_PAGE_LIST_HEAD(&hd->arch.pgtables.list);
 
-    return 0;
+    spin_unlock(&hd->arch.pgtables.lock);
 }
 
 struct page_info *iommu_alloc_pgtable(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
-    unsigned int memflags = 0;
     struct page_info *pg;
     void *p;
 
-#ifdef CONFIG_NUMA
-    if ( hd->node != NUMA_NO_NODE )
-        memflags = MEMF_node(hd->node);
-#endif
+    spin_lock(&hd->arch.pgtables.lock);
 
-    pg = alloc_domheap_page(NULL, memflags);
+ again:
+    pg = page_list_remove_head(&hd->arch.pgtables.free_list);
     if ( !pg )
+    {
+        /*
+         * The hardware and quarantine domains are not subject to a quota
+         * so create page-table pages on demand.
+         */
+        if ( is_hardware_domain(d) || d == dom_io )
+        {
+            int rc = create_pgtable(d);
+
+            if ( !rc )
+                goto again;
+        }
+
+        spin_unlock(&hd->arch.pgtables.lock);
         return NULL;
+    }
 
     p = __map_domain_page(pg);
     clear_page(p);
@@ -306,7 +398,6 @@ struct page_info *iommu_alloc_pgtable(struct domain *d)
 
     unmap_domain_page(p);
 
-    spin_lock(&hd->arch.pgtables.lock);
     page_list_add(pg, &hd->arch.pgtables.list);
     spin_unlock(&hd->arch.pgtables.lock);
 
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index d086f564af..3991b5601b 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -47,7 +47,8 @@ struct arch_iommu
 {
     spinlock_t mapping_lock; /* io page table lock */
     struct {
-        struct page_list_head list;
+        struct page_list_head list, free_list;
+        unsigned int nr;
         spinlock_t lock;
     } pgtables;
 
@@ -135,7 +136,7 @@ int pi_update_irte(const struct pi_desc *pi_desc, const struct pirq *pirq,
         iommu_vcall(ops, sync_cache, addr, size);       \
 })
 
-int __must_check iommu_free_pgtables(struct domain *d);
+void iommu_free_pgtables(struct domain *d);
 struct page_info *__must_check iommu_alloc_pgtable(struct domain *d);
 
 int __must_check iommu_set_allocation(struct domain *d, unsigned int nr_pages);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead
  2020-10-05  9:49 ` [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead Paul Durrant
@ 2020-10-08 13:07   ` Wei Liu
  0 siblings, 0 replies; 19+ messages in thread
From: Wei Liu @ 2020-10-08 13:07 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Paul Durrant, Ian Jackson, Wei Liu, Anthony PERARD

On Mon, Oct 05, 2020 at 10:49:01AM +0100, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> When using 'shared_pt' mode the IOMMU is using the EPT PTEs. In 'sync_pt'
> mode these PTEs are instead replicated for the IOMMU to use. Hence, it is
> fairly clear that the memory overhead in this mode is essentially another
> copy of the P2M.
> 
> This patch removes the independent calculation done in
> libxl__get_required_iommu_memory() and instead simply uses 'shadow_memkb'
> as the value of the IOMMU overhead since this is the estimated size of
> the P2M.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

Acked-by: Wei Liu <wl@xen.org>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
  2020-10-05  9:49 ` [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION Paul Durrant
@ 2020-10-08 13:08   ` Wei Liu
  2020-10-16 15:54   ` Julien Grall
  2020-10-30 16:45   ` Jan Beulich
  2 siblings, 0 replies; 19+ messages in thread
From: Wei Liu @ 2020-10-08 13:08 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Paul Durrant, Ian Jackson, Wei Liu, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Anthony PERARD, Volodymyr Babchuk, Roger Pau Monné

On Mon, Oct 05, 2020 at 10:49:03AM +0100, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
[...]
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index 6ec6c27c83..9631974dd6 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -520,6 +520,16 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
>                            NULL, 0, &shadow, 0, NULL);
>      }
>  
> +    if (d_config->b_info.iommu_memkb) {
> +        unsigned int nr_pages = DIV_ROUNDUP(d_config->b_info.iommu_memkb, 4);
> +

Please use XC_PAGE_SHIFT / XC_PAGE_SIZE for the calculation.

Wei.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  2020-10-05  9:49 ` [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl Paul Durrant
@ 2020-10-16 15:47   ` Julien Grall
  2020-10-19  7:23     ` Paul Durrant
  0 siblings, 1 reply; 19+ messages in thread
From: Julien Grall @ 2020-10-16 15:47 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Daniel De Graaf, Ian Jackson, Wei Liu,
	Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini

Hi Paul,

On 05/10/2020 10:49, Paul Durrant wrote:
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 791f0a2592..75e855625a 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1130,6 +1130,18 @@ struct xen_domctl_vuart_op {
>                                    */
>   };
>   
> +/*
> + * XEN_DOMCTL_iommu_ctl
> + *
> + * Control of VM IOMMU settings
> + */
> +
> +#define XEN_DOMCTL_IOMMU_INVALID 0

I can't find any user of XEN_DOMCTL_IOMMU_INVALID. What's the purpose 
for it?

[...]

> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> index b21c3783d3..1a96d3502c 100644
> --- a/xen/include/xsm/xsm.h
> +++ b/xen/include/xsm/xsm.h
> @@ -106,17 +106,19 @@ struct xsm_operations {
>       int (*irq_permission) (struct domain *d, int pirq, uint8_t allow);
>       int (*iomem_permission) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
>       int (*iomem_mapping) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
> -    int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t end, uint8_t access);
> +    int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t end, uint8_t access);

I can't figure out what changed here. Is it an intended change?

>   
> -#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
> +#if defined(CONFIG_HAS_PASSTHROUGH)
> +    int (*iommu_ctl) (struct domain *d, unsigned int op);
> +#if defined(CONFIG_HAS_PCI)
>       int (*get_device_group) (uint32_t machine_bdf);
>       int (*assign_device) (struct domain *d, uint32_t machine_bdf);
>       int (*deassign_device) (struct domain *d, uint32_t machine_bdf);
>   #endif
> -
> -#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
> +#if defined(CONFIG_HAS_DEVICE_TREE)
>       int (*assign_dtdevice) (struct domain *d, const char *dtpath);
>       int (*deassign_dtdevice) (struct domain *d, const char *dtpath);
> +#endif
>   #endif

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
  2020-10-05  9:49 ` [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION Paul Durrant
  2020-10-08 13:08   ` Wei Liu
@ 2020-10-16 15:54   ` Julien Grall
  2020-10-19  7:30     ` Paul Durrant
  2020-10-30 16:45   ` Jan Beulich
  2 siblings, 1 reply; 19+ messages in thread
From: Julien Grall @ 2020-10-16 15:54 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Ian Jackson, Wei Liu, Andrew Cooper, George Dunlap,
	Jan Beulich, Stefano Stabellini, Anthony PERARD,
	Volodymyr Babchuk, Roger Pau Monné

Hi Paul,

On 05/10/2020 10:49, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> ... sub-operation of XEN_DOMCTL_iommu_ctl.
> 
> This patch adds a new sub-operation into the domctl. The code in iommu_ctl()
> is extended to call a new arch-specific iommu_set_allocation() function which
> will be called with the IOMMU page-table overhead (in 4k pages) in response

Why 4KB? Wouldn't it be better to use the hypervisor page size instead?

> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 75e855625a..6402678838 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1138,8 +1138,16 @@ struct xen_domctl_vuart_op {
>   
>   #define XEN_DOMCTL_IOMMU_INVALID 0
>   
> +#define XEN_DOMCTL_IOMMU_SET_ALLOCATION 1
> +struct xen_domctl_iommu_set_allocation {
> +    uint32_t nr_pages;

Shouldn't this be a 64-bit value?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init()
  2020-10-05  9:49 ` [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init() Paul Durrant
@ 2020-10-16 16:07   ` Julien Grall
  2020-10-30 16:11   ` Jan Beulich
  1 sibling, 0 replies; 19+ messages in thread
From: Julien Grall @ 2020-10-16 16:07 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Paul Durrant, Jan Beulich

Hi Paul,

On 05/10/2020 10:49, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> Set these flags prior to the calls to arch_iommu_domain_init() and the
> implementation init() method. There is no reason to hide this information from
> those functions and the value of 'hap_pt_share' will be needed by a
> modification to arch_iommu_domain_init() made in a subsequent patch.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
>   xen/drivers/passthrough/iommu.c | 16 ++++++----------
>   1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 642d5c8331..fd9705b3a9 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -174,15 +174,6 @@ int iommu_domain_init(struct domain *d, unsigned int opts)
>       hd->node = NUMA_NO_NODE;
>   #endif
>   
> -    ret = arch_iommu_domain_init(d);
> -    if ( ret )
> -        return ret;
> -
> -    hd->platform_ops = iommu_get_ops();
> -    ret = hd->platform_ops->init(d);
> -    if ( ret || is_system_domain(d) )
> -        return ret;
> -
>       /*
>        * Use shared page tables for HAP and IOMMU if the global option
>        * is enabled (from which we can infer the h/w is capable) and
> @@ -202,7 +193,12 @@ int iommu_domain_init(struct domain *d, unsigned int opts)
>   
>       ASSERT(!(hd->need_sync && hd->hap_pt_share));
>   
> -    return 0;
> +    ret = arch_iommu_domain_init(d);
> +    if ( ret )
> +        return ret;
> +
> +    hd->platform_ops = iommu_get_ops();
> +    return hd->platform_ops->init(d);
>   }
>   
>   void __hwdom_init iommu_hwdom_init(struct domain *d)
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  2020-10-16 15:47   ` Julien Grall
@ 2020-10-19  7:23     ` Paul Durrant
  2020-10-19  7:29       ` Jan Beulich
  2020-10-20 17:17       ` Julien Grall
  0 siblings, 2 replies; 19+ messages in thread
From: Paul Durrant @ 2020-10-19  7:23 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Paul Durrant', 'Daniel De Graaf',
	'Ian Jackson', 'Wei Liu', 'Andrew Cooper',
	'George Dunlap', 'Jan Beulich',
	'Stefano Stabellini'

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 16 October 2020 16:47
> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org
> Cc: Paul Durrant <pdurrant@amazon.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Ian Jackson
> <iwj@xenproject.org>; Wei Liu <wl@xen.org>; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap
> <george.dunlap@citrix.com>; Jan Beulich <jbeulich@suse.com>; Stefano Stabellini
> <sstabellini@kernel.org>
> Subject: Re: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
> 
> Hi Paul,
> 
> On 05/10/2020 10:49, Paul Durrant wrote:
> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 791f0a2592..75e855625a 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -1130,6 +1130,18 @@ struct xen_domctl_vuart_op {
> >                                    */
> >   };
> >
> > +/*
> > + * XEN_DOMCTL_iommu_ctl
> > + *
> > + * Control of VM IOMMU settings
> > + */
> > +
> > +#define XEN_DOMCTL_IOMMU_INVALID 0
> 
> I can't find any user of XEN_DOMCTL_IOMMU_INVALID. What's the purpose
> for it?
> 

It's just a placeholder. I think it's generally safer that a zero opcode value is invalid.

> [...]
> 
> > diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> > index b21c3783d3..1a96d3502c 100644
> > --- a/xen/include/xsm/xsm.h
> > +++ b/xen/include/xsm/xsm.h
> > @@ -106,17 +106,19 @@ struct xsm_operations {
> >       int (*irq_permission) (struct domain *d, int pirq, uint8_t allow);
> >       int (*iomem_permission) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
> >       int (*iomem_mapping) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
> > -    int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t
> end, uint8_t access);
> > +    int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t
> end, uint8_t access);
> 
> I can't figure out what changed here. Is it an intended change?
> 

No, I'll check and get rid of it.

  Paul

> >
> > -#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
> > +#if defined(CONFIG_HAS_PASSTHROUGH)
> > +    int (*iommu_ctl) (struct domain *d, unsigned int op);
> > +#if defined(CONFIG_HAS_PCI)
> >       int (*get_device_group) (uint32_t machine_bdf);
> >       int (*assign_device) (struct domain *d, uint32_t machine_bdf);
> >       int (*deassign_device) (struct domain *d, uint32_t machine_bdf);
> >   #endif
> > -
> > -#if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_DEVICE_TREE)
> > +#if defined(CONFIG_HAS_DEVICE_TREE)
> >       int (*assign_dtdevice) (struct domain *d, const char *dtpath);
> >       int (*deassign_dtdevice) (struct domain *d, const char *dtpath);
> > +#endif
> >   #endif
> 
> Cheers,
> 
> --
> Julien Grall



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  2020-10-19  7:23     ` Paul Durrant
@ 2020-10-19  7:29       ` Jan Beulich
  2020-10-19  7:32         ` Paul Durrant
  2020-10-20 17:17       ` Julien Grall
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2020-10-19  7:29 UTC (permalink / raw)
  To: paul
  Cc: 'Julien Grall', xen-devel, 'Paul Durrant',
	'Daniel De Graaf', 'Ian Jackson',
	'Wei Liu', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini'

On 19.10.2020 09:23, Paul Durrant wrote:
>> From: Julien Grall <julien@xen.org>
>> Sent: 16 October 2020 16:47
>>
>> On 05/10/2020 10:49, Paul Durrant wrote:
>>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>>> index 791f0a2592..75e855625a 100644
>>> --- a/xen/include/public/domctl.h
>>> +++ b/xen/include/public/domctl.h
>>> @@ -1130,6 +1130,18 @@ struct xen_domctl_vuart_op {
>>>                                    */
>>>   };
>>>
>>> +/*
>>> + * XEN_DOMCTL_iommu_ctl
>>> + *
>>> + * Control of VM IOMMU settings
>>> + */
>>> +
>>> +#define XEN_DOMCTL_IOMMU_INVALID 0
>>
>> I can't find any user of XEN_DOMCTL_IOMMU_INVALID. What's the purpose
>> for it?
>>
> 
> It's just a placeholder. I think it's generally safer that a zero opcode value is invalid.

But does this then need a #define? Starting valid command from 1
ought to be sufficient?

Jan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
  2020-10-16 15:54   ` Julien Grall
@ 2020-10-19  7:30     ` Paul Durrant
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Durrant @ 2020-10-19  7:30 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Paul Durrant', 'Ian Jackson', 'Wei Liu',
	'Andrew Cooper', 'George Dunlap',
	'Jan Beulich', 'Stefano Stabellini',
	'Anthony PERARD', 'Volodymyr Babchuk',
	'Roger Pau Monné'

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 16 October 2020 16:55
> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org
> Cc: Paul Durrant <pdurrant@amazon.com>; Ian Jackson <iwj@xenproject.org>; Wei Liu <wl@xen.org>; Andrew
> Cooper <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony PERARD
> <anthony.perard@citrix.com>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Roger Pau Monné
> <roger.pau@citrix.com>
> Subject: Re: [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
> 
> Hi Paul,
> 
> On 05/10/2020 10:49, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > ... sub-operation of XEN_DOMCTL_iommu_ctl.
> >
> > This patch adds a new sub-operation into the domctl. The code in iommu_ctl()
> > is extended to call a new arch-specific iommu_set_allocation() function which
> > will be called with the IOMMU page-table overhead (in 4k pages) in response
> 
> Why 4KB? Wouldn't it be better to use the hypervisor page size instead?
> 

I think I'll follow the shadow/hap code more closely and just pass a value in MB, then any issue with page size is left inside Xen.

> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 75e855625a..6402678838 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -1138,8 +1138,16 @@ struct xen_domctl_vuart_op {
> >
> >   #define XEN_DOMCTL_IOMMU_INVALID 0
> >
> > +#define XEN_DOMCTL_IOMMU_SET_ALLOCATION 1
> > +struct xen_domctl_iommu_set_allocation {
> > +    uint32_t nr_pages;
> 
> Shouldn't this be a 64-bit value?

If I pass the value in MB then 32-bits will cover it, I think. I do need to add padding though.

  Paul

> 
> Cheers,
> 
> --
> Julien Grall



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  2020-10-19  7:29       ` Jan Beulich
@ 2020-10-19  7:32         ` Paul Durrant
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Durrant @ 2020-10-19  7:32 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Julien Grall', xen-devel, 'Paul Durrant',
	'Daniel De Graaf', 'Ian Jackson',
	'Wei Liu', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 19 October 2020 08:30
> To: paul@xen.org
> Cc: 'Julien Grall' <julien@xen.org>; xen-devel@lists.xenproject.org; 'Paul Durrant'
> <pdurrant@amazon.com>; 'Daniel De Graaf' <dgdegra@tycho.nsa.gov>; 'Ian Jackson' <iwj@xenproject.org>;
> 'Wei Liu' <wl@xen.org>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
> <george.dunlap@citrix.com>; 'Stefano Stabellini' <sstabellini@kernel.org>
> Subject: Re: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
> 
> On 19.10.2020 09:23, Paul Durrant wrote:
> >> From: Julien Grall <julien@xen.org>
> >> Sent: 16 October 2020 16:47
> >>
> >> On 05/10/2020 10:49, Paul Durrant wrote:
> >>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> >>> index 791f0a2592..75e855625a 100644
> >>> --- a/xen/include/public/domctl.h
> >>> +++ b/xen/include/public/domctl.h
> >>> @@ -1130,6 +1130,18 @@ struct xen_domctl_vuart_op {
> >>>                                    */
> >>>   };
> >>>
> >>> +/*
> >>> + * XEN_DOMCTL_iommu_ctl
> >>> + *
> >>> + * Control of VM IOMMU settings
> >>> + */
> >>> +
> >>> +#define XEN_DOMCTL_IOMMU_INVALID 0
> >>
> >> I can't find any user of XEN_DOMCTL_IOMMU_INVALID. What's the purpose
> >> for it?
> >>
> >
> > It's just a placeholder. I think it's generally safer that a zero opcode value is invalid.
> 
> But does this then need a #define? Starting valid command from 1
> ought to be sufficient?
> 

Seems harmless enough, and it also seemed the best way since to reserve 0 since this patch doesn't actually introduce any ops. As it has caused so much controversy though, I'll remove it.

  Paul

> Jan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
  2020-10-19  7:23     ` Paul Durrant
  2020-10-19  7:29       ` Jan Beulich
@ 2020-10-20 17:17       ` Julien Grall
  1 sibling, 0 replies; 19+ messages in thread
From: Julien Grall @ 2020-10-20 17:17 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Paul Durrant', 'Daniel De Graaf',
	'Ian Jackson', 'Wei Liu', 'Andrew Cooper',
	'George Dunlap', 'Jan Beulich',
	'Stefano Stabellini'

Hi Paul,

On 19/10/2020 08:23, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 16 October 2020 16:47
>> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org
>> Cc: Paul Durrant <pdurrant@amazon.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Ian Jackson
>> <iwj@xenproject.org>; Wei Liu <wl@xen.org>; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap
>> <george.dunlap@citrix.com>; Jan Beulich <jbeulich@suse.com>; Stefano Stabellini
>> <sstabellini@kernel.org>
>> Subject: Re: [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl
>>
>> Hi Paul,
>>
>> On 05/10/2020 10:49, Paul Durrant wrote:
>>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>>> index 791f0a2592..75e855625a 100644
>>> --- a/xen/include/public/domctl.h
>>> +++ b/xen/include/public/domctl.h
>>> @@ -1130,6 +1130,18 @@ struct xen_domctl_vuart_op {
>>>                                     */
>>>    };
>>>
>>> +/*
>>> + * XEN_DOMCTL_iommu_ctl
>>> + *
>>> + * Control of VM IOMMU settings
>>> + */
>>> +
>>> +#define XEN_DOMCTL_IOMMU_INVALID 0
>>
>> I can't find any user of XEN_DOMCTL_IOMMU_INVALID. What's the purpose
>> for it?
>>
> 
> It's just a placeholder. I think it's generally safer that a zero opcode value is invalid.

Thanks for the explanation. I first thought the goal would to somehow 
invalidate the IOMMU :).

Anyway, it might be worth adding /* Reserved - should never be used */ 
on top.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init()
  2020-10-05  9:49 ` [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init() Paul Durrant
  2020-10-16 16:07   ` Julien Grall
@ 2020-10-30 16:11   ` Jan Beulich
  1 sibling, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2020-10-30 16:11 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Paul Durrant, xen-devel

On 05.10.2020 11:49, Paul Durrant wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -174,15 +174,6 @@ int iommu_domain_init(struct domain *d, unsigned int opts)
>      hd->node = NUMA_NO_NODE;
>  #endif
>  
> -    ret = arch_iommu_domain_init(d);
> -    if ( ret )
> -        return ret;
> -
> -    hd->platform_ops = iommu_get_ops();
> -    ret = hd->platform_ops->init(d);
> -    if ( ret || is_system_domain(d) )
> -        return ret;

Are you suggesting the is_system_domain() here has become
unnecessary? If so, it would be nice if you could say when
or why. Otherwise I would assume it's needed to avoid
setting one or both of the fields.

Jan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages
  2020-10-05  9:49 ` [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages Paul Durrant
@ 2020-10-30 16:43   ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2020-10-30 16:43 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Paul Durrant, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 05.10.2020 11:49, Paul Durrant wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -2304,7 +2304,9 @@ int domain_relinquish_resources(struct domain *d)
>  
>      PROGRESS(iommu_pagetables):
>  
> -        ret = iommu_free_pgtables(d);
> +        iommu_free_pgtables(d);
> +
> +        ret = iommu_set_allocation(d, 0);
>          if ( ret )
>              return ret;

There doesn't look to be a need to call iommu_free_pgtables()
more than once - how about you move it immediately ahead of
the (extended) case label?

> +static int set_allocation(struct domain *d, unsigned int nr_pages,
> +                          bool allow_preempt)

Why the allow_preempt parameter when the sole caller passes
"true"?

> +/*
> + * Some IOMMU mappings are set up during domain_create() before the tool-
> + * stack has a chance to calculate and set the appropriate page-table
> + * allocation. A hard-coded initial allocation covers this gap.
> + */
> +#define INITIAL_ALLOCATION 256

How did you arrive at this number? IOW how many pages do we
need in reality, and how much leeway have you added in?

As to the tool stack - why would it "have a chance" to do the
necessary calculations only pretty late? I wonder whether the
intended allocation wouldn't better be part of struct
xen_domctl_createdomain, without the need for a new sub-op.

> @@ -265,38 +350,45 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>          return;
>  }
>  
> -int iommu_free_pgtables(struct domain *d)
> +void iommu_free_pgtables(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> -    struct page_info *pg;
> -    unsigned int done = 0;
>  
> -    while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
> -    {
> -        free_domheap_page(pg);
> +    spin_lock(&hd->arch.pgtables.lock);
>  
> -        if ( !(++done & 0xff) && general_preempt_check() )
> -            return -ERESTART;
> -    }
> +    page_list_splice(&hd->arch.pgtables.list, &hd->arch.pgtables.free_list);
> +    INIT_PAGE_LIST_HEAD(&hd->arch.pgtables.list);
>  
> -    return 0;
> +    spin_unlock(&hd->arch.pgtables.lock);
>  }
>  
>  struct page_info *iommu_alloc_pgtable(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> -    unsigned int memflags = 0;
>      struct page_info *pg;
>      void *p;
>  
> -#ifdef CONFIG_NUMA
> -    if ( hd->node != NUMA_NO_NODE )
> -        memflags = MEMF_node(hd->node);
> -#endif
> +    spin_lock(&hd->arch.pgtables.lock);
>  
> -    pg = alloc_domheap_page(NULL, memflags);
> + again:
> +    pg = page_list_remove_head(&hd->arch.pgtables.free_list);
>      if ( !pg )
> +    {
> +        /*
> +         * The hardware and quarantine domains are not subject to a quota
> +         * so create page-table pages on demand.
> +         */
> +        if ( is_hardware_domain(d) || d == dom_io )
> +        {
> +            int rc = create_pgtable(d);
> +
> +            if ( !rc )
> +                goto again;

This gives the appearance of a potentially infinite loop; it's
not because the lock is being held, but I still wonder whether
the impression this gives couldn't be avoided by a slightly
different code structure.

Also the downside of this is that the amount of pages used by
hwdom will now never shrink anymore.

> @@ -306,7 +398,6 @@ struct page_info *iommu_alloc_pgtable(struct domain *d)
>  
>      unmap_domain_page(p);
>  
> -    spin_lock(&hd->arch.pgtables.lock);
>      page_list_add(pg, &hd->arch.pgtables.list);
>      spin_unlock(&hd->arch.pgtables.lock);

You want to drop the lock before the map/clear/unmap, and then
re-acquire it. Or, on the assumption that putting it on the
list earlier is fine (which I think it is), move the other two
lines here up as well.

Jan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION...
  2020-10-05  9:49 ` [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION Paul Durrant
  2020-10-08 13:08   ` Wei Liu
  2020-10-16 15:54   ` Julien Grall
@ 2020-10-30 16:45   ` Jan Beulich
  2 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2020-10-30 16:45 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Paul Durrant, Ian Jackson, Wei Liu, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Anthony PERARD,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel

On 05.10.2020 11:49, Paul Durrant wrote:

Just two nits, in case the op is really needed:

> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -515,6 +515,14 @@ static int iommu_ctl(
>  
>      switch ( ctl->op )
>      {
> +    case XEN_DOMCTL_IOMMU_SET_ALLOCATION:
> +    {
> +        struct xen_domctl_iommu_set_allocation *set_allocation =
> +            &ctl->u.set_allocation;

const please, or drop the local variable.

> +        rc = iommu_set_allocation(d, set_allocation->nr_pages);
> +        break;
> +    }
>      default:

Blank line above here please.

Jan


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-10-30 16:45 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05  9:49 [PATCH 0/5] iommu page-table memory pool Paul Durrant
2020-10-05  9:49 ` [PATCH 1/5] libxl: remove separate calculation of IOMMU memory overhead Paul Durrant
2020-10-08 13:07   ` Wei Liu
2020-10-05  9:49 ` [PATCH 2/5] iommu / domctl: introduce XEN_DOMCTL_iommu_ctl Paul Durrant
2020-10-16 15:47   ` Julien Grall
2020-10-19  7:23     ` Paul Durrant
2020-10-19  7:29       ` Jan Beulich
2020-10-19  7:32         ` Paul Durrant
2020-10-20 17:17       ` Julien Grall
2020-10-05  9:49 ` [PATCH 3/5] libxl / iommu / domctl: introduce XEN_DOMCTL_IOMMU_SET_ALLOCATION Paul Durrant
2020-10-08 13:08   ` Wei Liu
2020-10-16 15:54   ` Julien Grall
2020-10-19  7:30     ` Paul Durrant
2020-10-30 16:45   ` Jan Beulich
2020-10-05  9:49 ` [PATCH 4/5] iommu: set 'hap_pt_share' and 'need_sync' flags earlier in iommu_domain_init() Paul Durrant
2020-10-16 16:07   ` Julien Grall
2020-10-30 16:11   ` Jan Beulich
2020-10-05  9:49 ` [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages Paul Durrant
2020-10-30 16:43   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).