All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/7] Paravirtual IOMMU Interface
@ 2018-10-15 10:35 Paul Durrant
  2018-10-15 10:35 ` [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

This contains the remaining patches from my original PV-IOMMU series (that
were not included in the recent pre-requisites series), and some new
patches to support the revised implementation.

The new patches are:
 - To add a 'no crash' variant of the internal IOMMU map method.
 - To add a new 'refcount' bit into the VT-d IOMMU PTE.

I have started this series at v7 as it derives from v6 of the old series.

Paul Durrant (7):
  public / x86: introduce __HYPERCALL_iommu_op
  iommu: track reserved ranges using a rangeset
  x86: add xen_iommu_op to query reserved ranges
  iommu: introduce iommu_map_page_nocrash
  iommu / vtd: introduce a new 'refcount' flag...
  x86: add xen_iommu_ops to modify IOMMU mappings
  x86: extend the map and unmap xen_iommu_ops to support grant
    references

 tools/flask/policy/modules/xen.if   |   1 +
 xen/arch/x86/hvm/hypercall.c        |   1 +
 xen/arch/x86/hypercall.c            |   1 +
 xen/arch/x86/pv/hypercall.c         |   1 +
 xen/common/Makefile                 |   1 +
 xen/common/grant_table.c            | 151 +++++++++
 xen/common/iommu_op.c               | 646 ++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/iommu.c     |  28 +-
 xen/drivers/passthrough/vtd/iommu.c |  25 +-
 xen/drivers/passthrough/vtd/iommu.h |  17 +-
 xen/drivers/passthrough/x86/iommu.c |   4 +
 xen/include/Makefile                |   1 +
 xen/include/public/iommu_op.h       | 222 +++++++++++++
 xen/include/public/xen.h            |   1 +
 xen/include/xen/grant_table.h       |   5 +
 xen/include/xen/hypercall.h         |  12 +
 xen/include/xen/iommu.h             |  16 +
 xen/include/xlat.lst                |   7 +
 xen/include/xsm/dummy.h             |   6 +
 xen/include/xsm/xsm.h               |   6 +
 xen/xsm/dummy.c                     |   1 +
 xen/xsm/flask/hooks.c               |   6 +
 xen/xsm/flask/policy/access_vectors |   2 +
 23 files changed, 1143 insertions(+), 18 deletions(-)
 create mode 100644 xen/common/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-09 11:23   ` Jan Beulich
  2018-10-15 10:35 ` [PATCH v7 2/7] iommu: track reserved ranges using a rangeset Paul Durrant
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Paul Durrant, Jan Beulich

This patch introduces the boilerplate for a new hypercall to allow a
domain to control IOMMU mappings for its own pages.
Whilst there is duplication of code between the native and compat entry
points which appears ripe for some form of combination, I think it is
better to maintain the separation as-is because the compat entry point
will necessarily gain complexity in subsequent patches.

NOTE: This hypercall is only implemented for x86 and is currently
      restricted by XSM to dom0. Its scope can be expanded in future
      if need be.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v7:
 - Avoid using the full XLAT code to copy back in compat handler.
 - Use local variable of the requisite handle type to store buf->h and
   hence avoid excessive casting.

v6:
 - Move iommu_op.c to xen/common as suggested by Kevin, but only build for
   x86 (as suggested by Jan).

v3:
 - Push op code into XSM check.

v2:
 - Get rid of the can_control_iommu() function, leaving this patch as pure
   boilerplate.
 - Re-structure the hypercall to use a buffer array, similar to that used
   by __HYPERCALL_dm_op, to allow for future expansion of op structure
   without affecting binary compatibility.
 - Drop use of __ in public header guard.
---
 tools/flask/policy/modules/xen.if   |   1 +
 xen/arch/x86/hvm/hypercall.c        |   1 +
 xen/arch/x86/hypercall.c            |   1 +
 xen/arch/x86/pv/hypercall.c         |   1 +
 xen/common/Makefile                 |   1 +
 xen/common/iommu_op.c               | 193 ++++++++++++++++++++++++++++++++++++
 xen/include/Makefile                |   1 +
 xen/include/public/iommu_op.h       |  64 ++++++++++++
 xen/include/public/xen.h            |   1 +
 xen/include/xen/hypercall.h         |  12 +++
 xen/include/xlat.lst                |   2 +
 xen/include/xsm/dummy.h             |   6 ++
 xen/include/xsm/xsm.h               |   6 ++
 xen/xsm/dummy.c                     |   1 +
 xen/xsm/flask/hooks.c               |   6 ++
 xen/xsm/flask/policy/access_vectors |   2 +
 16 files changed, 299 insertions(+)
 create mode 100644 xen/common/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h

diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
index 4e06cfc09b..c2f9fc20d7 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -60,6 +60,7 @@ define(`create_domain_common', `
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { getparam hvmctl sethvmc
 			setparam nested altp2mhvm altp2mhvm_op dm };
+	allow $1 $2:resource control_iommu;
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 56713d1e08..fbc9f10187 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -138,6 +138,7 @@ static const hypercall_table_t hvm_hypercall_table[] = {
     COMPAT_CALL(mmuext_op),
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1)
 };
 
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 74bde5e958..308db24a5f 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -70,6 +70,7 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(dm_op, 3),
 #endif
     ARGS(mca, 1),
+    ARGS(iommu_op, 2),
     ARGS(arch_1, 1),
 };
 
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 5d11911735..50bdf34875 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -83,6 +83,7 @@ const hypercall_table_t pv_hypercall_table[] = {
     COMPAT_CALL(dm_op),
 #endif
     HYPERCALL(mca),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1),
 };
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 6a05fffc7a..6341bdccee 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
 obj-y += grant_table.o
 obj-y += guestcopy.o
 obj-bin-y += gunzip.init.o
+obj-$(CONFIG_X86) += iommu_op.o
 obj-y += irq.o
 obj-y += kernel.o
 obj-y += keyhandler.o
diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
new file mode 100644
index 0000000000..6acc7b1888
--- /dev/null
+++ b/xen/common/iommu_op.c
@@ -0,0 +1,193 @@
+/******************************************************************************
+ * iommu_op.c
+ *
+ * Paravirtualised IOMMU functionality
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+
+static void iommu_op(xen_iommu_op_t *op)
+{
+    switch ( op->op )
+    {
+    default:
+        op->status = -EOPNOTSUPP;
+        break;
+    }
+}
+
+int do_one_iommu_op(xen_iommu_op_buf_t *buf)
+{
+    const XEN_GUEST_HANDLE(xen_iommu_op_t) h =
+        guest_handle_cast(buf->h, xen_iommu_op_t);
+    xen_iommu_op_t op;
+    int rc;
+
+    if ( buf->size < sizeof(op) )
+        return -ENODATA;
+
+    if ( copy_from_guest(&op, h, 1) )
+        return -EFAULT;
+
+    if ( op.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, op.op);
+    if ( rc )
+        return rc;
+
+    iommu_op(&op);
+
+    if ( __copy_field_to_guest(h, &op, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+long do_iommu_op(unsigned int nr_bufs,
+                 XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        xen_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = do_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+#ifdef CONFIG_COMPAT
+
+int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
+{
+    const COMPAT_HANDLE(compat_iommu_op_t) h =
+        compat_handle_cast(buf->h, compat_iommu_op_t);
+    compat_iommu_op_t cmp;
+    xen_iommu_op_t nat;
+    int rc;
+
+    if ( buf->size < sizeof(cmp) )
+        return -ENODATA;
+
+    if ( copy_from_compat(&cmp, h, 1) )
+        return -EFAULT;
+
+    if ( cmp.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
+    if ( rc )
+        return rc;
+
+    XLAT_iommu_op(&nat, &cmp);
+
+    iommu_op(&nat);
+
+    /*
+     * Avoid the full (and lengthy) XLAT code as the only thing that
+     * needs copying back is the status field.
+     */
+    cmp.status = nat.status;
+
+    if ( __copy_field_to_compat(h, &cmp, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+int compat_iommu_op(unsigned int nr_bufs,
+                    XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        compat_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = compat_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+#endif /* CONFIG_COMPAT */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/Makefile b/xen/include/Makefile
index f7895e4d4e..3dd856af57 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -29,6 +29,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/xen-$(compat-arch-y).h
 headers-$(CONFIG_X86)     += compat/hvm/dm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
+headers-$(CONFIG_X86)     += compat/iommu_op.h
 headers-y                 += compat/arch-$(compat-arch-y).h compat/pmu.h compat/xlat.h
 headers-$(CONFIG_XSM_FLASK) += compat/xsm/flask_op.h
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
new file mode 100644
index 0000000000..c3b68f665a
--- /dev/null
+++ b/xen/include/public/iommu_op.h
@@ -0,0 +1,64 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#ifndef XEN_PUBLIC_IOMMU_OP_H
+#define XEN_PUBLIC_IOMMU_OP_H
+
+#include "xen.h"
+
+struct xen_iommu_op {
+    uint16_t op;    /* op type */
+    uint16_t pad;
+    int32_t status; /* op completion status: */
+                    /* 0 for success otherwise, negative errno */
+};
+typedef struct xen_iommu_op xen_iommu_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
+
+struct xen_iommu_op_buf {
+    XEN_GUEST_HANDLE(void) h;
+    xen_ulong_t size;
+};
+typedef struct xen_iommu_op_buf xen_iommu_op_buf_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_buf_t);
+
+/* ` enum neg_errnoval
+ * ` HYPERVISOR_iommu_op(unsigned int nr_bufs,
+ * `                     xen_iommu_op_buf_t bufs[])
+ * `
+ *
+ * @nr_bufs is the number of buffers in the @bufs array.
+ * @bufs points to an array of buffers where each contains a struct
+ * xen_iommu_op.
+ */
+
+#endif /* XEN_PUBLIC_IOMMU_OP_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 68ee09810f..35e539d5ab 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -121,6 +121,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
+#define __HYPERVISOR_iommu_op             42
 
 /* Architecture-specific hypercall definitions. */
 #define __HYPERVISOR_arch_0               48
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea57d..2ebc999f4b 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -16,6 +16,7 @@
 #include <public/version.h>
 #include <public/pmu.h>
 #include <public/hvm/dm_op.h>
+#include <public/iommu_op.h>
 #include <asm/hypercall.h>
 #include <xsm/xsm.h>
 
@@ -148,6 +149,10 @@ do_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs);
 
+extern long
+do_iommu_op(unsigned int nr_bufs,
+            XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs);
+
 #ifdef CONFIG_COMPAT
 
 extern int
@@ -205,6 +210,13 @@ compat_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(void) bufs);
 
+#include <compat/iommu_op.h>
+
+DEFINE_XEN_GUEST_HANDLE(compat_iommu_op_buf_t);
+extern int
+compat_iommu_op(unsigned int nr_bufs,
+                XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs);
+
 #endif
 
 void arch_get_xen_caps(xen_capabilities_info_t *info);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 527332054a..3b15c18c4e 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -77,6 +77,8 @@
 ?	vcpu_hvm_context		hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_32			hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
+!	iommu_op			iommu_op.h
+!	iommu_op_buf			iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index b0ac1f66b3..34b786993d 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -715,6 +715,12 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_iommu_op(XSM_DEFAULT_ARG struct domain *d, unsigned int op)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
 #endif /* CONFIG_X86 */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3d67962493..847a1314ed 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -178,6 +178,7 @@ struct xsm_operations {
     int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow);
     int (*pmu_op) (struct domain *d, unsigned int op);
     int (*dm_op) (struct domain *d);
+    int (*iommu_op) (struct domain *d, unsigned int op);
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
@@ -686,6 +687,11 @@ static inline int xsm_dm_op(xsm_default_t def, struct domain *d)
     return xsm_ops->dm_op(d);
 }
 
+static inline int xsm_iommu_op(xsm_default_t def, struct domain *d, unsigned int op)
+{
+    return xsm_ops->iommu_op(d, op);
+}
+
 #endif /* CONFIG_X86 */
 
 static inline int xsm_xen_version (xsm_default_t def, uint32_t op)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 3290d04527..8532b74b9a 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,6 +155,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, ioport_mapping);
     set_to_dummy_if_null(ops, pmu_op);
     set_to_dummy_if_null(ops, dm_op);
+    set_to_dummy_if_null(ops, iommu_op);
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 6da2773aa9..211e69fb38 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1666,6 +1666,11 @@ static int flask_dm_op(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__DM);
 }
 
+static int flask_iommu_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_RESOURCE, RESOURCE__CONTROL_IOMMU);
+}
+
 #endif /* CONFIG_X86 */
 
 static int flask_xen_version (uint32_t op)
@@ -1844,6 +1849,7 @@ static struct xsm_operations flask_ops = {
     .ioport_mapping = flask_ioport_mapping,
     .pmu_op = flask_pmu_op,
     .dm_op = flask_dm_op,
+    .iommu_op = flask_iommu_op,
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index d01a7a0d03..802cbb8ea9 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -469,6 +469,8 @@ class resource
 # checked for PHYSDEVOP_setup_gsi (target IRQ)
 # checked for PHYSDEVOP_pci_mmcfg_reserved (target xen_t)
     setup
+# checked for IOMMU_OP
+    control_iommu
 }
 
 # Class security describes the FLASK security server itself; these operations
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 2/7] iommu: track reserved ranges using a rangeset
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
  2018-10-15 10:35 ` [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-09 11:46   ` Jan Beulich
  2018-10-15 10:35 ` [PATCH v7 3/7] x86: add xen_iommu_op to query reserved ranges Paul Durrant
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, Wei Liu, George Dunlap, Jan Beulich

Ranges that should be considered reserved in the IOMMU are not necessarily
limited to RMRRs. Any frame number that is not RAM but is present in the
initial domain map should be considered as reserved in the IOMMU.
This patch adds a rangeset to the domain_iommu structure that is then used
to track all reserved ranges. This will be needed by a subsequent patch
to test whether it is safe to modify a particular IOMMU entry.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v7:
 - Re-base.
 - NULL the rangeset pointer after freeing.
 - As well as RMRRs, include all non-RAM mapped PFNs in the rangeset
   rather than just E820 reserved.
 - Dropping George's and Wei's R-b due to non-trivial changes.

v6:
 - Re-base.

v2:
 - New in v2.
---
 xen/drivers/passthrough/iommu.c     | 11 ++++++++++-
 xen/drivers/passthrough/vtd/iommu.c | 20 +++++++++++++-------
 xen/drivers/passthrough/x86/iommu.c |  4 ++++
 xen/include/xen/iommu.h             |  6 ++++++
 4 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 8b438ae4bc..9f07112367 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -180,6 +180,10 @@ int iommu_domain_init(struct domain *d)
     if ( !iommu_enabled )
         return 0;
 
+    hd->reserved_ranges = rangeset_new(d, NULL, 0);
+    if ( !hd->reserved_ranges )
+        return -ENOMEM;
+
     hd->platform_ops = iommu_get_ops();
     return hd->platform_ops->init(d);
 }
@@ -296,12 +300,17 @@ int iommu_construct(struct domain *d)
 
 void iommu_domain_destroy(struct domain *d)
 {
-    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
+    struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
         return;
 
     iommu_teardown(d);
 
     arch_iommu_domain_destroy(d);
+
+    rangeset_destroy(hd->reserved_ranges);
+    hd->reserved_ranges = NULL;
 }
 
 int iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 4d1ff10817..d8873167e1 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1963,6 +1963,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
     unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K;
     struct mapped_rmrr *mrmrr;
     struct domain_iommu *hd = dom_iommu(d);
+    int err = 0;
 
     ASSERT(pcidevs_locked());
     ASSERT(rmrr->base_address < rmrr->end_address);
@@ -1976,8 +1977,6 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
         if ( mrmrr->base == rmrr->base_address &&
              mrmrr->end == rmrr->end_address )
         {
-            int ret = 0;
-
             if ( map )
             {
                 ++mrmrr->count;
@@ -1987,28 +1986,35 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
             if ( --mrmrr->count )
                 return 0;
 
-            while ( base_pfn < end_pfn )
+            err = rangeset_remove_range(hd->reserved_ranges,
+                                        base_pfn, end_pfn);
+            while ( !err && base_pfn < end_pfn )
             {
                 if ( clear_identity_p2m_entry(d, base_pfn) )
-                    ret = -ENXIO;
+                    err = -ENXIO;
+
                 base_pfn++;
             }
 
             list_del(&mrmrr->list);
             xfree(mrmrr);
-            return ret;
+            return err;
         }
     }
 
     if ( !map )
         return -ENOENT;
 
+    err = rangeset_add_range(hd->reserved_ranges, base_pfn, end_pfn);
+    if ( err )
+        return err;
+
     while ( base_pfn < end_pfn )
     {
-        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
-
+        err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
         if ( err )
             return err;
+
         base_pfn++;
     }
 
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index b20bad17de..e661dc0b45 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -241,6 +241,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         else
             rc = iommu_map_page(d, _dfn(pfn), _mfn(pfn),
                                 IOMMUF_readable | IOMMUF_writable);
+
+        if ( !rc && !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
+            rc = rangeset_add_singleton(dom_iommu(d)->reserved_ranges, pfn);
+
         if ( rc )
             printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index c75333c077..9113f37b85 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -133,6 +133,12 @@ struct domain_iommu {
      * include/xen/mm.h).
      */
     bool need_sync;
+
+    /*
+     * DFN ranges that are reserved in the domain IOMMU mappings and
+     * must not be modified after initialization.
+     */
+    struct rangeset *reserved_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 3/7] x86: add xen_iommu_op to query reserved ranges
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
  2018-10-15 10:35 ` [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
  2018-10-15 10:35 ` [PATCH v7 2/7] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-09 12:01   ` Jan Beulich
  2018-10-15 10:35 ` [PATCH v7 4/7] iommu: introduce iommu_map_page_nocrash Paul Durrant
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Paul Durrant, Jan Beulich

This patch adds a xen_iommu_op to allow the virtual machine's reserved
IOMMU ranges to be queried by the guest.

NOTE: The number of reserved ranges is determined by system firmware, in
      conjunction with Xen command line options, and is expected to be
      small. Thus, to avoid over-complicating the code, there is no
      pre-emption check within the operation.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v7:
 - Use fixed-width types in hypercall.
 - Keep avoiding use of XLAT_iommu_op() for copying back data in the
   compat case. A handle translation macro is still defined (in case there
   is ever a need to start using XLAT_iommu_op() in future) but it is
   directly evaluates if cmp.op == XEN_IOMMUOP_query_reserved.

v4:
 - Make xen_bfn_t strictly 64 bits wide and drop associated compat
   translation.

v3:
 - Avoid speculation beyond array bounds check.

v2:
 - Re-implemented for v2 based on new rangeset.
---
 xen/common/iommu_op.c         | 177 ++++++++++++++++++++++++++++++++++++++++--
 xen/include/public/iommu_op.h |  39 ++++++++++
 xen/include/xlat.lst          |   2 +
 3 files changed, 212 insertions(+), 6 deletions(-)

diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index 6acc7b1888..9d914a67db 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -22,11 +22,70 @@
 #include <xen/event.h>
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/nospec.h>
+
+struct get_reserved_ctxt {
+    unsigned int max_entries;
+    unsigned int nr_entries;
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
+static int get_reserved(unsigned long s, unsigned long e, void *arg)
+{
+    struct get_reserved_ctxt *ctxt = arg;
+
+    if ( ctxt->nr_entries < ctxt->max_entries )
+    {
+        xen_iommu_reserved_range_t range = {
+            .start_dfn = s,
+            .nr_frames = e - s,
+        };
+
+        if ( copy_to_guest_offset(ctxt->ranges, ctxt->nr_entries, &range,
+                                  1) )
+            return -EFAULT;
+    }
+
+    ctxt->nr_entries++;
+    return 0;
+}
+
+static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    struct get_reserved_ctxt ctxt = {
+        .max_entries = op->nr_entries,
+        .ranges = op->ranges,
+    };
+    int rc;
+
+    if ( op->pad )
+        return -EINVAL;
+
+    rc = rangeset_report_ranges(iommu->reserved_ranges, 0, ~0ul,
+                                get_reserved, &ctxt);
+    if ( rc )
+        return rc;
+
+    /* Pass back the actual number of reserved ranges */
+    op->nr_entries = ctxt.nr_entries;
+
+    if ( !guest_handle_is_null(ctxt.ranges) &&
+         ctxt.nr_entries > ctxt.max_entries )
+        return -ENOBUFS;
+
+    return 0;
+}
 
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
     {
+    case XEN_IOMMUOP_query_reserved:
+        op->status = iommu_op_query_reserved(&op->u.query_reserved);
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -38,12 +97,20 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     const XEN_GUEST_HANDLE(xen_iommu_op_t) h =
         guest_handle_cast(buf->h, xen_iommu_op_t);
     xen_iommu_op_t op;
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] =
+            sizeof(struct xen_iommu_op_query_reserved),
+    };
+    size_t size;
     int rc;
 
-    if ( buf->size < sizeof(op) )
+    offset = offsetof(struct xen_iommu_op, u);
+
+    if ( buf->size < offset )
         return -ENODATA;
 
-    if ( copy_from_guest(&op, h, 1) )
+    if ( copy_from_guest((void *)&op, buf->h, offset) )
         return -EFAULT;
 
     if ( op.pad )
@@ -53,8 +120,22 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( op.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(op.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_guest_offset((void *)&op.u, buf->h, offset, size) )
+        return -EFAULT;
+
     iommu_op(&op);
 
+    if ( op.op == XEN_IOMMUOP_query_reserved &&
+         __copy_field_to_guest(h, &op, u.query_reserved.nr_entries) )
+        return -EFAULT;
+
     if ( __copy_field_to_guest(h, &op, status) )
         return -EFAULT;
 
@@ -103,18 +184,29 @@ long do_iommu_op(unsigned int nr_bufs,
 
 #ifdef CONFIG_COMPAT
 
+CHECK_iommu_reserved_range;
+
 int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
 {
     const COMPAT_HANDLE(compat_iommu_op_t) h =
         compat_handle_cast(buf->h, compat_iommu_op_t);
     compat_iommu_op_t cmp;
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] =
+            sizeof(struct compat_iommu_op_query_reserved),
+    };
+    size_t size;
     xen_iommu_op_t nat;
+    unsigned int u;
     int rc;
 
-    if ( buf->size < sizeof(cmp) )
+    offset = offsetof(struct compat_iommu_op, u);
+
+    if ( buf->size < offset )
         return -ENODATA;
 
-    if ( copy_from_compat(&cmp, h, 1) )
+    if ( copy_from_compat((void *)&cmp, buf->h, offset) )
         return -EFAULT;
 
     if ( cmp.pad )
@@ -124,16 +216,89 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( cmp.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(cmp.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_compat_offset((void *)&cmp.u, buf->h, offset, size) )
+        return -EFAULT;
+
+    /*
+     * The xlat magic doesn't quite know how to handle the union so
+     * we need to fix things up here.
+     */
+#define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+    u = cmp.op;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
+    do                                                                \
+    {                                                                 \
+        if ( !compat_handle_is_null((_s_)->ranges) )                  \
+        {                                                             \
+            typeof(cmp.u.query_reserved.nr_entries) *nr_entries =     \
+                COMPAT_ARG_XLAT_VIRT_BASE;                            \
+            xen_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                             \
+                                                                      \
+            if ( sizeof(*nr_entries) +                                \
+                 (sizeof(*ranges) * (_s_)->nr_entries) >              \
+                 COMPAT_ARG_XLAT_SIZE )                               \
+                return -E2BIG;                                        \
+                                                                      \
+            *nr_entries = (_s_)->nr_entries;                          \
+            set_xen_guest_handle((_d_)->ranges, ranges);              \
+        }                                                             \
+        else                                                          \
+            set_xen_guest_handle((_d_)->ranges, NULL);                \
+    } while (false)
+
     XLAT_iommu_op(&nat, &cmp);
 
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+#undef XLAT_iommu_op_u_query_reserved
+
     iommu_op(&nat);
 
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)               \
+    do                                                                   \
+    {                                                                    \
+        if ( !compat_handle_is_null((_d_)->ranges) )                     \
+        {                                                                \
+            typeof(cmp.u.query_reserved.nr_entries) *nr_entries =        \
+                COMPAT_ARG_XLAT_VIRT_BASE;                               \
+            compat_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                                \
+            unsigned int nr =                                            \
+                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
+                                                                         \
+            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
+                cmp.status = -EFAULT;                                    \
+        }                                                                \
+    } while (false)
+
     /*
-     * Avoid the full (and lengthy) XLAT code as the only thing that
-     * needs copying back is the status field.
+     * Avoid the full (and lengthy) XLAT code as the only things that
+     * need copying back are the reserved ranges (in the case of the
+     * query op) and the status field (for all ops).
      */
     cmp.status = nat.status;
 
+    if ( cmp.op == XEN_IOMMUOP_query_reserved )
+    {
+        XLAT_iommu_op_query_reserved_HNDL_ranges(&cmp.u.query_reserved,
+                                                 &nat.u.query_reserved);
+
+        cmp.u.query_reserved.nr_entries = nat.u.query_reserved.nr_entries;
+
+        if ( __copy_field_to_compat(h, &cmp, u.query_reserved.nr_entries) )
+            return -EFAULT;
+    }
+
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     if ( __copy_field_to_compat(h, &cmp, status) )
         return -EFAULT;
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index c3b68f665a..001f515bb3 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -25,11 +25,50 @@
 
 #include "xen.h"
 
+typedef uint64_t xen_dfn_t;
+
+/* Structure describing a single range reserved in the IOMMU */
+struct xen_iommu_reserved_range {
+    xen_dfn_t start_dfn;
+    uint32_t nr_frames;
+    uint32_t pad;
+};
+typedef struct xen_iommu_reserved_range xen_iommu_reserved_range_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
+
+/*
+ * XEN_IOMMUOP_query_reserved: Query ranges reserved in the IOMMU.
+ */
+#define XEN_IOMMUOP_query_reserved 1
+
+struct xen_iommu_op_query_reserved {
+    /*
+     * IN/OUT - On entry this is the number of entries available
+     *          in the ranges array below.
+     *          On exit this is the actual number of reserved ranges.
+     */
+    uint32_t nr_entries;
+    uint32_t pad;
+    /*
+     * OUT - This array is populated with reserved ranges. If it is
+     *       not sufficiently large then available entries are populated,
+     *       but the op status code will be set to -ENOBUFS.
+     *       It is permissable to set this to NULL if nr_entries is also
+     *       set to zero. In this case, on exit, nr_entries will still be
+     *       set to the actual number of reserved ranges but the status
+     *       code will be set to zero.
+     */
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
     int32_t status; /* op completion status: */
                     /* 0 for success otherwise, negative errno */
+    union {
+        struct xen_iommu_op_query_reserved query_reserved;
+    } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3b15c18c4e..d2f9b1034b 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,6 +79,8 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_query_reserved		iommu_op.h
+?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 4/7] iommu: introduce iommu_map_page_nocrash
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
                   ` (2 preceding siblings ...)
  2018-10-15 10:35 ` [PATCH v7 3/7] x86: add xen_iommu_op to query reserved ranges Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-12 13:26   ` Jan Beulich
  2018-10-15 10:35 ` [PATCH v7 5/7] iommu / vtd: introduce a new 'refcount' flag Paul Durrant
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Jan Beulich

To allow a subsequent patch to map DFNs specified by hypercall, there
needs to be iommu_op wrapper function that does not contain an implicit
domain_crash. This patch introduces that function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>

v7:
 - New in v7.
---
 xen/drivers/passthrough/iommu.c | 15 +++++++++++----
 xen/include/xen/iommu.h         |  2 ++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 9f07112367..bc67cfe843 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -313,16 +313,23 @@ void iommu_domain_destroy(struct domain *d)
     hd->reserved_ranges = NULL;
 }
 
-int iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
-                   unsigned int flags)
+int iommu_map_page_nocrash(struct domain *d, dfn_t dfn, mfn_t mfn,
+                           unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
-    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, dfn, mfn, flags);
+    return hd->platform_ops->map_page(d, dfn, mfn, flags);
+}
+
+int iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
+                   unsigned int flags)
+{
+    int rc;
+
+    rc = iommu_map_page_nocrash(d, dfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 9113f37b85..1bf311624c 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -89,6 +89,8 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
+int __must_check iommu_map_page_nocrash(struct domain *d, dfn_t dfn,
+                                        mfn_t mfn, unsigned int flags);
 int __must_check iommu_map_page(struct domain *d, dfn_t dfn,
                                 mfn_t mfn, unsigned int flags);
 int __must_check iommu_unmap_page(struct domain *d, dfn_t dfn);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 5/7] iommu / vtd: introduce a new 'refcount' flag...
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
                   ` (3 preceding siblings ...)
  2018-10-15 10:35 ` [PATCH v7 4/7] iommu: introduce iommu_map_page_nocrash Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-12 13:37   ` Jan Beulich
  2018-10-15 10:35 ` [PATCH v7 6/7] x86: add xen_iommu_ops to modify IOMMU mappings Paul Durrant
  2018-10-15 10:35 ` [PATCH v7 7/7] x86: extend the map and unmap xen_iommu_ops to support grant references Paul Durrant
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, Jan Beulich

...to mean 'the page (being) mapped is reference counted'.

An important pre-requisite for PV-IOMMU mapping is being able to tell the
difference between IOMMU entries added at start-of-day by Xen and those
that have been added by a PV-IOMMU map operation. The reason for this is
that the pages for the former do not have an extra reference taken prior to
mapping but the latter will (for safety/security reasons).

This patch therefore introduces a new IOMMF_refcount flag that the
subsequent patch adding the PV-IOMMU map operation will use to mark
entries that it adds. When the VT-d mapping code encounters this flag
it will set a bit in the IOMMU PTE that is ignored by the IOMMU itself,
such that a subsquent lookup operation can determine whether the mapped
page was reference counted or not (and hence forbid a PV-IOMMU unmap
operation in the latter case).

A subsequent patch will implement a similar PTE bit for AMD IOMMUs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jan Beulich <jbeulich@suse.com>

v7:
 - New in v7.
---
 xen/drivers/passthrough/vtd/iommu.c |  5 +++++
 xen/drivers/passthrough/vtd/iommu.h | 17 +++++++++++------
 xen/include/xen/iommu.h             |  2 ++
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index d8873167e1..cb264d8af4 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1810,6 +1810,10 @@ static int __must_check intel_iommu_map_page(struct domain *d,
     if ( iommu_snoop )
         dma_set_pte_snp(new);
 
+    /* If the page has referenced for mapping then mark it as such */
+    if ( flags & IOMMUF_refcount )
+        dma_set_pte_refcnt(new);
+
     if ( old.val == new.val )
     {
         spin_unlock(&hd->arch.mapping_lock);
@@ -1879,6 +1883,7 @@ static int intel_iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
     *mfn = maddr_to_mfn(dma_pte_addr(val));
     *flags = dma_pte_read(val) ? IOMMUF_readable : 0;
     *flags |= dma_pte_write(val) ? IOMMUF_writable : 0;
+    *flags |= dma_pte_refcnt(val) ? IOMMUF_refcount : 0;
 
     return 0;
 }
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 1a992f72d6..880eebaed3 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -253,21 +253,25 @@ struct context_entry {
  * 1: writable
  * 2-6: reserved
  * 7: super page
- * 8-11: available
+ * 8-9: reserved
+ * 10: ignored by h/w (used for refcount flag)
+ * 11: snoop control
  * 12-63: Host physcial address
  */
 struct dma_pte {
     u64 val;
 };
-#define DMA_PTE_READ (1)
-#define DMA_PTE_WRITE (2)
-#define DMA_PTE_PROT (DMA_PTE_READ | DMA_PTE_WRITE)
-#define DMA_PTE_SP   (1 << 7)
-#define DMA_PTE_SNP  (1 << 11)
+#define DMA_PTE_READ   (1)
+#define DMA_PTE_WRITE  (2)
+#define DMA_PTE_PROT   (DMA_PTE_READ | DMA_PTE_WRITE)
+#define DMA_PTE_SP     (1 << 7)
+#define DMA_PTE_REFCNT (1 << 10)
+#define DMA_PTE_SNP    (1 << 11)
 #define dma_clear_pte(p)    do {(p).val = 0;} while(0)
 #define dma_set_pte_readable(p) do {(p).val |= DMA_PTE_READ;} while(0)
 #define dma_set_pte_writable(p) do {(p).val |= DMA_PTE_WRITE;} while(0)
 #define dma_set_pte_superpage(p) do {(p).val |= DMA_PTE_SP;} while(0)
+#define dma_set_pte_refcnt(p) do {(p).val |= DMA_PTE_REFCNT;} while(0)
 #define dma_set_pte_snp(p)  do {(p).val |= DMA_PTE_SNP;} while(0)
 #define dma_set_pte_prot(p, prot) do { \
         (p).val = ((p).val & ~DMA_PTE_PROT) | ((prot) & DMA_PTE_PROT); \
@@ -280,6 +284,7 @@ struct dma_pte {
             (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
 #define dma_pte_present(p) (((p).val & DMA_PTE_PROT) != 0)
 #define dma_pte_superpage(p) (((p).val & DMA_PTE_SP) != 0)
+#define dma_pte_refcnt(p) (((p).val & DMA_PTE_REFCNT) != 0)
 
 /* interrupt remap entry */
 struct iremap_entry {
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 1bf311624c..a56d03b719 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -89,6 +89,8 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
+#define _IOMMUF_refcount 2
+#define IOMMUF_refcount  (1u<<_IOMMUF_refcount)
 int __must_check iommu_map_page_nocrash(struct domain *d, dfn_t dfn,
                                         mfn_t mfn, unsigned int flags);
 int __must_check iommu_map_page(struct domain *d, dfn_t dfn,
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 6/7] x86: add xen_iommu_ops to modify IOMMU mappings
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
                   ` (4 preceding siblings ...)
  2018-10-15 10:35 ` [PATCH v7 5/7] iommu / vtd: introduce a new 'refcount' flag Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-12 14:29   ` Jan Beulich
  2018-10-15 10:35 ` [PATCH v7 7/7] x86: extend the map and unmap xen_iommu_ops to support grant references Paul Durrant
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich

This patch adds iommu_ops to add (map) or remove (unmap) frames in the
domain's IOMMU mappings.

Currently the flags value for each op must include the
XEN_IOMMUOP_map/unmap_all flag as the implementation does not yet support
per-device mappings. The sbdf field of each hypercall is accordingly
ignored.

Mappings added by the map operation are tracked and only those mappings
may be removed by a subsequent unmap operation. Frames are specified by the
owning domain and GFN. It is, of course, permissable for a domain to map
and unmap its own frames using DOMID_SELF.

NOTE: The owning domain and GFN must also be specified in the unmap
      operation, as well as the DFN, so that they can be cross-checked
      with the existent mapping.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v7:
 - Get rid of explicit flush xen_iommu_op. Flush at the end of a batch
   instead.

v6:
 - Add placeholder sbdf field and flag to control scope of map, unmap and
   flush.

v4:
 - Fixed logic inversion when checking return of iommu_unmap_page().

v3:
 - Add type pinning.

v2:
 - Heavily re-worked in v2, including explicit tracking of mappings.
   This avoids the need to clear non-reserved mappings from IOMMU
   at start of day, which would be prohibitively slow on a large host.
---
 xen/common/iommu_op.c           | 280 ++++++++++++++++++++++++++++++++++++++--
 xen/drivers/passthrough/iommu.c |   2 +
 xen/include/public/iommu_op.h   |  98 ++++++++++++++
 xen/include/xen/iommu.h         |   6 +
 xen/include/xlat.lst            |   3 +
 5 files changed, 376 insertions(+), 13 deletions(-)

diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index 9d914a67db..0876414df5 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -78,7 +78,205 @@ static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
     return 0;
 }
 
-static void iommu_op(xen_iommu_op_t *op)
+static int iommu_op_enable_modification(
+    struct xen_iommu_op_enable_modification *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    const struct iommu_ops *ops = iommu->platform_ops;
+    int rc;
+
+    if ( op->cap || op->pad )
+        return -EINVAL;
+
+    spin_lock(&iommu->lock);
+
+    /* Has modification already been enabled? */
+    rc = 0;
+    if ( iommu->domain_control )
+        goto unlock;
+
+    /*
+     * Modificaton of IOMMU mappings cannot be put under domain control if:
+     * - this domain does not have IOMMU page tables, or
+     * - HAP is enabled for this domain and the IOMMU shares the tables.
+     */
+    rc = -EACCES;
+    if ( !has_iommu_pt(currd) || iommu_use_hap_pt(currd) )
+        goto unlock;
+
+    /*
+     * The IOMMU implementation must provide the lookup method if
+     * modification of the mappings is to be supported.
+     */
+    rc = -EOPNOTSUPP;
+    if ( !ops->lookup_page )
+        goto unlock;
+
+    rc = 0;
+    iommu->need_sync = false; /* Disable synchronization, if enabled */
+    iommu->domain_control = true; /* Enable control */
+
+ unlock:
+    /*
+     * XEN_IOMMU_CAP_per_device_mappings is not supported yet so we can
+     * leave op->cap alone.
+     */
+
+    spin_unlock(&iommu->lock);
+
+    return rc;
+}
+
+static int iommuop_map(struct xen_iommu_op_map *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
+    dfn_t dfn = _dfn(op->dfn);
+    p2m_type_t p2mt;
+    struct page_info *page;
+    mfn_t ignore;
+    unsigned int flags;
+    int rc;
+
+    if ( op->pad || (op->flags & ~(XEN_IOMMUOP_map_all |
+                                   XEN_IOMMUOP_map_readonly)) )
+        return -EINVAL;
+
+    if ( !iommu->domain_control )
+        return -EOPNOTSUPP;
+
+    /* Per-device mapping not yet supported */
+    if ( !(op->flags & XEN_IOMMUOP_map_all) )
+        return -EINVAL;
+
+    /* Check whether the specified DFN falls in a reserved region */
+    if ( rangeset_contains_singleton(iommu->reserved_ranges, dfn_x(dfn)) )
+        return -EINVAL;
+
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);
+    if ( rc )
+        goto unlock_domain;
+
+    rc = -EINVAL;
+    if ( p2mt != p2m_ram_rw ||
+         (!readonly && !get_page_type(page, PGT_writable_page)) )
+    {
+        put_page(page);
+        goto unlock_domain;
+    }
+
+    spin_lock(&iommu->lock);
+
+    rc = iommu_lookup_page(currd, dfn, &ignore, &flags);
+
+    /* Treat a non-reference-counted entry as non-existent */
+    if ( !rc )
+        rc = !(flags & IOMMUF_refcount) ? -ENOENT : -EEXIST;
+
+    if ( rc != -ENOENT )
+        goto unlock_iommu;
+
+    flags = IOMMUF_readable | IOMMUF_refcount;
+    if ( !readonly )
+        flags |= IOMMUF_writable;
+
+    rc = iommu_map_page_nocrash(currd, dfn, page_to_mfn(page), flags);
+
+ unlock_iommu:
+    spin_unlock(&iommu->lock);
+
+    if ( rc ) /* retain references if mapping is successful */
+    {
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
+
+ unlock_domain:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
+static int iommuop_unmap(struct xen_iommu_op_unmap *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    dfn_t dfn = _dfn(op->dfn);
+    mfn_t mfn;
+    unsigned int flags;
+    bool readonly;
+    p2m_type_t p2mt;
+    struct page_info *page;
+    int rc;
+
+    if ( op->pad ||
+         (op->flags & ~XEN_IOMMUOP_unmap_all) )
+        return -EINVAL;
+
+    if ( !iommu->domain_control )
+        return -EOPNOTSUPP;
+
+    /* Per-device unmapping not yet supported */
+    if ( !(op->flags & XEN_IOMMUOP_unmap_all) )
+        return -EINVAL;
+
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    spin_lock(&iommu->lock);
+
+    rc = iommu_lookup_page(currd, dfn, &mfn, &flags);
+
+    /* Treat a non-reference-counted entry as non-existent */
+    if ( !rc )
+        rc = !(flags & IOMMUF_refcount) ? -ENOENT : 0;
+
+    if ( rc )
+        goto unlock;
+
+    readonly = !(flags & IOMMUF_writable);
+
+    /* Make sure the mapped frame matches */
+    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);
+    if ( rc )
+        goto unlock;
+
+    rc = !mfn_eq(mfn, page_to_mfn(page)) ? -EINVAL : 0;
+
+    /* Release reference taken above */
+    put_page(page);
+
+    if ( rc )
+        goto unlock;
+
+    /* Release references taken in map */
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+    /*
+     * This really should not fail. If it does, there is an implicit
+     * domain_crash() (except in the case of the hardware domain) since
+     * there is not a lot else that can be done to ensure the released
+     * page can be safely re-used.
+     */
+    rc = iommu_unmap_page(currd, dfn);
+
+ unlock:
+    spin_unlock(&iommu->lock);
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+static void iommu_op(xen_iommu_op_t *op, bool *need_flush)
 {
     switch ( op->op )
     {
@@ -86,13 +284,30 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_query_reserved(&op->u.query_reserved);
         break;
 
+    case XEN_IOMMUOP_enable_modification:
+        op->status =
+            iommu_op_enable_modification(&op->u.enable_modification);
+        break;
+
+    case XEN_IOMMUOP_map:
+        op->status = iommuop_map(&op->u.map);
+        if ( !op->status )
+            *need_flush = true;
+        break;
+
+    case XEN_IOMMUOP_unmap:
+        op->status = iommuop_unmap(&op->u.unmap);
+        if ( !op->status )
+            *need_flush = true;
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
     }
 }
 
-int do_one_iommu_op(xen_iommu_op_buf_t *buf)
+int do_one_iommu_op(xen_iommu_op_buf_t *buf, bool *need_flush)
 {
     const XEN_GUEST_HANDLE(xen_iommu_op_t) h =
         guest_handle_cast(buf->h, xen_iommu_op_t);
@@ -101,6 +316,10 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] =
             sizeof(struct xen_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] =
+            sizeof(struct xen_iommu_op_enable_modification),
+        [XEN_IOMMUOP_map] = sizeof(struct xen_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct xen_iommu_op_unmap),
     };
     size_t size;
     int rc;
@@ -130,10 +349,12 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     if ( copy_from_guest_offset((void *)&op.u, buf->h, offset, size) )
         return -EFAULT;
 
-    iommu_op(&op);
+    iommu_op(&op, need_flush);
 
-    if ( op.op == XEN_IOMMUOP_query_reserved &&
-         __copy_field_to_guest(h, &op, u.query_reserved.nr_entries) )
+    if ( (op.op == XEN_IOMMUOP_query_reserved &&
+          __copy_field_to_guest(h, &op, u.query_reserved.nr_entries)) ||
+         (op.op == XEN_IOMMUOP_enable_modification &&
+          __copy_field_to_guest(h, &op, u.enable_modification.cap)) )
         return -EFAULT;
 
     if ( __copy_field_to_guest(h, &op, status) )
@@ -146,8 +367,11 @@ long do_iommu_op(unsigned int nr_bufs,
                  XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs)
 {
     unsigned int i;
+    bool need_flush = false;
     long rc = 0;
 
+    this_cpu(iommu_dont_flush_iotlb) = 1;
+
     for ( i = 0; i < nr_bufs; i++ )
     {
         xen_iommu_op_buf_t buf;
@@ -164,11 +388,13 @@ long do_iommu_op(unsigned int nr_bufs,
             break;
         }
 
-        rc = do_one_iommu_op(&buf);
+        rc = do_one_iommu_op(&buf, &need_flush);
         if ( rc )
             break;
     }
 
+    this_cpu(iommu_dont_flush_iotlb) = 0;
+
     if ( rc > 0 )
     {
         ASSERT(rc < nr_bufs);
@@ -177,7 +403,8 @@ long do_iommu_op(unsigned int nr_bufs,
 
         rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
                                            "ih", nr_bufs, bufs);
-    }
+    } else if ( !rc && need_flush )
+        rc = iommu_iotlb_flush_all(current->domain);
 
     return rc;
 }
@@ -186,7 +413,7 @@ long do_iommu_op(unsigned int nr_bufs,
 
 CHECK_iommu_reserved_range;
 
-int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
+int compat_one_iommu_op(compat_iommu_op_buf_t *buf, bool *need_flush)
 {
     const COMPAT_HANDLE(compat_iommu_op_t) h =
         compat_handle_cast(buf->h, compat_iommu_op_t);
@@ -195,6 +422,10 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] =
             sizeof(struct compat_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] =
+            sizeof(struct compat_iommu_op_enable_modification),
+        [XEN_IOMMUOP_map] = sizeof(struct compat_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct compat_iommu_op_unmap),
     };
     size_t size;
     xen_iommu_op_t nat;
@@ -228,9 +459,15 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
 
     /*
      * The xlat magic doesn't quite know how to handle the union so
-     * we need to fix things up here.
+     * we need to fix things up here. Also, none of the sub-ops, apart from
+     * query_reserved, actually need any translation but the xlat magic
+     * can't deal with that either so all sub-ops must be marked for
+     * translation in xlat.lst.
      */
 #define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+#define XLAT_iommu_op_u_enable_modification XEN_IOMMUOP_enable_modification
+#define XLAT_iommu_op_u_map XEN_IOMMUOP_map
+#define XLAT_iommu_op_u_unmap XEN_IOMMUOP_unmap
     u = cmp.op;
 
 #define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
@@ -258,9 +495,12 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     XLAT_iommu_op(&nat, &cmp);
 
 #undef XLAT_iommu_op_query_reserved_HNDL_ranges
+#undef XLAT_iommu_op_u_unmap
+#undef XLAT_iommu_op_u_map
+#undef XLAT_iommu_op_u_enable_modification
 #undef XLAT_iommu_op_u_query_reserved
 
-    iommu_op(&nat);
+    iommu_op(&nat, need_flush);
 
 #define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)               \
     do                                                                   \
@@ -282,7 +522,8 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     /*
      * Avoid the full (and lengthy) XLAT code as the only things that
      * need copying back are the reserved ranges (in the case of the
-     * query op) and the status field (for all ops).
+     * query op), capabilities (in the case of the enable op) and the
+     * status field (for all ops).
      */
     cmp.status = nat.status;
 
@@ -296,6 +537,13 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
         if ( __copy_field_to_compat(h, &cmp, u.query_reserved.nr_entries) )
             return -EFAULT;
     }
+    else if ( cmp.op == XEN_IOMMUOP_enable_modification )
+    {
+        cmp.u.enable_modification.cap = nat.u.enable_modification.cap;
+
+        if ( __copy_field_to_compat(h, &cmp, u.enable_modification.cap) )
+            return -EFAULT;
+    }
 
 #undef XLAT_iommu_op_query_reserved_HNDL_ranges
 
@@ -309,8 +557,11 @@ int compat_iommu_op(unsigned int nr_bufs,
                     XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs)
 {
     unsigned int i;
+    bool need_flush = false;
     long rc = 0;
 
+    this_cpu(iommu_dont_flush_iotlb) = 1;
+
     for ( i = 0; i < nr_bufs; i++ )
     {
         compat_iommu_op_buf_t buf;
@@ -327,11 +578,13 @@ int compat_iommu_op(unsigned int nr_bufs,
             break;
         }
 
-        rc = compat_one_iommu_op(&buf);
+        rc = compat_one_iommu_op(&buf, &need_flush);
         if ( rc )
             break;
     }
 
+    this_cpu(iommu_dont_flush_iotlb) = 0;
+
     if ( rc > 0 )
     {
         ASSERT(rc < nr_bufs);
@@ -340,7 +593,8 @@ int compat_iommu_op(unsigned int nr_bufs,
 
         rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
                                            "ih", nr_bufs, bufs);
-    }
+    } else if ( !rc && need_flush )
+        rc = iommu_iotlb_flush_all(current->domain);
 
     return rc;
 }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index bc67cfe843..47c608cc89 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -184,6 +184,8 @@ int iommu_domain_init(struct domain *d)
     if ( !hd->reserved_ranges )
         return -ENOMEM;
 
+    spin_lock_init(&hd->lock);
+
     hd->platform_ops = iommu_get_ops();
     return hd->platform_ops->init(d);
 }
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index 001f515bb3..fcca47e8d2 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -61,6 +61,101 @@ struct xen_iommu_op_query_reserved {
     XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
 };
 
+/*
+ * XEN_IOMMUOP_enable_modification: Enable operations that modify IOMMU
+ *                                  mappings.
+ */
+#define XEN_IOMMUOP_enable_modification 2
+
+struct xen_iommu_op_enable_modification {
+    /*
+     * OUT - On successful return this is set to the bitwise OR of capabilities
+     *       defined below. On entry this must be set to zero.
+     */
+    uint32_t cap;
+    uint32_t pad;
+
+    /* Does the implementation support per-device mappings? */
+#define _XEN_IOMMU_CAP_per_device_mappings 0
+#define XEN_IOMMU_CAP_per_device_mappings (1u << _XEN_IOMMU_CAP_per_device_mappings)
+};
+
+/*
+ * XEN_IOMMUOP_map: Map a guest page in the IOMMU.
+ */
+#define XEN_IOMMUOP_map 3
+
+struct xen_iommu_op_map {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    /*
+     * IN - flags controlling the mapping. This should be a bitwise OR of the
+     *      flags defined below.
+     */
+    uint16_t flags;
+
+    /*
+     * Should the mapping be created for all initiators?
+     *
+     * NOTE: This flag is currently required as the implementation does not yet
+     *       support pre-device mappings.
+     */
+#define _XEN_IOMMUOP_map_all 0
+#define XEN_IOMMUOP_map_all (1 << (_XEN_IOMMUOP_map_all))
+
+    /* Should the mapping be read-only to the initiator(s)? */
+#define _XEN_IOMMUOP_map_readonly 1
+#define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
+
+    uint32_t pad;
+    /*
+     * IN - Segment/Bus/Device/Function of the initiator.
+     *
+     * NOTE: This is ignored if XEN_IOMMUOP_map_all is set.
+     */
+    uint64_t sbdf;
+    /* IN - The IOMMU frame number which will hold the new mapping */
+    xen_dfn_t dfn;
+    /* IN - The guest frame number of the page to be mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_unmap_gfn: Remove a mapping in the IOMMU.
+ */
+#define XEN_IOMMUOP_unmap 4
+
+struct xen_iommu_op_unmap {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    /*
+     * IN - flags controlling the unmapping. This should be a bitwise OR of the
+     *      flags defined below.
+     */
+    uint16_t flags;
+
+    /*
+     * Should the mapping be destroyed for all initiators?
+     *
+     * NOTE: This flag is currently required as the implementation does not yet
+     *       support pre-device mappings.
+     */
+#define _XEN_IOMMUOP_unmap_all 0
+#define XEN_IOMMUOP_unmap_all (1 << (_XEN_IOMMUOP_unmap_all))
+
+    uint32_t pad;
+    /*
+     * IN - Segment/Bus/Device/Function of the initiator.
+     *
+     * NOTE: This is ignored if XEN_IOMMUOP_unmap_all is set.
+     */
+    uint64_t sbdf;
+    /* IN - The IOMMU frame number which holds the mapping to be removed */
+    xen_dfn_t dfn;
+    /* IN - The guest frame number of the page that is mapped */
+    xen_pfn_t gfn;
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
@@ -68,6 +163,9 @@ struct xen_iommu_op {
                     /* 0 for success otherwise, negative errno */
     union {
         struct xen_iommu_op_query_reserved query_reserved;
+        struct xen_iommu_op_enable_modification enable_modification;
+        struct xen_iommu_op_map map;
+        struct xen_iommu_op_unmap unmap;
     } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index a56d03b719..a04c312aeb 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -143,6 +143,12 @@ struct domain_iommu {
      * must not be modified after initialization.
      */
     struct rangeset *reserved_ranges;
+
+    /*
+     * PV-IOMMU fields
+     */
+    bool domain_control;
+    spinlock_t lock;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index d2f9b1034b..3f5b0ac004 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,7 +79,10 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_enable_modification	iommu_op.h
+!	iommu_op_map			iommu_op.h
 !	iommu_op_query_reserved		iommu_op.h
+!	iommu_op_unmap			iommu_op.h
 ?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 7/7] x86: extend the map and unmap xen_iommu_ops to support grant references
  2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
                   ` (5 preceding siblings ...)
  2018-10-15 10:35 ` [PATCH v7 6/7] x86: add xen_iommu_ops to modify IOMMU mappings Paul Durrant
@ 2018-10-15 10:35 ` Paul Durrant
  2018-11-12 15:06   ` Jan Beulich
  6 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2018-10-15 10:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich

This patch allows a domain to add or remove foreign frames from its
IOMMU mappings by grant reference as well as GFN. This is necessary,
for example, to support a PV network backend that needs to construct a
packet buffer that can be directly accessed by a NIC.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v7:
 - Re-base.

v6:
 - Re-base.

v2:
 - New in v2.
---
 xen/common/grant_table.c      | 151 ++++++++++++++++++++++++++++++++++++++++++
 xen/common/iommu_op.c         | 100 +++++++++++++++++++---------
 xen/include/public/iommu_op.h |  29 ++++++--
 xen/include/xen/grant_table.h |   5 ++
 4 files changed, 248 insertions(+), 37 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 878e668bf5..f7c9333f60 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3927,6 +3927,157 @@ int gnttab_get_status_frame(struct domain *d, unsigned long idx,
     return rc;
 }
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EACCES;
+    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
+         (shah->flags & GTF_sub_page) )
+        goto release;
+
+    rc = -ERANGE;
+    if ( act->pin && ((act->domid != currd->domain_id) ||
+                      (act->pin & 0x80808080U) != 0) )
+        goto release;
+
+    rc = -EINVAL;
+    if ( !act->pin ||
+         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
+        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
+                         0, shah, act, status) != GNTST_okay )
+            goto release;
+    }
+
+    if ( !act->pin )
+    {
+        gfn_t gfn = gt->gt_version == 1 ?
+            _gfn(shared_entry_v1(gt, gref).frame) :
+            _gfn(shared_entry_v2(gt, gref).full_page.frame);
+        p2m_type_t p2mt;
+        struct page_info *page;
+
+        rc = check_get_page_from_gfn(d, gfn, readonly, &p2mt, &page);
+        if ( rc )
+            goto clear;
+
+        rc = -EINVAL;
+        if ( p2m_is_foreign(p2mt) )
+        {
+            put_page(page);
+            goto clear;
+        }
+
+        act_set_gfn(act, gfn);
+        act->mfn = page_to_mfn(page);
+        act->domid = currd->domain_id;
+        act->start = 0;
+        act->length = PAGE_SIZE;
+        act->is_sub_page = false;
+        act->trans_domain = d;
+        act->trans_gref = gref;
+    }
+    else
+    {
+        ASSERT(mfn_valid(act->mfn));
+        if ( !get_page(mfn_to_page(act->mfn), d) )
+            goto clear;
+    }
+
+    rc = 0;
+    act->pin += readonly ? GNTPIN_devr_inc : GNTPIN_devw_inc;
+    *mfn = act->mfn;
+    goto release;
+
+ clear:
+    if ( !readonly && !(act->pin & GNTPIN_devw_mask) )
+        gnttab_clear_flag(_GTF_writing, status);
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EINVAL;
+    if ( !act->pin || (act->domid != currd->domain_id) ||
+         !mfn_eq(act->mfn, mfn) )
+        goto release;
+
+    rc = 0;
+    if ( readonly )
+        act->pin -= GNTPIN_devr_inc;
+    else
+    {
+        gnttab_mark_dirty(d, mfn);
+
+        act->pin -= GNTPIN_devw_inc;
+        if ( !(act->pin & GNTPIN_devw_mask) )
+            gnttab_clear_flag(_GTF_writing, status);
+    }
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+    put_page(mfn_to_page(mfn));
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
     int first = 1;
diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index 0876414df5..5e8d8c2e7a 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -21,6 +21,7 @@
 
 #include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/grant_table.h>
 #include <xen/hypercall.h>
 #include <xen/nospec.h>
 
@@ -134,14 +135,13 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     struct domain_iommu *iommu = dom_iommu(currd);
     bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
     dfn_t dfn = _dfn(op->dfn);
-    p2m_type_t p2mt;
-    struct page_info *page;
-    mfn_t ignore;
+    mfn_t mfn, ignore;
     unsigned int flags;
     int rc;
 
     if ( op->pad || (op->flags & ~(XEN_IOMMUOP_map_all |
-                                   XEN_IOMMUOP_map_readonly)) )
+                                   XEN_IOMMUOP_map_readonly |
+                                   XEN_IOMMUOP_map_gref)) )
         return -EINVAL;
 
     if ( !iommu->domain_control )
@@ -159,16 +159,31 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     if ( !d )
         return -ESRCH;
 
-    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);
-    if ( rc )
-        goto unlock_domain;
-
-    rc = -EINVAL;
-    if ( p2mt != p2m_ram_rw ||
-         (!readonly && !get_page_type(page, PGT_writable_page)) )
+    if ( op->flags & XEN_IOMMUOP_map_gref )
     {
-        put_page(page);
-        goto unlock_domain;
+        rc = acquire_gref_for_iommu(d, op->u.gref, readonly, &mfn);
+        if ( rc )
+            goto unlock_domain;
+    }
+    else
+    {
+        p2m_type_t p2mt;
+        struct page_info *page;
+
+        rc = check_get_page_from_gfn(d, _gfn(op->u.gfn), readonly, &p2mt,
+                                     &page);
+        if ( rc )
+            goto unlock_domain;
+
+        rc = -EINVAL;
+        if ( p2mt != p2m_ram_rw ||
+             (!readonly && !get_page_type(page, PGT_writable_page)) )
+        {
+            put_page(page);
+            goto unlock_domain;
+        }
+
+        mfn = page_to_mfn(page);
     }
 
     spin_lock(&iommu->lock);
@@ -186,16 +201,23 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     if ( !readonly )
         flags |= IOMMUF_writable;
 
-    rc = iommu_map_page_nocrash(currd, dfn, page_to_mfn(page), flags);
+    rc = iommu_map_page_nocrash(currd, dfn, mfn, flags);
 
  unlock_iommu:
     spin_unlock(&iommu->lock);
 
     if ( rc ) /* retain references if mapping is successful */
     {
-        if ( !readonly )
-            put_page_type(page);
-        put_page(page);
+        if ( op->flags & XEN_IOMMUOP_map_gref )
+            release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+        else
+        {
+            struct page_info *page = mfn_to_page(mfn);
+
+            if ( !readonly )
+                put_page_type(page);
+            put_page(page);
+        }
     }
 
  unlock_domain:
@@ -211,12 +233,11 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     mfn_t mfn;
     unsigned int flags;
     bool readonly;
-    p2m_type_t p2mt;
-    struct page_info *page;
     int rc;
 
     if ( op->pad ||
-         (op->flags & ~XEN_IOMMUOP_unmap_all) )
+         (op->flags & ~(XEN_IOMMUOP_unmap_all |
+                        XEN_IOMMUOP_unmap_gref)) )
         return -EINVAL;
 
     if ( !iommu->domain_control )
@@ -243,23 +264,36 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
 
     readonly = !(flags & IOMMUF_writable);
 
-    /* Make sure the mapped frame matches */
-    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+    {
+        rc = release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+        if ( rc )
+            goto unlock;
+    }
+    else
+    {
+        p2m_type_t p2mt;
+        struct page_info *page;
 
-    rc = !mfn_eq(mfn, page_to_mfn(page)) ? -EINVAL : 0;
+        /* Make sure the mapped frame matches */
+        rc = check_get_page_from_gfn(d, _gfn(op->u.gfn), readonly, &p2mt,
+                                     &page);
+        if ( rc )
+            goto unlock;
 
-    /* Release reference taken above */
-    put_page(page);
+        rc = !mfn_eq(mfn, page_to_mfn(page)) ? -EINVAL : 0;
 
-    if ( rc )
-        goto unlock;
+        /* Release reference taken above */
+        put_page(page);
 
-    /* Release references taken in map */
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+        if ( rc )
+            goto unlock;
+
+        /* Release references taken in map */
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
     /*
      * This really should not fail. If it does, there is an implicit
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index fcca47e8d2..b49d3ee2b6 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -24,6 +24,7 @@
 #define XEN_PUBLIC_IOMMU_OP_H
 
 #include "xen.h"
+#include "grant_table.h"
 
 typedef uint64_t xen_dfn_t;
 
@@ -107,6 +108,10 @@ struct xen_iommu_op_map {
 #define _XEN_IOMMUOP_map_readonly 1
 #define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
 
+    /* Is the memory specified by gfn or grant reference? */
+#define _XEN_IOMMUOP_map_gref 2
+#define XEN_IOMMUOP_map_gref (1 << (_XEN_IOMMUOP_map_gref))
+
     uint32_t pad;
     /*
      * IN - Segment/Bus/Device/Function of the initiator.
@@ -116,8 +121,14 @@ struct xen_iommu_op_map {
     uint64_t sbdf;
     /* IN - The IOMMU frame number which will hold the new mapping */
     xen_dfn_t dfn;
-    /* IN - The guest frame number of the page to be mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page to
+     * be mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
@@ -143,6 +154,10 @@ struct xen_iommu_op_unmap {
 #define _XEN_IOMMUOP_unmap_all 0
 #define XEN_IOMMUOP_unmap_all (1 << (_XEN_IOMMUOP_unmap_all))
 
+    /* Is the memory specified by gfn or grant reference? */
+#define _XEN_IOMMUOP_unmap_gref 1
+#define XEN_IOMMUOP_unmap_gref (1 << (_XEN_IOMMUOP_unmap_gref))
+
     uint32_t pad;
     /*
      * IN - Segment/Bus/Device/Function of the initiator.
@@ -152,8 +167,14 @@ struct xen_iommu_op_unmap {
     uint64_t sbdf;
     /* IN - The IOMMU frame number which holds the mapping to be removed */
     xen_dfn_t dfn;
-    /* IN - The guest frame number of the page that is mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page that
+     * is mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 struct xen_iommu_op {
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 12e8a4b80b..4a26f5dbe4 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -61,4 +61,9 @@ int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
 int gnttab_get_status_frame(struct domain *d, unsigned long idx,
                             mfn_t *mfn);
 
+int acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                           bool readonly, mfn_t *mfn);
+int release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                           bool readonly, mfn_t mfn);
+
 #endif /* __XEN_GRANT_TABLE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op
  2018-10-15 10:35 ` [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-11-09 11:23   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-09 11:23 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
>  obj-y += grant_table.o
>  obj-y += guestcopy.o
>  obj-bin-y += gunzip.init.o
> +obj-$(CONFIG_X86) += iommu_op.o

I'm afraid I didn't notice this placement in earlier versions. Doesn't
the new hypercalls as a prereq take host IOMMU functionality? If
so, why would this not live in e.g. drivers/passthrough/ (under
this or perhaps a different name)?

Also now that we have full PV and HVM separation - is the
hypercall meant to be usable by both types of guests from the
very beginning? Otherwise conditionals will need to be
introduced above (expressed by the common list model if at all
possible) or in the source file.

> --- /dev/null
> +++ b/xen/common/iommu_op.c
> @@ -0,0 +1,193 @@
> +/******************************************************************************
> + * iommu_op.c
> + *
> + * Paravirtualised IOMMU functionality
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (C) 2018 Citrix Systems Inc
> + */
> +
> +#include <xen/event.h>
> +#include <xen/guest_access.h>
> +#include <xen/hypercall.h>
> +
> +static void iommu_op(xen_iommu_op_t *op)
> +{
> +    switch ( op->op )
> +    {
> +    default:
> +        op->status = -EOPNOTSUPP;
> +        break;
> +    }
> +}
> +
> +int do_one_iommu_op(xen_iommu_op_buf_t *buf)

static? const?

> +#ifdef CONFIG_COMPAT
> +
> +int compat_one_iommu_op(compat_iommu_op_buf_t *buf)

Same here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/7] iommu: track reserved ranges using a rangeset
  2018-10-15 10:35 ` [PATCH v7 2/7] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-11-09 11:46   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-09 11:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, Wei Liu, george.dunlap

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> @@ -1987,28 +1986,35 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
>              if ( --mrmrr->count )
>                  return 0;
>  
> -            while ( base_pfn < end_pfn )
> +            err = rangeset_remove_range(hd->reserved_ranges,
> +                                        base_pfn, end_pfn);
> +            while ( !err && base_pfn < end_pfn )
>              {
>                  if ( clear_identity_p2m_entry(d, base_pfn) )
> -                    ret = -ENXIO;
> +                    err = -ENXIO;
> +
>                  base_pfn++;
>              }

Please note how previously this loop was no exited early in case of error.
This was intentional and should be retained.

Also shouldn't you remove the frames from the rangeset only when the
P2M entries were cleared successfully? You also add them ahead of
setting P2M entries.

Finally - why the rename from ret to err?

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -241,6 +241,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>          else
>              rc = iommu_map_page(d, _dfn(pfn), _mfn(pfn),
>                                  IOMMUF_readable | IOMMUF_writable);
> +
> +        if ( !rc && !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
> +            rc = rangeset_add_singleton(dom_iommu(d)->reserved_ranges, pfn);

Along the lines of the earlier remark, perhaps better to set this before
doing the mapping? It's not _that_ important for Dom0, but just to not
set bad precedents consistency would seem desirable.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 3/7] x86: add xen_iommu_op to query reserved ranges
  2018-10-15 10:35 ` [PATCH v7 3/7] x86: add xen_iommu_op to query reserved ranges Paul Durrant
@ 2018-11-09 12:01   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-09 12:01 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> +static int get_reserved(unsigned long s, unsigned long e, void *arg)
> +{
> +    struct get_reserved_ctxt *ctxt = arg;
> +
> +    if ( ctxt->nr_entries < ctxt->max_entries )
> +    {
> +        xen_iommu_reserved_range_t range = {
> +            .start_dfn = s,
> +            .nr_frames = e - s,

Iirc rangeset ends are inclusive, so I think you need to add 1 here.

> @@ -38,12 +97,20 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
>      const XEN_GUEST_HANDLE(xen_iommu_op_t) h =
>          guest_handle_cast(buf->h, xen_iommu_op_t);
>      xen_iommu_op_t op;
> +    size_t offset;
> +    static const size_t op_size[] = {
> +        [XEN_IOMMUOP_query_reserved] =
> +            sizeof(struct xen_iommu_op_query_reserved),

Perhaps better (and shorter) sizeof(op.u.query_reserved)?

> --- a/xen/include/public/iommu_op.h
> +++ b/xen/include/public/iommu_op.h
> @@ -25,11 +25,50 @@
>  
>  #include "xen.h"
>  
> +typedef uint64_t xen_dfn_t;
> +
> +/* Structure describing a single range reserved in the IOMMU */
> +struct xen_iommu_reserved_range {
> +    xen_dfn_t start_dfn;
> +    uint32_t nr_frames;
> +    uint32_t pad;
> +};
> +typedef struct xen_iommu_reserved_range xen_iommu_reserved_range_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
> +
> +/*
> + * XEN_IOMMUOP_query_reserved: Query ranges reserved in the IOMMU.
> + */

Single line comment?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 4/7] iommu: introduce iommu_map_page_nocrash
  2018-10-15 10:35 ` [PATCH v7 4/7] iommu: introduce iommu_map_page_nocrash Paul Durrant
@ 2018-11-12 13:26   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-12 13:26 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -313,16 +313,23 @@ void iommu_domain_destroy(struct domain *d)
>      hd->reserved_ranges = NULL;
>  }
>  
> -int iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
> -                   unsigned int flags)
> +int iommu_map_page_nocrash(struct domain *d, dfn_t dfn, mfn_t mfn,
> +                           unsigned int flags)

Did you consider using a flag bit instead of introducing a funnily
named function (I sincerely hope that in the common case none
of our code crashes)?

>  {
>      const struct domain_iommu *hd = dom_iommu(d);
> -    int rc;
>  
>      if ( !iommu_enabled || !hd->platform_ops )
>          return 0;
>  
> -    rc = hd->platform_ops->map_page(d, dfn, mfn, flags);
> +    return hd->platform_ops->map_page(d, dfn, mfn, flags);
> +}
> +
> +int iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
> +                   unsigned int flags)
> +{
> +    int rc;
> +
> +    rc = iommu_map_page_nocrash(d, dfn, mfn, flags);

Otherwise, if the function split is indeed the better route, please
make this the initializer of the variable.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 5/7] iommu / vtd: introduce a new 'refcount' flag...
  2018-10-15 10:35 ` [PATCH v7 5/7] iommu / vtd: introduce a new 'refcount' flag Paul Durrant
@ 2018-11-12 13:37   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-12 13:37 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> ...to mean 'the page (being) mapped is reference counted'.
> 
> An important pre-requisite for PV-IOMMU mapping is being able to tell the
> difference between IOMMU entries added at start-of-day by Xen and those
> that have been added by a PV-IOMMU map operation. The reason for this is
> that the pages for the former do not have an extra reference taken prior to
> mapping but the latter will (for safety/security reasons).

Makes me wonder whether proper ref-counting, at least in the
non-shared-page-tables case, shouldn't be a prereq to PV IOMMU.
Otherwise this introduces clumsiness just to work around other
clumsiness.

> This patch therefore introduces a new IOMMF_refcount flag that the
> subsequent patch adding the PV-IOMMU map operation will use to mark
> entries that it adds. When the VT-d mapping code encounters this flag
> it will set a bit in the IOMMU PTE that is ignored by the IOMMU itself,
> such that a subsquent lookup operation can determine whether the mapped
> page was reference counted or not (and hence forbid a PV-IOMMU unmap
> operation in the latter case).

I don't understand this: If you can't unmap start-of-day mappings,
then how can you ever establish your own mappings using the new
hypercalls?

Furthermore, looking at the patch, you don't appear to ref-count
anything. You really only set a PTE bit. IOMMU_refcount looks like
a misnomer then. IOMMU_map_dyn perhaps? (In this case my
remark further up regarding ref-counting as a prereq of course is
irrelevant, but I've left it there just to also document the
confusion arising from the naming. Then again I have yet to see
what the next patch does.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 6/7] x86: add xen_iommu_ops to modify IOMMU mappings
  2018-10-15 10:35 ` [PATCH v7 6/7] x86: add xen_iommu_ops to modify IOMMU mappings Paul Durrant
@ 2018-11-12 14:29   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-12 14:29 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> --- a/xen/common/iommu_op.c
> +++ b/xen/common/iommu_op.c
> @@ -78,7 +78,205 @@ static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
>      return 0;
>  }
>  
> -static void iommu_op(xen_iommu_op_t *op)
> +static int iommu_op_enable_modification(
> +    struct xen_iommu_op_enable_modification *op)
> +{
> +    struct domain *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    const struct iommu_ops *ops = iommu->platform_ops;
> +    int rc;
> +
> +    if ( op->cap || op->pad )
> +        return -EINVAL;
> +
> +    spin_lock(&iommu->lock);
> +
> +    /* Has modification already been enabled? */
> +    rc = 0;
> +    if ( iommu->domain_control )
> +        goto unlock;
> +
> +    /*
> +     * Modificaton of IOMMU mappings cannot be put under domain control if:
> +     * - this domain does not have IOMMU page tables, or
> +     * - HAP is enabled for this domain and the IOMMU shares the tables.
> +     */
> +    rc = -EACCES;
> +    if ( !has_iommu_pt(currd) || iommu_use_hap_pt(currd) )
> +        goto unlock;

I understand the latter restriction, but why the former? Wouldn't
it be reasonable for a domain to request self management at boot
time even if a first pass-through device was assigned only at a
later point in time?

> +static int iommuop_map(struct xen_iommu_op_map *op)

const?

> +{
> +    struct domain *d, *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
> +    dfn_t dfn = _dfn(op->dfn);
> +    p2m_type_t p2mt;
> +    struct page_info *page;
> +    mfn_t ignore;
> +    unsigned int flags;
> +    int rc;
> +
> +    if ( op->pad || (op->flags & ~(XEN_IOMMUOP_map_all |
> +                                   XEN_IOMMUOP_map_readonly)) )
> +        return -EINVAL;
> +
> +    if ( !iommu->domain_control )
> +        return -EOPNOTSUPP;

-EACCES or -EPERM?

> +    /* Per-device mapping not yet supported */
> +    if ( !(op->flags & XEN_IOMMUOP_map_all) )
> +        return -EINVAL;
> +
> +    /* Check whether the specified DFN falls in a reserved region */
> +    if ( rangeset_contains_singleton(iommu->reserved_ranges, dfn_x(dfn)) )
> +        return -EINVAL;
> +
> +    d = rcu_lock_domain_by_any_id(op->domid);
> +    if ( !d )
> +        return -ESRCH;
> +
> +    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);
> +    if ( rc )
> +        goto unlock_domain;
> +
> +    rc = -EINVAL;
> +    if ( p2mt != p2m_ram_rw ||
> +         (!readonly && !get_page_type(page, PGT_writable_page)) )

Why would r/o mappings of p2m_ram_ro not be permitted?

> +    {
> +        put_page(page);
> +        goto unlock_domain;
> +    }
> +
> +    spin_lock(&iommu->lock);
> +
> +    rc = iommu_lookup_page(currd, dfn, &ignore, &flags);
> +
> +    /* Treat a non-reference-counted entry as non-existent */
> +    if ( !rc )
> +        rc = !(flags & IOMMUF_refcount) ? -ENOENT : -EEXIST;
> +
> +    if ( rc != -ENOENT )
> +        goto unlock_iommu;

Isn't the combination of the two if()s the wrong way round, allowing
non-ref-counted entries to get replaced? I.e. don't you need to
swap the two latter operands of the conditional expression above?
The comment also looks to be somewhat off: Aiui you mean to
permit mapping into holes (-ENOENT), i.e. what you call "non-
existent" above. But maybe I'm confused ...

> +static int iommuop_unmap(struct xen_iommu_op_unmap *op)
> +{
> +    struct domain *d, *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    dfn_t dfn = _dfn(op->dfn);
> +    mfn_t mfn;
> +    unsigned int flags;
> +    bool readonly;
> +    p2m_type_t p2mt;
> +    struct page_info *page;
> +    int rc;
> +
> +    if ( op->pad ||
> +         (op->flags & ~XEN_IOMMUOP_unmap_all) )
> +        return -EINVAL;
> +
> +    if ( !iommu->domain_control )
> +        return -EOPNOTSUPP;
> +
> +    /* Per-device unmapping not yet supported */
> +    if ( !(op->flags & XEN_IOMMUOP_unmap_all) )
> +        return -EINVAL;
> +
> +    d = rcu_lock_domain_by_any_id(op->domid);
> +    if ( !d )
> +        return -ESRCH;
> +
> +    spin_lock(&iommu->lock);
> +
> +    rc = iommu_lookup_page(currd, dfn, &mfn, &flags);
> +
> +    /* Treat a non-reference-counted entry as non-existent */
> +    if ( !rc )
> +        rc = !(flags & IOMMUF_refcount) ? -ENOENT : 0;
> +
> +    if ( rc )
> +        goto unlock;
> +
> +    readonly = !(flags & IOMMUF_writable);
> +
> +    /* Make sure the mapped frame matches */
> +    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);

I'm not convinced that P2M_UNSHARE is really intended or necessary
for unmaps of writable mappings. With this there's then little point
having the local variable.

> +    if ( rc )
> +        goto unlock;
> +
> +    rc = !mfn_eq(mfn, page_to_mfn(page)) ? -EINVAL : 0;
> +
> +    /* Release reference taken above */
> +    put_page(page);
> +
> +    if ( rc )
> +        goto unlock;
> +
> +    /* Release references taken in map */
> +    if ( !readonly )
> +        put_page_type(page);
> +    put_page(page);

You may not drop these references before ...

> +    /*
> +     * This really should not fail. If it does, there is an implicit
> +     * domain_crash() (except in the case of the hardware domain) since
> +     * there is not a lot else that can be done to ensure the released
> +     * page can be safely re-used.
> +     */
> +    rc = iommu_unmap_page(currd, dfn);

... having completed the unmap _and_ the associated flush.

> @@ -86,13 +284,30 @@ static void iommu_op(xen_iommu_op_t *op)
>          op->status = iommu_op_query_reserved(&op->u.query_reserved);
>          break;
>  
> +    case XEN_IOMMUOP_enable_modification:
> +        op->status =
> +            iommu_op_enable_modification(&op->u.enable_modification);
> +        break;
> +
> +    case XEN_IOMMUOP_map:
> +        op->status = iommuop_map(&op->u.map);
> +        if ( !op->status )
> +            *need_flush = true;

Aren't you going quite a bit too far here? Replacing a not-present
entry with another one should, in the common case, not require
a flush. There may be a hardware dependency here (different
IOMMU kinds may behave differently).

> @@ -177,7 +403,8 @@ long do_iommu_op(unsigned int nr_bufs,
>  
>          rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
>                                             "ih", nr_bufs, bufs);
> -    }
> +    } else if ( !rc && need_flush )
> +        rc = iommu_iotlb_flush_all(current->domain);

Seems rather heavy a flush for perhaps just a single page map/unmap.

> @@ -296,6 +537,13 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
>          if ( __copy_field_to_compat(h, &cmp, u.query_reserved.nr_entries) )
>              return -EFAULT;
>      }
> +    else if ( cmp.op == XEN_IOMMUOP_enable_modification )

Convert to switch()?

> --- a/xen/include/public/iommu_op.h
> +++ b/xen/include/public/iommu_op.h
> @@ -61,6 +61,101 @@ struct xen_iommu_op_query_reserved {
>      XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
>  };
>  
> +/*
> + * XEN_IOMMUOP_enable_modification: Enable operations that modify IOMMU
> + *                                  mappings.
> + */
> +#define XEN_IOMMUOP_enable_modification 2
> +
> +struct xen_iommu_op_enable_modification {
> +    /*
> +     * OUT - On successful return this is set to the bitwise OR of capabilities
> +     *       defined below. On entry this must be set to zero.
> +     */
> +    uint32_t cap;
> +    uint32_t pad;

What's the point of this padding field?

> +struct xen_iommu_op_map {
> +    /* IN - The domid of the guest */
> +    domid_t domid;
> +    /*
> +     * IN - flags controlling the mapping. This should be a bitwise OR of the
> +     *      flags defined below.
> +     */
> +    uint16_t flags;
> +
> +    /*
> +     * Should the mapping be created for all initiators?
> +     *
> +     * NOTE: This flag is currently required as the implementation does not yet
> +     *       support pre-device mappings.
> +     */
> +#define _XEN_IOMMUOP_map_all 0
> +#define XEN_IOMMUOP_map_all (1 << (_XEN_IOMMUOP_map_all))
> +
> +    /* Should the mapping be read-only to the initiator(s)? */
> +#define _XEN_IOMMUOP_map_readonly 1
> +#define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
> +
> +    uint32_t pad;
> +    /*
> +     * IN - Segment/Bus/Device/Function of the initiator.
> +     *
> +     * NOTE: This is ignored if XEN_IOMMUOP_map_all is set.
> +     */
> +    uint64_t sbdf;

Does this need to be 64 bits wide? Iirc segment numbers are 16 bit,
bus ones 8 bit, devices ones 5, and function ones 3. Sums up to 32.
Another question is whether, for portability's sake, this shouldn't
be a union with a PCI sub-structure. In this case for PCI 32 bits
would suffice, but reserving 64 bits for the general case would
perhaps indeed be a good idea.

> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -79,7 +79,10 @@
>  ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
>  !	iommu_op			iommu_op.h
>  !	iommu_op_buf			iommu_op.h
> +!	iommu_op_enable_modification	iommu_op.h

Shouldn't this be ? instead?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 7/7] x86: extend the map and unmap xen_iommu_ops to support grant references
  2018-10-15 10:35 ` [PATCH v7 7/7] x86: extend the map and unmap xen_iommu_ops to support grant references Paul Durrant
@ 2018-11-12 15:06   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-11-12 15:06 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 15.10.18 at 12:35, <paul.durrant@citrix.com> wrote:
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3927,6 +3927,157 @@ int gnttab_get_status_frame(struct domain *d, unsigned long idx,
>      return rc;
>  }
>  
> +int
> +acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
> +                       bool readonly, mfn_t *mfn)

As a non-static function this may deserve a gnttab_ prefix.

> +{
> +    struct domain *currd = current->domain;
> +    struct grant_table *gt = d->grant_table;
> +    grant_entry_header_t *shah;
> +    struct active_grant_entry *act;
> +    uint16_t *status;
> +    int rc;
> +
> +    grant_read_lock(gt);
> +
> +    rc = -ENOENT;
> +    if ( gref > nr_grant_entries(gt) )

>=

> +        goto unlock;
> +
> +    act = active_entry_acquire(gt, gref);
> +    shah = shared_entry_header(gt, gref);
> +    status = ( gt->gt_version == 2 ) ?

Stray blanks. (All three also apply to the release counterpart.)

> +        &status_entry(gt, gref) :
> +        &shah->flags;
> +
> +    rc = -EACCES;
> +    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
> +         (shah->flags & GTF_sub_page) )
> +        goto release;
> +
> +    rc = -ERANGE;
> +    if ( act->pin && ((act->domid != currd->domain_id) ||
> +                      (act->pin & 0x80808080U) != 0) )

I realize this follows the pattern used elsewhere, but it's going to far,
and I'd prefer if we didn't spread the problem: You don't mean to
increment all of the four fields, but just one or two of them. For the
others there's no point doing the respective overflow check.

> +        goto release;
> +
> +    rc = -EINVAL;
> +    if ( !act->pin ||
> +         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
> +        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
> +                         0, shah, act, status) != GNTST_okay )
> +            goto release;
> +    }

Please combine such directly nested if()s. There's also a misplaced
figure brace, but that'll then go away anyway.

> +    if ( !act->pin )
> +    {
> +        gfn_t gfn = gt->gt_version == 1 ?
> +            _gfn(shared_entry_v1(gt, gref).frame) :
> +            _gfn(shared_entry_v2(gt, gref).full_page.frame);
> +        p2m_type_t p2mt;
> +        struct page_info *page;
> +
> +        rc = check_get_page_from_gfn(d, gfn, readonly, &p2mt, &page);
> +        if ( rc )
> +            goto clear;
> +
> +        rc = -EINVAL;
> +        if ( p2m_is_foreign(p2mt) )

Compared to you allowing p2m_ram_rw only in the previous patch,
this allows basically everything. Is this really intended?

> +        {
> +            put_page(page);
> +            goto clear;
> +        }
> +
> +        act_set_gfn(act, gfn);
> +        act->mfn = page_to_mfn(page);
> +        act->domid = currd->domain_id;
> +        act->start = 0;
> +        act->length = PAGE_SIZE;
> +        act->is_sub_page = false;
> +        act->trans_domain = d;
> +        act->trans_gref = gref;
> +    }
> +    else
> +    {
> +        ASSERT(mfn_valid(act->mfn));
> +        if ( !get_page(mfn_to_page(act->mfn), d) )
> +            goto clear;
> +    }
> +
> +    rc = 0;
> +    act->pin += readonly ? GNTPIN_devr_inc : GNTPIN_devw_inc;

You don't acquire a write ref in the !readonly case.

> +int
> +release_gref_for_iommu(struct domain *d, grant_ref_t gref,
> +                       bool readonly, mfn_t mfn)
> +{
> +    struct domain *currd = current->domain;
> +    struct grant_table *gt = d->grant_table;
> +    grant_entry_header_t *shah;
> +    struct active_grant_entry *act;
> +    uint16_t *status;
> +    int rc;
> +
> +    grant_read_lock(gt);
> +
> +    rc = -ENOENT;
> +    if ( gref > nr_grant_entries(gt) )
> +        goto unlock;
> +
> +    act = active_entry_acquire(gt, gref);
> +    shah = shared_entry_header(gt, gref);
> +    status = ( gt->gt_version == 2 ) ?
> +        &status_entry(gt, gref) :
> +        &shah->flags;
> +
> +    rc = -EINVAL;
> +    if ( !act->pin || (act->domid != currd->domain_id) ||
> +         !mfn_eq(act->mfn, mfn) )
> +        goto release;

Code elsewhere also considers "readonly" in similar checks. Along
the lines of the comment in the acquire function I'd actually prefer
if you did the checks more correctly by checking against
GNTPIN_dev{r,w}_mask depending on the flag.

> @@ -159,16 +159,31 @@ static int iommuop_map(struct xen_iommu_op_map *op)
>      if ( !d )
>          return -ESRCH;
>  
> -    rc = check_get_page_from_gfn(d, _gfn(op->gfn), readonly, &p2mt, &page);
> -    if ( rc )
> -        goto unlock_domain;
> -
> -    rc = -EINVAL;
> -    if ( p2mt != p2m_ram_rw ||
> -         (!readonly && !get_page_type(page, PGT_writable_page)) )
> +    if ( op->flags & XEN_IOMMUOP_map_gref )
>      {
> -        put_page(page);
> -        goto unlock_domain;
> +        rc = acquire_gref_for_iommu(d, op->u.gref, readonly, &mfn);
> +        if ( rc )
> +            goto unlock_domain;
> +    }
> +    else
> +    {
> +        p2m_type_t p2mt;
> +        struct page_info *page;
> +
> +        rc = check_get_page_from_gfn(d, _gfn(op->u.gfn), readonly, &p2mt,
> +                                     &page);
> +        if ( rc )
> +            goto unlock_domain;
> +
> +        rc = -EINVAL;
> +        if ( p2mt != p2m_ram_rw ||
> +             (!readonly && !get_page_type(page, PGT_writable_page)) )

In the description you talk about doing grant maps also by GFN,
but you don't extend the set of permitted bytes here. Is the
description misleading, or is the code incomplete?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-11-12 15:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-15 10:35 [PATCH v7 0/7] Paravirtual IOMMU Interface Paul Durrant
2018-10-15 10:35 ` [PATCH v7 1/7] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
2018-11-09 11:23   ` Jan Beulich
2018-10-15 10:35 ` [PATCH v7 2/7] iommu: track reserved ranges using a rangeset Paul Durrant
2018-11-09 11:46   ` Jan Beulich
2018-10-15 10:35 ` [PATCH v7 3/7] x86: add xen_iommu_op to query reserved ranges Paul Durrant
2018-11-09 12:01   ` Jan Beulich
2018-10-15 10:35 ` [PATCH v7 4/7] iommu: introduce iommu_map_page_nocrash Paul Durrant
2018-11-12 13:26   ` Jan Beulich
2018-10-15 10:35 ` [PATCH v7 5/7] iommu / vtd: introduce a new 'refcount' flag Paul Durrant
2018-11-12 13:37   ` Jan Beulich
2018-10-15 10:35 ` [PATCH v7 6/7] x86: add xen_iommu_ops to modify IOMMU mappings Paul Durrant
2018-11-12 14:29   ` Jan Beulich
2018-10-15 10:35 ` [PATCH v7 7/7] x86: extend the map and unmap xen_iommu_ops to support grant references Paul Durrant
2018-11-12 15:06   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.