All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd
@ 2017-08-09 20:34 Lan Tianyu
  2017-08-09 20:34 ` [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
                   ` (24 more replies)
  0 siblings, 25 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, chao.gao

Change since v1:
       1) Fix coding style issues
       2) Add definitions for vIOMMU type and capabilities
       3) Change vIOMMU kconfig and select vIOMMU default on x86
       4) Put vIOMMU creation in libxl__arch_domain_create()
       5) Make vIOMMU structure of tool stack more general for both PV and HVM.

Change since RFC v2:
       1) Move vvtd.c to drivers/passthrough/vtd directroy. 
       2) Make vIOMMU always built in on x86
       3) Add new boot cmd "viommu" to enable viommu function
       4) Fix some code stype issues.

Change since RFC v1:
       1) Add Xen virtual IOMMU doc docs/misc/viommu.txt
       2) Move vIOMMU hypercall of create/destroy vIOMMU and query  
capabilities from dmop to domctl suggested by Paul Durrant. Because
these hypercalls can be done in tool stack and more VM mode(E,G PVH
or other modes don't use Qemu) can be benefit.
       3) Add check of input MMIO address and length.
       4) Add iommu_type in vIOMMU hypercall parameter to specify
vendor vIOMMU device model(E,G Intel VTD, AMD or ARM IOMMU. So far
only support Intel VTD).
       5) Add save and restore support for vvtd


This patchset is to introduce vIOMMU framework and add virtual VTD's
interrupt remapping support according "Xen virtual IOMMU high level
design doc V3"(https://lists.xenproject.org/archives/html/xen-devel/
2016-11/msg01391.html).

- vIOMMU framework
New framework provides viommu_ops and help functions to abstract
vIOMMU operations(E,G create, destroy, handle irq remapping request
and so on). Vendors(Intel, ARM, AMD and son) can implement their
vIOMMU callbacks.

- Virtual VTD
We enable irq remapping function and covers both
MSI and IOAPIC interrupts. Don't support post interrupt mode emulation
and post interrupt mode enabled on host with virtual VTD. will add
later.

Repo:
https://github.com/lantianyu/Xen/tree/xen_viommu_v2


Chao Gao (21):
  tools/libxc: Add viommu operations in libxc
  tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table
    structures
  tools/libacpi: Add new fields in acpi_config for DMAR table
  tools/libxl: Add a user configurable parameter to control vIOMMU
    attributes
  tools/libxl: build DMAR table for a guest with one virtual VTD
  tools/libxl: create vIOMMU during domain construction
  x86/hvm: Introduce a emulated VTD for HVM
  x86/vvtd: Add MMIO handler for VVTD
  x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  x86/vvtd: Process interrupt remapping request
  x86/vvtd: decode interrupt attribute from IRTE
  x86/vioapic: Hook interrupt delivery of vIOAPIC
  x86/vvtd: Enable Queued Invalidation through GCMD
  x86/vvtd: Enable Interrupt Remapping through GCMD
  x86/vioapic: extend vioapic_get_vector() to support remapping format
    RTE
  passthrough: move some fields of hvm_gmsi_info to a sub-structure
  tools/libxc: Add a new interface to bind remapping format msi with
    pirq
  x86/vmsi: Hook delivering remapping format msi to guest
  x86/vvtd: Handle interrupt translation faults
  x86/vvtd: Add queued invalidation (QI) support
  x86/vvtd: save and restore emulated VT-d

Lan Tianyu (4):
  DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  VIOMMU: Add irq request callback to deal with irq remapping
  VIOMMU: Add get irq info callback to convert irq remapping request
  Xen/doc: Add Xen virtual IOMMU doc

 docs/man/xl.cfg.pod.5.in               |   34 +-
 docs/misc/viommu.txt                   |  139 ++++
 tools/libacpi/acpi2_0.h                |   61 ++
 tools/libacpi/build.c                  |   57 ++
 tools/libacpi/libacpi.h                |    9 +
 tools/libxc/Makefile                   |    1 +
 tools/libxc/include/xenctrl.h          |   25 +
 tools/libxc/xc_domain.c                |   53 ++
 tools/libxc/xc_viommu.c                |   81 +++
 tools/libxl/libxl_arch.h               |    5 +
 tools/libxl/libxl_dom.c                |   36 +
 tools/libxl/libxl_types.idl            |   16 +
 tools/libxl/libxl_x86.c                |   28 +
 tools/libxl/libxl_x86_acpi.c           |   48 ++
 tools/xl/xl_parse.c                    |   66 ++
 xen/arch/x86/hvm/irq.c                 |   11 +
 xen/arch/x86/hvm/vioapic.c             |   32 +-
 xen/arch/x86/hvm/vmsi.c                |   18 +-
 xen/common/domctl.c                    |    3 +
 xen/common/viommu.c                    |   74 ++
 xen/drivers/passthrough/io.c           |  190 ++++-
 xen/drivers/passthrough/vtd/Makefile   |    7 +-
 xen/drivers/passthrough/vtd/iommu.h    |  222 +++++-
 xen/drivers/passthrough/vtd/vtd.h      |    6 +
 xen/drivers/passthrough/vtd/vvtd.c     | 1198 ++++++++++++++++++++++++++++++++
 xen/include/asm-x86/msi.h              |    3 +
 xen/include/asm-x86/viommu.h           |   84 +++
 xen/include/public/arch-x86/hvm/save.h |   24 +-
 xen/include/public/domctl.h            |   59 ++
 xen/include/xen/hvm/irq.h              |   15 +-
 xen/include/xen/viommu.h               |   24 +
 31 files changed, 2546 insertions(+), 83 deletions(-)
 create mode 100644 docs/misc/viommu.txt
 create mode 100644 tools/libxc/xc_viommu.c
 create mode 100644 xen/drivers/passthrough/vtd/vvtd.c
 create mode 100644 xen/include/asm-x86/viommu.h

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-17 11:18   ` Wei Liu
  2017-08-22 14:32   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, chao.gao

This patch is to introduce create, destroy and query capabilities
command for vIOMMU. vIOMMU layer will deal with requests and call
arch vIOMMU ops.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/domctl.c         |  3 +++
 xen/common/viommu.c         | 43 +++++++++++++++++++++++++++++++++++++
 xen/include/public/domctl.h | 52 +++++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/viommu.h    |  6 ++++++
 4 files changed, 104 insertions(+)

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index d80488b..01c3024 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1144,6 +1144,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
         if ( !ret )
             copyback = 1;
         break;
+    case XEN_DOMCTL_viommu_op:
+        ret = viommu_domctl(d, &op->u.viommu_op, &copyback);
+        break;
 
     default:
         ret = arch_do_domctl(op, d, u_domctl);
diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index 6874d9f..a4d004d 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -148,6 +148,49 @@ static u64 viommu_query_caps(struct domain *d, u64 type)
     return viommu_type->ops->query_caps(d);
 }
 
+int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
+                  bool *need_copy)
+{
+    int rc = -EINVAL, ret;
+
+    if ( !viommu_enabled() )
+        return rc;
+
+    switch ( op->cmd )
+    {
+    case XEN_DOMCTL_create_viommu:
+        ret = viommu_create(d, op->u.create_viommu.viommu_type,
+            op->u.create_viommu.base_address,
+            op->u.create_viommu.length,
+            op->u.create_viommu.capabilities);
+        if ( ret >= 0 ) {
+            op->u.create_viommu.viommu_id = ret;
+            *need_copy = true;
+            rc = 0; /* return 0 if success */
+        }
+        break;
+
+    case XEN_DOMCTL_destroy_viommu:
+        rc = viommu_destroy(d, op->u.destroy_viommu.viommu_id);
+        break;
+
+    case XEN_DOMCTL_query_viommu_caps:
+        ret = viommu_query_caps(d, op->u.query_caps.viommu_type);
+        if ( ret >= 0 )
+        {
+            op->u.query_caps.capabilities = ret;
+            rc = 0;
+        }
+        *need_copy = true;
+        break;
+
+    default:
+        break;
+    }
+
+    return rc;
+}
+
 int __init viommu_setup(void)
 {
     INIT_LIST_HEAD(&type_list);
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index ff39762..4b10f26 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1149,6 +1149,56 @@ struct xen_domctl_psr_cat_op {
 typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
 
+/*  vIOMMU helper
+ *
+ *  vIOMMU interface can be used to create/destroy vIOMMU and
+ *  query vIOMMU capabilities.
+ */
+
+/* vIOMMU type - specify vendor vIOMMU device model */
+#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
+
+/* vIOMMU capabilities */
+#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
+
+struct xen_domctl_viommu_op {
+    uint32_t cmd;
+#define XEN_DOMCTL_create_viommu          0
+#define XEN_DOMCTL_destroy_viommu         1
+#define XEN_DOMCTL_query_viommu_caps      2
+    union {
+        struct {
+            /* IN - vIOMMU type */
+            uint64_t viommu_type;
+            /* 
+             * IN - MMIO base address of vIOMMU. vIOMMU device models
+             * are in charge of to check base_address and length.
+             */
+            uint64_t base_address;
+            /* IN - Length of MMIO region */
+            uint64_t length;
+            /* IN - Capabilities with which we want to create */
+            uint64_t capabilities;
+            /* OUT - vIOMMU identity */
+            uint32_t viommu_id;
+        } create_viommu;
+
+        struct {
+            /* IN - vIOMMU identity */
+            uint32_t viommu_id;
+        } destroy_viommu;
+
+        struct {
+            /* IN - vIOMMU type */
+            uint64_t viommu_type;
+            /* OUT - vIOMMU Capabilities */
+            uint64_t capabilities;
+        } query_caps;
+    } u;
+};
+typedef struct xen_domctl_viommu_op xen_domctl_viommu_op;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_viommu_op);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1226,6 +1276,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_monitor_op                    77
 #define XEN_DOMCTL_psr_cat_op                    78
 #define XEN_DOMCTL_soft_reset                    79
+#define XEN_DOMCTL_viommu_op                     80
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1288,6 +1339,7 @@ struct xen_domctl {
         struct xen_domctl_psr_cmt_op        psr_cmt_op;
         struct xen_domctl_monitor_op        monitor_op;
         struct xen_domctl_psr_cat_op        psr_cat_op;
+        struct xen_domctl_viommu_op         viommu_op;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index 506ea54..527afb1 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -49,6 +49,8 @@ extern bool_t opt_viommu;
 static inline bool viommu_enabled(void) { return opt_viommu; }
 int viommu_init_domain(struct domain *d);
 int viommu_register_type(u64 type, struct viommu_ops * ops);
+int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
+                  bool_t *need_copy);
 int viommu_setup(void);
 #else
 static inline int viommu_init_domain(struct domain *d) { return 0; }
@@ -56,6 +58,10 @@ static inline int viommu_register_type(u64 type, struct viommu_ops * ops)
 { return 0; }
 static inline int __init viommu_setup(void) { return 0; }
 static inline bool viommu_enabled(void) { return false; }
+static inline int viommu_domctl(struct domain *d,
+                                struct xen_domctl_viommu_op *op,
+                                bool *need_copy)
+{ return -ENODEV };
 #endif
 
 #endif /* __XEN_VIOMMU_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
  2017-08-09 20:34 ` [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-17 11:18   ` Wei Liu
  2017-08-22 15:32   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
                   ` (22 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, chao.gao

This patch is to add irq request callback for platform implementation
to deal with irq remapping request.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/viommu.c          | 15 +++++++++
 xen/include/asm-x86/viommu.h | 73 ++++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/viommu.h     |  9 ++++++
 3 files changed, 97 insertions(+)
 create mode 100644 xen/include/asm-x86/viommu.h

diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index a4d004d..f4d34e6 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -198,6 +198,21 @@ int __init viommu_setup(void)
     return 0;
 }
 
+int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
+                              struct irq_remapping_request *request)
+{
+    struct viommu_info *info = &d->viommu;
+
+    if ( viommu_id >= info->nr_viommu
+         || !info->viommu[viommu_id] )
+        return -EINVAL;
+
+    if ( !info->viommu[viommu_id]->ops->handle_irq_request )
+        return -EINVAL;
+
+    return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
new file mode 100644
index 0000000..51bda72
--- /dev/null
+++ b/xen/include/asm-x86/viommu.h
@@ -0,0 +1,73 @@
+/*
+ * include/asm-x86/viommu.h
+ *
+ * Copyright (c) 2017 Intel Corporation.
+ * Author: Lan Tianyu <tianyu.lan@intel.com> 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+#ifndef __ARCH_X86_VIOMMU_H__
+#define __ARCH_X86_VIOMMU_H__
+
+#include <xen/viommu.h>
+#include <asm/types.h>
+
+/* IRQ request type */
+#define VIOMMU_REQUEST_IRQ_MSI          0
+#define VIOMMU_REQUEST_IRQ_APIC         1
+
+struct irq_remapping_request
+{
+    union {
+        /* MSI */
+        struct {
+            u64 addr;
+            u32 data;
+        } msi;
+        /* Redirection Entry in IOAPIC */
+        u64 rte;
+    } msg;
+    u16 source_id;
+    u8 type;
+};
+
+static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
+                             uint32_t ioapic_id, uint64_t rte)
+{
+    ASSERT(req);
+    req->type = VIOMMU_REQUEST_IRQ_APIC;
+    req->source_id = ioapic_id;
+    req->msg.rte = rte;
+}
+
+static inline void irq_request_msi_fill(struct irq_remapping_request *req,
+                          uint32_t source_id, uint64_t addr, uint32_t data)
+{
+    ASSERT(req);
+    req->type = VIOMMU_REQUEST_IRQ_MSI;
+    req->source_id = source_id;
+    req->msg.msi.addr = addr;
+    req->msg.msi.data = data;
+}
+
+#endif /* __ARCH_X86_VIOMMU_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * End:
+ */
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index 527afb1..0be1b3a 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -20,6 +20,8 @@
 #ifndef __XEN_VIOMMU_H__
 #define __XEN_VIOMMU_H__
 
+#include <asm/viommu.h>
+
 #define NR_VIOMMU_PER_DOMAIN 1
 
 struct viommu;
@@ -28,6 +30,8 @@ struct viommu_ops {
     u64 (*query_caps)(struct domain *d);
     int (*create)(struct domain *d, struct viommu *viommu);
     int (*destroy)(struct viommu *viommu);
+    int (*handle_irq_request)(struct domain *d,
+                              struct irq_remapping_request *request);
 };
 
 struct viommu {
@@ -52,6 +56,8 @@ int viommu_register_type(u64 type, struct viommu_ops * ops);
 int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
                   bool_t *need_copy);
 int viommu_setup(void);
+int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
+                              struct irq_remapping_request *request);
 #else
 static inline int viommu_init_domain(struct domain *d) { return 0; }
 static inline int viommu_register_type(u64 type, struct viommu_ops * ops)
@@ -62,6 +68,9 @@ static inline int viommu_domctl(struct domain *d,
                                 struct xen_domctl_viommu_op *op,
                                 bool *need_copy)
 { return -ENODEV };
+static inline int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
+                              struct irq_remapping_request *request)
+{ return 0 };
 #endif
 
 #endif /* __XEN_VIOMMU_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
  2017-08-09 20:34 ` [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
  2017-08-09 20:34 ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-17 11:19   ` Wei Liu
  2017-08-22 15:38   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
                   ` (21 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, chao.gao

This patch is to add get_irq_info callback for platform implementation
to convert irq remapping request to irq info (E,G vector, dest, dest_mode
and so on).

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/viommu.c          | 16 ++++++++++++++++
 xen/include/asm-x86/viommu.h |  8 ++++++++
 xen/include/xen/viommu.h     |  9 +++++++++
 3 files changed, 33 insertions(+)

diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index f4d34e6..03c879d 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -213,6 +213,22 @@ int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
     return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
 }
 
+int viommu_get_irq_info(struct domain *d, u32 viommu_id,
+                        struct irq_remapping_request *request,
+                        struct irq_remapping_info *irq_info)
+{
+    struct viommu_info *info = &d->viommu;
+
+    if ( viommu_id >= info->nr_viommu
+         || !info->viommu[viommu_id] )
+        return -EINVAL;
+
+    if ( !info->viommu[viommu_id]->ops->get_irq_info )
+        return -EINVAL;
+
+    return info->viommu[viommu_id]->ops->get_irq_info(d, request, irq_info);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
index 51bda72..1e8d4be 100644
--- a/xen/include/asm-x86/viommu.h
+++ b/xen/include/asm-x86/viommu.h
@@ -27,6 +27,14 @@
 #define VIOMMU_REQUEST_IRQ_MSI          0
 #define VIOMMU_REQUEST_IRQ_APIC         1
 
+struct irq_remapping_info
+{
+    u8  vector;
+    u32 dest;
+    u32 dest_mode:1;
+    u32 delivery_mode:3;
+};
+
 struct irq_remapping_request
 {
     union {
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index 0be1b3a..0badeae 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -32,6 +32,8 @@ struct viommu_ops {
     int (*destroy)(struct viommu *viommu);
     int (*handle_irq_request)(struct domain *d,
                               struct irq_remapping_request *request);
+    int (*get_irq_info)(struct domain *d, struct irq_remapping_request *request,
+                        struct irq_remapping_info *info);
 };
 
 struct viommu {
@@ -58,6 +60,9 @@ int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
 int viommu_setup(void);
 int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
                               struct irq_remapping_request *request);
+int viommu_get_irq_info(struct domain *d, u32 viommu_id, 
+                        struct irq_remapping_request *request,
+                        struct irq_remapping_info *irq_info);
 #else
 static inline int viommu_init_domain(struct domain *d) { return 0; }
 static inline int viommu_register_type(u64 type, struct viommu_ops * ops)
@@ -71,6 +76,10 @@ static inline int viommu_domctl(struct domain *d,
 static inline int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
                               struct irq_remapping_request *request)
 { return 0 };
+static inline int viommu_get_irq_info(struct domain *d, u32 viommu_id,
+                                      struct irq_remapping_request *request,
+                                      struct irq_remapping_info *irq_info)
+{ return 0 };
 #endif
 
 #endif /* __XEN_VIOMMU_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (2 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-17 11:19   ` Wei Liu
  2017-08-22 15:55   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc Lan Tianyu
                   ` (20 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, chao.gao

This patch is to add Xen virtual IOMMU doc to introduce motivation,
framework, vIOMMU hypercall and xl configuration.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 docs/misc/viommu.txt | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 139 insertions(+)
 create mode 100644 docs/misc/viommu.txt

diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
new file mode 100644
index 0000000..39455bb
--- /dev/null
+++ b/docs/misc/viommu.txt
@@ -0,0 +1,139 @@
+Xen virtual IOMMU
+
+Motivation
+==========
+*) Enable more than 255 vcpu support
+HPC cloud service requires VM provides high performance parallel
+computing and we hope to create a huge VM with >255 vcpu on one machine
+to meet such requirement. Pin each vcpu to separate pcpus.
+
+To support >255 vcpus, X2APIC mode in guest is necessary because legacy
+APIC(XAPIC) just supports 8-bit APIC ID and it only can support 255
+vcpus at most. X2APIC mode supports 32-bit APIC ID and it requires
+interrupt mapping function of vIOMMU.
+
+The reason for this is that there is no modification to existing PCI MSI
+and IOAPIC with the introduction of X2APIC. PCI MSI/IOAPIC can only send
+interrupt message containing 8-bit APIC ID, which cannot address >255
+cpus. Interrupt remapping supports 32-bit APIC ID and so it's necessary
+to enable >255 cpus with x2apic mode.
+
+
+vIOMMU Architecture
+===================
+vIOMMU device model is inside Xen hypervisor for following factors
+    1) Avoid round trips between Qemu and Xen hypervisor
+    2) Ease of integration with the rest of hypervisor
+    3) HVMlite/PVH doesn't use Qemu
+
+* Interrupt remapping overview.
+Interrupts from virtual devices and physical devices are delivered
+to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
+this procedure.
+
++---------------------------------------------------+
+|Qemu                       |VM                     |
+|                           | +----------------+    |
+|                           | |  Device driver |    |
+|                           | +--------+-------+    |
+|                           |          ^            |
+|       +----------------+  | +--------+-------+    |
+|       | Virtual device |  | |  IRQ subsystem |    |
+|       +-------+--------+  | +--------+-------+    |
+|               |           |          ^            |
+|               |           |          |            |
++---------------------------+-----------------------+
+|hypervisor     |                      | VIRQ       |
+|               |            +---------+--------+   |
+|               |            |      vLAPIC      |   |
+|               |VIRQ        +---------+--------+   |
+|               |                      ^            |
+|               |                      |            |
+|               |            +---------+--------+   |
+|               |            |      vIOMMU      |   |
+|               |            +---------+--------+   |
+|               |                      ^            |
+|               |                      |            |
+|               |            +---------+--------+   |
+|               |            |   vIOAPIC/vMSI   |   |
+|               |            +----+----+--------+   |
+|               |                 ^    ^            |
+|               +-----------------+    |            |
+|                                      |            |
++---------------------------------------------------+
+HW                                     |IRQ
+                                +-------------------+
+                                |   PCI Device      |
+                                +-------------------+
+
+
+vIOMMU hypercall
+================
+Introduce new domctl hypercall "xen_domctl_viommu_op" to create/destroy
+vIOMMU and query vIOMMU capabilities that device model can support.
+
+* vIOMMU hypercall parameter structure
+
+/* vIOMMU type - specify vendor vIOMMU device model */
+#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
+
+/* vIOMMU capabilities */
+#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
+
+struct xen_domctl_viommu_op {
+    uint32_t cmd;
+#define XEN_DOMCTL_create_viommu          0
+#define XEN_DOMCTL_destroy_viommu         1
+#define XEN_DOMCTL_query_viommu_caps      2
+    union {
+        struct {
+            /* IN - vIOMMU type  */
+            uint64_t viommu_type;
+            /* IN - MMIO base address of vIOMMU. */
+            uint64_t base_address;
+            /* IN - Length of MMIO region */
+            uint64_t length;
+            /* IN - Capabilities with which we want to create */
+            uint64_t capabilities;
+            /* OUT - vIOMMU identity */
+            uint32_t viommu_id;
+        } create_viommu;
+
+        struct {
+            /* IN - vIOMMU identity */
+            uint32_t viommu_id;
+        } destroy_viommu;
+
+        struct {
+            /* IN - vIOMMU type */
+            uint64_t viommu_type;
+            /* OUT - vIOMMU Capabilities */
+            uint64_t capabilities;
+        } query_caps;
+    } u;
+};
+
+- XEN_DOMCTL_query_viommu_caps
+    Query capabilities of vIOMMU device model. vIOMMU_type specifies
+which vendor vIOMMU device model(E,G Intel VTD) is targeted and hypervisor
+returns capability bits(E,G interrupt remapping bit).
+
+- XEN_DOMCTL_create_viommu
+    Create vIOMMU device with vIOMMU_type, capabilities, MMIO
+base address and length. Hypervisor returns viommu_id. Capabilities should
+be in range of value returned by query_viommu_caps hypercall.
+
+- XEN_DOMCTL_destroy_viommu
+    Destroy vIOMMU in Xen hypervisor with viommu_id as parameters.
+
+Now just suppport single vIOMMU for one VM and introduced domtcls are compatible
+with multi-vIOMMU support.
+
+xl vIOMMU configuration
+=======================
+viommu="type=intel_vtd,intremap=1,x2apic=1"
+
+"type" - Specify vIOMMU device model type. Currently only supports Intel vtd
+device model.
+"intremap" - Enable vIOMMU interrupt remapping function.
+"x2apic" - Support x2apic mode with interrupt remapping function.
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (3 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-22 11:09   ` Wei Liu
  2017-08-22 16:25   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
                   ` (19 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

This patch adds XEN_DOMCTL_viommu_op hypercall. This hypercall
comprises three sub-command:
- query capabilities of one specific type vIOMMU emulated by Xen
- create vIOMMU in Xen hypervisor, given viommu type, register-set location
and capabilities
- destroy vIOMMU specified by viommu_id

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libxc/Makefile          |  1 +
 tools/libxc/include/xenctrl.h |  8 +++++
 tools/libxc/xc_viommu.c       | 81 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 90 insertions(+)
 create mode 100644 tools/libxc/xc_viommu.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 28b1857..8724df5 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -51,6 +51,7 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
 CTRL_SRCS-y       += xc_evtchn_compat.c
 CTRL_SRCS-y       += xc_gnttab_compat.c
 CTRL_SRCS-y       += xc_devicemodel_compat.c
+CTRL_SRCS-y       += xc_viommu.c
 
 GUEST_SRCS-y :=
 GUEST_SRCS-y += xg_private.c xc_suspend.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index bde8313..dfaa9d5 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -27,6 +27,7 @@
 #define __XEN_TOOLS__ 1
 #endif
 
+#include <errno.h>
 #include <unistd.h>
 #include <stddef.h>
 #include <stdint.h>
@@ -2499,6 +2500,13 @@ enum xc_static_cpu_featuremask {
 const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
 const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
 
+int xc_viommu_query_cap(xc_interface *xch, domid_t dom,
+                        uint64_t type, uint64_t *cap);
+int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
+                     uint64_t base_addr, uint64_t length, uint64_t cap,
+                     uint32_t *viommu_id);
+int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id);
+
 #endif
 
 int xc_livepatch_upload(xc_interface *xch,
diff --git a/tools/libxc/xc_viommu.c b/tools/libxc/xc_viommu.c
new file mode 100644
index 0000000..04aae96
--- /dev/null
+++ b/tools/libxc/xc_viommu.c
@@ -0,0 +1,81 @@
+/*
+ * xc_viommu.c
+ *
+ * viommu related API functions.
+ *
+ * Copyright (C) 2017 Intel Corporation
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License, version 2.1, as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xc_private.h"
+
+int xc_viommu_query_cap(xc_interface *xch, domid_t dom,
+                        uint64_t type, uint64_t *cap)
+{
+    int rc;
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_viommu_op;
+    domctl.domain = (domid_t)dom;
+    domctl.u.viommu_op.cmd = XEN_DOMCTL_query_viommu_caps;
+    domctl.u.viommu_op.u.query_caps.viommu_type = type;
+
+    rc = do_domctl(xch, &domctl);
+    if ( !rc )
+        *cap = domctl.u.viommu_op.u.query_caps.capabilities;
+    return rc;
+}
+
+int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
+                     uint64_t base_addr, uint64_t length, uint64_t cap,
+                     uint32_t *viommu_id)
+{
+    int rc;
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_viommu_op;
+    domctl.domain = (domid_t)dom;
+    domctl.u.viommu_op.cmd = XEN_DOMCTL_create_viommu;
+    domctl.u.viommu_op.u.create_viommu.viommu_type = type;
+    domctl.u.viommu_op.u.create_viommu.base_address = base_addr;
+    domctl.u.viommu_op.u.create_viommu.length = length;
+    domctl.u.viommu_op.u.create_viommu.capabilities = cap;
+
+    rc = do_domctl(xch, &domctl);
+    if ( !rc )
+        *viommu_id = domctl.u.viommu_op.u.create_viommu.viommu_id;
+    return rc;
+}
+
+int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id)
+{
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_viommu_op;
+    domctl.domain = (domid_t)dom;
+    domctl.u.viommu_op.cmd = XEN_DOMCTL_destroy_viommu;
+    domctl.u.viommu_op.u.destroy_viommu.viommu_id = viommu_id;
+
+    return do_domctl(xch, &domctl);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (4 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-22 12:56   ` Wei Liu
  2017-08-09 20:34 ` [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Add dmar table structure according Chapter 8 "BIOS Considerations" of
VTd spec Rev. 2.4.

VTd spec:http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libacpi/acpi2_0.h | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/tools/libacpi/acpi2_0.h b/tools/libacpi/acpi2_0.h
index 2619ba3..758a823 100644
--- a/tools/libacpi/acpi2_0.h
+++ b/tools/libacpi/acpi2_0.h
@@ -422,6 +422,65 @@ struct acpi_20_slit {
 };
 
 /*
+ * DMA Remapping Table header definition (DMAR)
+ */
+
+/*
+ * DMAR Flags.
+ */
+#define ACPI_DMAR_INTR_REMAP        (1 << 0)
+#define ACPI_DMAR_X2APIC_OPT_OUT    (1 << 1)
+
+struct acpi_dmar {
+    struct acpi_header header;
+    uint8_t host_address_width;
+    uint8_t flags;
+    uint8_t reserved[10];
+};
+
+/*
+ * Device Scope Types
+ */
+#define ACPI_DMAR_DEVICE_SCOPE_PCI_ENDPOINT             0x01
+#define ACPI_DMAR_DEVICE_SCOPE_PCI_SUB_HIERARACHY       0x01
+#define ACPI_DMAR_DEVICE_SCOPE_IOAPIC                   0x03
+#define ACPI_DMAR_DEVICE_SCOPE_HPET                     0x04
+#define ACPI_DMAR_DEVICE_SCOPE_ACPI_NAMESPACE_DEVICE    0x05
+
+struct dmar_device_scope {
+    uint8_t type;
+    uint8_t length;
+    uint8_t reserved[2];
+    uint8_t enumeration_id;
+    uint8_t bus;
+    uint16_t path[0];
+};
+
+/*
+ * DMA Remapping Hardware Unit Types
+ */
+#define ACPI_DMAR_TYPE_HARDWARE_UNIT        0x00
+#define ACPI_DMAR_TYPE_RESERVED_MEMORY      0x01
+#define ACPI_DMAR_TYPE_ATSR                 0x02
+#define ACPI_DMAR_TYPE_HARDWARE_AFFINITY    0x03
+#define ACPI_DMAR_TYPE_ANDD                 0x04
+
+/*
+ * DMA Remapping Hardware Unit Flags. All other bits are reserved and must be 0.
+ */
+#define ACPI_DMAR_INCLUDE_PCI_ALL   (1 << 0)
+
+struct acpi_dmar_hardware_unit {
+    uint16_t type;
+    uint16_t length;
+    uint8_t flags;
+    uint8_t reserved;
+    uint16_t pci_segment;
+    uint64_t base_address;
+    struct dmar_device_scope scope[0];
+};
+
+/*
  * Table Signatures.
  */
 #define ACPI_2_0_RSDP_SIGNATURE ASCII64('R','S','D',' ','P','T','R',' ')
@@ -435,6 +494,7 @@ struct acpi_20_slit {
 #define ACPI_2_0_WAET_SIGNATURE ASCII32('W','A','E','T')
 #define ACPI_2_0_SRAT_SIGNATURE ASCII32('S','R','A','T')
 #define ACPI_2_0_SLIT_SIGNATURE ASCII32('S','L','I','T')
+#define ACPI_2_0_DMAR_SIGNATURE ASCII32('D','M','A','R')
 
 /*
  * Table revision numbers.
@@ -449,6 +509,7 @@ struct acpi_20_slit {
 #define ACPI_1_0_FADT_REVISION 0x01
 #define ACPI_2_0_SRAT_REVISION 0x01
 #define ACPI_2_0_SLIT_REVISION 0x01
+#define ACPI_2_0_DMAR_REVISION 0x01
 
 #pragma pack ()
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (5 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-22 13:12   ` Wei Liu
  2017-08-22 16:41   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
                   ` (17 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

The BIOS reports the remapping hardware units in a platform to system software
through the DMA Remapping Reporting (DMAR) ACPI table.
New fields are introduces for DMAR table. These new fields are set by
toolstack through parsing guest's config file. construct_dmar() is added to
build DMAR table according to the new fields.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libacpi/build.c   | 57 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/libacpi/libacpi.h |  9 ++++++++
 2 files changed, 66 insertions(+)

diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
index f9881c9..c7cc784 100644
--- a/tools/libacpi/build.c
+++ b/tools/libacpi/build.c
@@ -28,6 +28,10 @@
 
 #define ACPI_MAX_SECONDARY_TABLES 16
 
+#define VTD_HOST_ADDRESS_WIDTH 39
+#define I440_PSEUDO_BUS_PLATFORM 0xff
+#define I440_PSEUDO_DEVFN_IOAPIC 0x0
+
 #define align16(sz)        (((sz) + 15) & ~15)
 #define fixed_strcpy(d, s) strncpy((d), (s), sizeof(d))
 
@@ -303,6 +307,59 @@ static struct acpi_20_slit *construct_slit(struct acpi_ctxt *ctxt,
     return slit;
 }
 
+/*
+ * Only one DMA remapping hardware unit is exposed and all devices
+ * are under the remapping hardware unit. I/O APIC should be explicitly
+ * enumerated.
+ */
+struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
+                                 const struct acpi_config *config)
+{
+    struct acpi_dmar *dmar;
+    struct acpi_dmar_hardware_unit *drhd;
+    struct dmar_device_scope *scope;
+    unsigned int size;
+    unsigned int ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);
+
+    size = sizeof(*dmar) + sizeof(*drhd) + ioapic_scope_size;
+
+    dmar = ctxt->mem_ops.alloc(ctxt, size, 16);
+    if ( !dmar )
+        return NULL;
+
+    memset(dmar, 0, size);
+    dmar->header.signature = ACPI_2_0_DMAR_SIGNATURE;
+    dmar->header.revision = ACPI_2_0_DMAR_REVISION;
+    dmar->header.length = size;
+    fixed_strcpy(dmar->header.oem_id, ACPI_OEM_ID);
+    fixed_strcpy(dmar->header.oem_table_id, ACPI_OEM_TABLE_ID);
+    dmar->header.oem_revision = ACPI_OEM_REVISION;
+    dmar->header.creator_id   = ACPI_CREATOR_ID;
+    dmar->header.creator_revision = ACPI_CREATOR_REVISION;
+    dmar->host_address_width = VTD_HOST_ADDRESS_WIDTH - 1;
+    if ( config->iommu_intremap_supported )
+        dmar->flags = ACPI_DMAR_INTR_REMAP;
+    if ( !config->iommu_x2apic_supported )
+        dmar->flags |= ACPI_DMAR_X2APIC_OPT_OUT;
+
+    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
+    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
+    drhd->length = sizeof(*drhd) + ioapic_scope_size;
+    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
+    drhd->pci_segment = 0;
+    drhd->base_address = config->iommu_base_addr;
+
+    scope = &drhd->scope[0];
+    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
+    scope->length = ioapic_scope_size;
+    scope->enumeration_id = config->ioapic_id;
+    scope->bus = I440_PSEUDO_BUS_PLATFORM;
+    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
+
+    set_checksum(dmar, offsetof(struct acpi_header, checksum), size);
+    return dmar;
+}
+
 static int construct_passthrough_tables(struct acpi_ctxt *ctxt,
                                         unsigned long *table_ptrs,
                                         int nr_tables,
diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
index 2ed1ecf..74778a5 100644
--- a/tools/libacpi/libacpi.h
+++ b/tools/libacpi/libacpi.h
@@ -20,6 +20,8 @@
 #ifndef __LIBACPI_H__
 #define __LIBACPI_H__
 
+#include <stdbool.h>
+
 #define ACPI_HAS_COM1              (1<<0)
 #define ACPI_HAS_COM2              (1<<1)
 #define ACPI_HAS_LPT1              (1<<2)
@@ -96,8 +98,15 @@ struct acpi_config {
     uint32_t ioapic_base_address;
     uint16_t pci_isa_irq_mask;
     uint8_t ioapic_id;
+
+    /* Emulated IOMMU features and location */
+    bool iommu_intremap_supported;
+    bool iommu_x2apic_supported;
+    uint64_t iommu_base_addr;
 };
 
+struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
+                                 const struct acpi_config *config);
 int acpi_build_tables(struct acpi_ctxt *ctxt, struct acpi_config *config);
 
 #endif /* __LIBACPI_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (6 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-22 13:19   ` Wei Liu
  2017-08-22 16:48   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
                   ` (16 subsequent siblings)
  24 siblings, 2 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

A field, viommu_info, is added to struct libxl_domain_build_info. Several
attributes can be specified by guest config file for virtual IOMMU. These
attributes are used for DMAR construction and vIOMMU creation.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 docs/man/xl.cfg.pod.5.in    | 34 ++++++++++++++++++++++-
 tools/libxl/libxl_types.idl | 16 +++++++++++
 tools/xl/xl_parse.c         | 66 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
index 79cb2ea..f259e22 100644
--- a/docs/man/xl.cfg.pod.5.in
+++ b/docs/man/xl.cfg.pod.5.in
@@ -1545,7 +1545,39 @@ Do not provide a VM generation ID.
 See also "Virtual Machine Generation ID" by Microsoft:
 L<http://www.microsoft.com/en-us/download/details.aspx?id=30707>
 
-=back 
+=back
+
+=item B<viommu="VIOMMU_STRING">
+
+Specifies the vIOMMU which are to be provided to the guest.
+
+B<VIOMMU_STRING> has the form C<KEY=VALUE,KEY=VALUE,...> where:
+
+=over 4
+
+=item B<KEY=VALUE>
+
+Possible B<KEY>s are:
+
+=over 4
+
+=item B<type="STRING">
+
+Currently there is only one valid type:
+
+(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
+
+=item B<intremap=BOOLEAN>
+
+Specifies whether the vIOMMU should support interrupt remapping
+and default 'true'.
+
+=item B<x2apic=BOOLEAN>
+
+Specifies whether the vIOMMU should support x2apic mode and default 'true'.
+Only valid for "intel_vtd".
+
+=back
 
 =head3 Guest Virtual Time Controls
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 8a9849c..7abd70c 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -450,6 +450,21 @@ libxl_altp2m_mode = Enumeration("altp2m_mode", [
     (3, "limited"),
     ], init_val = "LIBXL_ALTP2M_MODE_DISABLED")
 
+libxl_viommu_type = Enumeration("viommu_type", [
+    (1, "intel_vtd"),
+    ])
+
+libxl_viommu_info = Struct("viommu_info", [
+    ("u", KeyedUnion(None, libxl_viommu_type, "type",
+           [("intel_vtd", Struct(None, [
+                 ("x2apic",     libxl_defbool)]))
+           ])),
+    ("intremap",        libxl_defbool),
+    ("cap",             uint64),
+    ("base_addr",       uint64),
+    ("len",             uint64),
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),
     ("avail_vcpus",     libxl_bitmap),
@@ -506,6 +521,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     # 65000 which is reserved by the toolstack.
     ("device_tree",      string),
     ("acpi",             libxl_defbool),
+    ("viommu",           libxl_viommu_info),
     ("u", KeyedUnion(None, libxl_domain_type, "type",
                 [("hvm", Struct(None, [("firmware",         string),
                                        ("bios",             libxl_bios_type),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 5c2bf17..11c4eb2 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -17,6 +17,7 @@
 #include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
+#include <xen/domctl.h>
 #include <xen/hvm/e820.h>
 #include <xen/hvm/params.h>
 
@@ -30,6 +31,9 @@
 
 extern void set_default_nic_values(libxl_device_nic *nic);
 
+#define VIOMMU_VTD_BASE_ADDR        0xfed90000ULL
+#define VIOMMU_VTD_REGISTER_LEN     0x1000ULL
+
 #define ARRAY_EXTEND_INIT__CORE(array,count,initfn,more)                \
     ({                                                                  \
         typeof((count)) array_extend_old_count = (count);               \
@@ -804,6 +808,61 @@ int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token)
     return 0;
 }
 
+/* Parses viommu data and adds info into viommu
+ * Returns 1 if the input doesn't form a valid viommu
+ * or parsed values are not correct. Successful parse returns 0 */
+static int parse_viommu_config(libxl_viommu_info *viommu, const char *info)
+{
+    char *ptr, *oparg, *saveptr = NULL, *buf = xstrdup(info);
+
+    ptr = strtok_r(buf, ",", &saveptr);
+    if (MATCH_OPTION("type", ptr, oparg)) {
+        if (!strcmp(oparg, "intel_vtd")) {
+            viommu->type = LIBXL_VIOMMU_TYPE_INTEL_VTD;
+        } else {
+            fprintf(stderr, "Invalid viommu type: %s\n", oparg);
+            return 1;
+        }
+    } else {
+        fprintf(stderr, "viommu type should be set first: %s\n", oparg);
+        return 1;
+    }
+
+    ptr = strtok_r(NULL, ",", &saveptr);
+    if (MATCH_OPTION("intremap", ptr, oparg)) {
+        libxl_defbool_set(&viommu->intremap, !!strtoul(oparg, NULL, 0));
+    }
+
+    if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
+        for (ptr = strtok_r(NULL, ",", &saveptr); ptr;
+             ptr = strtok_r(NULL, ",", &saveptr)) {
+            if (MATCH_OPTION("x2apic", ptr, oparg)) {
+                libxl_defbool_set(&viommu->u.intel_vtd.x2apic,
+                                  !!strtoul(oparg, NULL, 0));
+            } else {
+                fprintf(stderr, "Unknown string `%s' in viommu spec\n", ptr);
+                return 1;
+            }
+        }
+
+        if (libxl_defbool_is_default(viommu->intremap))
+            libxl_defbool_set(&viommu->intremap, true);
+
+        if (!libxl_defbool_val(viommu->intremap)) {
+            fprintf(stderr, "Cannot create one virtual VTD without intremap\n");
+            return 1;
+        }
+
+        /* Set default values to unexposed fields */
+        viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
+        viommu->len = VIOMMU_VTD_REGISTER_LEN;
+
+        /* Set desired capbilities */
+        viommu->cap = VIOMMU_CAP_IRQ_REMAPPING;
+    }
+    return 0;
+}
+
 void parse_config_data(const char *config_source,
                        const char *config_data,
                        int config_len,
@@ -1037,6 +1096,13 @@ void parse_config_data(const char *config_source,
     xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0);
     xlu_cfg_get_defbool(config, "acpi", &b_info->acpi, 0);
 
+    if (!xlu_cfg_get_string(config, "viommu", &buf, 0)) {
+        if (parse_viommu_config(&b_info->viommu, buf)) {
+            fprintf(stderr, "ERROR: invalid viommu setting\n");
+            exit (1);
+        }
+    }
+
     switch(b_info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
         kernel_basename = libxl_basename(b_info->kernel);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (7 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-17 11:32   ` Wei Liu
  2017-08-09 20:34 ` [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction Lan Tianyu
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

A new logic is added to build ACPI DMAR table in tool stack for a guest
with one virtual VTD and pass through it to guest via existing mechanism. If
there already are ACPI tables needed to pass through, we joint the tables.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libxl/libxl_arch.h     |  5 +++++
 tools/libxl/libxl_dom.c      | 36 +++++++++++++++++++++++++++++++++
 tools/libxl/libxl_x86_acpi.c | 48 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 89 insertions(+)

diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index 5e1fc60..d8ddd60 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -78,6 +78,11 @@ int libxl__arch_extra_memory(libxl__gc *gc,
 int libxl__dom_load_acpi(libxl__gc *gc,
                          const libxl_domain_build_info *b_info,
                          struct xc_dom_image *dom);
+
+int libxl__dom_build_dmar(libxl__gc *gc,
+                          const libxl_domain_build_info *b_info,
+                          struct xc_dom_image *dom,
+                          void **data, int *len);
 #endif
 
 #endif
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index f54fd49..94c9196 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1060,6 +1060,42 @@ static int libxl__domain_firmware(libxl__gc *gc,
         }
     }
 
+    /*
+     * If a guest has one virtual VTD, build DMAR table for it and joint this
+     * table with existing content in acpi_modules in order to employ HVM
+     * firmware pass-through mechanism to pass-through DMAR table.
+     */
+    if (info->viommu.type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
+        datalen = 0;
+        e = libxl__dom_build_dmar(gc, info, dom, &data, &datalen);
+        if (e) {
+            LOGEV(ERROR, e, "failed to build DMAR table");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+        if (datalen) {
+            libxl__ptr_add(gc, data);
+            if (!dom->acpi_modules[0].data) {
+                dom->acpi_modules[0].data = data;
+                dom->acpi_modules[0].length = (uint32_t)datalen;
+            } else {
+                /* joint tables */
+                void *newdata;
+                newdata = malloc(datalen + dom->acpi_modules[0].length);
+                if (!newdata) {
+                    LOGE(ERROR, "failed to joint DMAR table to acpi modules");
+                    rc = ERROR_FAIL;
+                    goto out;
+                }
+                memcpy(newdata, dom->acpi_modules[0].data,
+                       dom->acpi_modules[0].length);
+                memcpy(newdata + dom->acpi_modules[0].length, data, datalen);
+                dom->acpi_modules[0].data = newdata;
+                dom->acpi_modules[0].length += (uint32_t)datalen;
+            }
+        }
+    }
+
     return 0;
 out:
     assert(rc != 0);
diff --git a/tools/libxl/libxl_x86_acpi.c b/tools/libxl/libxl_x86_acpi.c
index c0a6e32..1fa97ff 100644
--- a/tools/libxl/libxl_x86_acpi.c
+++ b/tools/libxl/libxl_x86_acpi.c
@@ -16,6 +16,7 @@
 #include "libxl_arch.h"
 #include <xen/hvm/hvm_info_table.h>
 #include <xen/hvm/e820.h>
+#include "libacpi/acpi2_0.h"
 #include "libacpi/libacpi.h"
 
 #include <xc_dom.h>
@@ -236,6 +237,53 @@ out:
     return rc;
 }
 
+static void *acpi_memalign(struct acpi_ctxt *ctxt, uint32_t size,
+                           uint32_t align)
+{
+    int ret;
+    void *ptr;
+
+    ret = posix_memalign(&ptr, align, size);
+    if (ret != 0 || !ptr)
+        return NULL;
+
+    return ptr;
+}
+
+int libxl__dom_build_dmar(libxl__gc *gc,
+                          const libxl_domain_build_info *b_info,
+                          struct xc_dom_image *dom,
+                          void **data, int *len)
+{
+    struct acpi_config config = { 0 };
+    struct acpi_ctxt ctxt;
+    void *table;
+
+    if ((b_info->type != LIBXL_DOMAIN_TYPE_HVM) ||
+        (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) ||
+        (b_info->viommu.type != LIBXL_VIOMMU_TYPE_INTEL_VTD))
+        return 0;
+
+    ctxt.mem_ops.alloc = acpi_memalign;
+    ctxt.mem_ops.v2p = virt_to_phys;
+    ctxt.mem_ops.free = acpi_mem_free;
+
+    if (libxl_defbool_val(b_info->viommu.intremap))
+        config.iommu_intremap_supported = true;
+    if (libxl_defbool_val(b_info->viommu.u.intel_vtd.x2apic))
+        config.iommu_x2apic_supported = true;
+    config.iommu_base_addr = b_info->viommu.base_addr;
+
+    config.ioapic_id = 1; /* the IOAPIC_ID used by HVM */
+
+    table = construct_dmar(&ctxt, &config);
+    if ( !table )
+        return ERROR_NOMEM;
+    *data = table;
+    *len = ((struct acpi_header *)table)->length;
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (8 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  7:45   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

If guest is configured to have a vIOMMU, create it during domain construction.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libxl/libxl_x86.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 455f6f0..ace20e5 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -341,8 +341,36 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
     if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {
         unsigned long shadow = DIV_ROUNDUP(d_config->b_info.shadow_memkb,
                                            1024);
+        libxl_viommu_info *viommu = &d_config->b_info.viommu;
+
         xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
                           NULL, 0, &shadow, 0, NULL);
+
+        /* Check supported capbilities and create viommu */
+        if (viommu->type) {
+            uint32_t id;
+            uint64_t cap;
+
+            if (xc_viommu_query_cap(ctx->xch, domid, viommu->type, &cap)) {
+                LOGED(ERROR, domid, "failed to query vIOMMU's capabilities");
+                ret = ERROR_FAIL;
+                goto out;
+            }
+
+            if ((cap & viommu->cap) != viommu->cap) {
+                LOGED(ERROR, domid, "vIOMMU: Unsupported cap %"PRIu64, cap);
+                ret = ERROR_FAIL;
+                goto out;
+            }
+
+            ret = xc_viommu_create(ctx->xch, domid, viommu->type,
+                      viommu->base_addr, viommu->len, viommu->cap, &id);
+            if (ret) {
+                LOGED(ERROR, domid, "create vIOMMU fail");
+                ret = ERROR_FAIL;
+                goto out;
+            }
+        }
     }
 
     if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV &&
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (9 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  7:58   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 12/25] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

This patch adds create/destroy/query function for the emulated VTD
and adapts it to the common VIOMMU abstraction.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/Makefile |   7 +-
 xen/drivers/passthrough/vtd/iommu.h  |  99 +++++++++++++++++-----
 xen/drivers/passthrough/vtd/vvtd.c   | 158 +++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/viommu.h         |   3 +
 4 files changed, 241 insertions(+), 26 deletions(-)
 create mode 100644 xen/drivers/passthrough/vtd/vvtd.c

diff --git a/xen/drivers/passthrough/vtd/Makefile b/xen/drivers/passthrough/vtd/Makefile
index f302653..163c7fe 100644
--- a/xen/drivers/passthrough/vtd/Makefile
+++ b/xen/drivers/passthrough/vtd/Makefile
@@ -1,8 +1,9 @@
 subdir-$(CONFIG_X86) += x86
 
-obj-y += iommu.o
 obj-y += dmar.o
-obj-y += utils.o
-obj-y += qinval.o
 obj-y += intremap.o
+obj-y += iommu.o
+obj-y += qinval.o
 obj-y += quirks.o
+obj-y += utils.o
+obj-$(CONFIG_VIOMMU) += vvtd.o
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 72c1a2e..55f3b6e 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -23,31 +23,54 @@
 #include <asm/msi.h>
 
 /*
- * Intel IOMMU register specification per version 1.0 public spec.
+ * Intel IOMMU register specification per version 2.4 public spec.
  */
 
-#define    DMAR_VER_REG    0x0    /* Arch version supported by this IOMMU */
-#define    DMAR_CAP_REG    0x8    /* Hardware supported capabilities */
-#define    DMAR_ECAP_REG    0x10    /* Extended capabilities supported */
-#define    DMAR_GCMD_REG    0x18    /* Global command register */
-#define    DMAR_GSTS_REG    0x1c    /* Global status register */
-#define    DMAR_RTADDR_REG    0x20    /* Root entry table */
-#define    DMAR_CCMD_REG    0x28    /* Context command reg */
-#define    DMAR_FSTS_REG    0x34    /* Fault Status register */
-#define    DMAR_FECTL_REG    0x38    /* Fault control register */
-#define    DMAR_FEDATA_REG    0x3c    /* Fault event interrupt data register */
-#define    DMAR_FEADDR_REG    0x40    /* Fault event interrupt addr register */
-#define    DMAR_FEUADDR_REG 0x44    /* Upper address register */
-#define    DMAR_AFLOG_REG    0x58    /* Advanced Fault control */
-#define    DMAR_PMEN_REG    0x64    /* Enable Protected Memory Region */
-#define    DMAR_PLMBASE_REG 0x68    /* PMRR Low addr */
-#define    DMAR_PLMLIMIT_REG 0x6c    /* PMRR low limit */
-#define    DMAR_PHMBASE_REG 0x70    /* pmrr high base addr */
-#define    DMAR_PHMLIMIT_REG 0x78    /* pmrr high limit */
-#define    DMAR_IQH_REG    0x80    /* invalidation queue head */
-#define    DMAR_IQT_REG    0x88    /* invalidation queue tail */
-#define    DMAR_IQA_REG    0x90    /* invalidation queue addr */
-#define    DMAR_IRTA_REG   0xB8    /* intr remap */
+#define DMAR_VER_REG            0x0  /* Arch version supported by this IOMMU */
+#define DMAR_CAP_REG            0x8  /* Hardware supported capabilities */
+#define DMAR_ECAP_REG           0x10 /* Extended capabilities supported */
+#define DMAR_GCMD_REG           0x18 /* Global command register */
+#define DMAR_GSTS_REG           0x1c /* Global status register */
+#define DMAR_RTADDR_REG         0x20 /* Root entry table */
+#define DMAR_CCMD_REG           0x28 /* Context command reg */
+#define DMAR_FSTS_REG           0x34 /* Fault Status register */
+#define DMAR_FECTL_REG          0x38 /* Fault control register */
+#define DMAR_FEDATA_REG         0x3c /* Fault event interrupt data register */
+#define DMAR_FEADDR_REG         0x40 /* Fault event interrupt addr register */
+#define DMAR_FEUADDR_REG        0x44 /* Upper address register */
+#define DMAR_AFLOG_REG          0x58 /* Advanced Fault control */
+#define DMAR_PMEN_REG           0x64 /* Enable Protected Memory Region */
+#define DMAR_PLMBASE_REG        0x68 /* PMRR Low addr */
+#define DMAR_PLMLIMIT_REG       0x6c /* PMRR low limit */
+#define DMAR_PHMBASE_REG        0x70 /* pmrr high base addr */
+#define DMAR_PHMLIMIT_REG       0x78 /* pmrr high limit */
+#define DMAR_IQH_REG            0x80 /* invalidation queue head */
+#define DMAR_IQT_REG            0x88 /* invalidation queue tail */
+#define DMAR_IQT_REG_HI         0x8c
+#define DMAR_IQA_REG            0x90 /* invalidation queue addr */
+#define DMAR_IQA_REG_HI         0x94
+#define DMAR_ICS_REG            0x9c /* Invalidation complete status */
+#define DMAR_IECTL_REG          0xa0 /* Invalidation event control */
+#define DMAR_IEDATA_REG         0xa4 /* Invalidation event data */
+#define DMAR_IEADDR_REG         0xa8 /* Invalidation event address */
+#define DMAR_IEUADDR_REG        0xac /* Invalidation event address */
+#define DMAR_IRTA_REG           0xb8 /* Interrupt remapping table addr */
+#define DMAR_IRTA_REG_HI        0xbc
+#define DMAR_PQH_REG            0xc0 /* Page request queue head */
+#define DMAR_PQH_REG_HI         0xc4
+#define DMAR_PQT_REG            0xc8 /* Page request queue tail*/
+#define DMAR_PQT_REG_HI         0xcc
+#define DMAR_PQA_REG            0xd0 /* Page request queue address */
+#define DMAR_PQA_REG_HI         0xd4
+#define DMAR_PRS_REG            0xdc /* Page request status */
+#define DMAR_PECTL_REG          0xe0 /* Page request event control */
+#define DMAR_PEDATA_REG         0xe4 /* Page request event data */
+#define DMAR_PEADDR_REG         0xe8 /* Page request event address */
+#define DMAR_PEUADDR_REG        0xec /* Page event upper address */
+#define DMAR_MTRRCAP_REG        0x100 /* MTRR capability */
+#define DMAR_MTRRCAP_REG_HI     0x104
+#define DMAR_MTRRDEF_REG        0x108 /* MTRR default type */
+#define DMAR_MTRRDEF_REG_HI     0x10c
 
 #define OFFSET_STRIDE        (9)
 #define dmar_readl(dmar, reg) readl((dmar) + (reg))
@@ -58,6 +81,30 @@
 #define VER_MAJOR(v)        (((v) & 0xf0) >> 4)
 #define VER_MINOR(v)        ((v) & 0x0f)
 
+/* CAP_REG */
+#define DMA_DOMAIN_ID_SHIFT         16  /* 16-bit domain id for 64K domains */
+#define DMA_DOMAIN_ID_MASK          ((1UL << DMA_DOMAIN_ID_SHIFT) - 1)
+#define DMA_CAP_ND                  (((DMA_DOMAIN_ID_SHIFT - 4) / 2) & 7ULL)
+#define DMA_MGAW                    39  /* Maximum Guest Address Width */
+#define DMA_CAP_MGAW                (((DMA_MGAW - 1) & 0x3fULL) << 16)
+#define DMA_MAMV                    18ULL
+#define DMA_CAP_MAMV                (DMA_MAMV << 48)
+#define DMA_CAP_PSI                 (1ULL << 39)
+#define DMA_CAP_SLLPS               ((1ULL << 34) | (1ULL << 35))
+#define DMA_FRCD_REG_NR             1ULL
+#define DMA_CAP_NFR                 ((DMA_FRCD_REG_NR - 1) << 40)
+#define DMA_CAP_FRO_OFFSET          0x220ULL
+#define DMA_CAP_FRO                 (DMA_CAP_FRO_OFFSET << 20)
+
+/* Supported Adjusted Guest Address Widths */
+#define DMA_CAP_SAGAW_SHIFT         8
+#define DMA_CAP_SAGAW_MASK          (0x1fULL << DMA_CAP_SAGAW_SHIFT)
+ /* 39-bit AGAW, 3-level page-table */
+#define DMA_CAP_SAGAW_39bit         (0x2ULL << DMA_CAP_SAGAW_SHIFT)
+ /* 48-bit AGAW, 4-level page-table */
+#define DMA_CAP_SAGAW_48bit         (0x4ULL << DMA_CAP_SAGAW_SHIFT)
+#define DMA_CAP_SAGAW               DMA_CAP_SAGAW_39bit
+
 /*
  * Decoding Capability Register
  */
@@ -89,6 +136,12 @@
 #define cap_afl(c)        (((c) >> 3) & 1)
 #define cap_ndoms(c)        (1 << (4 + 2 * ((c) & 0x7)))
 
+/* ECAP_REG */
+#define DMA_ECAP_QI                 (1ULL << 1)
+#define DMA_ECAP_IR                 (1ULL << 3)
+#define DMA_ECAP_EIM                (1ULL << 4)
+#define DMA_ECAP_MHMV               (15ULL << 20)
+
 /*
  * Extended Capability Register
  */
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
new file mode 100644
index 0000000..353fafe
--- /dev/null
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -0,0 +1,158 @@
+/*
+ * vvtd.c
+ *
+ * virtualize VTD for HVM.
+ *
+ * Copyright (C) 2017 Chao Gao, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/domain_page.h>
+#include <xen/sched.h>
+#include <xen/types.h>
+#include <xen/viommu.h>
+#include <xen/xmalloc.h>
+#include <asm/current.h>
+#include <asm/hvm/domain.h>
+#include <asm/page.h>
+
+#include "iommu.h"
+
+struct hvm_hw_vvtd_regs {
+    uint8_t data[1024];
+};
+
+/* Status field of struct vvtd */
+#define VIOMMU_STATUS_DEFAULT                   (0)
+#define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
+#define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
+
+struct vvtd {
+    /* VIOMMU_STATUS_XXX */
+    int status;
+    /* Address range of remapping hardware register-set */
+    uint64_t base_addr;
+    uint64_t length;
+    /* Point back to the owner domain */
+    struct domain *domain;
+    struct hvm_hw_vvtd_regs *regs;
+    struct page_info *regs_page;
+};
+
+static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
+                                uint32_t value)
+{
+    *((uint32_t *)(&vtd->regs->data[reg])) = value;
+}
+
+static inline uint32_t vvtd_get_reg(struct vvtd *vtd, uint32_t reg)
+{
+    return *((uint32_t *)(&vtd->regs->data[reg]));
+}
+
+static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
+{
+    return *((uint8_t *)(&vtd->regs->data[reg]));
+}
+
+#define vvtd_get_reg_quad(vvtd, reg, val) do {  \
+    (val) = vvtd_get_reg(vvtd, (reg) + 4 );     \
+    (val) = (val) << 32;                        \
+    (val) += vvtd_get_reg(vvtd, reg);           \
+} while(0)
+#define vvtd_set_reg_quad(vvtd, reg, val) do {  \
+    vvtd_set_reg(vvtd, reg, (val));             \
+    vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
+} while(0)
+
+static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
+{
+    uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
+                   DMA_CAP_MGAW | DMA_CAP_SAGAW | DMA_CAP_ND;
+    uint64_t ecap = DMA_ECAP_IR | DMA_ECAP_EIM | DMA_ECAP_QI;
+
+    vvtd_set_reg(vvtd, DMAR_VER_REG, 0x10UL);
+    vvtd_set_reg_quad(vvtd, DMAR_CAP_REG, cap);
+    vvtd_set_reg_quad(vvtd, DMAR_ECAP_REG, ecap);
+    vvtd_set_reg(vvtd, DMAR_FECTL_REG, 0x80000000UL);
+    vvtd_set_reg(vvtd, DMAR_IECTL_REG, 0x80000000UL);
+}
+
+static u64 vvtd_query_caps(struct domain *d)
+{
+    return VIOMMU_CAP_IRQ_REMAPPING;
+}
+
+static int vvtd_create(struct domain *d, struct viommu *viommu)
+{
+    struct vvtd *vvtd;
+    int ret;
+
+    if ( !is_hvm_domain(d) || (viommu->length != PAGE_SIZE) ||
+        (~vvtd_query_caps(d) & viommu->caps) )
+        return -EINVAL;
+
+    ret = -ENOMEM;
+    vvtd = xmalloc_bytes(sizeof(struct vvtd));
+    if ( !vvtd )
+        return ret;
+
+    vvtd->regs_page = alloc_domheap_page(d, MEMF_no_owner);
+    if ( !vvtd->regs_page )
+        goto out1;
+
+    vvtd->regs = __map_domain_page_global(vvtd->regs_page);
+    if ( !vvtd->regs )
+        goto out2;
+    clear_page(vvtd->regs);
+
+    vvtd_reset(vvtd, viommu->caps);
+    vvtd->base_addr = viommu->base_address;
+    vvtd->length = viommu->length;
+    vvtd->domain = d;
+    vvtd->status = VIOMMU_STATUS_DEFAULT;
+    return 0;
+
+ out2:
+    free_domheap_page(vvtd->regs_page);
+ out1:
+    xfree(vvtd);
+    return ret;
+}
+
+static int vvtd_destroy(struct viommu *viommu)
+{
+    struct vvtd *vvtd = viommu->priv;
+
+    if ( vvtd )
+    {
+        unmap_domain_page_global(vvtd->regs);
+        free_domheap_page(vvtd->regs_page);
+        xfree(vvtd);
+    }
+    return 0;
+}
+
+struct viommu_ops vvtd_hvm_vmx_ops = {
+    .query_caps = vvtd_query_caps,
+    .create = vvtd_create,
+    .destroy = vvtd_destroy
+};
+
+static int vvtd_register(void)
+{
+    viommu_register_type(VIOMMU_TYPE_INTEL_VTD, &vvtd_hvm_vmx_ops);
+    return 0;
+}
+__initcall(vvtd_register);
diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
index 1e8d4be..b730e65 100644
--- a/xen/include/asm-x86/viommu.h
+++ b/xen/include/asm-x86/viommu.h
@@ -22,6 +22,9 @@
 
 #include <xen/viommu.h>
 #include <asm/types.h>
+#include <asm/processor.h>
+
+extern struct viommu_ops vvtd_hvm_vmx_ops;
 
 /* IRQ request type */
 #define VIOMMU_REQUEST_IRQ_MSI          0
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 12/25] x86/vvtd: Add MMIO handler for VVTD
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (10 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  8:27   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 13/25] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

This patch adds VVTD MMIO handler to deal with MMIO access.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/vvtd.c | 114 +++++++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 353fafe..94680e6 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -50,6 +50,38 @@ struct vvtd {
     struct page_info *regs_page;
 };
 
+#define __DEBUG_VVTD__
+#ifdef __DEBUG_VVTD__
+extern unsigned int vvtd_debug_level;
+#define VVTD_DBG_INFO     1
+#define VVTD_DBG_TRANS    (1<<1)
+#define VVTD_DBG_RW       (1<<2)
+#define VVTD_DBG_FAULT    (1<<3)
+#define VVTD_DBG_EOI      (1<<4)
+#define VVTD_DEBUG(lvl, _f, _a...) do { \
+    if ( vvtd_debug_level & lvl ) \
+        printk("VVTD %s:" _f "\n", __func__, ## _a);    \
+} while(0)
+#else
+#define VVTD_DEBUG(fmt...) do {} while(0)
+#endif
+
+unsigned int vvtd_debug_level __read_mostly;
+integer_param("vvtd_debug", vvtd_debug_level);
+
+struct vvtd *domain_vvtd(struct domain *d)
+{
+    struct viommu_info *info = &d->viommu;
+
+    BUILD_BUG_ON(NR_VIOMMU_PER_DOMAIN != 1);
+    return (info && info->viommu[0]) ? info->viommu[0]->priv : NULL;
+}
+
+static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
+{
+    return domain_vvtd(v->domain);
+}
+
 static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
                                 uint32_t value)
 {
@@ -76,6 +108,87 @@ static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
     vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
 } while(0)
 
+static int vvtd_range(struct vcpu *v, unsigned long addr)
+{
+    struct vvtd *vvtd = vcpu_vvtd(v);
+
+    if ( vvtd )
+        return (addr >= vvtd->base_addr) &&
+               (addr < vvtd->base_addr + PAGE_SIZE);
+    return 0;
+}
+
+static int vvtd_read(struct vcpu *v, unsigned long addr,
+                     unsigned int len, unsigned long *pval)
+{
+    struct vvtd *vvtd = vcpu_vvtd(v);
+    unsigned int offset = addr - vvtd->base_addr;
+    unsigned int offset_aligned = offset & ~3;
+
+    VVTD_DEBUG(VVTD_DBG_RW, "READ INFO: offset %x len %d.", offset, len);
+
+    if ( !pval )
+        return X86EMUL_UNHANDLEABLE;
+
+    if ( (offset & 3) || ((len != 4) && (len != 8)) )
+    {
+        VVTD_DEBUG(VVTD_DBG_RW, "Alignment or length is not canonical");
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    if ( len == 4 )
+        *pval = vvtd_get_reg(vvtd, offset_aligned);
+    else
+        vvtd_get_reg_quad(vvtd, offset_aligned, *pval);
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write(struct vcpu *v, unsigned long addr,
+                      unsigned int len, unsigned long val)
+{
+    struct vvtd *vvtd = vcpu_vvtd(v);
+    unsigned int offset = addr - vvtd->base_addr;
+    unsigned int offset_aligned = offset & ~0x3;
+    int ret;
+
+    VVTD_DEBUG(VVTD_DBG_RW, "WRITE INFO: offset %x len %d val %lx.",
+               offset, len, val);
+
+    if ( (offset & 3) || ((len != 4) && (len != 8)) )
+    {
+        VVTD_DEBUG(VVTD_DBG_RW, "Alignment or length is not canonical");
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    ret = X86EMUL_UNHANDLEABLE;
+    if ( len == 4 )
+    {
+        switch ( offset_aligned )
+        {
+        case DMAR_IEDATA_REG:
+        case DMAR_IEADDR_REG:
+        case DMAR_IEUADDR_REG:
+        case DMAR_FEDATA_REG:
+        case DMAR_FEADDR_REG:
+        case DMAR_FEUADDR_REG:
+            vvtd_set_reg(vvtd, offset_aligned, val);
+            ret = X86EMUL_OKAY;
+            break;
+
+        default:
+            break;
+        }
+    }
+
+    return ret;
+}
+
+static const struct hvm_mmio_ops vvtd_mmio_ops = {
+    .check = vvtd_range,
+    .read = vvtd_read,
+    .write = vvtd_write
+};
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
@@ -122,6 +235,7 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
     vvtd->length = viommu->length;
     vvtd->domain = d;
     vvtd->status = VIOMMU_STATUS_DEFAULT;
+    register_mmio_handler(d, &vvtd_mmio_ops);
     return 0;
 
  out2:
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 13/25] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (11 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 12/25] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  8:47   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request Lan Tianyu
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Software sets this field to set/update the interrupt remapping table pointer
used by hardware. The interrupt remapping table pointer is specified through
the Interrupt Remapping Table Address (IRTA_REG) register.

This patch emulates this operation and adds some new fields in VVTD to track
info (e.g. the table's gfn and max supported entries) of interrupt remapping
table.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  9 ++++-
 xen/drivers/passthrough/vtd/vvtd.c  | 73 +++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 55f3b6e..102b4f3 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -192,9 +192,16 @@
 #define DMA_GSTS_WBFS   (((u64)1) << 27)
 #define DMA_GSTS_QIES   (((u64)1) <<26)
 #define DMA_GSTS_IRES   (((u64)1) <<25)
-#define DMA_GSTS_SIRTPS (((u64)1) << 24)
+#define DMA_GSTS_SIRTPS_BIT     24
+#define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_BIT)
 #define DMA_GSTS_CFIS   (((u64)1) <<23)
 
+/* IRTA_REG */
+#define DMA_IRTA_ADDR(val)      (val & ~0xfffULL)
+#define DMA_IRTA_EIME(val)      (!!(val & (1 << 11)))
+#define DMA_IRTA_S(val)         (val & 0xf)
+#define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
+
 /* PMEN_REG */
 #define DMA_PMEN_EPM    (((u32)1) << 31)
 #define DMA_PMEN_PRS    (((u32)1) << 0)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 94680e6..8e8dbe6 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -46,6 +46,13 @@ struct vvtd {
     uint64_t length;
     /* Point back to the owner domain */
     struct domain *domain;
+    /* Is in Extended Interrupt Mode? */
+    bool eim;
+    /* Max remapping entries in IRT */
+    int irt_max_entry;
+    /* Interrupt remapping table base gfn */
+    uint64_t irt;
+
     struct hvm_hw_vvtd_regs *regs;
     struct page_info *regs_page;
 };
@@ -82,6 +89,11 @@ static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
     return domain_vvtd(v->domain);
 }
 
+static inline void __vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    return __set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
+}
+
 static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
                                 uint32_t value)
 {
@@ -108,6 +120,44 @@ static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
     vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
 } while(0)
 
+static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
+{
+    uint64_t irta;
+
+    if ( !(val & DMA_GCMD_SIRTP) )
+        return X86EMUL_OKAY;
+
+    vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
+    vvtd->irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
+    vvtd->irt_max_entry = DMA_IRTA_SIZE(irta);
+    vvtd->eim = DMA_IRTA_EIME(irta);
+    VVTD_DEBUG(VVTD_DBG_RW, "Update IR info (addr=%lx eim=%d size=%d).",
+               vvtd->irt, vvtd->eim, vvtd->irt_max_entry);
+    __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_SIRTPS_BIT);
+
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
+{
+    uint32_t orig = vvtd_get_reg(vvtd, DMAR_GSTS_REG);
+    uint32_t changed;
+
+    orig = orig & 0x96ffffff;    /* reset the one-shot bits */
+    changed = orig ^ val;
+
+    if ( !changed )
+        return X86EMUL_OKAY;
+    if ( (changed & (changed - 1)) )
+        VVTD_DEBUG(VVTD_DBG_RW, "Guest attempts to update multiple fields "
+                     "of GCMD_REG in one write transation.");
+
+    if ( changed & DMA_GCMD_SIRTP )
+        vvtd_handle_gcmd_sirtp(vvtd, val);
+
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_range(struct vcpu *v, unsigned long addr)
 {
     struct vvtd *vvtd = vcpu_vvtd(v);
@@ -165,12 +215,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
     {
         switch ( offset_aligned )
         {
+        case DMAR_GCMD_REG:
+            ret = vvtd_write_gcmd(vvtd, val);
+            break;
+
         case DMAR_IEDATA_REG:
         case DMAR_IEADDR_REG:
         case DMAR_IEUADDR_REG:
         case DMAR_FEDATA_REG:
         case DMAR_FEADDR_REG:
         case DMAR_FEUADDR_REG:
+        case DMAR_IRTA_REG:
+        case DMAR_IRTA_REG_HI:
             vvtd_set_reg(vvtd, offset_aligned, val);
             ret = X86EMUL_OKAY;
             break;
@@ -179,6 +235,20 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             break;
         }
     }
+    else /* len == 8 */
+    {
+        switch ( offset_aligned )
+        {
+        case DMAR_IRTA_REG:
+            vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);
+            ret = X86EMUL_OKAY;
+            break;
+
+        default:
+            ret = X86EMUL_UNHANDLEABLE;
+            break;
+        }
+    }
 
     return ret;
 }
@@ -235,6 +305,9 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
     vvtd->length = viommu->length;
     vvtd->domain = d;
     vvtd->status = VIOMMU_STATUS_DEFAULT;
+    vvtd->eim = 0;
+    vvtd->irt = 0;
+    vvtd->irt_max_entry = 0;
     register_mmio_handler(d, &vvtd_mmio_ops);
     return 0;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (12 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 13/25] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  9:49   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 15/25] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When a remapping interrupt request arrives, remapping hardware computes the
interrupt_index per the algorithm described in VTD spec
"Interrupt Remapping Table", interprets the IRTE and generates a remapped
interrupt request.

This patch introduces viommu_handle_irq_request() to emulate the process how
remapping hardware handles a remapping interrupt request.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  21 +++
 xen/drivers/passthrough/vtd/vtd.h   |   6 +
 xen/drivers/passthrough/vtd/vvtd.c  | 276 +++++++++++++++++++++++++++++++++++-
 3 files changed, 302 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 102b4f3..70e64cf 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -244,6 +244,21 @@
 #define dma_frcd_source_id(c) (c & 0xffff)
 #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
 
+enum VTD_FAULT_TYPE
+{
+    /* Interrupt remapping transition faults */
+    VTD_FR_IR_REQ_RSVD = 0x20,   /* One or more IR request reserved
+                                  * fields set */
+    VTD_FR_IR_INDEX_OVER = 0x21, /* Index value greater than max */
+    VTD_FR_IR_ENTRY_P = 0x22,    /* Present (P) not set in IRTE */
+    VTD_FR_IR_ROOT_INVAL = 0x23, /* IR Root table invalid */
+    VTD_FR_IR_IRTE_RSVD = 0x24,  /* IRTE Rsvd field non-zero with
+                                  * Present flag set */
+    VTD_FR_IR_REQ_COMPAT = 0x25, /* Encountered compatible IR
+                                  * request while disabled */
+    VTD_FR_IR_SID_ERR = 0x26,    /* Invalid Source-ID */
+};
+
 /*
  * 0: Present
  * 1-11: Reserved
@@ -384,6 +399,12 @@ struct iremap_entry {
 };
 
 /*
+ * When VT-d doesn't enable Extended Interrupt Mode. Hardware only interprets
+ * only 8-bits ([15:8]) of Destination-ID field in the IRTEs.
+ */
+#define IRTE_xAPIC_DEST_MASK 0xff00
+
+/*
  * Posted-interrupt descriptor address is 64 bits with 64-byte aligned, only
  * the upper 26 bits of lest significiant 32 bits is available.
  */
diff --git a/xen/drivers/passthrough/vtd/vtd.h b/xen/drivers/passthrough/vtd/vtd.h
index bb8889f..1032b46 100644
--- a/xen/drivers/passthrough/vtd/vtd.h
+++ b/xen/drivers/passthrough/vtd/vtd.h
@@ -47,6 +47,8 @@ struct IO_APIC_route_remap_entry {
     };
 };
 
+#define IOAPIC_REMAP_ENTRY_INDEX(x) ((x.index_15 << 15) + x.index_0_14)
+
 struct msi_msg_remap_entry {
     union {
         u32 val;
@@ -65,4 +67,8 @@ struct msi_msg_remap_entry {
     u32	data;		/* msi message data */
 };
 
+#define MSI_REMAP_ENTRY_INDEX(x) ((x.address_lo.index_15 << 15) + \
+                                  x.address_lo.index_0_14 + \
+                                  (x.address_lo.SHV ? (uint16_t)x.data : 0))
+
 #endif // _VTD_H_
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 8e8dbe6..2bee352 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -23,11 +23,16 @@
 #include <xen/types.h>
 #include <xen/viommu.h>
 #include <xen/xmalloc.h>
+#include <asm/apic.h>
 #include <asm/current.h>
+#include <asm/event.h>
 #include <asm/hvm/domain.h>
+#include <asm/io_apic.h>
 #include <asm/page.h>
+#include <asm/p2m.h>
 
 #include "iommu.h"
+#include "vtd.h"
 
 struct hvm_hw_vvtd_regs {
     uint8_t data[1024];
@@ -38,6 +43,9 @@ struct hvm_hw_vvtd_regs {
 #define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
 #define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
 
+#define vvtd_irq_remapping_enabled(vvtd) \
+    (vvtd->status & VIOMMU_STATUS_IRQ_REMAPPING_ENABLED)
+
 struct vvtd {
     /* VIOMMU_STATUS_XXX */
     int status;
@@ -120,6 +128,140 @@ static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
     vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
 } while(0)
 
+static int map_guest_page(struct domain *d, uint64_t gfn, void **virt)
+{
+    struct page_info *p;
+
+    p = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
+    if ( !p )
+        return -EINVAL;
+
+    if ( !get_page_type(p, PGT_writable_page) )
+    {
+        put_page(p);
+        return -EINVAL;
+    }
+
+    *virt = __map_domain_page_global(p);
+    if ( !*virt )
+    {
+        put_page_and_type(p);
+        return -ENOMEM;
+    }
+    return 0;
+}
+
+static void unmap_guest_page(void *virt)
+{
+    struct page_info *page;
+
+    if ( !virt )
+        return;
+
+    virt = (void *)((unsigned long)virt & PAGE_MASK);
+    page = mfn_to_page(domain_page_map_to_mfn(virt));
+
+    unmap_domain_page_global(virt);
+    put_page_and_type(page);
+}
+
+static void vvtd_inj_irq(
+    struct vlapic *target,
+    uint8_t vector,
+    uint8_t trig_mode,
+    uint8_t delivery_mode)
+{
+    VVTD_DEBUG(VVTD_DBG_INFO, "dest=v%d, delivery_mode=%x vector=%d "
+               "trig_mode=%d.",
+               vlapic_vcpu(target)->vcpu_id, delivery_mode,
+               vector, trig_mode);
+
+    ASSERT((delivery_mode == dest_Fixed) ||
+           (delivery_mode == dest_LowestPrio));
+
+    vlapic_set_irq(target, vector, trig_mode);
+}
+
+static int vvtd_delivery(
+    struct domain *d, int vector,
+    uint32_t dest, uint8_t dest_mode,
+    uint8_t delivery_mode, uint8_t trig_mode)
+{
+    struct vlapic *target;
+    struct vcpu *v;
+
+    switch ( delivery_mode )
+    {
+    case dest_LowestPrio:
+        target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
+        if ( target != NULL )
+        {
+            vvtd_inj_irq(target, vector, trig_mode, delivery_mode);
+            break;
+        }
+        VVTD_DEBUG(VVTD_DBG_INFO, "null round robin: vector=%02x\n", vector);
+        break;
+
+    case dest_Fixed:
+        for_each_vcpu ( d, v )
+            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest,
+                                   dest_mode) )
+                vvtd_inj_irq(vcpu_vlapic(v), vector,
+                             trig_mode, delivery_mode);
+        break;
+
+    case dest_NMI:
+        for_each_vcpu ( d, v )
+            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode)
+                 && !test_and_set_bool(v->nmi_pending) )
+                vcpu_kick(v);
+        break;
+
+    default:
+        printk(XENLOG_G_WARNING
+               "%pv: Unsupported VTD delivery mode %d for Dom%d\n",
+               current, delivery_mode, d->domain_id);
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static uint32_t irq_remapping_request_index(struct irq_remapping_request *irq)
+{
+    if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
+    {
+        struct msi_msg_remap_entry msi_msg = { { irq->msg.msi.addr }, 0,
+                                               irq->msg.msi.data };
+
+        return MSI_REMAP_ENTRY_INDEX(msi_msg);
+    }
+    else if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
+    {
+        struct IO_APIC_route_remap_entry remap_rte = { { irq->msg.rte } };
+
+        return IOAPIC_REMAP_ENTRY_INDEX(remap_rte);
+    }
+    BUG();
+    return 0;
+}
+
+static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
+{
+    uint64_t irta;
+
+    vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
+    /* In xAPIC mode, only 8-bits([15:8]) are valid */
+    return DMA_IRTA_EIME(irta) ? dest : MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
+}
+
+static int vvtd_record_fault(struct vvtd *vvtd,
+                             struct irq_remapping_request *irq,
+                             int reason)
+{
+    return 0;
+}
+
 static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
 {
     uint64_t irta;
@@ -259,6 +401,137 @@ static const struct hvm_mmio_ops vvtd_mmio_ops = {
     .write = vvtd_write
 };
 
+static bool ir_sid_valid(struct iremap_entry *irte, uint32_t source_id)
+{
+    return true;
+}
+
+/*
+ * 'record_fault' is a flag to indicate whether we need recording a fault
+ * and notifying guest when a fault happens during fetching vIRTE.
+ */
+static int vvtd_get_entry(struct vvtd *vvtd,
+                          struct irq_remapping_request *irq,
+                          struct iremap_entry *dest,
+                          bool record_fault)
+{
+    int ret;
+    uint32_t entry = irq_remapping_request_index(irq);
+    struct iremap_entry  *irte, *irt_page;
+
+    VVTD_DEBUG(VVTD_DBG_TRANS, "interpret a request with index %x", entry);
+
+    if ( entry > vvtd->irt_max_entry )
+    {
+        ret = VTD_FR_IR_INDEX_OVER;
+        goto handle_fault;
+    }
+
+    ret = map_guest_page(vvtd->domain, vvtd->irt + (entry >> IREMAP_ENTRY_ORDER),
+                         (void**)&irt_page);
+    if ( ret )
+    {
+        ret = VTD_FR_IR_ROOT_INVAL;
+        goto handle_fault;
+    }
+
+    irte = irt_page + (entry % (1 << IREMAP_ENTRY_ORDER));
+    dest->val = irte->val;
+    if ( !qinval_present(*irte) )
+    {
+        ret = VTD_FR_IR_ENTRY_P;
+        goto unmap_handle_fault;
+    }
+
+    /* Check reserved bits */
+    if ( (irte->remap.res_1 || irte->remap.res_2 || irte->remap.res_3 ||
+          irte->remap.res_4) )
+    {
+        ret = VTD_FR_IR_IRTE_RSVD;
+        goto unmap_handle_fault;
+    }
+
+    if (!ir_sid_valid(irte, irq->source_id))
+    {
+        ret = VTD_FR_IR_SID_ERR;
+        goto unmap_handle_fault;
+    }
+    unmap_guest_page(irt_page);
+    return 0;
+
+ unmap_handle_fault:
+    unmap_guest_page(irt_page);
+ handle_fault:
+    if ( !record_fault )
+        return ret;
+
+    switch ( ret )
+    {
+    case VTD_FR_IR_SID_ERR:
+    case VTD_FR_IR_IRTE_RSVD:
+    case VTD_FR_IR_ENTRY_P:
+        if ( qinval_fault_disable(*irte) )
+            break;
+    /* fall through */
+    case VTD_FR_IR_INDEX_OVER:
+    case VTD_FR_IR_ROOT_INVAL:
+        vvtd_record_fault(vvtd, irq, ret);
+        break;
+
+    default:
+        gdprintk(XENLOG_G_INFO, "Can't handle VT-d fault %x\n", ret);
+    }
+    return ret;
+}
+
+static int vvtd_irq_request_sanity_check(struct vvtd *vvtd,
+                                         struct irq_remapping_request *irq)
+{
+    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
+    {
+        struct IO_APIC_route_remap_entry rte = { { irq->msg.rte } };
+
+        ASSERT(rte.format);
+        return (!rte.reserved) ? 0 : VTD_FR_IR_REQ_RSVD;
+    }
+    else if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
+    {
+        struct msi_msg_remap_entry msi_msg = { { irq->msg.msi.addr } };
+
+        ASSERT(msi_msg.address_lo.format);
+        return 0;
+    }
+    BUG();
+    return 0;
+}
+
+static int vvtd_handle_irq_request(struct domain *d,
+                                   struct irq_remapping_request *irq)
+{
+    struct iremap_entry irte;
+    int ret;
+    struct vvtd *vvtd = domain_vvtd(d);
+
+    if ( !vvtd || !vvtd_irq_remapping_enabled(vvtd) )
+        return -EINVAL;
+
+    ret = vvtd_irq_request_sanity_check(vvtd, irq);
+    if ( ret )
+    {
+        vvtd_record_fault(vvtd, irq, ret);
+        return ret;
+    }
+
+    if ( !vvtd_get_entry(vvtd, irq, &irte, true) )
+    {
+        vvtd_delivery(vvtd->domain, irte.remap.vector,
+                      irte_dest(vvtd, irte.remap.dst), irte.remap.dm,
+                      irte.remap.dlm, irte.remap.tm);
+        return 0;
+    }
+    return -EFAULT;
+}
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
@@ -334,7 +607,8 @@ static int vvtd_destroy(struct viommu *viommu)
 struct viommu_ops vvtd_hvm_vmx_ops = {
     .query_caps = vvtd_query_caps,
     .create = vvtd_create,
-    .destroy = vvtd_destroy
+    .destroy = vvtd_destroy,
+    .handle_irq_request = vvtd_handle_irq_request
 };
 
 static int vvtd_register(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 15/25] x86/vvtd: decode interrupt attribute from IRTE
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (13 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  9:57   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Previously, interrupt attributes can be extracted from msi message or
IOAPIC RTE. However, with interrupt remapping enabled, the attributes
are enclosed in the associated IRTE. This callback is for cases in
which the caller wants to acquire interrupt attributes.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/vvtd.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 2bee352..374fd88 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -532,6 +532,25 @@ static int vvtd_handle_irq_request(struct domain *d,
     return -EFAULT;
 }
 
+static int vvtd_get_irq_info(struct domain *d,
+                             struct irq_remapping_request *irq,
+                             struct irq_remapping_info *info)
+{
+    int ret;
+    struct iremap_entry irte;
+    struct vvtd *vvtd = domain_vvtd(d);
+
+    ret = vvtd_get_entry(vvtd, irq, &irte, false);
+    if ( ret )
+        return ret;
+
+    info->vector = irte.remap.vector;
+    info->dest = irte_dest(vvtd, irte.remap.dst);
+    info->dest_mode = irte.remap.dm;
+    info->delivery_mode = irte.remap.dlm;
+    return 0;
+}
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
@@ -608,7 +627,8 @@ struct viommu_ops vvtd_hvm_vmx_ops = {
     .query_caps = vvtd_query_caps,
     .create = vvtd_create,
     .destroy = vvtd_destroy,
-    .handle_irq_request = vvtd_handle_irq_request
+    .handle_irq_request = vvtd_handle_irq_request,
+    .get_irq_info = vvtd_get_irq_info
 };
 
 static int vvtd_register(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (14 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 15/25] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23  9:59   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 17/25] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When irq remapping is enabled, IOAPIC Redirection Entry may be in remapping
format. If that, generate an irq_remapping_request and call the common
VIOMMU abstraction's callback to handle this interrupt request. Device
model is responsible for checking the request's validity.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/vioapic.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 72cae93..322f33c 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -30,6 +30,7 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/viommu.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/vpic.h>
@@ -39,6 +40,8 @@
 #include <asm/event.h>
 #include <asm/io_apic.h>
 
+#include "../../../drivers/passthrough/vtd/vtd.h"
+
 /* HACK: Route IRQ0 only to VCPU0 to prevent time jumps. */
 #define IRQ0_SPECIAL_ROUTING 1
 
@@ -387,9 +390,20 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
     struct vlapic *target;
     struct vcpu *v;
     unsigned int irq = vioapic->base_gsi + pin;
+    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };
 
     ASSERT(spin_is_locked(&d->arch.hvm_domain.irq_lock));
 
+    if ( rte.format )
+    {
+        struct irq_remapping_request request;
+
+        irq_request_ioapic_fill(&request, vioapic->id, rte.val);
+        /* Currently, only viommu 0 is supported */
+        viommu_handle_irq_request(d, 0, &request);
+        return;
+    }
+
     HVM_DBG_LOG(DBG_LEVEL_IOAPIC,
                 "dest=%x dest_mode=%x delivery_mode=%x "
                 "vector=%x trig_mode=%x",
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 17/25] x86/vvtd: Enable Queued Invalidation through GCMD
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (15 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 10:03   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 18/25] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Software writes to QIE fields of GCMD to enable or disable queued
invalidations. This patch emulates QIE fields of GCMD.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  3 ++-
 xen/drivers/passthrough/vtd/vvtd.c  | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 70e64cf..82bf6bc 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -190,7 +190,8 @@
 #define DMA_GSTS_FLS    (((u64)1) << 29)
 #define DMA_GSTS_AFLS   (((u64)1) << 28)
 #define DMA_GSTS_WBFS   (((u64)1) << 27)
-#define DMA_GSTS_QIES   (((u64)1) <<26)
+#define DMA_GSTS_QIES_BIT       26
+#define DMA_GSTS_QIES           (((u64)1) << DMA_GSTS_QIES_BIT)
 #define DMA_GSTS_IRES   (((u64)1) <<25)
 #define DMA_GSTS_SIRTPS_BIT     24
 #define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_BIT)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 374fd88..470bc56 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -102,6 +102,11 @@ static inline void __vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
     return __set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
 }
 
+static inline void __vvtd_clear_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    return __clear_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
+}
+
 static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
                                 uint32_t value)
 {
@@ -262,6 +267,21 @@ static int vvtd_record_fault(struct vvtd *vvtd,
     return 0;
 }
 
+static int vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)
+{
+    VVTD_DEBUG(VVTD_DBG_RW, "%sable Queue Invalidation.",
+               (val & DMA_GCMD_QIE) ? "En" : "Dis");
+
+    if ( val & DMA_GCMD_QIE )
+        __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_BIT);
+    else
+    {
+        vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, 0ULL);
+        __vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_BIT);
+    }
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
 {
     uint64_t irta;
@@ -296,6 +316,8 @@ static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
 
     if ( changed & DMA_GCMD_SIRTP )
         vvtd_handle_gcmd_sirtp(vvtd, val);
+    if ( changed & DMA_GCMD_QIE )
+        vvtd_handle_gcmd_qie(vvtd, val);
 
     return X86EMUL_OKAY;
 }
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 18/25] x86/vvtd: Enable Interrupt Remapping through GCMD
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (16 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 17/25] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 10:07   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Software writes this field to enable/disable interrupt reampping. This patch
emulate IRES field of GCMD.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  3 ++-
 xen/drivers/passthrough/vtd/vvtd.c  | 27 +++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 82bf6bc..e323352 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -192,7 +192,8 @@
 #define DMA_GSTS_WBFS   (((u64)1) << 27)
 #define DMA_GSTS_QIES_BIT       26
 #define DMA_GSTS_QIES           (((u64)1) << DMA_GSTS_QIES_BIT)
-#define DMA_GSTS_IRES   (((u64)1) <<25)
+#define DMA_GSTS_IRES_BIT       25
+#define DMA_GSTS_IRES   (((u64)1) << DMA_GSTS_IRES_BIT)
 #define DMA_GSTS_SIRTPS_BIT     24
 #define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_BIT)
 #define DMA_GSTS_CFIS   (((u64)1) <<23)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 470bc56..eae8f11 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -282,6 +282,25 @@ static int vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)
     return X86EMUL_OKAY;
 }
 
+static int vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
+{
+    VVTD_DEBUG(VVTD_DBG_RW, "%sable Interrupt Remapping.",
+               (val & DMA_GCMD_IRE) ? "En" : "Dis");
+
+    if ( val & DMA_GCMD_IRE )
+    {
+        vvtd->status |= VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
+        __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
+    }
+    else
+    {
+        vvtd->status |= ~VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
+        __vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
+    }
+
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
 {
     uint64_t irta;
@@ -289,6 +308,10 @@ static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
     if ( !(val & DMA_GCMD_SIRTP) )
         return X86EMUL_OKAY;
 
+    if ( vvtd_irq_remapping_enabled(vvtd) )
+        VVTD_DEBUG(VVTD_DBG_RW, "Update Interrupt Remapping Table when "
+                   "active." );
+
     vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
     vvtd->irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
     vvtd->irt_max_entry = DMA_IRTA_SIZE(irta);
@@ -318,6 +341,10 @@ static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
         vvtd_handle_gcmd_sirtp(vvtd, val);
     if ( changed & DMA_GCMD_QIE )
         vvtd_handle_gcmd_qie(vvtd, val);
+    if ( changed & DMA_GCMD_IRE )
+        vvtd_handle_gcmd_ire(vvtd, val);
+    if ( changed & ~(DMA_GCMD_QIE | DMA_GCMD_SIRTP | DMA_GCMD_IRE) )
+        gdprintk(XENLOG_INFO, "Only QIE,SIRTP,IRE in GCMD_REG are handled.\n");
 
     return X86EMUL_OKAY;
 }
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (17 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 18/25] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 10:14   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 20/25] passthrough: move some fields of hvm_gmsi_info to a sub-structure Lan Tianyu
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When IOAPIC RTE is in remapping format, it doesn't contain the vector of
interrupt. For this case, the RTE contains an index of interrupt remapping
table where the vector of interrupt is stored. This patchs gets the vector
through a vIOMMU interface.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/vioapic.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 322f33c..ff0742d 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -565,11 +565,27 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
 {
     unsigned int pin;
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
+    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };
 
     if ( !vioapic )
         return -EINVAL;
 
-    return vioapic->redirtbl[pin].fields.vector;
+    if ( rte.format )
+    {
+        int err;
+        struct irq_remapping_request request;
+        struct irq_remapping_info info;
+
+        irq_request_ioapic_fill(&request, vioapic->id, rte.val);
+        /* Currently, only viommu 0 is supported */
+        err = viommu_get_irq_info(vioapic->domain, 0, &request, &info);
+        return !err ? info.vector : -1;
+    }
+    else
+    {
+        return vioapic->redirtbl[pin].fields.vector;
+    }
+
 }
 
 int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 20/25] passthrough: move some fields of hvm_gmsi_info to a sub-structure
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (18 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-09 20:34 ` [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

No functional change. It is a preparation for introducing new fields in
hvm_gmsi_info to manage remapping format msi bound to a physical msi.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/vmsi.c      |  4 ++--
 xen/drivers/passthrough/io.c | 32 ++++++++++++++++----------------
 xen/include/xen/hvm/irq.h    |  8 ++++++--
 3 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index a36692c..c4ec0ad 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -101,8 +101,8 @@ int vmsi_deliver(
 
 void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 {
-    uint32_t flags = pirq_dpci->gmsi.gflags;
-    int vector = pirq_dpci->gmsi.gvec;
+    uint32_t flags = pirq_dpci->gmsi.legacy.gflags;
+    int vector = pirq_dpci->gmsi.legacy.gvec;
     uint8_t dest = (uint8_t)flags;
     uint8_t dest_mode = !!(flags & VMSI_DM_MASK);
     uint8_t delivery_mode = (flags & VMSI_DELIV_MASK)
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ffeaf70..4d457f6 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -348,8 +348,8 @@ int pt_irq_create_bind(
         {
             pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED | HVM_IRQ_DPCI_MACH_MSI |
                                HVM_IRQ_DPCI_GUEST_MSI;
-            pirq_dpci->gmsi.gvec = pt_irq_bind->u.msi.gvec;
-            pirq_dpci->gmsi.gflags = pt_irq_bind->u.msi.gflags;
+            pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
+            pirq_dpci->gmsi.legacy.gflags = pt_irq_bind->u.msi.gflags;
             /*
              * 'pt_irq_create_bind' can be called after 'pt_irq_destroy_bind'.
              * The 'pirq_cleanup_check' which would free the structure is only
@@ -381,8 +381,8 @@ int pt_irq_create_bind(
             }
             if ( unlikely(rc) )
             {
-                pirq_dpci->gmsi.gflags = 0;
-                pirq_dpci->gmsi.gvec = 0;
+                pirq_dpci->gmsi.legacy.gflags = 0;
+                pirq_dpci->gmsi.legacy.gvec = 0;
                 pirq_dpci->dom = NULL;
                 pirq_dpci->flags = 0;
                 pirq_cleanup_check(info, d);
@@ -401,20 +401,20 @@ int pt_irq_create_bind(
             }
 
             /* If pirq is already mapped as vmsi, update guest data/addr. */
-            if ( pirq_dpci->gmsi.gvec != pt_irq_bind->u.msi.gvec ||
-                 pirq_dpci->gmsi.gflags != pt_irq_bind->u.msi.gflags )
+            if ( pirq_dpci->gmsi.legacy.gvec != pt_irq_bind->u.msi.gvec ||
+                 pirq_dpci->gmsi.legacy.gflags != pt_irq_bind->u.msi.gflags )
             {
                 /* Directly clear pending EOIs before enabling new MSI info. */
                 pirq_guest_eoi(info);
 
-                pirq_dpci->gmsi.gvec = pt_irq_bind->u.msi.gvec;
-                pirq_dpci->gmsi.gflags = pt_irq_bind->u.msi.gflags;
+                pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
+                pirq_dpci->gmsi.legacy.gflags = pt_irq_bind->u.msi.gflags;
             }
         }
         /* Calculate dest_vcpu_id for MSI-type pirq migration. */
-        dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
-        dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
-        delivery_mode = (pirq_dpci->gmsi.gflags & VMSI_DELIV_MASK) >>
+        dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
+        dest_mode = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
+        delivery_mode = (pirq_dpci->gmsi.legacy.gflags & VMSI_DELIV_MASK) >>
                          GFLAGS_SHIFT_DELIV_MODE;
 
         dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
@@ -427,7 +427,7 @@ int pt_irq_create_bind(
         {
             if ( delivery_mode == dest_LowestPrio )
                 vcpu = vector_hashing_dest(d, dest, dest_mode,
-                                           pirq_dpci->gmsi.gvec);
+                                           pirq_dpci->gmsi.legacy.gvec);
             if ( vcpu )
                 pirq_dpci->gmsi.posted = true;
         }
@@ -437,7 +437,7 @@ int pt_irq_create_bind(
         /* Use interrupt posting if it is supported. */
         if ( iommu_intpost )
             pi_update_irte(vcpu ? &vcpu->arch.hvm_vmx.pi_desc : NULL,
-                           info, pirq_dpci->gmsi.gvec);
+                           info, pirq_dpci->gmsi.legacy.gvec);
 
         break;
     }
@@ -817,10 +817,10 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
     int vector = (long)arg;
 
     if ( (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
-         (pirq_dpci->gmsi.gvec == vector) )
+         (pirq_dpci->gmsi.legacy.gvec == vector) )
     {
-        int dest = pirq_dpci->gmsi.gflags & VMSI_DEST_ID_MASK;
-        int dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK);
+        int dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
+        int dest_mode = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
 
         if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
                                dest_mode) )
diff --git a/xen/include/xen/hvm/irq.h b/xen/include/xen/hvm/irq.h
index 0d2c72c..5e736f8 100644
--- a/xen/include/xen/hvm/irq.h
+++ b/xen/include/xen/hvm/irq.h
@@ -62,8 +62,12 @@ struct dev_intx_gsi_link {
 #define GFLAGS_SHIFT_TRG_MODE       15
 
 struct hvm_gmsi_info {
-    uint32_t gvec;
-    uint32_t gflags;
+    union {
+        struct {
+            uint32_t gvec;
+            uint32_t gflags;
+        } legacy;
+    };
     int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
     bool posted; /* directly deliver to guest via VT-d PI? */
 };
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (19 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 20/25] passthrough: move some fields of hvm_gmsi_info to a sub-structure Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 10:41   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Introduce a new binding relationship and provide a new interface to
manage the new relationship.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libxc/include/xenctrl.h |  17 ++++++
 tools/libxc/xc_domain.c       |  53 +++++++++++++++++
 xen/drivers/passthrough/io.c  | 135 +++++++++++++++++++++++++++++++++++-------
 xen/include/public/domctl.h   |   7 +++
 xen/include/xen/hvm/irq.h     |   7 +++
 5 files changed, 198 insertions(+), 21 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index dfaa9d5..b0a9437 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1720,6 +1720,15 @@ int xc_domain_ioport_mapping(xc_interface *xch,
                              uint32_t nr_ports,
                              uint32_t add_mapping);
 
+int xc_domain_update_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr,
+    uint64_t gtable);
+
 int xc_domain_update_msi_irq(
     xc_interface *xch,
     uint32_t domid,
@@ -1734,6 +1743,14 @@ int xc_domain_unbind_msi_irq(xc_interface *xch,
                              uint32_t pirq,
                              uint32_t gflags);
 
+int xc_domain_unbind_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr);
+
 int xc_domain_bind_pt_irq(xc_interface *xch,
                           uint32_t domid,
                           uint8_t machine_irq,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 3bab4e8..4b6a510 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1702,8 +1702,34 @@ int xc_deassign_dt_device(
     return rc;
 }
 
+int xc_domain_update_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr,
+    uint64_t gtable)
+{
+    int rc;
+    xen_domctl_bind_pt_irq_t *bind;
+
+    DECLARE_DOMCTL;
 
+    domctl.cmd = XEN_DOMCTL_bind_pt_irq;
+    domctl.domain = (domid_t)domid;
 
+    bind = &(domctl.u.bind_pt_irq);
+    bind->irq_type = PT_IRQ_TYPE_MSI_IR;
+    bind->machine_irq = pirq;
+    bind->u.msi_ir.source_id = source_id;
+    bind->u.msi_ir.data = data;
+    bind->u.msi_ir.addr = addr;
+    bind->u.msi_ir.gtable = gtable;
+
+    rc = do_domctl(xch, &domctl);
+    return rc;
+}
 
 int xc_domain_update_msi_irq(
     xc_interface *xch,
@@ -1732,6 +1758,33 @@ int xc_domain_update_msi_irq(
     return rc;
 }
 
+int xc_domain_unbind_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr)
+{
+    int rc;
+    xen_domctl_bind_pt_irq_t *bind;
+
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_unbind_pt_irq;
+    domctl.domain = (domid_t)domid;
+
+    bind = &(domctl.u.bind_pt_irq);
+    bind->irq_type = PT_IRQ_TYPE_MSI_IR;
+    bind->machine_irq = pirq;
+    bind->u.msi_ir.source_id = source_id;
+    bind->u.msi_ir.data = data;
+    bind->u.msi_ir.addr = addr;
+
+    rc = do_domctl(xch, &domctl);
+    return rc;
+}
+
 int xc_domain_unbind_msi_irq(
     xc_interface *xch,
     uint32_t domid,
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 4d457f6..0510887 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -276,6 +276,92 @@ static struct vcpu *vector_hashing_dest(const struct domain *d,
     return dest;
 }
 
+static inline void set_hvm_gmsi_info(struct hvm_gmsi_info *msi,
+                                     xen_domctl_bind_pt_irq_t *pt_irq_bind)
+{
+    if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI )
+    {
+        msi->legacy.gvec = pt_irq_bind->u.msi.gvec;
+        msi->legacy.gflags = pt_irq_bind->u.msi.gflags;
+    }
+    else if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR )
+    {
+        msi->intremap.source_id = pt_irq_bind->u.msi_ir.source_id;
+        msi->intremap.data = pt_irq_bind->u.msi_ir.data;
+        msi->intremap.addr = pt_irq_bind->u.msi_ir.addr;
+    }
+    else
+        BUG();
+}
+
+static inline void clear_hvm_gmsi_info(struct hvm_gmsi_info *msi, int irq_type)
+{
+    if ( irq_type == PT_IRQ_TYPE_MSI )
+    {
+        msi->legacy.gvec = 0;
+        msi->legacy.gflags = 0;
+    }
+    else if ( irq_type == PT_IRQ_TYPE_MSI_IR )
+    {
+        msi->intremap.source_id = 0;
+        msi->intremap.data = 0;
+        msi->intremap.addr = 0;
+    }
+    else
+        BUG();
+}
+
+static inline bool hvm_gmsi_info_need_update(struct hvm_gmsi_info *msi,
+                                         xen_domctl_bind_pt_irq_t *pt_irq_bind)
+{
+    if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI )
+        return ((msi->legacy.gvec != pt_irq_bind->u.msi.gvec) ||
+                (msi->legacy.gflags != pt_irq_bind->u.msi.gflags));
+    else if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR )
+        return ((msi->intremap.source_id != pt_irq_bind->u.msi_ir.source_id) ||
+                (msi->intremap.data != pt_irq_bind->u.msi_ir.data) ||
+                (msi->intremap.addr != pt_irq_bind->u.msi_ir.addr));
+    BUG();
+    return 0;
+}
+
+static int pirq_dpci_2_msi_attr(struct domain *d,
+                                struct hvm_pirq_dpci *pirq_dpci, uint8_t *gvec,
+                                uint8_t *dest, uint8_t *dm, uint8_t *dlm)
+{
+    int rc = 0;
+
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI )
+    {
+        *gvec = pirq_dpci->gmsi.legacy.gvec;
+        *dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
+        *dm = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
+        *dlm = (pirq_dpci->gmsi.legacy.gflags & VMSI_DELIV_MASK) >>
+                GFLAGS_SHIFT_DELIV_MODE;
+    }
+    else if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI_IR )
+    {
+        struct irq_remapping_request request;
+        struct irq_remapping_info irq_info;
+
+        irq_request_msi_fill(&request, pirq_dpci->gmsi.intremap.source_id,
+                             pirq_dpci->gmsi.intremap.addr,
+                             pirq_dpci->gmsi.intremap.data);
+        /* Currently, only viommu 0 is supported */
+        rc = viommu_get_irq_info(d, 0, &request, &irq_info);
+        if ( !rc )
+        {
+            *gvec = irq_info.vector;
+            *dest = irq_info.dest;
+            *dm = irq_info.dest_mode;
+            *dlm = irq_info.delivery_mode;
+        }
+    }
+    else
+        BUG();
+    return rc;
+}
+
 int pt_irq_create_bind(
     struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
 {
@@ -339,17 +425,21 @@ int pt_irq_create_bind(
     switch ( pt_irq_bind->irq_type )
     {
     case PT_IRQ_TYPE_MSI:
+    case PT_IRQ_TYPE_MSI_IR:
     {
-        uint8_t dest, dest_mode, delivery_mode;
+        uint8_t dest = 0, dest_mode = 0, delivery_mode = 0, gvec;
         int dest_vcpu_id;
         const struct vcpu *vcpu;
+        bool ir = (pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR);
+        uint64_t gtable = ir ? pt_irq_bind->u.msi_ir.gtable :
+                          pt_irq_bind->u.msi.gtable;
 
         if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
         {
             pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED | HVM_IRQ_DPCI_MACH_MSI |
-                               HVM_IRQ_DPCI_GUEST_MSI;
-            pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
-            pirq_dpci->gmsi.legacy.gflags = pt_irq_bind->u.msi.gflags;
+                               (ir ? HVM_IRQ_DPCI_GUEST_MSI_IR :
+                                HVM_IRQ_DPCI_GUEST_MSI);
+            set_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind);
             /*
              * 'pt_irq_create_bind' can be called after 'pt_irq_destroy_bind'.
              * The 'pirq_cleanup_check' which would free the structure is only
@@ -364,9 +454,9 @@ int pt_irq_create_bind(
             pirq_dpci->dom = d;
             /* bind after hvm_irq_dpci is setup to avoid race with irq handler*/
             rc = pirq_guest_bind(d->vcpu[0], info, 0);
-            if ( rc == 0 && pt_irq_bind->u.msi.gtable )
+            if ( rc == 0 && gtable )
             {
-                rc = msixtbl_pt_register(d, info, pt_irq_bind->u.msi.gtable);
+                rc = msixtbl_pt_register(d, info, gtable);
                 if ( unlikely(rc) )
                 {
                     pirq_guest_unbind(d, info);
@@ -381,8 +471,7 @@ int pt_irq_create_bind(
             }
             if ( unlikely(rc) )
             {
-                pirq_dpci->gmsi.legacy.gflags = 0;
-                pirq_dpci->gmsi.legacy.gvec = 0;
+                clear_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind->irq_type);
                 pirq_dpci->dom = NULL;
                 pirq_dpci->flags = 0;
                 pirq_cleanup_check(info, d);
@@ -392,7 +481,8 @@ int pt_irq_create_bind(
         }
         else
         {
-            uint32_t mask = HVM_IRQ_DPCI_MACH_MSI | HVM_IRQ_DPCI_GUEST_MSI;
+            uint32_t mask = HVM_IRQ_DPCI_MACH_MSI |
+                     (ir ? HVM_IRQ_DPCI_GUEST_MSI_IR : HVM_IRQ_DPCI_GUEST_MSI);
 
             if ( (pirq_dpci->flags & mask) != mask )
             {
@@ -401,29 +491,31 @@ int pt_irq_create_bind(
             }
 
             /* If pirq is already mapped as vmsi, update guest data/addr. */
-            if ( pirq_dpci->gmsi.legacy.gvec != pt_irq_bind->u.msi.gvec ||
-                 pirq_dpci->gmsi.legacy.gflags != pt_irq_bind->u.msi.gflags )
+            if ( hvm_gmsi_info_need_update(&pirq_dpci->gmsi, pt_irq_bind) )
             {
                 /* Directly clear pending EOIs before enabling new MSI info. */
                 pirq_guest_eoi(info);
 
-                pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
-                pirq_dpci->gmsi.legacy.gflags = pt_irq_bind->u.msi.gflags;
+                set_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind);
             }
         }
         /* Calculate dest_vcpu_id for MSI-type pirq migration. */
-        dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
-        dest_mode = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
-        delivery_mode = (pirq_dpci->gmsi.legacy.gflags & VMSI_DELIV_MASK) >>
-                         GFLAGS_SHIFT_DELIV_MODE;
-
-        dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
+        rc = pirq_dpci_2_msi_attr(d, pirq_dpci, &gvec, &dest, &dest_mode,
+                                  &delivery_mode);
+        if ( unlikely(rc) )
+        {
+            spin_unlock(&d->event_lock);
+            return -EFAULT;
+        }
+        else
+            dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
         pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
         spin_unlock(&d->event_lock);
 
         pirq_dpci->gmsi.posted = false;
         vcpu = (dest_vcpu_id >= 0) ? d->vcpu[dest_vcpu_id] : NULL;
-        if ( iommu_intpost )
+        /* Currently, don't use interrupt posting for guest's remapping MSIs */
+        if ( iommu_intpost && !ir )
         {
             if ( delivery_mode == dest_LowestPrio )
                 vcpu = vector_hashing_dest(d, dest, dest_mode,
@@ -435,7 +527,7 @@ int pt_irq_create_bind(
             hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
 
         /* Use interrupt posting if it is supported. */
-        if ( iommu_intpost )
+        if ( iommu_intpost && !ir )
             pi_update_irte(vcpu ? &vcpu->arch.hvm_vmx.pi_desc : NULL,
                            info, pirq_dpci->gmsi.legacy.gvec);
 
@@ -627,6 +719,7 @@ int pt_irq_destroy_bind(
         }
         break;
     case PT_IRQ_TYPE_MSI:
+    case PT_IRQ_TYPE_MSI_IR:
         break;
     default:
         return -EOPNOTSUPP;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 4b10f26..1adf032 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -555,6 +555,7 @@ typedef enum pt_irq_type_e {
     PT_IRQ_TYPE_MSI,
     PT_IRQ_TYPE_MSI_TRANSLATE,
     PT_IRQ_TYPE_SPI,    /* ARM: valid range 32-1019 */
+    PT_IRQ_TYPE_MSI_IR,
 } pt_irq_type_t;
 struct xen_domctl_bind_pt_irq {
     uint32_t machine_irq;
@@ -575,6 +576,12 @@ struct xen_domctl_bind_pt_irq {
             uint64_aligned_t gtable;
         } msi;
         struct {
+            uint32_t source_id;
+            uint32_t data;
+            uint64_t addr;
+            uint64_aligned_t gtable;
+        } msi_ir;
+        struct {
             uint16_t spi;
         } spi;
     } u;
diff --git a/xen/include/xen/hvm/irq.h b/xen/include/xen/hvm/irq.h
index 5e736f8..884e092 100644
--- a/xen/include/xen/hvm/irq.h
+++ b/xen/include/xen/hvm/irq.h
@@ -41,6 +41,7 @@ struct dev_intx_gsi_link {
 #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
 #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
 #define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
+#define _HVM_IRQ_DPCI_GUEST_MSI_IR_SHIFT        7 
 #define _HVM_IRQ_DPCI_TRANSLATE_SHIFT          15
 #define HVM_IRQ_DPCI_MACH_PCI        (1 << _HVM_IRQ_DPCI_MACH_PCI_SHIFT)
 #define HVM_IRQ_DPCI_MACH_MSI        (1 << _HVM_IRQ_DPCI_MACH_MSI_SHIFT)
@@ -49,6 +50,7 @@ struct dev_intx_gsi_link {
 #define HVM_IRQ_DPCI_GUEST_PCI       (1 << _HVM_IRQ_DPCI_GUEST_PCI_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_MSI       (1 << _HVM_IRQ_DPCI_GUEST_MSI_SHIFT)
 #define HVM_IRQ_DPCI_IDENTITY_GSI    (1 << _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT)
+#define HVM_IRQ_DPCI_GUEST_MSI_IR    (1 << _HVM_IRQ_DPCI_GUEST_MSI_IR_SHIFT)
 #define HVM_IRQ_DPCI_TRANSLATE       (1 << _HVM_IRQ_DPCI_TRANSLATE_SHIFT)
 
 #define VMSI_DEST_ID_MASK 0xff
@@ -67,6 +69,11 @@ struct hvm_gmsi_info {
             uint32_t gvec;
             uint32_t gflags;
         } legacy;
+        struct {
+            uint32_t source_id;
+            uint32_t data;
+            uint64_t addr;
+        } intremap;
     };
     int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
     bool posted; /* directly deliver to guest via VT-d PI? */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (20 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 10:55   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults Lan Tianyu
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, julien.grall, jbeulich, Chao Gao

From: Chao Gao <chao.gao@intel.com>

In two situations, hypervisor delivers a msi to a hvm guest. One is
when qemu sends a request to hypervisor through XEN_DMOP_inject_msi.
The other is when a physical interrupt arrives and it has been bound
to a guest msi.

For the former, the msi is routed to common vIOMMU layer if it is in
remapping format. For the latter, if the pt irq is bound to a guest
remapping msi, a new remapping msi is constructed based on the binding
information and routed to common vIOMMU layer.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/irq.c       | 11 ++++++++++
 xen/arch/x86/hvm/vmsi.c      | 14 ++++++++++--
 xen/drivers/passthrough/io.c | 51 +++++++++++++++++++++++++++++++++-----------
 xen/include/asm-x86/msi.h    |  3 +++
 4 files changed, 65 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index e425df9..12d83b3 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -26,6 +26,7 @@
 #include <asm/hvm/domain.h>
 #include <asm/hvm/support.h>
 #include <asm/msi.h>
+#include <asm/viommu.h>
 
 /* Must be called with hvm_domain->irq_lock hold */
 static void assert_gsi(struct domain *d, unsigned ioapic_gsi)
@@ -340,6 +341,16 @@ int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
         >> MSI_DATA_TRIGGER_SHIFT;
     uint8_t vector = data & MSI_DATA_VECTOR_MASK;
 
+    if ( addr & MSI_ADDR_INTEFORMAT_MASK )
+    {
+        struct irq_remapping_request request;
+
+        irq_request_msi_fill(&request, 0, addr, data);
+        /* Currently, only viommu 0 is supported */
+        viommu_handle_irq_request(d, 0, &request);
+        return 0;
+    }
+
     if ( !vector )
     {
         int pirq = ((addr >> 32) & 0xffffff00) | dest;
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index c4ec0ad..75ceb19 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -114,9 +114,19 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
                 "vector=%x trig_mode=%x\n",
                 dest, dest_mode, delivery_mode, vector, trig_mode);
 
-    ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
+    ASSERT(pirq_dpci->flags & (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_GUEST_MSI_IR));
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI_IR )
+    {
+        struct irq_remapping_request request;
 
-    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
+        irq_request_msi_fill(&request, pirq_dpci->gmsi.intremap.source_id,
+                             pirq_dpci->gmsi.intremap.addr,
+                             pirq_dpci->gmsi.intremap.data);
+        /* Currently, only viommu 0 is supported */
+        viommu_handle_irq_request(d, 0, &request);
+    }
+    else
+        vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
 }
 
 /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 0510887..3a086d8 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -139,7 +139,9 @@ static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci)
 
 bool_t pt_irq_need_timer(uint32_t flags)
 {
-    return !(flags & (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_TRANSLATE));
+    return !(flags & (HVM_IRQ_DPCI_GUEST_MSI_IR |
+                      HVM_IRQ_DPCI_GUEST_MSI |
+                      HVM_IRQ_DPCI_TRANSLATE));
 }
 
 static int pt_irq_guest_eoi(struct domain *d, struct hvm_pirq_dpci *pirq_dpci,
@@ -738,7 +740,8 @@ int pt_irq_destroy_bind(
     pirq = pirq_info(d, machine_gsi);
     pirq_dpci = pirq_dpci(pirq);
 
-    if ( hvm_irq_dpci && pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI )
+    if ( hvm_irq_dpci && (pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI) &&
+         (pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI_IR) )
     {
         unsigned int bus = pt_irq_bind->u.pci.bus;
         unsigned int device = pt_irq_bind->u.pci.device;
@@ -909,17 +912,39 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
 {
     int vector = (long)arg;
 
-    if ( (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
-         (pirq_dpci->gmsi.legacy.gvec == vector) )
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI )
     {
-        int dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
-        int dest_mode = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
+        if ( (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) &&
+             (pirq_dpci->gmsi.legacy.gvec == vector) )
+        {
+            int dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
+            int dest_mode = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
 
-        if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
-                               dest_mode) )
+            if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
+                                   dest_mode) )
+            {
+                __msi_pirq_eoi(pirq_dpci);
+                return 1;
+            }
+        }
+        else if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI_IR )
         {
-            __msi_pirq_eoi(pirq_dpci);
-            return 1;
+            int ret;
+            struct irq_remapping_request request;
+            struct irq_remapping_info irq_info;
+
+            irq_request_msi_fill(&request, pirq_dpci->gmsi.intremap.source_id,
+                                 pirq_dpci->gmsi.intremap.addr,
+                                 pirq_dpci->gmsi.intremap.data);
+            /* Currently, only viommu 0 is supported */
+            ret = viommu_get_irq_info(d, 0, &request, &irq_info);
+            if ( (!ret) && (irq_info.vector == vector) &&
+                 vlapic_match_dest(vcpu_vlapic(current), NULL, 0,
+                                   irq_info.dest, irq_info.dest_mode) )
+            {
+                __msi_pirq_eoi(pirq_dpci);
+                return 1;
+            }
         }
     }
 
@@ -954,14 +979,16 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
         {
             send_guest_pirq(d, pirq);
 
-            if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI )
+            if ( pirq_dpci->flags &
+                 (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_GUEST_MSI_IR) )
             {
                 spin_unlock(&d->event_lock);
                 return;
             }
         }
 
-        if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI )
+        if ( pirq_dpci->flags &
+             (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_GUEST_MSI_IR) )
         {
             vmsi_deliver_pirq(d, pirq_dpci);
             spin_unlock(&d->event_lock);
diff --git a/xen/include/asm-x86/msi.h b/xen/include/asm-x86/msi.h
index 37d37b8..5e94d07 100644
--- a/xen/include/asm-x86/msi.h
+++ b/xen/include/asm-x86/msi.h
@@ -49,6 +49,9 @@
 #define MSI_ADDR_REDIRECTION_CPU    (0 << MSI_ADDR_REDIRECTION_SHIFT)
 #define MSI_ADDR_REDIRECTION_LOWPRI (1 << MSI_ADDR_REDIRECTION_SHIFT)
 
+#define MSI_ADDR_INTEFORMAT_SHIFT   4
+#define MSI_ADDR_INTEFORMAT_MASK    (1 << MSI_ADDR_INTEFORMAT_SHIFT)
+
 #define MSI_ADDR_DEST_ID_SHIFT		12
 #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
 #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (21 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 11:51   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
  2017-08-09 20:34 ` [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d Lan Tianyu
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Lan Tianyu, kevin.tian, julien.grall, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Interrupt translation faults are non-recoverable fault. When faults
are triggered, it needs to populate fault info to Fault Recording
Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
to deal with faults.

This patch emulates hardware's handling interrupt translation
faults (more information about the process can be found in VT-d spec,
chipter "Translation Faults", section "Non-Recoverable Fault
Reporting" and section "Non-Recoverable Logging").
Specifically, viommu_record_fault() records the fault information and
viommu_report_non_recoverable_fault() reports faults to software.
Currently, only Primary Fault Logging is supported and the Number of
Fault-recording Registers is 1.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  60 +++++++--
 xen/drivers/passthrough/vtd/vvtd.c  | 238 +++++++++++++++++++++++++++++++++++-
 2 files changed, 286 insertions(+), 12 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index e323352..a9e905b 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -226,26 +226,66 @@
 #define DMA_CCMD_CAIG_MASK(x) (((u64)x) & ((u64) 0x3 << 59))
 
 /* FECTL_REG */
-#define DMA_FECTL_IM (((u64)1) << 31)
+#define DMA_FECTL_IM_BIT 31
+#define DMA_FECTL_IM (1U << DMA_FECTL_IM_BIT)
+#define DMA_FECTL_IP_BIT 30
+#define DMA_FECTL_IP (1U << DMA_FECTL_IP_BIT)
 
 /* FSTS_REG */
-#define DMA_FSTS_PFO ((u64)1 << 0)
-#define DMA_FSTS_PPF ((u64)1 << 1)
-#define DMA_FSTS_AFO ((u64)1 << 2)
-#define DMA_FSTS_APF ((u64)1 << 3)
-#define DMA_FSTS_IQE ((u64)1 << 4)
-#define DMA_FSTS_ICE ((u64)1 << 5)
-#define DMA_FSTS_ITE ((u64)1 << 6)
-#define DMA_FSTS_FAULTS    DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE
+#define DMA_FSTS_PFO_BIT 0
+#define DMA_FSTS_PFO (1U << DMA_FSTS_PFO_BIT)
+#define DMA_FSTS_PPF_BIT 1
+#define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_BIT)
+#define DMA_FSTS_AFO (1U << 2)
+#define DMA_FSTS_APF (1U << 3)
+#define DMA_FSTS_IQE (1U << 4)
+#define DMA_FSTS_ICE (1U << 5)
+#define DMA_FSTS_ITE (1U << 6)
+#define DMA_FSTS_PRO_BIT 7
+#define DMA_FSTS_PRO (1U << DMA_FSTS_PRO_BIT)
+#define DMA_FSTS_FAULTS    (DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE | DMA_FSTS_PRO)
+#define DMA_FSTS_RW1CS     (DMA_FSTS_PFO | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE | DMA_FSTS_PRO)
 #define dma_fsts_fault_record_index(s) (((s) >> 8) & 0xff)
 
 /* FRCD_REG, 32 bits access */
-#define DMA_FRCD_F (((u64)1) << 31)
+#define DMA_FRCD_LEN            0x10
+#define DMA_FRCD0_OFFSET        0x0
+#define DMA_FRCD1_OFFSET        0x4
+#define DMA_FRCD2_OFFSET        0x8
+#define DMA_FRCD3_OFFSET        0xc
+#define DMA_FRCD3_FR_MASK       0xffUL
+#define DMA_FRCD_F_BIT 31
+#define DMA_FRCD_F ((u64)1 << DMA_FRCD_F_BIT)
+#define DMA_FRCD(idx, offset) (DMA_CAP_FRO_OFFSET + DMA_FRCD_LEN * idx + offset)
 #define dma_frcd_type(d) ((d >> 30) & 1)
 #define dma_frcd_fault_reason(c) (c & 0xff)
 #define dma_frcd_source_id(c) (c & 0xffff)
 #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
 
+struct vtd_fault_record_register
+{
+    union {
+        struct {
+            u64 lo;
+            u64 hi;
+        } bits;
+        struct {
+            u64 rsvd0   :12,
+                FI      :52; /* Fault Info */
+            u64 SID     :16, /* Source Identifier */
+                rsvd1   :9,
+                PRIV    :1,  /* Privilege Mode Requested */
+                EXE     :1,  /* Execute Permission Requested */
+                PP      :1,  /* PASID Present */
+                FR      :8,  /* Fault Reason */
+                PV      :20, /* PASID Value */
+                AT      :2,  /* Address Type */
+                T       :1,  /* Type. (0) Write (1) Read/AtomicOp */
+                F       :1;  /* Fault */
+        } fields;
+    };
+};
+
 enum VTD_FAULT_TYPE
 {
     /* Interrupt remapping transition faults */
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index eae8f11..f1e6d01 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -19,6 +19,7 @@
  */
 
 #include <xen/domain_page.h>
+#include <xen/lib.h>
 #include <xen/sched.h>
 #include <xen/types.h>
 #include <xen/viommu.h>
@@ -30,6 +31,7 @@
 #include <asm/io_apic.h>
 #include <asm/page.h>
 #include <asm/p2m.h>
+#include <asm/system.h>
 
 #include "iommu.h"
 #include "vtd.h"
@@ -49,6 +51,8 @@ struct hvm_hw_vvtd_regs {
 struct vvtd {
     /* VIOMMU_STATUS_XXX */
     int status;
+    /* Fault Recording index */
+    int frcd_idx;
     /* Address range of remapping hardware register-set */
     uint64_t base_addr;
     uint64_t length;
@@ -97,6 +101,23 @@ static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
     return domain_vvtd(v->domain);
 }
 
+static inline int vvtd_test_and_set_bit(struct vvtd *vvtd, uint32_t reg,
+                                        int nr)
+{
+    return test_and_set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
+}
+
+static inline int vvtd_test_and_clear_bit(struct vvtd *vvtd, uint32_t reg,
+                                          int nr)
+{
+    return test_and_clear_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
+}
+
+static inline int vvtd_test_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    return test_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
+}
+
 static inline void __vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
 {
     return __set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
@@ -232,6 +253,24 @@ static int vvtd_delivery(
     return 0;
 }
 
+void vvtd_generate_interrupt(struct vvtd *vvtd,
+                             uint32_t addr,
+                             uint32_t data)
+{
+    uint8_t dest, dm, dlm, tm, vector;
+
+    VVTD_DEBUG(VVTD_DBG_FAULT, "Sending interrupt %x %x to d%d",
+               addr, data, vvtd->domain->domain_id);
+
+    dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
+    dm = !!(addr & MSI_ADDR_DESTMODE_MASK);
+    dlm = (data & MSI_DATA_DELIVERY_MODE_MASK) >> MSI_DATA_DELIVERY_MODE_SHIFT;
+    tm = (data & MSI_DATA_TRIGGER_MASK) >> MSI_DATA_TRIGGER_SHIFT;
+    vector = data & MSI_DATA_VECTOR_MASK;
+
+    vvtd_delivery(vvtd->domain, vector, dest, dm, dlm, tm);
+}
+
 static uint32_t irq_remapping_request_index(struct irq_remapping_request *irq)
 {
     if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
@@ -260,11 +299,189 @@ static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
     return DMA_IRTA_EIME(irta) ? dest : MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
 }
 
+static void vvtd_report_non_recoverable_fault(struct vvtd *vvtd, int reason)
+{
+    uint32_t fsts;
+
+    ASSERT(reason & DMA_FSTS_FAULTS);
+    fsts = vvtd_get_reg(vvtd, DMAR_FSTS_REG);
+    __vvtd_set_bit(vvtd, DMAR_FSTS_REG, reason);
+
+    /*
+     * Accoroding to VT-d spec "Non-Recoverable Fault Event" chapter, if
+     * there are any previously reported interrupt conditions that are yet to
+     * be sevices by software, the Fault Event interrrupt is not generated.
+     */
+    if ( fsts & DMA_FSTS_FAULTS )
+        return;
+
+    __vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
+    if ( !vvtd_test_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_BIT) )
+    {
+        uint32_t fe_data, fe_addr;
+        fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
+        fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
+        vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
+        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
+    }
+}
+
+static void vvtd_recomputing_ppf(struct vvtd *vvtd)
+{
+    int i;
+
+    for ( i = 0; i < DMA_FRCD_REG_NR; i++ )
+    {
+        if ( vvtd_test_bit(vvtd, DMA_FRCD(i, DMA_FRCD3_OFFSET),
+                           DMA_FRCD_F_BIT) )
+        {
+            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PPF_BIT);
+            return;
+        }
+    }
+    /*
+     * No Primary Fault is in Fault Record Registers, thus clear PPF bit in
+     * FSTS.
+     */
+    __vvtd_clear_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PPF_BIT);
+
+    /* If no fault is in FSTS, clear pending bit in FECTL. */
+    if ( !(vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS) )
+        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
+}
+
+/*
+ * Commit a frcd to emulated Fault Record Registers.
+ */
+static void vvtd_commit_frcd(struct vvtd *vvtd, int idx,
+                             struct vtd_fault_record_register *frcd)
+{
+    vvtd_set_reg_quad(vvtd, DMA_FRCD(idx, DMA_FRCD0_OFFSET), frcd->bits.lo);
+    vvtd_set_reg_quad(vvtd, DMA_FRCD(idx, DMA_FRCD2_OFFSET), frcd->bits.hi);
+    vvtd_recomputing_ppf(vvtd);
+}
+
+/*
+ * Allocate a FRCD for the caller. If success, return the FRI. Or, return -1
+ * when failure.
+ */
+static int vvtd_alloc_frcd(struct vvtd *vvtd)
+{
+    int prev;
+
+    /* Set the F bit to indicate the FRCD is in use. */
+    if ( vvtd_test_and_set_bit(vvtd, DMA_FRCD(vvtd->frcd_idx, DMA_FRCD3_OFFSET),
+                               DMA_FRCD_F_BIT) )
+    {
+        prev = vvtd->frcd_idx;
+        vvtd->frcd_idx = (prev + 1) % DMA_FRCD_REG_NR;
+        return vvtd->frcd_idx;
+    }
+    return -1;
+}
+
+static void vvtd_free_frcd(struct vvtd *vvtd, int i)
+{
+    __vvtd_clear_bit(vvtd, DMA_FRCD(i, DMA_FRCD3_OFFSET), DMA_FRCD_F_BIT);
+}
+
 static int vvtd_record_fault(struct vvtd *vvtd,
-                             struct irq_remapping_request *irq,
+                             struct irq_remapping_request *request,
                              int reason)
 {
-    return 0;
+    struct vtd_fault_record_register frcd;
+    int frcd_idx;
+
+    switch(reason)
+    {
+    case VTD_FR_IR_REQ_RSVD:
+    case VTD_FR_IR_INDEX_OVER:
+    case VTD_FR_IR_ENTRY_P:
+    case VTD_FR_IR_ROOT_INVAL:
+    case VTD_FR_IR_IRTE_RSVD:
+    case VTD_FR_IR_REQ_COMPAT:
+    case VTD_FR_IR_SID_ERR:
+        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_BIT) )
+            return X86EMUL_OKAY;
+
+        /* No available Fault Record means Fault overflowed */
+        frcd_idx = vvtd_alloc_frcd(vvtd);
+        if ( frcd_idx == -1 )
+        {
+            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PFO_BIT);
+            return X86EMUL_OKAY;
+        }
+        memset(&frcd, 0, sizeof(frcd));
+        frcd.fields.FR = (u8)reason;
+        frcd.fields.FI = ((u64)irq_remapping_request_index(request)) << 36;
+        frcd.fields.SID = (u16)request->source_id;
+        frcd.fields.F = 1;
+        vvtd_commit_frcd(vvtd, frcd_idx, &frcd);
+        return X86EMUL_OKAY;
+
+    default:
+        break;
+    }
+
+    gdprintk(XENLOG_ERR, "Can't handle vVTD Fault (reason 0x%x).", reason);
+    domain_crash(vvtd->domain);
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
+{
+    /* Writing a 1 means clear fault */
+    if ( val & DMA_FRCD_F )
+    {
+        vvtd_free_frcd(vvtd, 0);
+        vvtd_recomputing_ppf(vvtd);
+    }
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
+{
+    /*
+     * Only DMA_FECTL_IM bit is writable. Generate pending event when unmask.
+     */
+    if ( !(val & DMA_FECTL_IM) )
+    {
+        /* Clear IM */
+        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_BIT);
+        if ( vvtd_test_and_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT) )
+        {
+            uint32_t fe_data, fe_addr;
+            fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
+            fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
+            vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
+        }
+    }
+    else
+        __vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_BIT);
+
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
+{
+    int i, max_fault_index = DMA_FSTS_PRO_BIT;
+    uint64_t bits_to_clear = val & DMA_FSTS_RW1CS;
+
+    i = find_first_bit(&bits_to_clear, max_fault_index / 8 + 1);
+    while ( i <= max_fault_index )
+    {
+        __vvtd_clear_bit(vvtd, DMAR_FSTS_REG, i);
+        i = find_next_bit(&bits_to_clear, max_fault_index / 8 + 1, i + 1);
+    }
+
+    /*
+     * Clear IP field when all status fields in the Fault Status Register
+     * being clear.
+     */
+    if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
+        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
+
+    return X86EMUL_OKAY;
 }
 
 static int vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)
@@ -410,6 +627,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             ret = vvtd_write_gcmd(vvtd, val);
             break;
 
+        case DMAR_FSTS_REG:
+            ret = vvtd_write_fsts(vvtd, val);
+            break;
+
+        case DMAR_FECTL_REG:
+            ret = vvtd_write_fectl(vvtd, val);
+            break;
+
+        case DMA_CAP_FRO_OFFSET + DMA_FRCD3_OFFSET:
+            ret = vvtd_write_frcd3(vvtd, val);
+            break;
+
         case DMAR_IEDATA_REG:
         case DMAR_IEADDR_REG:
         case DMAR_IEUADDR_REG:
@@ -435,6 +664,10 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             ret = X86EMUL_OKAY;
             break;
 
+        case DMA_CAP_FRO_OFFSET + DMA_FRCD2_OFFSET:
+            ret = vvtd_write_frcd3(vvtd, val >> 32);
+            break;
+
         default:
             ret = X86EMUL_UNHANDLEABLE;
             break;
@@ -649,6 +882,7 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
     vvtd->eim = 0;
     vvtd->irt = 0;
     vvtd->irt_max_entry = 0;
+    vvtd->frcd_idx = 0;
     register_mmio_handler(d, &vvtd_mmio_ops);
     return 0;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (22 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 12:16   ` Roger Pau Monné
  2017-08-09 20:34 ` [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d Lan Tianyu
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Lan Tianyu, kevin.tian, julien.grall, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Queued Invalidation Interface is an expanded invalidation interface with
extended capabilities. Hardware implementations report support for queued
invalidation interface through the Extended Capability Register. The queued
invalidation interface uses an Invalidation Queue (IQ), which is a circular
buffer in system memory. Software submits commands by writing Invalidation
Descriptors to the IQ.

In this patch, a new function viommu_process_iq() is used for emulating how
hardware handles invalidation requests through QI.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  29 ++++-
 xen/drivers/passthrough/vtd/vvtd.c  | 244 ++++++++++++++++++++++++++++++++++++
 2 files changed, 272 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index a9e905b..eac0fbe 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -204,6 +204,32 @@
 #define DMA_IRTA_S(val)         (val & 0xf)
 #define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
 
+/* IQH_REG */
+#define DMA_IQH_QH_SHIFT        4
+#define DMA_IQH_QH(val)         ((val >> 4) & 0x7fffULL)
+
+/* IQT_REG */
+#define DMA_IQT_QT_SHIFT        4
+#define DMA_IQT_QT(val)         ((val >> 4) & 0x7fffULL)
+#define DMA_IQT_RSVD            0xfffffffffff80007ULL
+
+/* IQA_REG */
+#define DMA_MGAW                39  /* Maximum Guest Address Width */
+#define DMA_IQA_ADDR(val)       (val & ~0xfffULL)
+#define DMA_IQA_QS(val)         (val & 0x7)
+#define DMA_IQA_ENTRY_PER_PAGE  (1 << 8)
+#define DMA_IQA_RSVD            (~((1ULL << DMA_MGAW) -1 ) | 0xff8ULL)
+
+/* IECTL_REG */
+#define DMA_IECTL_IM_BIT 31
+#define DMA_IECTL_IM            (1 << DMA_IECTL_IM_BIT)
+#define DMA_IECTL_IP_BIT 30
+#define DMA_IECTL_IP (((u64)1) << DMA_IECTL_IP_BIT)
+
+/* ICS_REG */
+#define DMA_ICS_IWC_BIT         0
+#define DMA_ICS_IWC             (1 << DMA_ICS_IWC_BIT)
+
 /* PMEN_REG */
 #define DMA_PMEN_EPM    (((u32)1) << 31)
 #define DMA_PMEN_PRS    (((u32)1) << 0)
@@ -238,7 +264,8 @@
 #define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_BIT)
 #define DMA_FSTS_AFO (1U << 2)
 #define DMA_FSTS_APF (1U << 3)
-#define DMA_FSTS_IQE (1U << 4)
+#define DMA_FSTS_IQE_BIT 4
+#define DMA_FSTS_IQE (1U << DMA_FSTS_IQE_BIT)
 #define DMA_FSTS_ICE (1U << 5)
 #define DMA_FSTS_ITE (1U << 6)
 #define DMA_FSTS_PRO_BIT 7
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index f1e6d01..4f5e28e 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -428,6 +428,185 @@ static int vvtd_record_fault(struct vvtd *vvtd,
     return X86EMUL_OKAY;
 }
 
+/*
+ * Process a invalidation descriptor. Currently, only two types descriptors,
+ * Interrupt Entry Cache Invalidation Descritor and Invalidation Wait
+ * Descriptor are handled.
+ * @vvtd: the virtual vtd instance
+ * @i: the index of the invalidation descriptor to be processed
+ *
+ * If success return 0, or return -1 when failure.
+ */
+static int process_iqe(struct vvtd *vvtd, int i)
+{
+    uint64_t iqa, addr;
+    struct qinval_entry *qinval_page;
+    void *pg;
+    int ret;
+
+    vvtd_get_reg_quad(vvtd, DMAR_IQA_REG, iqa);
+    ret = map_guest_page(vvtd->domain, DMA_IQA_ADDR(iqa)>>PAGE_SHIFT,
+                         (void**)&qinval_page);
+    if ( ret )
+    {
+        gdprintk(XENLOG_ERR, "Can't map guest IRT (rc %d)", ret);
+        return -1;
+    }
+
+    switch ( qinval_page[i].q.inv_wait_dsc.lo.type )
+    {
+    case TYPE_INVAL_WAIT:
+        if ( qinval_page[i].q.inv_wait_dsc.lo.sw )
+        {
+            addr = (qinval_page[i].q.inv_wait_dsc.hi.saddr << 2);
+            ret = map_guest_page(vvtd->domain, addr >> PAGE_SHIFT, &pg);
+            if ( ret )
+            {
+                gdprintk(XENLOG_ERR, "Can't map guest memory to inform guest "
+                         "IWC completion (rc %d)", ret);
+                goto error;
+            }
+            *(uint32_t *)((uint64_t)pg + (addr & ~PAGE_MASK)) =
+                qinval_page[i].q.inv_wait_dsc.lo.sdata;
+            unmap_guest_page(pg);
+        }
+
+        /*
+         * The following code generates an invalidation completion event
+         * indicating the invalidation wait descriptor completion. Note that
+         * the following code fragment is not tested properly.
+         */
+        if ( qinval_page[i].q.inv_wait_dsc.lo.iflag )
+        {
+            uint32_t ie_data, ie_addr;
+            if ( !vvtd_test_and_set_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_BIT) )
+            {
+                __vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT);
+                if ( !vvtd_test_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_BIT) )
+                {
+                    ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
+                    ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
+                    vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
+                    __vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT);
+                }
+            }
+        }
+        break;
+
+    case TYPE_INVAL_IEC:
+        /*
+         * Currently, no cache is preserved in hypervisor. Only need to update
+         * pIRTEs which are modified in binding process.
+         */
+        break;
+
+    default:
+        goto error;
+    }
+
+    unmap_guest_page((void*)qinval_page);
+    return 0;
+
+ error:
+    unmap_guest_page((void*)qinval_page);
+    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
+    domain_crash(vvtd->domain);
+    return -1;
+}
+
+/*
+ * Invalidate all the descriptors in Invalidation Queue.
+ */
+static void vvtd_process_iq(struct vvtd *vvtd)
+{
+    uint64_t iqh, iqt, iqa, max_entry, i;
+    int ret = 0;
+
+    /*
+     * No new descriptor is fetched from the Invalidation Queue until
+     * software clears the IQE field in the Fault Status Register
+     */
+    if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_BIT) )
+        return;
+
+    vvtd_get_reg_quad(vvtd, DMAR_IQH_REG, iqh);
+    vvtd_get_reg_quad(vvtd, DMAR_IQT_REG, iqt);
+    vvtd_get_reg_quad(vvtd, DMAR_IQA_REG, iqa);
+
+    max_entry = DMA_IQA_ENTRY_PER_PAGE << DMA_IQA_QS(iqa);
+    iqh = DMA_IQH_QH(iqh);
+    iqt = DMA_IQT_QT(iqt);
+
+    ASSERT(iqt < max_entry);
+    if ( iqh == iqt )
+        return;
+
+    i = iqh;
+    while ( i != iqt )
+    {
+        ret = process_iqe(vvtd, i);
+        if ( ret )
+            break;
+        else
+            i = (i + 1) % max_entry;
+        vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, i << DMA_IQH_QH_SHIFT);
+    }
+
+    /*
+     * When IQE set, IQH references the desriptor associated with the error.
+     */
+    if ( ret )
+        vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_IQE_BIT);
+}
+
+static int vvtd_write_iqt(struct vvtd *vvtd, unsigned long val)
+{
+    uint64_t iqa;
+
+    if ( val & DMA_IQT_RSVD )
+    {
+        VVTD_DEBUG(VVTD_DBG_RW, "Attempt to set reserved bits in "
+                   "Invalidation Queue Tail.");
+        return X86EMUL_OKAY;
+    }
+
+    vvtd_get_reg_quad(vvtd, DMAR_IQA_REG, iqa);
+    if ( DMA_IQT_QT(val) >= DMA_IQA_ENTRY_PER_PAGE << DMA_IQA_QS(iqa) )
+    {
+        VVTD_DEBUG(VVTD_DBG_RW, "IQT: Value %lx exceeded supported max "
+                   "index.", val);
+        return X86EMUL_OKAY;
+    }
+
+    vvtd_set_reg_quad(vvtd, DMAR_IQT_REG, val);
+    vvtd_process_iq(vvtd);
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_iqa(struct vvtd *vvtd, unsigned long val)
+{
+    if ( val & DMA_IQA_RSVD )
+    {
+        VVTD_DEBUG(VVTD_DBG_RW, "Attempt to set reserved bits in "
+                   "Invalidation Queue Address.");
+        return X86EMUL_OKAY;
+    }
+
+    vvtd_set_reg_quad(vvtd, DMAR_IQA_REG, val);
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_ics(struct vvtd *vvtd, uint32_t val)
+{
+    if ( val & DMA_ICS_IWC )
+    {
+        __vvtd_clear_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_BIT);
+        /*When IWC field is cleared, the IP field needs to be cleared */
+        __vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT);
+    }
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
 {
     /* Writing a 1 means clear fault */
@@ -439,6 +618,29 @@ static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
     return X86EMUL_OKAY;
 }
 
+static int vvtd_write_iectl(struct vvtd *vvtd, uint32_t val)
+{
+    /*
+     * Only DMA_IECTL_IM bit is writable. Generate pending event when unmask.
+     */
+    if ( !(val & DMA_IECTL_IM) )
+    {
+        /* Clear IM and clear IP */
+        __vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_BIT);
+        if ( vvtd_test_and_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT) )
+        {
+            uint32_t ie_data, ie_addr;
+            ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
+            ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
+            vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
+        }
+    }
+    else
+        __vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_BIT);
+
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
 {
     /*
@@ -481,6 +683,10 @@ static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
     if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
         __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
 
+    /* Continue to deal invalidation when IQE is clear */
+    if ( !vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_BIT) )
+        vvtd_process_iq(vvtd);
+
     return X86EMUL_OKAY;
 }
 
@@ -639,6 +845,36 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             ret = vvtd_write_frcd3(vvtd, val);
             break;
 
+        case DMAR_IECTL_REG:
+            ret = vvtd_write_iectl(vvtd, val);
+            break;
+
+        case DMAR_ICS_REG:
+            ret = vvtd_write_ics(vvtd, val);
+            break;
+
+        case DMAR_IQT_REG:
+            ret = vvtd_write_iqt(vvtd, (uint32_t)val);
+            break;
+
+        case DMAR_IQA_REG:
+        {
+            uint32_t iqa_hi;
+
+            iqa_hi = vvtd_get_reg(vvtd, DMAR_IQA_REG_HI);
+            ret = vvtd_write_iqa(vvtd, (uint32_t)val | ((uint64_t)iqa_hi << 32));
+            break;
+        }
+
+        case DMAR_IQA_REG_HI:
+        {
+            uint32_t iqa_lo;
+
+            iqa_lo = vvtd_get_reg(vvtd, DMAR_IQA_REG);
+            ret = vvtd_write_iqa(vvtd, (val << 32) | iqa_lo);
+            break;
+        }
+
         case DMAR_IEDATA_REG:
         case DMAR_IEADDR_REG:
         case DMAR_IEUADDR_REG:
@@ -668,6 +904,14 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             ret = vvtd_write_frcd3(vvtd, val >> 32);
             break;
 
+        case DMAR_IQT_REG:
+            ret = vvtd_write_iqt(vvtd, val);
+            break;
+
+        case DMAR_IQA_REG:
+            ret = vvtd_write_iqa(vvtd, val);
+            break;
+
         default:
             ret = X86EMUL_UNHANDLEABLE;
             break;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d
  2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
                   ` (23 preceding siblings ...)
  2017-08-09 20:34 ` [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
@ 2017-08-09 20:34 ` Lan Tianyu
  2017-08-23 12:19   ` Roger Pau Monné
  24 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-09 20:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Lan Tianyu, kevin.tian, julien.grall, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Wrap some useful status in a new structure hvm_hw_vvtd, following
the customs of vlapic, vioapic and etc. Provide two save-restore
pairs to save/restore registers and non-register status.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/vvtd.c     | 98 ++++++++++++++++++++++------------
 xen/include/public/arch-x86/hvm/save.h | 24 ++++++++-
 2 files changed, 88 insertions(+), 34 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 4f5e28e..dd6be83 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -20,6 +20,7 @@
 
 #include <xen/domain_page.h>
 #include <xen/lib.h>
+#include <xen/hvm/save.h>
 #include <xen/sched.h>
 #include <xen/types.h>
 #include <xen/viommu.h>
@@ -32,39 +33,26 @@
 #include <asm/page.h>
 #include <asm/p2m.h>
 #include <asm/system.h>
+#include <public/hvm/save.h>
 
 #include "iommu.h"
 #include "vtd.h"
 
-struct hvm_hw_vvtd_regs {
-    uint8_t data[1024];
-};
-
 /* Status field of struct vvtd */
 #define VIOMMU_STATUS_DEFAULT                   (0)
 #define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
 #define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
 
 #define vvtd_irq_remapping_enabled(vvtd) \
-    (vvtd->status & VIOMMU_STATUS_IRQ_REMAPPING_ENABLED)
+    (vvtd->hw.status & VIOMMU_STATUS_IRQ_REMAPPING_ENABLED)
 
 struct vvtd {
-    /* VIOMMU_STATUS_XXX */
-    int status;
-    /* Fault Recording index */
-    int frcd_idx;
     /* Address range of remapping hardware register-set */
     uint64_t base_addr;
     uint64_t length;
     /* Point back to the owner domain */
     struct domain *domain;
-    /* Is in Extended Interrupt Mode? */
-    bool eim;
-    /* Max remapping entries in IRT */
-    int irt_max_entry;
-    /* Interrupt remapping table base gfn */
-    uint64_t irt;
-
+    struct hvm_hw_vvtd hw;
     struct hvm_hw_vvtd_regs *regs;
     struct page_info *regs_page;
 };
@@ -370,12 +358,12 @@ static int vvtd_alloc_frcd(struct vvtd *vvtd)
     int prev;
 
     /* Set the F bit to indicate the FRCD is in use. */
-    if ( vvtd_test_and_set_bit(vvtd, DMA_FRCD(vvtd->frcd_idx, DMA_FRCD3_OFFSET),
+    if ( vvtd_test_and_set_bit(vvtd, DMA_FRCD(vvtd->hw.frcd_idx, DMA_FRCD3_OFFSET),
                                DMA_FRCD_F_BIT) )
     {
-        prev = vvtd->frcd_idx;
-        vvtd->frcd_idx = (prev + 1) % DMA_FRCD_REG_NR;
-        return vvtd->frcd_idx;
+        prev = vvtd->hw.frcd_idx;
+        vvtd->hw.frcd_idx = (prev + 1) % DMA_FRCD_REG_NR;
+        return vvtd->hw.frcd_idx;
     }
     return -1;
 }
@@ -712,12 +700,12 @@ static int vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
 
     if ( val & DMA_GCMD_IRE )
     {
-        vvtd->status |= VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
+        vvtd->hw.status |= VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
         __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
     }
     else
     {
-        vvtd->status |= ~VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
+        vvtd->hw.status |= ~VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
         __vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
     }
 
@@ -736,11 +724,11 @@ static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
                    "active." );
 
     vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
-    vvtd->irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
-    vvtd->irt_max_entry = DMA_IRTA_SIZE(irta);
-    vvtd->eim = DMA_IRTA_EIME(irta);
+    vvtd->hw.irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
+    vvtd->hw.irt_max_entry = DMA_IRTA_SIZE(irta);
+    vvtd->hw.eim = DMA_IRTA_EIME(irta);
     VVTD_DEBUG(VVTD_DBG_RW, "Update IR info (addr=%lx eim=%d size=%d).",
-               vvtd->irt, vvtd->eim, vvtd->irt_max_entry);
+               vvtd->hw.irt, vvtd->hw.eim, vvtd->hw.irt_max_entry);
     __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_SIRTPS_BIT);
 
     return X86EMUL_OKAY;
@@ -947,13 +935,13 @@ static int vvtd_get_entry(struct vvtd *vvtd,
 
     VVTD_DEBUG(VVTD_DBG_TRANS, "interpret a request with index %x", entry);
 
-    if ( entry > vvtd->irt_max_entry )
+    if ( entry > vvtd->hw.irt_max_entry )
     {
         ret = VTD_FR_IR_INDEX_OVER;
         goto handle_fault;
     }
 
-    ret = map_guest_page(vvtd->domain, vvtd->irt + (entry >> IREMAP_ENTRY_ORDER),
+    ret = map_guest_page(vvtd->domain, vvtd->hw.irt + (entry >> IREMAP_ENTRY_ORDER),
                          (void**)&irt_page);
     if ( ret )
     {
@@ -1077,6 +1065,49 @@ static int vvtd_get_irq_info(struct domain *d,
     return 0;
 }
 
+static int vvtd_load_regs(struct domain *d, hvm_domain_context_t *h)
+{
+    if ( !domain_vvtd(d) )
+        return -ENODEV;
+
+    if ( hvm_load_entry(IOMMU_REGS, h, domain_vvtd(d)->regs) )
+        return -EINVAL;
+
+    return 0;
+}
+
+static int vvtd_save_regs(struct domain *d, hvm_domain_context_t *h)
+{
+    if ( !domain_vvtd(d) )
+        return 0;
+
+    return hvm_save_entry(IOMMU_REGS, 0, h, domain_vvtd(d)->regs);
+}
+
+static int vvtd_load_hidden(struct domain *d, hvm_domain_context_t *h)
+{
+    if ( !domain_vvtd(d) )
+        return -ENODEV;
+
+    if ( hvm_load_entry(IOMMU, h, &domain_vvtd(d)->hw) )
+        return -EINVAL;
+
+    return 0;
+}
+
+static int vvtd_save_hidden(struct domain *d, hvm_domain_context_t *h)
+{
+    if ( !domain_vvtd(d) )
+        return 0;
+
+    return hvm_save_entry(IOMMU, 0, h, &domain_vvtd(d)->hw);
+}
+
+HVM_REGISTER_SAVE_RESTORE(IOMMU, vvtd_save_hidden, vvtd_load_hidden,
+                          1, HVMSR_PER_DOM);
+HVM_REGISTER_SAVE_RESTORE(IOMMU_REGS, vvtd_save_regs, vvtd_load_regs,
+                          1, HVMSR_PER_DOM);
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
@@ -1122,12 +1153,13 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
     vvtd->base_addr = viommu->base_address;
     vvtd->length = viommu->length;
     vvtd->domain = d;
-    vvtd->status = VIOMMU_STATUS_DEFAULT;
-    vvtd->eim = 0;
-    vvtd->irt = 0;
-    vvtd->irt_max_entry = 0;
-    vvtd->frcd_idx = 0;
+    vvtd->hw.status = VIOMMU_STATUS_DEFAULT;
+    vvtd->hw.eim = 0;
+    vvtd->hw.irt = 0;
+    vvtd->hw.irt_max_entry = 0;
+    vvtd->hw.frcd_idx = 0;
     register_mmio_handler(d, &vvtd_mmio_ops);
+    viommu->priv = (void *)vvtd;
     return 0;
 
  out2:
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index fd7bf3f..10536cb 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -639,10 +639,32 @@ struct hvm_msr {
 
 #define CPU_MSR_CODE  20
 
+struct hvm_hw_vvtd_regs {
+    uint8_t data[1024];
+};
+
+DECLARE_HVM_SAVE_TYPE(IOMMU_REGS, 21, struct hvm_hw_vvtd_regs);
+
+struct hvm_hw_vvtd
+{
+    /* VIOMMU_STATUS_XXX */
+    uint32_t status;
+    /* Fault Recording index */
+    uint32_t frcd_idx;
+    /* Is in Extended Interrupt Mode? */
+    uint32_t eim;
+    /* Max remapping entries in IRT */
+    uint32_t irt_max_entry;
+    /* Interrupt remapping table base gfn */
+    uint64_t irt;
+};
+
+DECLARE_HVM_SAVE_TYPE(IOMMU, 22, struct hvm_hw_vvtd);
+
 /* 
  * Largest type-code in use
  */
-#define HVM_SAVE_CODE_MAX 20
+#define HVM_SAVE_CODE_MAX 22
 
 #endif /* __XEN_PUBLIC_HVM_SAVE_X86_H__ */
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-09 20:34 ` [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
@ 2017-08-17 11:18   ` Wei Liu
  2017-08-18  2:57     ` Lan Tianyu
  2017-08-22 14:32   ` Roger Pau Monné
  1 sibling, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-17 11:18 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
[...]
>  
> +int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
> +                  bool *need_copy)
> +{
> +    int rc = -EINVAL, ret;
> +
> +    if ( !viommu_enabled() )
> +        return rc;
> +
> +    switch ( op->cmd )
> +    {
> +    case XEN_DOMCTL_create_viommu:
> +        ret = viommu_create(d, op->u.create_viommu.viommu_type,
> +            op->u.create_viommu.base_address,
> +            op->u.create_viommu.length,
> +            op->u.create_viommu.capabilities);

Please align these with "d" in previous line.

> +        if ( ret >= 0 ) {

Coding style is wrong.

> +            op->u.create_viommu.viommu_id = ret;
> +            *need_copy = true;
> +            rc = 0; /* return 0 if success */

No need to have that comment.

> +        }
> +        break;
> +
> +    case XEN_DOMCTL_destroy_viommu:
> +        rc = viommu_destroy(d, op->u.destroy_viommu.viommu_id);
> +        break;
> +
> +    case XEN_DOMCTL_query_viommu_caps:
> +        ret = viommu_query_caps(d, op->u.query_caps.viommu_type);
> +        if ( ret >= 0 )
> +        {
> +            op->u.query_caps.capabilities = ret;
> +            rc = 0;
> +        }
> +        *need_copy = true;
> +        break;
> +
> +    default:
> +        break;
> +    }
> +
> +    return rc;
> +}
> +
>  int __init viommu_setup(void)
>  {
>      INIT_LIST_HEAD(&type_list);
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index ff39762..4b10f26 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1149,6 +1149,56 @@ struct xen_domctl_psr_cat_op {
>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
>  
> +/*  vIOMMU helper
> + *
> + *  vIOMMU interface can be used to create/destroy vIOMMU and
> + *  query vIOMMU capabilities.
> + */
> +
> +/* vIOMMU type - specify vendor vIOMMU device model */
> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)

Why use a bit when the types are mutually exclusive? Using a number
should be fine?

> +
> +/* vIOMMU capabilities */
> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> +
> +struct xen_domctl_viommu_op {
> +    uint32_t cmd;
> +#define XEN_DOMCTL_create_viommu          0
> +#define XEN_DOMCTL_destroy_viommu         1
> +#define XEN_DOMCTL_query_viommu_caps      2
> +    union {
> +        struct {
> +            /* IN - vIOMMU type */
> +            uint64_t viommu_type;
> +            /* 
> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
> +             * are in charge of to check base_address and length.
> +             */
> +            uint64_t base_address;
> +            /* IN - Length of MMIO region */
> +            uint64_t length;
> +            /* IN - Capabilities with which we want to create */
> +            uint64_t capabilities;
> +            /* OUT - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } create_viommu;

create should be fine.

> +
> +        struct {
> +            /* IN - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } destroy_viommu;

destroy should be fine.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-09 20:34 ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
@ 2017-08-17 11:18   ` Wei Liu
  2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
  2017-08-18  7:09     ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
  2017-08-22 15:32   ` Roger Pau Monné
  1 sibling, 2 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-17 11:18 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
> This patch is to add irq request callback for platform implementation
> to deal with irq remapping request.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c          | 15 +++++++++
>  xen/include/asm-x86/viommu.h | 73 ++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/viommu.h     |  9 ++++++
>  3 files changed, 97 insertions(+)
>  create mode 100644 xen/include/asm-x86/viommu.h
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index a4d004d..f4d34e6 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -198,6 +198,21 @@ int __init viommu_setup(void)
>      return 0;
>  }
>  
> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
> +                              struct irq_remapping_request *request)
> +{
> +    struct viommu_info *info = &d->viommu;

Does this compile? This patch and the previous one don't have viommu
added to struct domain.

> +
> +    if ( viommu_id >= info->nr_viommu
> +         || !info->viommu[viommu_id] )

Join this to previous line?

> +        return -EINVAL;

ASSERT(info->viommu[viommu_id]->ops);

For extra safety.

> +
> +    if ( !info->viommu[viommu_id]->ops->handle_irq_request )
> +        return -EINVAL;
> +
> +    return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
> new file mode 100644
> index 0000000..51bda72
> --- /dev/null
> +++ b/xen/include/asm-x86/viommu.h
> @@ -0,0 +1,73 @@
> +/*
> + * include/asm-x86/viommu.h
> + *
> + * Copyright (c) 2017 Intel Corporation.
> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#ifndef __ARCH_X86_VIOMMU_H__
> +#define __ARCH_X86_VIOMMU_H__
> +

Is a corresponding ARM header needed? Given viommu is common code.

> +#include <xen/viommu.h>

I think you're probably doing it wrong.

It should be that the common header header includes arch header, then
the code only uses the common header (I haven't read the rest of your
series at this point).

> +#include <asm/types.h>
> +
> +/* IRQ request type */
> +#define VIOMMU_REQUEST_IRQ_MSI          0
> +#define VIOMMU_REQUEST_IRQ_APIC         1
> +
> +struct irq_remapping_request
> +{
> +    union {
> +        /* MSI */
> +        struct {
> +            u64 addr;
> +            u32 data;
> +        } msi;
> +        /* Redirection Entry in IOAPIC */
> +        u64 rte;
> +    } msg;
> +    u16 source_id;
> +    u8 type;

uintXX_t please.

> +};
> +
> +static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
> +                             uint32_t ioapic_id, uint64_t rte)

Indentation.

> +{
> +    ASSERT(req);
> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
> +    req->source_id = ioapic_id;
> +    req->msg.rte = rte;
> +}
> +
> +static inline void irq_request_msi_fill(struct irq_remapping_request *req,
> +                          uint32_t source_id, uint64_t addr, uint32_t data)

Indentation.

> +{
> +    ASSERT(req);
> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
> +    req->source_id = source_id;
> +    req->msg.msi.addr = addr;
> +    req->msg.msi.data = data;
> +}
> +
> +#endif /* __ARCH_X86_VIOMMU_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * End:
> + */
> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
> index 527afb1..0be1b3a 100644
> --- a/xen/include/xen/viommu.h
> +++ b/xen/include/xen/viommu.h
> @@ -20,6 +20,8 @@
>  #ifndef __XEN_VIOMMU_H__
>  #define __XEN_VIOMMU_H__
>  
> +#include <asm/viommu.h>
> +

Circular inclusion? Note the #include <xen/viommu.h> some lines above.

>  #define NR_VIOMMU_PER_DOMAIN 1
>  
>  struct viommu;
> @@ -28,6 +30,8 @@ struct viommu_ops {
>      u64 (*query_caps)(struct domain *d);
>      int (*create)(struct domain *d, struct viommu *viommu);
>      int (*destroy)(struct viommu *viommu);
> +    int (*handle_irq_request)(struct domain *d,
> +                              struct irq_remapping_request *request);
>  };
>  
>  struct viommu {
> @@ -52,6 +56,8 @@ int viommu_register_type(u64 type, struct viommu_ops * ops);
>  int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
>                    bool_t *need_copy);
>  int viommu_setup(void);
> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
> +                              struct irq_remapping_request *request);
>  #else
>  static inline int viommu_init_domain(struct domain *d) { return 0; }
>  static inline int viommu_register_type(u64 type, struct viommu_ops * ops)
> @@ -62,6 +68,9 @@ static inline int viommu_domctl(struct domain *d,
>                                  struct xen_domctl_viommu_op *op,
>                                  bool *need_copy)
>  { return -ENODEV };
> +static inline int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
> +                              struct irq_remapping_request *request)
> +{ return 0 };

This should fail.

>  #endif
>  
>  #endif /* __XEN_VIOMMU_H__ */
> -- 
> 1.8.3.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-08-09 20:34 ` [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
@ 2017-08-17 11:19   ` Wei Liu
  2017-08-22 15:38   ` Roger Pau Monné
  1 sibling, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-17 11:19 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:04PM -0400, Lan Tianyu wrote:
> This patch is to add get_irq_info callback for platform implementation
> to convert irq remapping request to irq info (E,G vector, dest, dest_mode
> and so on).
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c          | 16 ++++++++++++++++
>  xen/include/asm-x86/viommu.h |  8 ++++++++
>  xen/include/xen/viommu.h     |  9 +++++++++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index f4d34e6..03c879d 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -213,6 +213,22 @@ int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>      return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
>  }
>  
> +int viommu_get_irq_info(struct domain *d, u32 viommu_id,

Again, uint32_t please.

Please fix all these in this series.

> +                        struct irq_remapping_request *request,
> +                        struct irq_remapping_info *irq_info)
> +{
> +    struct viommu_info *info = &d->viommu;

Having skimmed the rest of this series, there is no addition of viommu
to struct domain (no change to sched.h). Did I miss something obvious?

> +
> +    if ( viommu_id >= info->nr_viommu
> +         || !info->viommu[viommu_id] )
> +        return -EINVAL;
> +
> +    if ( !info->viommu[viommu_id]->ops->get_irq_info )
> +        return -EINVAL;
> +
> +    return info->viommu[viommu_id]->ops->get_irq_info(d, request, irq_info);

Same comments in previous patch apply here, too.

> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
[...]
> +static inline int viommu_get_irq_info(struct domain *d, u32 viommu_id,
> +                                      struct irq_remapping_request *request,
> +                                      struct irq_remapping_info *irq_info)
> +{ return 0 };

This should fail, too.

>  #endif
>  
>  #endif /* __XEN_VIOMMU_H__ */
> -- 
> 1.8.3.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-09 20:34 ` [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
@ 2017-08-17 11:19   ` Wei Liu
  2017-08-18  7:17     ` Lan Tianyu
  2017-08-22 15:55   ` Roger Pau Monné
  1 sibling, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-17 11:19 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
> +Now just suppport single vIOMMU for one VM and introduced domtcls are compatible
> +with multi-vIOMMU support.

Is this still true? There is an ID field in the struct which can
distinguish multiple viommus, right?

> +
> +xl vIOMMU configuration
> +=======================
> +viommu="type=intel_vtd,intremap=1,x2apic=1"

If there is provision to support multiple viommu please make this an
array.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-09 20:34 ` [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
@ 2017-08-17 11:32   ` Wei Liu
  2017-08-17 12:28     ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-17 11:32 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:10PM -0400, Lan Tianyu wrote:
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index f54fd49..94c9196 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1060,6 +1060,42 @@ static int libxl__domain_firmware(libxl__gc *gc,
>          }
>      }
>  
> +    /*
> +     * If a guest has one virtual VTD, build DMAR table for it and joint this
> +     * table with existing content in acpi_modules in order to employ HVM
> +     * firmware pass-through mechanism to pass-through DMAR table.
> +     */
> +    if (info->viommu.type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> +        datalen = 0;
> +        e = libxl__dom_build_dmar(gc, info, dom, &data, &datalen);
> +        if (e) {
> +            LOGEV(ERROR, e, "failed to build DMAR table");
> +            rc = ERROR_FAIL;
> +            goto out;
> +        }
> +        if (datalen) {
> +            libxl__ptr_add(gc, data);
> +            if (!dom->acpi_modules[0].data) {
> +                dom->acpi_modules[0].data = data;
> +                dom->acpi_modules[0].length = (uint32_t)datalen;
> +            } else {
> +                /* joint tables */
> +                void *newdata;
> +                newdata = malloc(datalen + dom->acpi_modules[0].length);

All memory allocations in libxl should use libxl__*lloc wrappers.

> +                if (!newdata) {
> +                    LOGE(ERROR, "failed to joint DMAR table to acpi modules");
> +                    rc = ERROR_FAIL;
> +                    goto out;
> +                }
> +                memcpy(newdata, dom->acpi_modules[0].data,
> +                       dom->acpi_modules[0].length);
> +                memcpy(newdata + dom->acpi_modules[0].length, data, datalen);
> +                dom->acpi_modules[0].data = newdata;
> +                dom->acpi_modules[0].length += (uint32_t)datalen;
> +            }
> +        }
> +    }

This still looks wrong to me. How do you know acpi_modules[0] is DMAR
table?

You should have a look at libxl_x86_acpi.c and work out a proper
solution.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-17 11:32   ` Wei Liu
@ 2017-08-17 12:28     ` Wei Liu
  2017-08-18  5:45       ` Chao Gao
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-17 12:28 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Thu, Aug 17, 2017 at 12:32:17PM +0100, Wei Liu wrote:
> On Wed, Aug 09, 2017 at 04:34:10PM -0400, Lan Tianyu wrote:
> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> > index f54fd49..94c9196 100644
> > --- a/tools/libxl/libxl_dom.c
> > +++ b/tools/libxl/libxl_dom.c
> > @@ -1060,6 +1060,42 @@ static int libxl__domain_firmware(libxl__gc *gc,
> >          }
> >      }
> >  
> > +    /*
> > +     * If a guest has one virtual VTD, build DMAR table for it and joint this
> > +     * table with existing content in acpi_modules in order to employ HVM
> > +     * firmware pass-through mechanism to pass-through DMAR table.
> > +     */
> > +    if (info->viommu.type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> > +        datalen = 0;
> > +        e = libxl__dom_build_dmar(gc, info, dom, &data, &datalen);
> > +        if (e) {
> > +            LOGEV(ERROR, e, "failed to build DMAR table");
> > +            rc = ERROR_FAIL;
> > +            goto out;
> > +        }
> > +        if (datalen) {
> > +            libxl__ptr_add(gc, data);
> > +            if (!dom->acpi_modules[0].data) {
> > +                dom->acpi_modules[0].data = data;
> > +                dom->acpi_modules[0].length = (uint32_t)datalen;
> > +            } else {
> > +                /* joint tables */
> > +                void *newdata;
> > +                newdata = malloc(datalen + dom->acpi_modules[0].length);
> 
> All memory allocations in libxl should use libxl__*lloc wrappers.
> 
> > +                if (!newdata) {
> > +                    LOGE(ERROR, "failed to joint DMAR table to acpi modules");
> > +                    rc = ERROR_FAIL;
> > +                    goto out;
> > +                }
> > +                memcpy(newdata, dom->acpi_modules[0].data,
> > +                       dom->acpi_modules[0].length);
> > +                memcpy(newdata + dom->acpi_modules[0].length, data, datalen);
> > +                dom->acpi_modules[0].data = newdata;
> > +                dom->acpi_modules[0].length += (uint32_t)datalen;

Also, this leaks the old pointer, right?

> > +            }
> > +        }
> > +    }
> 
> This still looks wrong to me. How do you know acpi_modules[0] is DMAR
> table?
> 

Oh, I sorta see why you do this, but I still think this is wrong. The
DMAR should either be a new module or be joined to the existing one (and
with all conflicts resolved).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-17 11:18   ` Wei Liu
@ 2017-08-18  0:22     ` Lan Tianyu
  2017-08-18  8:41       ` Jan Beulich
                         ` (3 more replies)
  2017-08-18  7:09     ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
  1 sibling, 4 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-18  0:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	julien.grall, jbeulich, chao.gao

This patch is to introduct an abstract layer for arch vIOMMU implementation
to deal with requests from dom0. Arch vIOMMU code needs to provide callback
to perform create, destroy and query capabilities operation.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/Kconfig     |   1 +
 xen/arch/x86/setup.c     |   1 +
 xen/common/Kconfig       |   3 +
 xen/common/Makefile      |   1 +
 xen/common/domain.c      |   3 +
 xen/common/viommu.c      | 165 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/sched.h  |   2 +
 xen/include/xen/viommu.h |  71 ++++++++++++++++++++
 8 files changed, 247 insertions(+)
 create mode 100644 xen/common/viommu.c
 create mode 100644 xen/include/xen/viommu.h

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 30c2769..1f1de96 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -23,6 +23,7 @@ config X86
 	select HAS_PDX
 	select NUMA
 	select VGA
+	select VIOMMU
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index db5df69..68f1631 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1513,6 +1513,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     early_msi_init();
 
     iommu_setup();    /* setup iommu if available */
+    viommu_setup();
 
     smp_prepare_cpus(max_cpus);
 
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index dc8e876..2ad2c8d 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -49,6 +49,9 @@ config HAS_CHECKPOLICY
 	string
 	option env="XEN_HAS_CHECKPOLICY"
 
+config VIOMMU
+	bool
+
 config KEXEC
 	bool "kexec support"
 	default y
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 26c5a64..852553d 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -56,6 +56,7 @@ obj-y += time.o
 obj-y += timer.o
 obj-y += trace.o
 obj-y += version.o
+obj-$(CONFIG_VIOMMU) += viommu.o
 obj-y += virtual_region.o
 obj-y += vm_event.o
 obj-y += vmap.o
diff --git a/xen/common/domain.c b/xen/common/domain.c
index b22aacc..d1f9b10 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -396,6 +396,9 @@ struct domain *domain_create(domid_t domid, unsigned int domcr_flags,
         spin_unlock(&domlist_update_lock);
     }
 
+    if ( (err = viommu_init_domain(d)) != 0 )
+        goto fail;
+
     return d;
 
  fail:
diff --git a/xen/common/viommu.c b/xen/common/viommu.c
new file mode 100644
index 0000000..6874d9f
--- /dev/null
+++ b/xen/common/viommu.c
@@ -0,0 +1,165 @@
+/*
+ * common/viommu.c
+ * 
+ * Copyright (c) 2017 Intel Corporation
+ * Author: Lan Tianyu <tianyu.lan@intel.com> 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/sched.h>
+#include <xen/spinlock.h>
+#include <xen/types.h>
+#include <xen/viommu.h>
+
+bool __read_mostly opt_viommu;
+boolean_param("viommu", opt_viommu);
+
+static spinlock_t type_list_lock;
+static struct list_head type_list;
+
+struct viommu_type {
+    u64 type;
+    struct viommu_ops *ops;
+    struct list_head node;
+};
+
+int viommu_init_domain(struct domain *d)
+{
+    d->viommu.nr_viommu = 0;
+    return 0;
+}
+
+static struct viommu_type *viommu_get_type(u64 type)
+{
+    struct viommu_type *viommu_type = NULL;
+
+    spin_lock(&type_list_lock);
+    list_for_each_entry( viommu_type, &type_list, node )
+    {
+        if ( viommu_type->type == type )
+        {
+            spin_unlock(&type_list_lock);
+            return viommu_type;
+        }
+    }
+    spin_unlock(&type_list_lock);
+
+    return NULL;
+}
+
+int viommu_register_type(u64 type, struct viommu_ops * ops)
+{
+    struct viommu_type *viommu_type = NULL;
+
+    if ( !viommu_enabled() )
+        return -EINVAL;
+
+    if ( viommu_get_type(type) )
+        return -EEXIST;
+
+    viommu_type = xzalloc(struct viommu_type);
+    if ( !viommu_type )
+        return -ENOMEM;
+
+    viommu_type->type = type;
+    viommu_type->ops = ops;
+
+    spin_lock(&type_list_lock);
+    list_add_tail(&viommu_type->node, &type_list);
+    spin_unlock(&type_list_lock);
+
+    return 0;
+}
+
+static int viommu_create(struct domain *d, u64 type, u64 base_address,
+    u64 length, u64 caps)
+{
+    struct viommu_info *info = &d->viommu;
+    struct viommu *viommu;
+    struct viommu_type *viommu_type = NULL;
+    int rc;
+
+    viommu_type = viommu_get_type(type);
+    if ( !viommu_type )
+        return -EINVAL;
+
+    if ( info->nr_viommu >= NR_VIOMMU_PER_DOMAIN
+        || !viommu_type->ops || !viommu_type->ops->create )
+        return -EINVAL;
+
+    viommu = xzalloc(struct viommu);
+    if ( !viommu )
+        return -ENOMEM;
+
+    viommu->base_address = base_address;
+    viommu->length = length;
+    viommu->caps = caps;
+    viommu->ops = viommu_type->ops;
+    viommu->viommu_id = info->nr_viommu;
+
+    info->viommu[info->nr_viommu] = viommu;
+    info->nr_viommu++;
+
+    rc = viommu->ops->create(d, viommu);
+    if ( rc < 0 )
+    {
+        xfree(viommu);
+        info->nr_viommu--;
+        info->viommu[info->nr_viommu] = NULL;
+        return rc;
+    }
+
+    return viommu->viommu_id;
+}
+
+static int viommu_destroy(struct domain *d, u32 viommu_id)
+{
+    struct viommu_info *info = &d->viommu;
+
+    if ( viommu_id >= info->nr_viommu || !info->viommu[viommu_id] )
+        return -EINVAL;
+
+    if ( info->viommu[viommu_id]->ops->destroy(info->viommu[viommu_id]) )
+        return -EFAULT;
+
+    xfree(info->viommu[viommu_id]);
+    info->viommu[viommu_id] = NULL;
+    return 0;
+}
+
+static u64 viommu_query_caps(struct domain *d, u64 type)
+{
+    struct viommu_type *viommu_type = viommu_get_type(type);
+
+    if ( !viommu_type )
+        return -EINVAL;
+
+    return viommu_type->ops->query_caps(d);
+}
+
+int __init viommu_setup(void)
+{
+    INIT_LIST_HEAD(&type_list);
+    spin_lock_init(&type_list_lock);
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 6673b27..98a965a 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -21,6 +21,7 @@
 #include <xen/perfc.h>
 #include <asm/atomic.h>
 #include <xen/wait.h>
+#include <xen/viommu.h>
 #include <public/xen.h>
 #include <public/domctl.h>
 #include <public/sysctl.h>
@@ -477,6 +478,7 @@ struct domain
     /* vNUMA topology accesses are protected by rwlock. */
     rwlock_t vnuma_rwlock;
     struct vnuma_info *vnuma;
+    struct viommu_info viommu;
 
     /* Common monitor options */
     struct {
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
new file mode 100644
index 0000000..506ea54
--- /dev/null
+++ b/xen/include/xen/viommu.h
@@ -0,0 +1,71 @@
+/*
+ * include/xen/viommu.h
+ *
+ * Copyright (c) 2017, Intel Corporation
+ * Author: Lan Tianyu <tianyu.lan@intel.com> 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+#ifndef __XEN_VIOMMU_H__
+#define __XEN_VIOMMU_H__
+
+#define NR_VIOMMU_PER_DOMAIN 1
+
+struct viommu;
+
+struct viommu_ops {
+    u64 (*query_caps)(struct domain *d);
+    int (*create)(struct domain *d, struct viommu *viommu);
+    int (*destroy)(struct viommu *viommu);
+};
+
+struct viommu {
+    u64 base_address;
+    u64 length;
+    u64 caps;
+    u32 viommu_id;
+    const struct viommu_ops *ops;
+    void *priv;
+};
+
+struct viommu_info {
+    u32 nr_viommu;
+    struct viommu *viommu[NR_VIOMMU_PER_DOMAIN]; /* viommu array*/
+};
+
+#ifdef CONFIG_VIOMMU
+extern bool_t opt_viommu;
+static inline bool viommu_enabled(void) { return opt_viommu; }
+int viommu_init_domain(struct domain *d);
+int viommu_register_type(u64 type, struct viommu_ops * ops);
+int viommu_setup(void);
+#else
+static inline int viommu_init_domain(struct domain *d) { return 0; }
+static inline int viommu_register_type(u64 type, struct viommu_ops * ops)
+{ return 0; }
+static inline int __init viommu_setup(void) { return 0; }
+static inline bool viommu_enabled(void) { return false; }
+#endif
+
+#endif /* __XEN_VIOMMU_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-17 11:18   ` Wei Liu
@ 2017-08-18  2:57     ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-18  2:57 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

Hi Wei:
	Thanks for your review.

On 2017年08月17日 19:18, Wei Liu wrote:
> On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>> index ff39762..4b10f26 100644
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -1149,6 +1149,56 @@ struct xen_domctl_psr_cat_op {
>>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
>>  
>> +/*  vIOMMU helper
>> + *
>> + *  vIOMMU interface can be used to create/destroy vIOMMU and
>> + *  query vIOMMU capabilities.
>> + */
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
> 
> Why use a bit when the types are mutually exclusive? Using a number
> should be fine?

Yes, will update.

> 
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> +    uint32_t cmd;
>> +#define XEN_DOMCTL_create_viommu          0
>> +#define XEN_DOMCTL_destroy_viommu         1
>> +#define XEN_DOMCTL_query_viommu_caps      2
>> +    union {
>> +        struct {
>> +            /* IN - vIOMMU type */
>> +            uint64_t viommu_type;
>> +            /* 
>> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
>> +             * are in charge of to check base_address and length.
>> +             */
>> +            uint64_t base_address;
>> +            /* IN - Length of MMIO region */
>> +            uint64_t length;
>> +            /* IN - Capabilities with which we want to create */
>> +            uint64_t capabilities;
>> +            /* OUT - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } create_viommu;
> 
> create should be fine.
> 

OK.

>> +
>> +        struct {
>> +            /* IN - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } destroy_viommu;
> 
> destroy should be fine.
> 

OK.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-17 12:28     ` Wei Liu
@ 2017-08-18  5:45       ` Chao Gao
  2017-08-18 13:45         ` Wei Liu
  2017-08-22 13:48         ` Wei Liu
  0 siblings, 2 replies; 136+ messages in thread
From: Chao Gao @ 2017-08-18  5:45 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich

On Thu, Aug 17, 2017 at 01:28:21PM +0100, Wei Liu wrote:
>On Thu, Aug 17, 2017 at 12:32:17PM +0100, Wei Liu wrote:
>> On Wed, Aug 09, 2017 at 04:34:10PM -0400, Lan Tianyu wrote:
>> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
>> > index f54fd49..94c9196 100644
>> > --- a/tools/libxl/libxl_dom.c
>> > +++ b/tools/libxl/libxl_dom.c
>> > @@ -1060,6 +1060,42 @@ static int libxl__domain_firmware(libxl__gc *gc,
>> >          }
>> >      }
>> >  
>> > +    /*
>> > +     * If a guest has one virtual VTD, build DMAR table for it and joint this
>> > +     * table with existing content in acpi_modules in order to employ HVM
>> > +     * firmware pass-through mechanism to pass-through DMAR table.
>> > +     */
>> > +    if (info->viommu.type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
>> > +        datalen = 0;
>> > +        e = libxl__dom_build_dmar(gc, info, dom, &data, &datalen);
>> > +        if (e) {
>> > +            LOGEV(ERROR, e, "failed to build DMAR table");
>> > +            rc = ERROR_FAIL;
>> > +            goto out;
>> > +        }
>> > +        if (datalen) {
>> > +            libxl__ptr_add(gc, data);
>> > +            if (!dom->acpi_modules[0].data) {
>> > +                dom->acpi_modules[0].data = data;
>> > +                dom->acpi_modules[0].length = (uint32_t)datalen;
>> > +            } else {
>> > +                /* joint tables */
>> > +                void *newdata;
>> > +                newdata = malloc(datalen + dom->acpi_modules[0].length);
>> 
>> All memory allocations in libxl should use libxl__*lloc wrappers.
>> 
>> > +                if (!newdata) {
>> > +                    LOGE(ERROR, "failed to joint DMAR table to acpi modules");
>> > +                    rc = ERROR_FAIL;
>> > +                    goto out;
>> > +                }
>> > +                memcpy(newdata, dom->acpi_modules[0].data,
>> > +                       dom->acpi_modules[0].length);
>> > +                memcpy(newdata + dom->acpi_modules[0].length, data, datalen);
>> > +                dom->acpi_modules[0].data = newdata;
>> > +                dom->acpi_modules[0].length += (uint32_t)datalen;
>
>Also, this leaks the old pointer, right?

Yes. Will fix this.

>
>> > +            }
>> > +        }
>> > +    }
>> 
>> This still looks wrong to me. How do you know acpi_modules[0] is DMAR
>> table?
>> 
>
>Oh, I sorta see why you do this, but I still think this is wrong. The
>DMAR should either be a new module or be joined to the existing one (and
>with all conflicts resolved).

Hi, Wei
Thanks for your comments.

iirc, HVM only supports one module; DMAR cannot be a new module. Joining to
the existing one is the approach we are taking. 

Which kind of conflicts you think should be resolved? If you mean I
forget to free the old buf, I will fix this. If you mean the potential
overlap between the binary passed by admin and DMAR table built here, I
don't have much idea on this. Even without the DMAR table, the binary
may contains MADT or other tables and tool stacks don't intrepret the
binary and check whether there are conflicts, right?

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-17 11:18   ` Wei Liu
  2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
@ 2017-08-18  7:09     ` Lan Tianyu
  2017-08-18 10:13       ` Wei Liu
  1 sibling, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-18  7:09 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月17日 19:18, Wei Liu wrote:
> On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
>> This patch is to add irq request callback for platform implementation
>> to deal with irq remapping request.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/viommu.c          | 15 +++++++++
>>  xen/include/asm-x86/viommu.h | 73 ++++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/xen/viommu.h     |  9 ++++++
>>  3 files changed, 97 insertions(+)
>>  create mode 100644 xen/include/asm-x86/viommu.h
>>
>> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
>> index a4d004d..f4d34e6 100644
>> --- a/xen/common/viommu.c
>> +++ b/xen/common/viommu.c
>> @@ -198,6 +198,21 @@ int __init viommu_setup(void)
>>      return 0;
>>  }
>>  
>> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>> +                              struct irq_remapping_request *request)
>> +{
>> +    struct viommu_info *info = &d->viommu;
> 
> Does this compile? This patch and the previous one don't have viommu
> added to struct domain.

Oh. Sorry. Missed patch "VIOMMU: Add vIOMMU helper functions to create,
 destroy and query capabilities" in this series. I just sent out and
followed this mail. Please have a look.

> 
>> +
>> +    if ( viommu_id >= info->nr_viommu
>> +         || !info->viommu[viommu_id] )
> 
> Join this to previous line?
>

OK.

>> +        return -EINVAL;
> 
> ASSERT(info->viommu[viommu_id]->ops);
> 
> For extra safety.

Or check ops in the previous if?

> 
>> +
>> +    if ( !info->viommu[viommu_id]->ops->handle_irq_request )
>> +        return -EINVAL;
>> +
>> +    return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
>> +}
>> +
>>  /*
>>   * Local variables:
>>   * mode: C
>> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
>> new file mode 100644
>> index 0000000..51bda72
>> --- /dev/null
>> +++ b/xen/include/asm-x86/viommu.h
>> @@ -0,0 +1,73 @@
>> +/*
>> + * include/asm-x86/viommu.h
>> + *
>> + * Copyright (c) 2017 Intel Corporation.
>> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + */
>> +#ifndef __ARCH_X86_VIOMMU_H__
>> +#define __ARCH_X86_VIOMMU_H__
>> +
> 
> Is a corresponding ARM header needed? Given viommu is common code.

I added such ARM header file in previous version but Julien hope vIOMMU
should be disabled for ARM because ARM doesn't support vIOMMU so far.

> 
>> +#include <xen/viommu.h>
> 
> I think you're probably doing it wrong.
> 
> It should be that the common header header includes arch header, then
> the code only uses the common header (I haven't read the rest of your
> series at this point).

OK. Will fix it.

> 
>> +#include <asm/types.h>
>> +
>> +/* IRQ request type */
>> +#define VIOMMU_REQUEST_IRQ_MSI          0
>> +#define VIOMMU_REQUEST_IRQ_APIC         1
>> +
>> +struct irq_remapping_request
>> +{
>> +    union {
>> +        /* MSI */
>> +        struct {
>> +            u64 addr;
>> +            u32 data;
>> +        } msi;
>> +        /* Redirection Entry in IOAPIC */
>> +        u64 rte;
>> +    } msg;
>> +    u16 source_id;
>> +    u8 type;
> 
> uintXX_t please.

Sure.

> 
>> +};
>> +
>> +static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
>> +                             uint32_t ioapic_id, uint64_t rte)
> 
> Indentation.
> 
>> +{
>> +    ASSERT(req);
>> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
>> +    req->source_id = ioapic_id;
>> +    req->msg.rte = rte;
>> +}
>> +
>> +static inline void irq_request_msi_fill(struct irq_remapping_request *req,
>> +                          uint32_t source_id, uint64_t addr, uint32_t data)
> 
> Indentation.
> 
>> +{
>> +    ASSERT(req);
>> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
>> +    req->source_id = source_id;
>> +    req->msg.msi.addr = addr;
>> +    req->msg.msi.data = data;
>> +}
>> +
>> +#endif /* __ARCH_X86_VIOMMU_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * End:
>> + */
>> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
>> index 527afb1..0be1b3a 100644
>> --- a/xen/include/xen/viommu.h
>> +++ b/xen/include/xen/viommu.h
>> @@ -20,6 +20,8 @@
>>  #ifndef __XEN_VIOMMU_H__
>>  #define __XEN_VIOMMU_H__
>>  
>> +#include <asm/viommu.h>
>> +
> 
> Circular inclusion? Note the #include <xen/viommu.h> some lines above.
> 
>>  #define NR_VIOMMU_PER_DOMAIN 1
>>  
>>  struct viommu;
>> @@ -28,6 +30,8 @@ struct viommu_ops {
>>      u64 (*query_caps)(struct domain *d);
>>      int (*create)(struct domain *d, struct viommu *viommu);
>>      int (*destroy)(struct viommu *viommu);
>> +    int (*handle_irq_request)(struct domain *d,
>> +                              struct irq_remapping_request *request);
>>  };
>>  
>>  struct viommu {
>> @@ -52,6 +56,8 @@ int viommu_register_type(u64 type, struct viommu_ops * ops);
>>  int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
>>                    bool_t *need_copy);
>>  int viommu_setup(void);
>> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>> +                              struct irq_remapping_request *request);
>>  #else
>>  static inline int viommu_init_domain(struct domain *d) { return 0; }
>>  static inline int viommu_register_type(u64 type, struct viommu_ops * ops)
>> @@ -62,6 +68,9 @@ static inline int viommu_domctl(struct domain *d,
>>                                  struct xen_domctl_viommu_op *op,
>>                                  bool *need_copy)
>>  { return -ENODEV };
>> +static inline int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>> +                              struct irq_remapping_request *request)
>> +{ return 0 };
> 
> This should fail.
> 
>>  #endif
>>  
>>  #endif /* __XEN_VIOMMU_H__ */
>> -- 
>> 1.8.3.1
>>


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-17 11:19   ` Wei Liu
@ 2017-08-18  7:17     ` Lan Tianyu
  2017-08-18 10:15       ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-18  7:17 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月17日 19:19, Wei Liu wrote:
> On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
>> +Now just suppport single vIOMMU for one VM and introduced domctls are compatible
>> +with multi-vIOMMU support.
> 
> Is this still true? 

Yes, the patchset just supports single vIOMMU for one VM.

> There is an ID field in the struct which can
> distinguish multiple viommus, right?

Yes, this is reserved for mult vIOMMU support.

> 
>> +
>> +xl vIOMMU configuration
>> +=======================
>> +viommu="type=intel_vtd,intremap=1,x2apic=1"
> 
> If there is provision to support multiple viommu please make this an
> array.

Ok. Will update.
-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
@ 2017-08-18  8:41       ` Jan Beulich
  2017-08-18  8:50         ` Lan Tianyu
  2017-08-18 13:32       ` Wei Liu
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 136+ messages in thread
From: Jan Beulich @ 2017-08-18  8:41 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, chao.gao

>>> On 18.08.17 at 02:22, <tianyu.lan@intel.com> wrote:
> This patch is to introduct an abstract layer for arch vIOMMU implementation
> to deal with requests from dom0. Arch vIOMMU code needs to provide callback
> to perform create, destroy and query capabilities operation.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

What is this? It looks to be a standalone patch when there was
a V2 series a couple of days ago with 1/25 titled "DOMCTL:
Introduce new DOMCTL commands for vIOMMU support". I'm
confused.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-18  8:41       ` Jan Beulich
@ 2017-08-18  8:50         ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-18  8:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, chao.gao

On 2017年08月18日 16:41, Jan Beulich wrote:
>>>> On 18.08.17 at 02:22, <tianyu.lan@intel.com> wrote:
>> This patch is to introduct an abstract layer for arch vIOMMU implementation
>> to deal with requests from dom0. Arch vIOMMU code needs to provide callback
>> to perform create, destroy and query capabilities operation.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> What is this? It looks to be a standalone patch when there was
> a V2 series a couple of days ago with 1/25 titled "DOMCTL:
> Introduce new DOMCTL commands for vIOMMU support". I'm
> confused.
> 
> Jan
> 
Hi Jan:
	Sorry. Missed this patch in this patchset and this should be the first
one. I will send v3 with some fixs.
-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-18  7:09     ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
@ 2017-08-18 10:13       ` Wei Liu
  2017-08-22  8:04         ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-18 10:13 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Fri, Aug 18, 2017 at 03:09:55PM +0800, Lan Tianyu wrote:
> On 2017年08月17日 19:18, Wei Liu wrote:
> > On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
> >> This patch is to add irq request callback for platform implementation
> >> to deal with irq remapping request.
> >>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >>  xen/common/viommu.c          | 15 +++++++++
> >>  xen/include/asm-x86/viommu.h | 73 ++++++++++++++++++++++++++++++++++++++++++++
> >>  xen/include/xen/viommu.h     |  9 ++++++
> >>  3 files changed, 97 insertions(+)
> >>  create mode 100644 xen/include/asm-x86/viommu.h
> >>
> >> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> >> index a4d004d..f4d34e6 100644
> >> --- a/xen/common/viommu.c
> >> +++ b/xen/common/viommu.c
> >> @@ -198,6 +198,21 @@ int __init viommu_setup(void)
> >>      return 0;
> >>  }
> >>  
> >> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
> >> +                              struct irq_remapping_request *request)
> >> +{
> >> +    struct viommu_info *info = &d->viommu;
> > 
> > Does this compile? This patch and the previous one don't have viommu
> > added to struct domain.
> 
> Oh. Sorry. Missed patch "VIOMMU: Add vIOMMU helper functions to create,
>  destroy and query capabilities" in this series. I just sent out and
> followed this mail. Please have a look.
> 
> > 
> >> +
> >> +    if ( viommu_id >= info->nr_viommu
> >> +         || !info->viommu[viommu_id] )
> > 
> > Join this to previous line?
> >
> 
> OK.
> 
> >> +        return -EINVAL;
> > 
> > ASSERT(info->viommu[viommu_id]->ops);
> > 
> > For extra safety.
> 
> Or check ops in the previous if?
> 

That depends on if ops can be null or not.

> > 
> >> +
> >> +    if ( !info->viommu[viommu_id]->ops->handle_irq_request )
> >> +        return -EINVAL;
> >> +
> >> +    return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
> >> +}
> >> +
> >>  /*
> >>   * Local variables:
> >>   * mode: C
> >> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
> >> new file mode 100644
> >> index 0000000..51bda72
> >> --- /dev/null
> >> +++ b/xen/include/asm-x86/viommu.h
> >> @@ -0,0 +1,73 @@
> >> +/*
> >> + * include/asm-x86/viommu.h
> >> + *
> >> + * Copyright (c) 2017 Intel Corporation.
> >> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify it
> >> + * under the terms and conditions of the GNU General Public License,
> >> + * version 2, as published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope it will be useful, but WITHOUT
> >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> >> + * more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License along with
> >> + * this program; If not, see <http://www.gnu.org/licenses/>.
> >> + *
> >> + */
> >> +#ifndef __ARCH_X86_VIOMMU_H__
> >> +#define __ARCH_X86_VIOMMU_H__
> >> +
> > 
> > Is a corresponding ARM header needed? Given viommu is common code.
> 
> I added such ARM header file in previous version but Julien hope vIOMMU
> should be disabled for ARM because ARM doesn't support vIOMMU so far.
> 

If that's what he wanted, sure.

Please build test ARM as well. You can do so using travis-ci.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-18  7:17     ` Lan Tianyu
@ 2017-08-18 10:15       ` Wei Liu
  2017-08-22  8:07         ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-18 10:15 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Fri, Aug 18, 2017 at 03:17:37PM +0800, Lan Tianyu wrote:
> On 2017年08月17日 19:19, Wei Liu wrote:
> > On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
> >> +Now just suppport single vIOMMU for one VM and introduced domctls are compatible
> >> +with multi-vIOMMU support.
> > 
> > Is this still true? 
> 
> Yes, the patchset just supports single vIOMMU for one VM.
> 

The first part of the sentence is true, but the latter is probably not.
It seems to me domctl is able to cope with multiple viommu. Please
correct me if I'm wrong.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
  2017-08-18  8:41       ` Jan Beulich
@ 2017-08-18 13:32       ` Wei Liu
  2017-08-22 15:27       ` Roger Pau Monné
  2017-08-24  8:14       ` Tian, Kevin
  3 siblings, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-18 13:32 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Thu, Aug 17, 2017 at 08:22:16PM -0400, Lan Tianyu wrote:
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index b22aacc..d1f9b10 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -396,6 +396,9 @@ struct domain *domain_create(domid_t domid, unsigned int domcr_flags,
>          spin_unlock(&domlist_update_lock);
>      }
>  
> +    if ( (err = viommu_init_domain(d)) != 0 )
> +        goto fail;
> +

Where is the code to destroy viommu during domain destruction?

I suppose you will need a viommu_destroy_domain and call it somewhere in
complete_domain_destroy.

> +
> +#include <xen/sched.h>
> +#include <xen/spinlock.h>
> +#include <xen/types.h>
> +#include <xen/viommu.h>
> +
> +bool __read_mostly opt_viommu;
> +boolean_param("viommu", opt_viommu);

Missing patch to xen command line option doc.

> +
> +static spinlock_t type_list_lock;
> +static struct list_head type_list;
> +
> +struct viommu_type {
> +    u64 type;

uintXX_t here and in all other places please.

[...]
> +
> +static int viommu_create(struct domain *d, u64 type, u64 base_address,
> +    u64 length, u64 caps)
> +{
> +    struct viommu_info *info = &d->viommu;
> +    struct viommu *viommu;
> +    struct viommu_type *viommu_type = NULL;
> +    int rc;
> +
> +    viommu_type = viommu_get_type(type);
> +    if ( !viommu_type )
> +        return -EINVAL;
> +
> +    if ( info->nr_viommu >= NR_VIOMMU_PER_DOMAIN

E2BIG for this?

> +        || !viommu_type->ops || !viommu_type->ops->create )
> +        return -EINVAL;
> +
> +    viommu = xzalloc(struct viommu);
> +    if ( !viommu )
> +        return -ENOMEM;
> +
[...]
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 6673b27..98a965a 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -21,6 +21,7 @@
>  #include <xen/perfc.h>
>  #include <asm/atomic.h>
>  #include <xen/wait.h>
> +#include <xen/viommu.h>
>  #include <public/xen.h>
>  #include <public/domctl.h>
>  #include <public/sysctl.h>
> @@ -477,6 +478,7 @@ struct domain
>      /* vNUMA topology accesses are protected by rwlock. */
>      rwlock_t vnuma_rwlock;
>      struct vnuma_info *vnuma;

Please add a new line here.

> +    struct viommu_info viommu;
>  
>      /* Common monitor options */
>      struct {
> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
> new file mode 100644
> index 0000000..506ea54
> --- /dev/null
> +++ b/xen/include/xen/viommu.h
> @@ -0,0 +1,71 @@
> +/*
> + * include/xen/viommu.h
> + *
> + * Copyright (c) 2017, Intel Corporation
> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#ifndef __XEN_VIOMMU_H__
> +#define __XEN_VIOMMU_H__
> +
> +#define NR_VIOMMU_PER_DOMAIN 1
> +
> +struct viommu;
> +
> +struct viommu_ops {
> +    u64 (*query_caps)(struct domain *d);
> +    int (*create)(struct domain *d, struct viommu *viommu);
> +    int (*destroy)(struct viommu *viommu);
> +};
> +
> +struct viommu {
> +    u64 base_address;
> +    u64 length;
> +    u64 caps;
> +    u32 viommu_id;
> +    const struct viommu_ops *ops;
> +    void *priv;
> +};
> +
> +struct viommu_info {
> +    u32 nr_viommu;

unsigned int

> +    struct viommu *viommu[NR_VIOMMU_PER_DOMAIN]; /* viommu array*/
> +};
> +
> +#ifdef CONFIG_VIOMMU
> +extern bool_t opt_viommu;

bool

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-18  5:45       ` Chao Gao
@ 2017-08-18 13:45         ` Wei Liu
  2017-08-18 13:56           ` Jan Beulich
  2017-08-22 13:48         ` Wei Liu
  1 sibling, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-18 13:45 UTC (permalink / raw)
  To: Wei Liu, Lan Tianyu, xen-devel, ian.jackson, jbeulich,
	andrew.cooper3, kevin.tian, julien.grall

On Fri, Aug 18, 2017 at 01:45:50PM +0800, Chao Gao wrote:
> >
> >> > +            }
> >> > +        }
> >> > +    }
> >> 
> >> This still looks wrong to me. How do you know acpi_modules[0] is DMAR
> >> table?
> >> 
> >
> >Oh, I sorta see why you do this, but I still think this is wrong. The
> >DMAR should either be a new module or be joined to the existing one (and
> >with all conflicts resolved).
> 
> Hi, Wei
> Thanks for your comments.
> 
> iirc, HVM only supports one module;

This is indeed how it is stated in various comments. I'm not sure why
there is such restriction. Maybe x86 maintainers can shed more light on
this?

> DMAR cannot be a new module. Joining to
> the existing one is the approach we are taking. 
> 
> Which kind of conflicts you think should be resolved? If you mean I
> forget to free the old buf, I will fix this. If you mean the potential
> overlap between the binary passed by admin and DMAR table built here, I
> don't have much idea on this. Even without the DMAR table, the binary
> may contains MADT or other tables and tool stacks don't intrepret the
> binary and check whether there are conflicts, right?

That's true. Ignore the comment about fixing up conflicts then.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-18 13:45         ` Wei Liu
@ 2017-08-18 13:56           ` Jan Beulich
  2017-08-22 13:44             ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Jan Beulich @ 2017-08-18 13:56 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall

>>> On 18.08.17 at 15:45, <wei.liu2@citrix.com> wrote:
> On Fri, Aug 18, 2017 at 01:45:50PM +0800, Chao Gao wrote:
>> >
>> >> > +            }
>> >> > +        }
>> >> > +    }
>> >> 
>> >> This still looks wrong to me. How do you know acpi_modules[0] is DMAR
>> >> table?
>> >> 
>> >
>> >Oh, I sorta see why you do this, but I still think this is wrong. The
>> >DMAR should either be a new module or be joined to the existing one (and
>> >with all conflicts resolved).
>> 
>> Hi, Wei
>> Thanks for your comments.
>> 
>> iirc, HVM only supports one module;
> 
> This is indeed how it is stated in various comments. I'm not sure why
> there is such restriction. Maybe x86 maintainers can shed more light on
> this?

Not me, sorry. Maybe ask whoever has written that code?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-18 10:13       ` Wei Liu
@ 2017-08-22  8:04         ` Lan Tianyu
  2017-08-22  8:42           ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-22  8:04 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月18日 18:13, Wei Liu wrote:
>>> ASSERT(info->viommu[viommu_id]->ops);
>>> > > 
>>> > > For extra safety.
>> > 
>> > Or check ops in the previous if?
>> > 
> That depends on if ops can be null or not.

If ops isn't be set, it will be null. Because struct viommu is allocated
via xzalloc().

> 
>>> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
>>> > >> new file mode 100644
>>> > >> index 0000000..51bda72
>>> > >> --- /dev/null
>>> > >> +++ b/xen/include/asm-x86/viommu.h
>>> > >> @@ -0,0 +1,73 @@
>>> > >> +/*
>>> > >> + * include/asm-x86/viommu.h
>>> > >> + *
>>> > >> + * Copyright (c) 2017 Intel Corporation.
>>> > >> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
>>> > >> + *
>>> > >> + * This program is free software; you can redistribute it and/or modify it
>>> > >> + * under the terms and conditions of the GNU General Public License,
>>> > >> + * version 2, as published by the Free Software Foundation.
>>> > >> + *
>>> > >> + * This program is distributed in the hope it will be useful, but WITHOUT
>>> > >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> > >> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>>> > >> + * more details.
>>> > >> + *
>>> > >> + * You should have received a copy of the GNU General Public License along with
>>> > >> + * this program; If not, see <http://www.gnu.org/licenses/>.
>>> > >> + *
>>> > >> + */
>>> > >> +#ifndef __ARCH_X86_VIOMMU_H__
>>> > >> +#define __ARCH_X86_VIOMMU_H__
>>> > >> +
>> > > 
>> > > Is a corresponding ARM header needed? Given viommu is common code.
>> 
>> I added such ARM header file in previous version but Julien hope vIOMMU
>> > should be disabled for ARM because ARM doesn't support vIOMMU so far.
>> > 
> If that's what he wanted, sure.
> 
> Please build test ARM as well. You can do so using travis-ci.

OK. Will test.
-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-18 10:15       ` Wei Liu
@ 2017-08-22  8:07         ` Lan Tianyu
  2017-08-22 11:03           ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-22  8:07 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月18日 18:15, Wei Liu wrote:
> On Fri, Aug 18, 2017 at 03:17:37PM +0800, Lan Tianyu wrote:
>> On 2017年08月17日 19:19, Wei Liu wrote:
>>> On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
>>>> +Now just suppport single vIOMMU for one VM and introduced domctls are compatible
>>>> +with multi-vIOMMU support.
>>>
>>> Is this still true? 
>>
>> Yes, the patchset just supports single vIOMMU for one VM.
>>
> 
> The first part of the sentence is true, but the latter is probably not.
> It seems to me domctl is able to cope with multiple viommu. Please
> correct me if I'm wrong.

These domctl is able to support multiple vIOMMU but vIOMMU device model
in Xen hypervisor only support single vIOMMU for one VM.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-22  8:04         ` Lan Tianyu
@ 2017-08-22  8:42           ` Wei Liu
  2017-08-22 10:39             ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22  8:42 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Tue, Aug 22, 2017 at 04:04:09PM +0800, Lan Tianyu wrote:
> On 2017年08月18日 18:13, Wei Liu wrote:
> >>> ASSERT(info->viommu[viommu_id]->ops);
> >>> > > 
> >>> > > For extra safety.
> >> > 
> >> > Or check ops in the previous if?
> >> > 
> > That depends on if ops can be null or not.
> 
> If ops isn't be set, it will be null. Because struct viommu is allocated
> via xzalloc().

But is it functionally correct / possible to have it to be NULL when you
come to this path?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-22  8:42           ` Wei Liu
@ 2017-08-22 10:39             ` Lan Tianyu
  2017-08-22 10:53               ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-22 10:39 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月22日 16:42, Wei Liu wrote:
> On Tue, Aug 22, 2017 at 04:04:09PM +0800, Lan Tianyu wrote:
>> On 2017年08月18日 18:13, Wei Liu wrote:
>>>>> ASSERT(info->viommu[viommu_id]->ops);
>>>>>>>
>>>>>>> For extra safety.
>>>>>
>>>>> Or check ops in the previous if?
>>>>>
>>> That depends on if ops can be null or not.
>>
>> If ops isn't be set, it will be null. Because struct viommu is allocated
>> via xzalloc().
> 
> But is it functionally correct / possible to have it to be NULL when you
> come to this path?
> 

No, it shouldn't be NULL if struct viommu is set up correctly.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-22 10:39             ` Lan Tianyu
@ 2017-08-22 10:53               ` Wei Liu
  2017-08-22 10:54                 ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22 10:53 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Tue, Aug 22, 2017 at 06:39:32PM +0800, Lan Tianyu wrote:
> On 2017年08月22日 16:42, Wei Liu wrote:
> > On Tue, Aug 22, 2017 at 04:04:09PM +0800, Lan Tianyu wrote:
> >> On 2017年08月18日 18:13, Wei Liu wrote:
> >>>>> ASSERT(info->viommu[viommu_id]->ops);
> >>>>>>>
> >>>>>>> For extra safety.
> >>>>>
> >>>>> Or check ops in the previous if?
> >>>>>
> >>> That depends on if ops can be null or not.
> >>
> >> If ops isn't be set, it will be null. Because struct viommu is allocated
> >> via xzalloc().
> > 
> > But is it functionally correct / possible to have it to be NULL when you
> > come to this path?
> > 
> 
> No, it shouldn't be NULL if struct viommu is set up correctly.
> 

Then an ASSERT is warranted -- hope you see the line of thinking.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-22 10:53               ` Wei Liu
@ 2017-08-22 10:54                 ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-22 10:54 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月22日 18:53, Wei Liu wrote:
> On Tue, Aug 22, 2017 at 06:39:32PM +0800, Lan Tianyu wrote:
>> On 2017年08月22日 16:42, Wei Liu wrote:
>>> On Tue, Aug 22, 2017 at 04:04:09PM +0800, Lan Tianyu wrote:
>>>> On 2017年08月18日 18:13, Wei Liu wrote:
>>>>>>> ASSERT(info->viommu[viommu_id]->ops);
>>>>>>>>>
>>>>>>>>> For extra safety.
>>>>>>>
>>>>>>> Or check ops in the previous if?
>>>>>>>
>>>>> That depends on if ops can be null or not.
>>>>
>>>> If ops isn't be set, it will be null. Because struct viommu is allocated
>>>> via xzalloc().
>>>
>>> But is it functionally correct / possible to have it to be NULL when you
>>> come to this path?
>>>
>>
>> No, it shouldn't be NULL if struct viommu is set up correctly.
>>
> 
> Then an ASSERT is warranted -- hope you see the line of thinking.
> 

OK. I got it. Will add in the next version.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-22  8:07         ` Lan Tianyu
@ 2017-08-22 11:03           ` Wei Liu
  2017-08-23  2:06             ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22 11:03 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Tue, Aug 22, 2017 at 04:07:32PM +0800, Lan Tianyu wrote:
> On 2017年08月18日 18:15, Wei Liu wrote:
> > On Fri, Aug 18, 2017 at 03:17:37PM +0800, Lan Tianyu wrote:
> >> On 2017年08月17日 19:19, Wei Liu wrote:
> >>> On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
> >>>> +Now just suppport single vIOMMU for one VM and introduced domctls are compatible
> >>>> +with multi-vIOMMU support.
> >>>
> >>> Is this still true? 
> >>
> >> Yes, the patchset just supports single vIOMMU for one VM.
> >>
> > 
> > The first part of the sentence is true, but the latter is probably not.
> > It seems to me domctl is able to cope with multiple viommu. Please
> > correct me if I'm wrong.
> 
> These domctl is able to support multiple vIOMMU but vIOMMU device model
> in Xen hypervisor only support single vIOMMU for one VM.
> 

In that case please update the document.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc
  2017-08-09 20:34 ` [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc Lan Tianyu
@ 2017-08-22 11:09   ` Wei Liu
  2017-08-22 16:25   ` Roger Pau Monné
  1 sibling, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-22 11:09 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:06PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds XEN_DOMCTL_viommu_op hypercall. This hypercall
> comprises three sub-command:
> - query capabilities of one specific type vIOMMU emulated by Xen
> - create vIOMMU in Xen hypervisor, given viommu type, register-set location
> and capabilities
> - destroy vIOMMU specified by viommu_id
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

This needs to get rebased in accordance with the proposed name changes
in the struct.

> +
> +#include "xc_private.h"
> +
> +int xc_viommu_query_cap(xc_interface *xch, domid_t dom,
> +                        uint64_t type, uint64_t *cap)
> +{
> +    int rc;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_op;
> +    domctl.domain = (domid_t)dom;

Pointless cast here and below.

> +    domctl.u.viommu_op.cmd = XEN_DOMCTL_query_viommu_caps;
> +    domctl.u.viommu_op.u.query_caps.viommu_type = type;
> +
> +    rc = do_domctl(xch, &domctl);
> +    if ( !rc )
> +        *cap = domctl.u.viommu_op.u.query_caps.capabilities;

Blank line here please.

> +    return rc;
> +}
> +
> +int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
> +                     uint64_t base_addr, uint64_t length, uint64_t cap,
> +                     uint32_t *viommu_id)
> +{
> +    int rc;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_op;
> +    domctl.domain = (domid_t)dom;
> +    domctl.u.viommu_op.cmd = XEN_DOMCTL_create_viommu;
> +    domctl.u.viommu_op.u.create_viommu.viommu_type = type;
> +    domctl.u.viommu_op.u.create_viommu.base_address = base_addr;
> +    domctl.u.viommu_op.u.create_viommu.length = length;
> +    domctl.u.viommu_op.u.create_viommu.capabilities = cap;
> +
> +    rc = do_domctl(xch, &domctl);
> +    if ( !rc )
> +        *viommu_id = domctl.u.viommu_op.u.create_viommu.viommu_id;

Ditto.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures
  2017-08-09 20:34 ` [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
@ 2017-08-22 12:56   ` Wei Liu
  2017-08-23  2:47     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22 12:56 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:07PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Add dmar table structure according Chapter 8 "BIOS Considerations" of
> VTd spec Rev. 2.4.
> 
> VTd spec:http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

I check the spec and the content, they match.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-09 20:34 ` [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
@ 2017-08-22 13:12   ` Wei Liu
  2017-08-23  2:36     ` Lan Tianyu
  2017-08-22 16:41   ` Roger Pau Monné
  1 sibling, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22 13:12 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:08PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> The BIOS reports the remapping hardware units in a platform to system software
> through the DMA Remapping Reporting (DMAR) ACPI table.
> New fields are introduces for DMAR table. These new fields are set by
> toolstack through parsing guest's config file. construct_dmar() is added to
> build DMAR table according to the new fields.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  tools/libacpi/build.c   | 57 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/libacpi/libacpi.h |  9 ++++++++
>  2 files changed, 66 insertions(+)
> 
> diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
> index f9881c9..c7cc784 100644
> --- a/tools/libacpi/build.c
> +++ b/tools/libacpi/build.c
> @@ -28,6 +28,10 @@
>  
>  #define ACPI_MAX_SECONDARY_TABLES 16
>  
> +#define VTD_HOST_ADDRESS_WIDTH 39
> +#define I440_PSEUDO_BUS_PLATFORM 0xff
> +#define I440_PSEUDO_DEVFN_IOAPIC 0x0

I have some stupid questions. What are these I440 values? Where do they
come from?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-08-09 20:34 ` [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
@ 2017-08-22 13:19   ` Wei Liu
  2017-08-23  2:46     ` Lan Tianyu
  2017-08-22 16:48   ` Roger Pau Monné
  1 sibling, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22 13:19 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:09PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
[...]
> -=back 
> +=back
> +
> +=item B<viommu="VIOMMU_STRING">
> +
> +Specifies the vIOMMU which are to be provided to the guest.
> +
> +B<VIOMMU_STRING> has the form C<KEY=VALUE,KEY=VALUE,...> where:
> +
> +=over 4
> +
> +=item B<KEY=VALUE>
> +
> +Possible B<KEY>s are:
> +
> +=over 4
> +
> +=item B<type="STRING">
> +
> +Currently there is only one valid type:
> +
> +(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
> +
> +=item B<intremap=BOOLEAN>
> +
> +Specifies whether the vIOMMU should support interrupt remapping
> +and default 'true'.
> +
> +=item B<x2apic=BOOLEAN>
> +
> +Specifies whether the vIOMMU should support x2apic mode and default 'true'.
> +Only valid for "intel_vtd".

Why not expose base address and length as well?

> +
> +=back
>  
>  =head3 Guest Virtual Time Controls
>  
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 8a9849c..7abd70c 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -450,6 +450,21 @@ libxl_altp2m_mode = Enumeration("altp2m_mode", [
>      (3, "limited"),
>      ], init_val = "LIBXL_ALTP2M_MODE_DISABLED")
>  
> +libxl_viommu_type = Enumeration("viommu_type", [
> +    (1, "intel_vtd"),
> +    ])
> +
> +libxl_viommu_info = Struct("viommu_info", [
> +    ("u", KeyedUnion(None, libxl_viommu_type, "type",
> +           [("intel_vtd", Struct(None, [
> +                 ("x2apic",     libxl_defbool)]))
> +           ])),
> +    ("intremap",        libxl_defbool),
> +    ("cap",             uint64),
> +    ("base_addr",       uint64),
> +    ("len",             uint64),
> +    ])
> +
>  libxl_domain_build_info = Struct("domain_build_info",[
>      ("max_vcpus",       integer),
>      ("avail_vcpus",     libxl_bitmap),
> @@ -506,6 +521,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>      # 65000 which is reserved by the toolstack.
>      ("device_tree",      string),
>      ("acpi",             libxl_defbool),
> +    ("viommu",           libxl_viommu_info),

An array please, we shouldn't limit the number of viommus.

>      ("u", KeyedUnion(None, libxl_domain_type, "type",
>                  [("hvm", Struct(None, [("firmware",         string),
>                                         ("bios",             libxl_bios_type),
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 5c2bf17..11c4eb2 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -17,6 +17,7 @@
>  #include <limits.h>
>  #include <stdio.h>
>  #include <stdlib.h>
> +#include <xen/domctl.h>

Why is this needed?

>  #include <xen/hvm/e820.h>
>  #include <xen/hvm/params.h>
>  
> @@ -30,6 +31,9 @@
>  
>  extern void set_default_nic_values(libxl_device_nic *nic);
>  
> +#define VIOMMU_VTD_BASE_ADDR        0xfed90000ULL
> +#define VIOMMU_VTD_REGISTER_LEN     0x1000ULL
> +
>  #define ARRAY_EXTEND_INIT__CORE(array,count,initfn,more)                \
>      ({                                                                  \
>          typeof((count)) array_extend_old_count = (count);               \
> @@ -804,6 +808,61 @@ int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token)
>      return 0;
>  }
>  
> +/* Parses viommu data and adds info into viommu
> + * Returns 1 if the input doesn't form a valid viommu
> + * or parsed values are not correct. Successful parse returns 0 */
> +static int parse_viommu_config(libxl_viommu_info *viommu, const char *info)
> +{
> +    char *ptr, *oparg, *saveptr = NULL, *buf = xstrdup(info);
> +
> +    ptr = strtok_r(buf, ",", &saveptr);
> +    if (MATCH_OPTION("type", ptr, oparg)) {
> +        if (!strcmp(oparg, "intel_vtd")) {
> +            viommu->type = LIBXL_VIOMMU_TYPE_INTEL_VTD;
> +        } else {
> +            fprintf(stderr, "Invalid viommu type: %s\n", oparg);
> +            return 1;
> +        }
> +    } else {
> +        fprintf(stderr, "viommu type should be set first: %s\n", oparg);
> +        return 1;
> +    }
> +
> +    ptr = strtok_r(NULL, ",", &saveptr);
> +    if (MATCH_OPTION("intremap", ptr, oparg)) {
> +        libxl_defbool_set(&viommu->intremap, !!strtoul(oparg, NULL, 0));
> +    }
> +
> +    if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> +        for (ptr = strtok_r(NULL, ",", &saveptr); ptr;
> +             ptr = strtok_r(NULL, ",", &saveptr)) {
> +            if (MATCH_OPTION("x2apic", ptr, oparg)) {
> +                libxl_defbool_set(&viommu->u.intel_vtd.x2apic,
> +                                  !!strtoul(oparg, NULL, 0));
> +            } else {
> +                fprintf(stderr, "Unknown string `%s' in viommu spec\n", ptr);
> +                return 1;
> +            }
> +        }
> +
> +        if (libxl_defbool_is_default(viommu->intremap))
> +            libxl_defbool_set(&viommu->intremap, true);
> +
> +        if (!libxl_defbool_val(viommu->intremap)) {
> +            fprintf(stderr, "Cannot create one virtual VTD without intremap\n");
> +            return 1;
> +        }
> +
> +        /* Set default values to unexposed fields */
> +        viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
> +        viommu->len = VIOMMU_VTD_REGISTER_LEN;
> +

You're doing this is xl. This is not right. The default value should be
set from within libxl.

You should have a libxl_XXX_setdefault function for this type.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-18 13:56           ` Jan Beulich
@ 2017-08-22 13:44             ` Wei Liu
  0 siblings, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-22 13:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Lan Tianyu, kevin.tian, Wei Liu, andrew.cooper3, ian.jackson,
	xen-devel, julien.grall

On Fri, Aug 18, 2017 at 07:56:36AM -0600, Jan Beulich wrote:
> >>> On 18.08.17 at 15:45, <wei.liu2@citrix.com> wrote:
> > On Fri, Aug 18, 2017 at 01:45:50PM +0800, Chao Gao wrote:
> >> >
> >> >> > +            }
> >> >> > +        }
> >> >> > +    }
> >> >> 
> >> >> This still looks wrong to me. How do you know acpi_modules[0] is DMAR
> >> >> table?
> >> >> 
> >> >
> >> >Oh, I sorta see why you do this, but I still think this is wrong. The
> >> >DMAR should either be a new module or be joined to the existing one (and
> >> >with all conflicts resolved).
> >> 
> >> Hi, Wei
> >> Thanks for your comments.
> >> 
> >> iirc, HVM only supports one module;
> > 
> > This is indeed how it is stated in various comments. I'm not sure why
> > there is such restriction. Maybe x86 maintainers can shed more light on
> > this?
> 
> Not me, sorry. Maybe ask whoever has written that code?
> 

OK. I have misunderstood the restriction was from hvmloader.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-18  5:45       ` Chao Gao
  2017-08-18 13:45         ` Wei Liu
@ 2017-08-22 13:48         ` Wei Liu
  2017-08-23  5:35           ` Lan Tianyu
  1 sibling, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-22 13:48 UTC (permalink / raw)
  To: Wei Liu, Lan Tianyu, xen-devel, ian.jackson, jbeulich,
	andrew.cooper3, kevin.tian, julien.grall

On Fri, Aug 18, 2017 at 01:45:50PM +0800, Chao Gao wrote:
> On Thu, Aug 17, 2017 at 01:28:21PM +0100, Wei Liu wrote:
> >On Thu, Aug 17, 2017 at 12:32:17PM +0100, Wei Liu wrote:
> >> On Wed, Aug 09, 2017 at 04:34:10PM -0400, Lan Tianyu wrote:
> >> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> >> > index f54fd49..94c9196 100644
> >> > --- a/tools/libxl/libxl_dom.c
> >> > +++ b/tools/libxl/libxl_dom.c
> >> > @@ -1060,6 +1060,42 @@ static int libxl__domain_firmware(libxl__gc *gc,
> >> >          }
> >> >      }
> >> >  
> >> > +    /*
> >> > +     * If a guest has one virtual VTD, build DMAR table for it and joint this
> >> > +     * table with existing content in acpi_modules in order to employ HVM
> >> > +     * firmware pass-through mechanism to pass-through DMAR table.
> >> > +     */
> >> > +    if (info->viommu.type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> >> > +        datalen = 0;
> >> > +        e = libxl__dom_build_dmar(gc, info, dom, &data, &datalen);
> >> > +        if (e) {
> >> > +            LOGEV(ERROR, e, "failed to build DMAR table");
> >> > +            rc = ERROR_FAIL;
> >> > +            goto out;
> >> > +        }
> >> > +        if (datalen) {
> >> > +            libxl__ptr_add(gc, data);
> >> > +            if (!dom->acpi_modules[0].data) {
> >> > +                dom->acpi_modules[0].data = data;
> >> > +                dom->acpi_modules[0].length = (uint32_t)datalen;
> >> > +            } else {
> >> > +                /* joint tables */
> >> > +                void *newdata;
> >> > +                newdata = malloc(datalen + dom->acpi_modules[0].length);
> >> 
> >> All memory allocations in libxl should use libxl__*lloc wrappers.
> >> 
> >> > +                if (!newdata) {
> >> > +                    LOGE(ERROR, "failed to joint DMAR table to acpi modules");
> >> > +                    rc = ERROR_FAIL;
> >> > +                    goto out;
> >> > +                }
> >> > +                memcpy(newdata, dom->acpi_modules[0].data,
> >> > +                       dom->acpi_modules[0].length);
> >> > +                memcpy(newdata + dom->acpi_modules[0].length, data, datalen);
> >> > +                dom->acpi_modules[0].data = newdata;
> >> > +                dom->acpi_modules[0].length += (uint32_t)datalen;
> >
> >Also, this leaks the old pointer, right?
> 
> Yes. Will fix this.
> 
> >
> >> > +            }
> >> > +        }
> >> > +    }
> >> 
> >> This still looks wrong to me. How do you know acpi_modules[0] is DMAR
> >> table?
> >> 
> >
> >Oh, I sorta see why you do this, but I still think this is wrong. The
> >DMAR should either be a new module or be joined to the existing one (and
> >with all conflicts resolved).
> 
> Hi, Wei
> Thanks for your comments.
> 
> iirc, HVM only supports one module; DMAR cannot be a new module. Joining to
> the existing one is the approach we are taking. 
> 
> Which kind of conflicts you think should be resolved? If you mean I
> forget to free the old buf, I will fix this. If you mean the potential
> overlap between the binary passed by admin and DMAR table built here, I
> don't have much idea on this. Even without the DMAR table, the binary
> may contains MADT or other tables and tool stacks don't intrepret the
> binary and check whether there are conflicts, right?
> 

Thinking a bit more about this, when I first said "conflicts" I didn't
mean to parse the content. I was referring to the code in
libxl_x86_apci.c which also seems to manipulate acpi_modules.

I would like the code to generate dmar take into consideration
libxl__dom_load_acpi.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-09 20:34 ` [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
  2017-08-17 11:18   ` Wei Liu
@ 2017-08-22 14:32   ` Roger Pau Monné
  2017-08-23  6:06     ` Lan Tianyu
  1 sibling, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 14:32 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
> This patch is to introduce create, destroy and query capabilities
> command for vIOMMU. vIOMMU layer will deal with requests and call
> arch vIOMMU ops.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/domctl.c         |  3 +++
>  xen/common/viommu.c         | 43 +++++++++++++++++++++++++++++++++++++

I'm confused, I don't see this file in the repo, and the cover letter
doesn't mention this being based on top of any other series, where
does this viommu.c file come from?

>  xen/include/public/domctl.h | 52 +++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/viommu.h    |  6 ++++++
>  4 files changed, 104 insertions(+)
> 
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index d80488b..01c3024 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -1144,6 +1144,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>          if ( !ret )
>              copyback = 1;
>          break;
> +    case XEN_DOMCTL_viommu_op:
> +        ret = viommu_domctl(d, &op->u.viommu_op, &copyback);
> +        break;

Hm, shouldn't this be protected with #ifdef CONFIG_VIOMMU?

>      default:
>          ret = arch_do_domctl(op, d, u_domctl);
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index 6874d9f..a4d004d 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -148,6 +148,49 @@ static u64 viommu_query_caps(struct domain *d, u64 type)
>      return viommu_type->ops->query_caps(d);
>  }
>  
> +int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
> +                  bool *need_copy)
> +{
> +    int rc = -EINVAL, ret;

Do you really need both ret and rc?

> +    if ( !viommu_enabled() )
> +        return rc;

EINVAL? Maybe ENODEV?

> +
> +    switch ( op->cmd )
> +    {
> +    case XEN_DOMCTL_create_viommu:
> +        ret = viommu_create(d, op->u.create_viommu.viommu_type,
> +            op->u.create_viommu.base_address,
> +            op->u.create_viommu.length,
> +            op->u.create_viommu.capabilities);

I would rather prefer for viommu_create to simply return an error or
0, and store the viommu_id by passing a pointer parameter to viommu_create, ie:

rc = viommu_create(d, op->u.create_viommu.viommu_type,
                   op->u.create_viommu.base_address,
                   op->u.create_viommu.length,
                   op->u.create_viommu.capabilities,
                   &op->u.create_viommu.viommu_id);

> +        if ( ret >= 0 ) {
                           ^ coding style
> +            op->u.create_viommu.viommu_id = ret;
> +            *need_copy = true;
> +            rc = 0; /* return 0 if success */
> +        }
> +        break;
> +
> +    case XEN_DOMCTL_destroy_viommu:
> +        rc = viommu_destroy(d, op->u.destroy_viommu.viommu_id);
> +        break;
> +
> +    case XEN_DOMCTL_query_viommu_caps:
> +        ret = viommu_query_caps(d, op->u.query_caps.viommu_type);

Same here, I would rather pass another parameter and use the return
for error only.

> +        if ( ret >= 0 )
> +        {
> +            op->u.query_caps.capabilities = ret;
> +            rc = 0;
> +        }
> +        *need_copy = true;
> +        break;
> +
> +    default:
> +        break;

Here you should return ENOSYS.

> +    }
> +
> +    return rc;
> +}
> +
>  int __init viommu_setup(void)
>  {
>      INIT_LIST_HEAD(&type_list);
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index ff39762..4b10f26 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1149,6 +1149,56 @@ struct xen_domctl_psr_cat_op {
>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
>  
> +/*  vIOMMU helper
> + *
> + *  vIOMMU interface can be used to create/destroy vIOMMU and
> + *  query vIOMMU capabilities.
> + */
> +
> +/* vIOMMU type - specify vendor vIOMMU device model */
> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)

If this going to be used to specify the vendor only, it doesn't need
to be a bitfield, because it doesn't make sense to specify for
example VIOMMU_TYPE_INTEL_VTD | VIOMMU_TYPE_AMD, it's either Intel or
AMD. Do you have plans to expand this with other uses? In which case
the comment should be fixed.

> +
> +/* vIOMMU capabilities */
> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> +
> +struct xen_domctl_viommu_op {
> +    uint32_t cmd;
> +#define XEN_DOMCTL_create_viommu          0
> +#define XEN_DOMCTL_destroy_viommu         1
> +#define XEN_DOMCTL_query_viommu_caps      2
> +    union {
> +        struct {
> +            /* IN - vIOMMU type */
> +            uint64_t viommu_type;
> +            /* 
> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
> +             * are in charge of to check base_address and length.
> +             */
> +            uint64_t base_address;
> +            /* IN - Length of MMIO region */
> +            uint64_t length;

It seems weird that you can specify the length, is that something
that a user would like to set? Isn't the length of the IOMMU MMIO
region fixed by the hardware spec?

> +            /* IN - Capabilities with which we want to create */
> +            uint64_t capabilities;
> +            /* OUT - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } create_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } destroy_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU type */
> +            uint64_t viommu_type;
> +            /* OUT - vIOMMU Capabilities */
> +            uint64_t capabilities;
> +        } query_caps;

This also seems weird, shouldn't you query the capabilities of an
already created vIOMMU, rather than a vIOMMU type? Shouldn't the first
field be viommu_id?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
  2017-08-18  8:41       ` Jan Beulich
  2017-08-18 13:32       ` Wei Liu
@ 2017-08-22 15:27       ` Roger Pau Monné
  2017-08-23  7:10         ` Lan Tianyu
  2017-08-24  8:14       ` Tian, Kevin
  3 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 15:27 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Thu, Aug 17, 2017 at 08:22:16PM -0400, Lan Tianyu wrote:
> This patch is to introduct an abstract layer for arch vIOMMU implementation
> to deal with requests from dom0. Arch vIOMMU code needs to provide callback
> to perform create, destroy and query capabilities operation.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/arch/x86/Kconfig     |   1 +
>  xen/arch/x86/setup.c     |   1 +
>  xen/common/Kconfig       |   3 +
>  xen/common/Makefile      |   1 +
>  xen/common/domain.c      |   3 +
>  xen/common/viommu.c      | 165 +++++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/sched.h  |   2 +
>  xen/include/xen/viommu.h |  71 ++++++++++++++++++++
>  8 files changed, 247 insertions(+)
>  create mode 100644 xen/common/viommu.c
>  create mode 100644 xen/include/xen/viommu.h
> 
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 30c2769..1f1de96 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -23,6 +23,7 @@ config X86
>  	select HAS_PDX
>  	select NUMA
>  	select VGA
> +	select VIOMMU
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index db5df69..68f1631 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1513,6 +1513,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>      early_msi_init();
>  
>      iommu_setup();    /* setup iommu if available */
> +    viommu_setup();
>  
>      smp_prepare_cpus(max_cpus);
>  
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index dc8e876..2ad2c8d 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -49,6 +49,9 @@ config HAS_CHECKPOLICY
>  	string
>  	option env="XEN_HAS_CHECKPOLICY"
>  
> +config VIOMMU
> +	bool
> +
>  config KEXEC
>  	bool "kexec support"
>  	default y
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 26c5a64..852553d 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -56,6 +56,7 @@ obj-y += time.o
>  obj-y += timer.o
>  obj-y += trace.o
>  obj-y += version.o
> +obj-$(CONFIG_VIOMMU) += viommu.o
>  obj-y += virtual_region.o
>  obj-y += vm_event.o
>  obj-y += vmap.o
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index b22aacc..d1f9b10 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -396,6 +396,9 @@ struct domain *domain_create(domid_t domid, unsigned int domcr_flags,
>          spin_unlock(&domlist_update_lock);
>      }
>  
> +    if ( (err = viommu_init_domain(d)) != 0 )
> +        goto fail;
> +
>      return d;
>  
>   fail:
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> new file mode 100644
> index 0000000..6874d9f
> --- /dev/null
> +++ b/xen/common/viommu.c
> @@ -0,0 +1,165 @@
> +/*
> + * common/viommu.c
> + * 
> + * Copyright (c) 2017 Intel Corporation
> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/spinlock.h>
> +#include <xen/types.h>
> +#include <xen/viommu.h>
> +
> +bool __read_mostly opt_viommu;
> +boolean_param("viommu", opt_viommu);
> +
> +static spinlock_t type_list_lock;

static DEFINE_SPINLOCK(type_list_lock);

> +static struct list_head type_list;

static LIST_HEAD(type_list);

> +
> +struct viommu_type {
> +    u64 type;
> +    struct viommu_ops *ops;
> +    struct list_head node;
> +};
> +
> +int viommu_init_domain(struct domain *d)
> +{
> +    d->viommu.nr_viommu = 0;
> +    return 0;
> +}

If you don't use the viommu_info struct you can also get rid of this.
The array entries will point to NULL which can be used to signal not
initialized.

> +
> +static struct viommu_type *viommu_get_type(u64 type)
> +{
> +    struct viommu_type *viommu_type = NULL;
> +
> +    spin_lock(&type_list_lock);
> +    list_for_each_entry( viommu_type, &type_list, node )
> +    {
> +        if ( viommu_type->type == type )
> +        {
> +            spin_unlock(&type_list_lock);
> +            return viommu_type;
> +        }
> +    }
> +    spin_unlock(&type_list_lock);
> +
> +    return NULL;
> +}
> +
> +int viommu_register_type(u64 type, struct viommu_ops * ops)
> +{
> +    struct viommu_type *viommu_type = NULL;
> +
> +    if ( !viommu_enabled() )
> +        return -EINVAL;

ENODEV is maybe better here.

> +
> +    if ( viommu_get_type(type) )
> +        return -EEXIST;
> +
> +    viommu_type = xzalloc(struct viommu_type);
> +    if ( !viommu_type )
> +        return -ENOMEM;
> +
> +    viommu_type->type = type;
> +    viommu_type->ops = ops;
> +
> +    spin_lock(&type_list_lock);
> +    list_add_tail(&viommu_type->node, &type_list);
> +    spin_unlock(&type_list_lock);
> +
> +    return 0;
> +}

Hm, I haven't seen the usage of this function, but from the looks of
it it seems like you want to use something similar to what's used by
the scheduler in order to register vIOMMU types.

See the logic around REGISTER_SCHEDULER in xen/sched-if.h.

> +
> +static int viommu_create(struct domain *d, u64 type, u64 base_address,
> +    u64 length, u64 caps)
> +{
> +    struct viommu_info *info = &d->viommu;
> +    struct viommu *viommu;
> +    struct viommu_type *viommu_type = NULL;
> +    int rc;
> +
> +    viommu_type = viommu_get_type(type);
> +    if ( !viommu_type )
> +        return -EINVAL;
> +
> +    if ( info->nr_viommu >= NR_VIOMMU_PER_DOMAIN
> +        || !viommu_type->ops || !viommu_type->ops->create )
           ^ aligned with "info" above.
> +        return -EINVAL;
> +
> +    viommu = xzalloc(struct viommu);
> +    if ( !viommu )
> +        return -ENOMEM;
> +
> +    viommu->base_address = base_address;
> +    viommu->length = length;
> +    viommu->caps = caps;
> +    viommu->ops = viommu_type->ops;
> +    viommu->viommu_id = info->nr_viommu;
> +
> +    info->viommu[info->nr_viommu] = viommu;
> +    info->nr_viommu++;
> +
> +    rc = viommu->ops->create(d, viommu);
> +    if ( rc < 0 )
> +    {
> +        xfree(viommu);
> +        info->nr_viommu--;
> +        info->viommu[info->nr_viommu] = NULL;
> +        return rc;
> +    }
> +
> +    return viommu->viommu_id;
> +}
> +
> +static int viommu_destroy(struct domain *d, u32 viommu_id)
> +{
> +    struct viommu_info *info = &d->viommu;
> +
> +    if ( viommu_id >= info->nr_viommu || !info->viommu[viommu_id] )
> +        return -EINVAL;
> +
> +    if ( info->viommu[viommu_id]->ops->destroy(info->viommu[viommu_id]) )
> +        return -EFAULT;

You should return the return the original return value from the
"destroy" function pointer, instead of hardcoding it to EFAULT.

> +
> +    xfree(info->viommu[viommu_id]);
> +    info->viommu[viommu_id] = NULL;
> +    return 0;
> +}
> +
> +static u64 viommu_query_caps(struct domain *d, u64 type)
> +{
> +    struct viommu_type *viommu_type = viommu_get_type(type);
> +
> +    if ( !viommu_type )
> +        return -EINVAL;
> +
> +    return viommu_type->ops->query_caps(d);
> +}
> +
> +int __init viommu_setup(void)
> +{
> +    INIT_LIST_HEAD(&type_list);
> +    spin_lock_init(&type_list_lock);
> +    return 0;
> +}

With the suggested changes to init type_list and type_list_lock at
definition time you can get rid of viommu_setup.

> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * End:
> + */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 6673b27..98a965a 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -21,6 +21,7 @@
>  #include <xen/perfc.h>
>  #include <asm/atomic.h>
>  #include <xen/wait.h>
> +#include <xen/viommu.h>
>  #include <public/xen.h>
>  #include <public/domctl.h>
>  #include <public/sysctl.h>
> @@ -477,6 +478,7 @@ struct domain
>      /* vNUMA topology accesses are protected by rwlock. */
>      rwlock_t vnuma_rwlock;
>      struct vnuma_info *vnuma;
> +    struct viommu_info viommu;
>  
>      /* Common monitor options */
>      struct {
> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
> new file mode 100644
> index 0000000..506ea54
> --- /dev/null
> +++ b/xen/include/xen/viommu.h
> @@ -0,0 +1,71 @@
> +/*
> + * include/xen/viommu.h
> + *
> + * Copyright (c) 2017, Intel Corporation
> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#ifndef __XEN_VIOMMU_H__
> +#define __XEN_VIOMMU_H__
> +
> +#define NR_VIOMMU_PER_DOMAIN 1
> +
> +struct viommu;
> +
> +struct viommu_ops {
> +    u64 (*query_caps)(struct domain *d);
> +    int (*create)(struct domain *d, struct viommu *viommu);
> +    int (*destroy)(struct viommu *viommu);
> +};
> +
> +struct viommu {
> +    u64 base_address;

All those should use uint*_t instead of the u* types.

> +    u64 length;
> +    u64 caps;
> +    u32 viommu_id;
> +    const struct viommu_ops *ops;
> +    void *priv;
> +};
> +
> +struct viommu_info {
> +    u32 nr_viommu;
> +    struct viommu *viommu[NR_VIOMMU_PER_DOMAIN]; /* viommu array*/

Seems kind of pointless to have a nr_viommu field when the array is
not dynamic, you could just use ARRAY_SIZE. And then in the domain
struct you could directly add an array of viommu structs, getting rid
of viommu_info altogether.

> +};
> +
> +#ifdef CONFIG_VIOMMU
> +extern bool_t opt_viommu;

bool

> +static inline bool viommu_enabled(void) { return opt_viommu; }

I think those static inline functions should also follow the coding
standard.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-09 20:34 ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
  2017-08-17 11:18   ` Wei Liu
@ 2017-08-22 15:32   ` Roger Pau Monné
  2017-08-23  7:42     ` Lan Tianyu
  1 sibling, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 15:32 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
> This patch is to add irq request callback for platform implementation
> to deal with irq remapping request.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c          | 15 +++++++++
>  xen/include/asm-x86/viommu.h | 73 ++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/viommu.h     |  9 ++++++
>  3 files changed, 97 insertions(+)
>  create mode 100644 xen/include/asm-x86/viommu.h
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index a4d004d..f4d34e6 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -198,6 +198,21 @@ int __init viommu_setup(void)
>      return 0;
>  }
>  
> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
> +                              struct irq_remapping_request *request)
> +{
> +    struct viommu_info *info = &d->viommu;
> +
> +    if ( viommu_id >= info->nr_viommu
> +         || !info->viommu[viommu_id] )

This fits on the same line, no need to split it.

> +        return -EINVAL;
> +
> +    if ( !info->viommu[viommu_id]->ops->handle_irq_request )
> +        return -EINVAL;
> +
> +    return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
> new file mode 100644
> index 0000000..51bda72
> --- /dev/null
> +++ b/xen/include/asm-x86/viommu.h
> @@ -0,0 +1,73 @@
> +/*
> + * include/asm-x86/viommu.h
> + *
> + * Copyright (c) 2017 Intel Corporation.
> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#ifndef __ARCH_X86_VIOMMU_H__
> +#define __ARCH_X86_VIOMMU_H__
> +
> +#include <xen/viommu.h>
> +#include <asm/types.h>
> +
> +/* IRQ request type */
> +#define VIOMMU_REQUEST_IRQ_MSI          0
> +#define VIOMMU_REQUEST_IRQ_APIC         1
> +
> +struct irq_remapping_request
> +{
> +    union {
> +        /* MSI */
> +        struct {
> +            u64 addr;
> +            u32 data;
> +        } msi;
> +        /* Redirection Entry in IOAPIC */
> +        u64 rte;
> +    } msg;
> +    u16 source_id;
> +    u8 type;
> +};
> +
> +static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
> +                             uint32_t ioapic_id, uint64_t rte)
> +{
> +    ASSERT(req);
> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
> +    req->source_id = ioapic_id;
> +    req->msg.rte = rte;
> +}
> +
> +static inline void irq_request_msi_fill(struct irq_remapping_request *req,
> +                          uint32_t source_id, uint64_t addr, uint32_t data)
> +{
> +    ASSERT(req);
> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
> +    req->source_id = source_id;
> +    req->msg.msi.addr = addr;
> +    req->msg.msi.data = data;
> +}

What's the usage of those two functions? AFAICT they don't have any
callers in this patch.

> +
> +#endif /* __ARCH_X86_VIOMMU_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * End:
> + */
> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
> index 527afb1..0be1b3a 100644
> --- a/xen/include/xen/viommu.h
> +++ b/xen/include/xen/viommu.h
> @@ -20,6 +20,8 @@
>  #ifndef __XEN_VIOMMU_H__
>  #define __XEN_VIOMMU_H__
>  
> +#include <asm/viommu.h>
> +
>  #define NR_VIOMMU_PER_DOMAIN 1
>  
>  struct viommu;
> @@ -28,6 +30,8 @@ struct viommu_ops {
>      u64 (*query_caps)(struct domain *d);
>      int (*create)(struct domain *d, struct viommu *viommu);
>      int (*destroy)(struct viommu *viommu);
> +    int (*handle_irq_request)(struct domain *d,
> +                              struct irq_remapping_request *request);

I'm slightly lost, you add the function pointer here and some inline
functions in asm-x86/viommu.h, yet I don't see them being hooked into
the struct viommu_ops in any way.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-08-09 20:34 ` [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
  2017-08-17 11:19   ` Wei Liu
@ 2017-08-22 15:38   ` Roger Pau Monné
  2017-08-23  7:43     ` Lan Tianyu
  2017-08-23  9:25     ` Jan Beulich
  1 sibling, 2 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 15:38 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:04PM -0400, Lan Tianyu wrote:
> This patch is to add get_irq_info callback for platform implementation
> to convert irq remapping request to irq info (E,G vector, dest, dest_mode
> and so on).
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c          | 16 ++++++++++++++++
>  xen/include/asm-x86/viommu.h |  8 ++++++++
>  xen/include/xen/viommu.h     |  9 +++++++++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index f4d34e6..03c879d 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -213,6 +213,22 @@ int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>      return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
>  }
>  
> +int viommu_get_irq_info(struct domain *d, u32 viommu_id,
> +                        struct irq_remapping_request *request,
> +                        struct irq_remapping_info *irq_info)

The definition of this struct seems to be arch-specific, in which case
IMHO it should be called arch_irq_remapping_info, in order to denote
it's arch-specific.

> +{
> +    struct viommu_info *info = &d->viommu;
> +
> +    if ( viommu_id >= info->nr_viommu
> +         || !info->viommu[viommu_id] )

Unneeded line break.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-09 20:34 ` [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
  2017-08-17 11:19   ` Wei Liu
@ 2017-08-22 15:55   ` Roger Pau Monné
  2017-08-23  7:36     ` Lan Tianyu
  1 sibling, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 15:55 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
> This patch is to add Xen virtual IOMMU doc to introduce motivation,
> framework, vIOMMU hypercall and xl configuration.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  docs/misc/viommu.txt | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 139 insertions(+)
>  create mode 100644 docs/misc/viommu.txt
> 
> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
> new file mode 100644
> index 0000000..39455bb
> --- /dev/null
> +++ b/docs/misc/viommu.txt

IMHO, this should be the first patch in the series.

> @@ -0,0 +1,139 @@
> +Xen virtual IOMMU
> +
> +Motivation
> +==========
> +*) Enable more than 255 vcpu support

Seems like the "*)" is some kind of leftover?

> +HPC cloud service requires VM provides high performance parallel
> +computing and we hope to create a huge VM with >255 vcpu on one machine
> +to meet such requirement. Pin each vcpu to separate pcpus.

I would re-write this as:

The current requirements of HPC cloud service requires VM with a high
number of CPUs in order to achieve high performance in parallel
computing.

Also, this is needed in order to create VMs with > 128 vCPUs, not 255
vCPUs. That's because the APIC ID used by Xen is CPU ID * 2 (ie: CPU
127 has APIC ID 254, which is the last one available in xAPIC mode).
You should reword the paragraphs below in order to fix the mention of
255 vCPUs.

> +
> +To support >255 vcpus, X2APIC mode in guest is necessary because legacy
> +APIC(XAPIC) just supports 8-bit APIC ID and it only can support 255
> +vcpus at most. X2APIC mode supports 32-bit APIC ID and it requires
> +interrupt mapping function of vIOMMU.

Correct me if I'm wrong, but I don't think x2APIC requires vIOMMU. The
IOMMU is required so that you can route interrupts to all the possible
CPUs. One could image a setup where only CPUs with APIC IDs < 255 are
used as targets of external interrupts, and that doesn't require a
IOMMU.

> +The reason for this is that there is no modification to existing PCI MSI
> +and IOAPIC with the introduction of X2APIC. PCI MSI/IOAPIC can only send
> +interrupt message containing 8-bit APIC ID, which cannot address >255
> +cpus. Interrupt remapping supports 32-bit APIC ID and so it's necessary
> +to enable >255 cpus with x2apic mode.
> +
> +
> +vIOMMU Architecture
> +===================
> +vIOMMU device model is inside Xen hypervisor for following factors
> +    1) Avoid round trips between Qemu and Xen hypervisor
> +    2) Ease of integration with the rest of hypervisor
> +    3) HVMlite/PVH doesn't use Qemu
> +
> +* Interrupt remapping overview.
> +Interrupts from virtual devices and physical devices are delivered
> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
> +this procedure.
> +
> ++---------------------------------------------------+
> +|Qemu                       |VM                     |
> +|                           | +----------------+    |
> +|                           | |  Device driver |    |
> +|                           | +--------+-------+    |
> +|                           |          ^            |
> +|       +----------------+  | +--------+-------+    |
> +|       | Virtual device |  | |  IRQ subsystem |    |
> +|       +-------+--------+  | +--------+-------+    |
> +|               |           |          ^            |
> +|               |           |          |            |
> ++---------------------------+-----------------------+
> +|hypervisor     |                      | VIRQ       |
> +|               |            +---------+--------+   |
> +|               |            |      vLAPIC      |   |
> +|               |VIRQ        +---------+--------+   |
> +|               |                      ^            |
> +|               |                      |            |
> +|               |            +---------+--------+   |
> +|               |            |      vIOMMU      |   |
> +|               |            +---------+--------+   |
> +|               |                      ^            |
> +|               |                      |            |
> +|               |            +---------+--------+   |
> +|               |            |   vIOAPIC/vMSI   |   |
> +|               |            +----+----+--------+   |
> +|               |                 ^    ^            |
> +|               +-----------------+    |            |
> +|                                      |            |
> ++---------------------------------------------------+
> +HW                                     |IRQ
> +                                +-------------------+
> +                                |   PCI Device      |
> +                                +-------------------+
> +
> +
> +vIOMMU hypercall
> +================
> +Introduce new domctl hypercall "xen_domctl_viommu_op" to create/destroy
            ^ a
> +vIOMMU and query vIOMMU capabilities that device model can support.
         ^ s                                ^ the
> +
> +* vIOMMU hypercall parameter structure
> +
> +/* vIOMMU type - specify vendor vIOMMU device model */
> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
> +
> +/* vIOMMU capabilities */
> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> +
> +struct xen_domctl_viommu_op {
> +    uint32_t cmd;
> +#define XEN_DOMCTL_create_viommu          0
> +#define XEN_DOMCTL_destroy_viommu         1
> +#define XEN_DOMCTL_query_viommu_caps      2
> +    union {
> +        struct {
> +            /* IN - vIOMMU type  */
> +            uint64_t viommu_type;
> +            /* IN - MMIO base address of vIOMMU. */
> +            uint64_t base_address;
> +            /* IN - Length of MMIO region */
> +            uint64_t length;
> +            /* IN - Capabilities with which we want to create */
> +            uint64_t capabilities;
> +            /* OUT - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } create_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } destroy_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU type */
> +            uint64_t viommu_type;
> +            /* OUT - vIOMMU Capabilities */
> +            uint64_t capabilities;
> +        } query_caps;
> +    } u;
> +};
> +
> +- XEN_DOMCTL_query_viommu_caps
> +    Query capabilities of vIOMMU device model. vIOMMU_type specifies
> +which vendor vIOMMU device model(E,G Intel VTD) is targeted and hypervisor
> +returns capability bits(E,G interrupt remapping bit).
> +
> +- XEN_DOMCTL_create_viommu
> +    Create vIOMMU device with vIOMMU_type, capabilities, MMIO
> +base address and length. Hypervisor returns viommu_id. Capabilities should
> +be in range of value returned by query_viommu_caps hypercall.
> +
> +- XEN_DOMCTL_destroy_viommu
> +    Destroy vIOMMU in Xen hypervisor with viommu_id as parameters.
> +
> +Now just suppport single vIOMMU for one VM and introduced domtcls are compatible
> +with multi-vIOMMU support.
> +
> +xl vIOMMU configuration

This should be "xl x86 vIOMMU configuration", since it's clearly x86
specific.

> +=======================
> +viommu="type=intel_vtd,intremap=1,x2apic=1"

Shouldn't this have some kind of array form? From the code I saw it
seems like you are adding support for domains having multiple IOMMUs,
in which case this should at least look like:

viommu = [
    'type=intel_vtd,intremap=1,x2apic=1',
    'type=intel_vtd,intremap=1,x2apic=1'
]

But then it's missing to which PCI bus each IOMMU is attached.

Also, why do you need the x2apic parameter? Is there any value in
providing a vIOMMU if it doesn't support x2APIC mode?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc
  2017-08-09 20:34 ` [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc Lan Tianyu
  2017-08-22 11:09   ` Wei Liu
@ 2017-08-22 16:25   ` Roger Pau Monné
  1 sibling, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 16:25 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:06PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds XEN_DOMCTL_viommu_op hypercall. This hypercall
> comprises three sub-command:
                             ^ s
> - query capabilities of one specific type vIOMMU emulated by Xen
                                           ^ of
> - create vIOMMU in Xen hypervisor, given viommu type, register-set location
          ^ a            ^s/hypervisor//
> and capabilities
> - destroy vIOMMU specified by viommu_id
           ^ a
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  tools/libxc/Makefile          |  1 +
>  tools/libxc/include/xenctrl.h |  8 +++++
>  tools/libxc/xc_viommu.c       | 81 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 90 insertions(+)
>  create mode 100644 tools/libxc/xc_viommu.c
> 
> diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
> index 28b1857..8724df5 100644
> --- a/tools/libxc/Makefile
> +++ b/tools/libxc/Makefile
> @@ -51,6 +51,7 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
>  CTRL_SRCS-y       += xc_evtchn_compat.c
>  CTRL_SRCS-y       += xc_gnttab_compat.c
>  CTRL_SRCS-y       += xc_devicemodel_compat.c
> +CTRL_SRCS-y       += xc_viommu.c
>  
>  GUEST_SRCS-y :=
>  GUEST_SRCS-y += xg_private.c xc_suspend.c
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index bde8313..dfaa9d5 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -27,6 +27,7 @@
>  #define __XEN_TOOLS__ 1
>  #endif
>  
> +#include <errno.h>

I don't see the need for this include.

>  #include <unistd.h>
>  #include <stddef.h>
>  #include <stdint.h>
> @@ -2499,6 +2500,13 @@ enum xc_static_cpu_featuremask {
>  const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
>  const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
>  
> +int xc_viommu_query_cap(xc_interface *xch, domid_t dom,
> +                        uint64_t type, uint64_t *cap);
> +int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
> +                     uint64_t base_addr, uint64_t length, uint64_t cap,
> +                     uint32_t *viommu_id);
> +int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id);
> +
>  #endif
>  
>  int xc_livepatch_upload(xc_interface *xch,
> diff --git a/tools/libxc/xc_viommu.c b/tools/libxc/xc_viommu.c
> new file mode 100644
> index 0000000..04aae96
> --- /dev/null
> +++ b/tools/libxc/xc_viommu.c
> @@ -0,0 +1,81 @@
> +/*
> + * xc_viommu.c
> + *
> + * viommu related API functions.
> + *
> + * Copyright (C) 2017 Intel Corporation
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License, version 2.1, as published by the Free Software Foundation.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "xc_private.h"
> +
> +int xc_viommu_query_cap(xc_interface *xch, domid_t dom,
> +                        uint64_t type, uint64_t *cap)
> +{
> +    int rc;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_op;
> +    domctl.domain = (domid_t)dom;

Pointless cast.

> +    domctl.u.viommu_op.cmd = XEN_DOMCTL_query_viommu_caps;
> +    domctl.u.viommu_op.u.query_caps.viommu_type = type;
> +
> +    rc = do_domctl(xch, &domctl);
> +    if ( !rc )
> +        *cap = domctl.u.viommu_op.u.query_caps.capabilities;
> +    return rc;
> +}
> +
> +int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
> +                     uint64_t base_addr, uint64_t length, uint64_t cap,
> +                     uint32_t *viommu_id)
> +{
> +    int rc;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_op;
> +    domctl.domain = (domid_t)dom;

Pointless cast.

> +    domctl.u.viommu_op.cmd = XEN_DOMCTL_create_viommu;
> +    domctl.u.viommu_op.u.create_viommu.viommu_type = type;
> +    domctl.u.viommu_op.u.create_viommu.base_address = base_addr;
> +    domctl.u.viommu_op.u.create_viommu.length = length;
> +    domctl.u.viommu_op.u.create_viommu.capabilities = cap;
> +
> +    rc = do_domctl(xch, &domctl);
> +    if ( !rc )
> +        *viommu_id = domctl.u.viommu_op.u.create_viommu.viommu_id;
> +    return rc;
> +}
> +
> +int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id)
> +{
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_op;
> +    domctl.domain = (domid_t)dom;
> +    domctl.u.viommu_op.cmd = XEN_DOMCTL_destroy_viommu;
> +    domctl.u.viommu_op.u.destroy_viommu.viommu_id = viommu_id;
> +
> +    return do_domctl(xch, &domctl);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-09 20:34 ` [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
  2017-08-22 13:12   ` Wei Liu
@ 2017-08-22 16:41   ` Roger Pau Monné
  2017-08-23  7:52     ` Lan Tianyu
  1 sibling, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 16:41 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:08PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> The BIOS reports the remapping hardware units in a platform to system software
> through the DMA Remapping Reporting (DMAR) ACPI table.
> New fields are introduces for DMAR table. These new fields are set by
                          ^ s/s/d/ to libacpi
> toolstack through parsing guest's config file. construct_dmar() is added to
> build DMAR table according to the new fields.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  tools/libacpi/build.c   | 57 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/libacpi/libacpi.h |  9 ++++++++
>  2 files changed, 66 insertions(+)
> 
> diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
> index f9881c9..c7cc784 100644
> --- a/tools/libacpi/build.c
> +++ b/tools/libacpi/build.c
> @@ -28,6 +28,10 @@
>  
>  #define ACPI_MAX_SECONDARY_TABLES 16

A comment about the meaning of the defines below might be helpful.

> +#define VTD_HOST_ADDRESS_WIDTH 39
> +#define I440_PSEUDO_BUS_PLATFORM 0xff
> +#define I440_PSEUDO_DEVFN_IOAPIC 0x0
> +
>  #define align16(sz)        (((sz) + 15) & ~15)
>  #define fixed_strcpy(d, s) strncpy((d), (s), sizeof(d))
>  
> @@ -303,6 +307,59 @@ static struct acpi_20_slit *construct_slit(struct acpi_ctxt *ctxt,
>      return slit;
>  }
>  
> +/*
> + * Only one DMA remapping hardware unit is exposed and all devices
> + * are under the remapping hardware unit. I/O APIC should be explicitly
> + * enumerated.
> + */
> +struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
> +                                 const struct acpi_config *config)
> +{
> +    struct acpi_dmar *dmar;
> +    struct acpi_dmar_hardware_unit *drhd;
> +    struct dmar_device_scope *scope;
> +    unsigned int size;
> +    unsigned int ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);
> +
> +    size = sizeof(*dmar) + sizeof(*drhd) + ioapic_scope_size;
> +
> +    dmar = ctxt->mem_ops.alloc(ctxt, size, 16);
> +    if ( !dmar )
> +        return NULL;
> +
> +    memset(dmar, 0, size);
> +    dmar->header.signature = ACPI_2_0_DMAR_SIGNATURE;
> +    dmar->header.revision = ACPI_2_0_DMAR_REVISION;
> +    dmar->header.length = size;
> +    fixed_strcpy(dmar->header.oem_id, ACPI_OEM_ID);
> +    fixed_strcpy(dmar->header.oem_table_id, ACPI_OEM_TABLE_ID);
> +    dmar->header.oem_revision = ACPI_OEM_REVISION;
> +    dmar->header.creator_id   = ACPI_CREATOR_ID;
> +    dmar->header.creator_revision = ACPI_CREATOR_REVISION;
> +    dmar->host_address_width = VTD_HOST_ADDRESS_WIDTH - 1;
> +    if ( config->iommu_intremap_supported )
> +        dmar->flags = ACPI_DMAR_INTR_REMAP;

Since you initialize flags to 0 I would use |= here, in case this gets
moved later and this is not the first flag to be set.

> +    if ( !config->iommu_x2apic_supported )
> +        dmar->flags |= ACPI_DMAR_X2APIC_OPT_OUT;

I'm not sure of the reason behind not supporting x2APIC mode (I've
already commented in another patch).

> +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
> +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
> +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
> +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
> +    drhd->pci_segment = 0;
> +    drhd->base_address = config->iommu_base_addr;
> +
> +    scope = &drhd->scope[0];
> +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
> +    scope->length = ioapic_scope_size;
> +    scope->enumeration_id = config->ioapic_id;
> +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
> +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;

I'm not sure whether this constants should instead be fields in the
acpi_config struct passed down from libxl. libxc shouldn't really need
to know anything about which chipset a VM is using.

> +    set_checksum(dmar, offsetof(struct acpi_header, checksum), size);
> +    return dmar;
> +}
> +
>  static int construct_passthrough_tables(struct acpi_ctxt *ctxt,
>                                          unsigned long *table_ptrs,
>                                          int nr_tables,
> diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
> index 2ed1ecf..74778a5 100644
> --- a/tools/libacpi/libacpi.h
> +++ b/tools/libacpi/libacpi.h
> @@ -20,6 +20,8 @@
>  #ifndef __LIBACPI_H__
>  #define __LIBACPI_H__
>  
> +#include <stdbool.h>
> +
>  #define ACPI_HAS_COM1              (1<<0)
>  #define ACPI_HAS_COM2              (1<<1)
>  #define ACPI_HAS_LPT1              (1<<2)
> @@ -96,8 +98,15 @@ struct acpi_config {
>      uint32_t ioapic_base_address;
>      uint16_t pci_isa_irq_mask;
>      uint8_t ioapic_id;
> +
> +    /* Emulated IOMMU features and location */
> +    bool iommu_intremap_supported;
> +    bool iommu_x2apic_supported;
> +    uint64_t iommu_base_addr;
>  };
>  
> +struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
> +                                 const struct acpi_config *config);
>  int acpi_build_tables(struct acpi_ctxt *ctxt, struct acpi_config *config);
>  
>  #endif /* __LIBACPI_H__ */
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-08-09 20:34 ` [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
  2017-08-22 13:19   ` Wei Liu
@ 2017-08-22 16:48   ` Roger Pau Monné
  1 sibling, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-22 16:48 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:09PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> A field, viommu_info, is added to struct libxl_domain_build_info. Several
> attributes can be specified by guest config file for virtual IOMMU. These
> attributes are used for DMAR construction and vIOMMU creation.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  docs/man/xl.cfg.pod.5.in    | 34 ++++++++++++++++++++++-
>  tools/libxl/libxl_types.idl | 16 +++++++++++
>  tools/xl/xl_parse.c         | 66 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 115 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
> index 79cb2ea..f259e22 100644
> --- a/docs/man/xl.cfg.pod.5.in
> +++ b/docs/man/xl.cfg.pod.5.in
> @@ -1545,7 +1545,39 @@ Do not provide a VM generation ID.
>  See also "Virtual Machine Generation ID" by Microsoft:
>  L<http://www.microsoft.com/en-us/download/details.aspx?id=30707>
>  
> -=back 
> +=back

No spurious changes. Leave the extra " " as is.

> +
> +=item B<viommu="VIOMMU_STRING">
> +
> +Specifies the vIOMMU which are to be provided to the guest.
> +
> +B<VIOMMU_STRING> has the form C<KEY=VALUE,KEY=VALUE,...> where:

Should be an array of VIOMMU_STRINGs instead.

> +=over 4
> +
> +=item B<KEY=VALUE>
> +
> +Possible B<KEY>s are:
> +
> +=over 4
> +
> +=item B<type="STRING">
> +
> +Currently there is only one valid type:
> +
> +(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
                                          ^ an
> +
> +=item B<intremap=BOOLEAN>
> +
> +Specifies whether the vIOMMU should support interrupt remapping
> +and default 'true'.
> +
> +=item B<x2apic=BOOLEAN>
> +
> +Specifies whether the vIOMMU should support x2apic mode and default 'true'.
> +Only valid for "intel_vtd".
> +
> +=back
>  
>  =head3 Guest Virtual Time Controls
>  
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 8a9849c..7abd70c 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -450,6 +450,21 @@ libxl_altp2m_mode = Enumeration("altp2m_mode", [
>      (3, "limited"),
>      ], init_val = "LIBXL_ALTP2M_MODE_DISABLED")
>  
> +libxl_viommu_type = Enumeration("viommu_type", [
> +    (1, "intel_vtd"),
> +    ])
> +
> +libxl_viommu_info = Struct("viommu_info", [
> +    ("u", KeyedUnion(None, libxl_viommu_type, "type",
> +           [("intel_vtd", Struct(None, [
> +                 ("x2apic",     libxl_defbool)]))
> +           ])),
> +    ("intremap",        libxl_defbool),
> +    ("cap",             uint64),
> +    ("base_addr",       uint64),
> +    ("len",             uint64),
> +    ])
> +
>  libxl_domain_build_info = Struct("domain_build_info",[
>      ("max_vcpus",       integer),
>      ("avail_vcpus",     libxl_bitmap),
> @@ -506,6 +521,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>      # 65000 which is reserved by the toolstack.
>      ("device_tree",      string),
>      ("acpi",             libxl_defbool),
> +    ("viommu",           libxl_viommu_info),
>      ("u", KeyedUnion(None, libxl_domain_type, "type",
>                  [("hvm", Struct(None, [("firmware",         string),
>                                         ("bios",             libxl_bios_type),
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 5c2bf17..11c4eb2 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -17,6 +17,7 @@
>  #include <limits.h>
>  #include <stdio.h>
>  #include <stdlib.h>
> +#include <xen/domctl.h>
>  #include <xen/hvm/e820.h>
>  #include <xen/hvm/params.h>
>  
> @@ -30,6 +31,9 @@
>  
>  extern void set_default_nic_values(libxl_device_nic *nic);
>  
> +#define VIOMMU_VTD_BASE_ADDR        0xfed90000ULL
> +#define VIOMMU_VTD_REGISTER_LEN     0x1000ULL

I don't think those defines should be here at all.

>  #define ARRAY_EXTEND_INIT__CORE(array,count,initfn,more)                \
>      ({                                                                  \
>          typeof((count)) array_extend_old_count = (count);               \
> @@ -804,6 +808,61 @@ int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token)
>      return 0;
>  }
>  
> +/* Parses viommu data and adds info into viommu
> + * Returns 1 if the input doesn't form a valid viommu
> + * or parsed values are not correct. Successful parse returns 0 */
> +static int parse_viommu_config(libxl_viommu_info *viommu, const char *info)
> +{
> +    char *ptr, *oparg, *saveptr = NULL, *buf = xstrdup(info);
> +
> +    ptr = strtok_r(buf, ",", &saveptr);
> +    if (MATCH_OPTION("type", ptr, oparg)) {
> +        if (!strcmp(oparg, "intel_vtd")) {
> +            viommu->type = LIBXL_VIOMMU_TYPE_INTEL_VTD;
> +        } else {
> +            fprintf(stderr, "Invalid viommu type: %s\n", oparg);
> +            return 1;
> +        }
> +    } else {
> +        fprintf(stderr, "viommu type should be set first: %s\n", oparg);
> +        return 1;
> +    }
> +
> +    ptr = strtok_r(NULL, ",", &saveptr);
> +    if (MATCH_OPTION("intremap", ptr, oparg)) {
> +        libxl_defbool_set(&viommu->intremap, !!strtoul(oparg, NULL, 0));
> +    }
> +
> +    if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> +        for (ptr = strtok_r(NULL, ",", &saveptr); ptr;
> +             ptr = strtok_r(NULL, ",", &saveptr)) {
> +            if (MATCH_OPTION("x2apic", ptr, oparg)) {
> +                libxl_defbool_set(&viommu->u.intel_vtd.x2apic,
> +                                  !!strtoul(oparg, NULL, 0));
> +            } else {
> +                fprintf(stderr, "Unknown string `%s' in viommu spec\n", ptr);
                                                   ^ '
> +                return 1;
> +            }
> +        }
> +
> +        if (libxl_defbool_is_default(viommu->intremap))
> +            libxl_defbool_set(&viommu->intremap, true);
> +
> +        if (!libxl_defbool_val(viommu->intremap)) {
> +            fprintf(stderr, "Cannot create one virtual VTD without intremap\n");
> +            return 1;
> +        }

Why is that an option anyway if it's not possible to create an IOMMU
without intremap anyway?

> +
> +        /* Set default values to unexposed fields */
> +        viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
> +        viommu->len = VIOMMU_VTD_REGISTER_LEN;
> +
> +        /* Set desired capbilities */
> +        viommu->cap = VIOMMU_CAP_IRQ_REMAPPING;

This should be set in libxl__domain_build_info_setdefault IMHO.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-22 11:03           ` Wei Liu
@ 2017-08-23  2:06             ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  2:06 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, chao.gao

On 2017年08月22日 19:03, Wei Liu wrote:
> On Tue, Aug 22, 2017 at 04:07:32PM +0800, Lan Tianyu wrote:
>> On 2017年08月18日 18:15, Wei Liu wrote:
>>> On Fri, Aug 18, 2017 at 03:17:37PM +0800, Lan Tianyu wrote:
>>>> On 2017年08月17日 19:19, Wei Liu wrote:
>>>>> On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
>>>>>> +Now just suppport single vIOMMU for one VM and introduced domctls are compatible
>>>>>> +with multi-vIOMMU support.
>>>>>
>>>>> Is this still true? 
>>>>
>>>> Yes, the patchset just supports single vIOMMU for one VM.
>>>>
>>>
>>> The first part of the sentence is true, but the latter is probably not.
>>> It seems to me domctl is able to cope with multiple viommu. Please
>>> correct me if I'm wrong.
>>
>> These domctl is able to support multiple vIOMMU but vIOMMU device model
>> in Xen hypervisor only support single vIOMMU for one VM.
>>
> 
> In that case please update the document.
> 

OK. Will update.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-22 13:12   ` Wei Liu
@ 2017-08-23  2:36     ` Lan Tianyu
  2017-08-23  8:07       ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  2:36 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, Chao Gao

On 2017年08月22日 21:12, Wei Liu wrote:
> On Wed, Aug 09, 2017 at 04:34:08PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> The BIOS reports the remapping hardware units in a platform to system software
>> through the DMA Remapping Reporting (DMAR) ACPI table.
>> New fields are introduces for DMAR table. These new fields are set by
>> toolstack through parsing guest's config file. construct_dmar() is added to
>> build DMAR table according to the new fields.
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  tools/libacpi/build.c   | 57 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  tools/libacpi/libacpi.h |  9 ++++++++
>>  2 files changed, 66 insertions(+)
>>
>> diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
>> index f9881c9..c7cc784 100644
>> --- a/tools/libacpi/build.c
>> +++ b/tools/libacpi/build.c
>> @@ -28,6 +28,10 @@
>>  
>>  #define ACPI_MAX_SECONDARY_TABLES 16
>>  
>> +#define VTD_HOST_ADDRESS_WIDTH 39
>> +#define I440_PSEUDO_BUS_PLATFORM 0xff
>> +#define I440_PSEUDO_DEVFN_IOAPIC 0x0
> 
> I have some stupid questions. What are these I440 values? Where do they
> come from?
> 

Each IOAPIC device in the system reported via ACPI MADT must be
explicitly enumerated under one specific remapping hardware unit. We
assigned IOAPCI unit to bdf ff:00 and referenced Qemu vIOMMU implementation.


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-08-22 13:19   ` Wei Liu
@ 2017-08-23  2:46     ` Lan Tianyu
  2017-08-23  8:09       ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  2:46 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, Chao Gao

On 2017年08月22日 21:19, Wei Liu wrote:
>> +=over 4
>> > +
>> > +=item B<KEY=VALUE>
>> > +
>> > +Possible B<KEY>s are:
>> > +
>> > +=over 4
>> > +
>> > +=item B<type="STRING">
>> > +
>> > +Currently there is only one valid type:
>> > +
>> > +(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
>> > +
>> > +=item B<intremap=BOOLEAN>
>> > +
>> > +Specifies whether the vIOMMU should support interrupt remapping
>> > +and default 'true'.
>> > +
>> > +=item B<x2apic=BOOLEAN>
>> > +
>> > +Specifies whether the vIOMMU should support x2apic mode and default 'true'.
>> > +Only valid for "intel_vtd".
> Why not expose base address and length as well?

"base address" and "length" of vIOMMU register region is low level
vIOMMU configuration. I am afraid users is vary hard to determine which
region is suitable for vIOMMU and doesn't conflict with other device model.

> 
>> +
>> > +libxl_viommu_info = Struct("viommu_info", [
>> > +    ("u", KeyedUnion(None, libxl_viommu_type, "type",
>> > +           [("intel_vtd", Struct(None, [
>> > +                 ("x2apic",     libxl_defbool)]))
>> > +           ])),
>> > +    ("intremap",        libxl_defbool),
>> > +    ("cap",             uint64),
>> > +    ("base_addr",       uint64),
>> > +    ("len",             uint64),
>> > +    ])
>> > +
>> >  libxl_domain_build_info = Struct("domain_build_info",[
>> >      ("max_vcpus",       integer),
>> >      ("avail_vcpus",     libxl_bitmap),
>> > @@ -506,6 +521,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>> >      # 65000 which is reserved by the toolstack.
>> >      ("device_tree",      string),
>> >      ("acpi",             libxl_defbool),
>> > +    ("viommu",           libxl_viommu_info),
> An array please, we shouldn't limit the number of viommus.
> 
>> >      ("u", KeyedUnion(None, libxl_domain_type, "type",
>> >                  [("hvm", Struct(None, [("firmware",         string),
>> >                                         ("bios",             libxl_bios_type),
>> > diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
>> > index 5c2bf17..11c4eb2 100644
>> > --- a/tools/xl/xl_parse.c
>> > +++ b/tools/xl/xl_parse.c
>> > @@ -17,6 +17,7 @@
>> >  #include <limits.h>
>> >  #include <stdio.h>
>> >  #include <stdlib.h>
>> > +#include <xen/domctl.h>
> Why is this needed?
> 
>> > +        if (libxl_defbool_is_default(viommu->intremap))
>> > +            libxl_defbool_set(&viommu->intremap, true);
>> > +
>> > +        if (!libxl_defbool_val(viommu->intremap)) {
>> > +            fprintf(stderr, "Cannot create one virtual VTD without intremap\n");
>> > +            return 1;
>> > +        }
>> > +
>> > +        /* Set default values to unexposed fields */
>> > +        viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
>> > +        viommu->len = VIOMMU_VTD_REGISTER_LEN;
>> > +
> You're doing this is xl. This is not right. The default value should be
> set from within libxl.
> 
> You should have a libxl_XXX_setdefault function for this type.

OK. will update.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures
  2017-08-22 12:56   ` Wei Liu
@ 2017-08-23  2:47     ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  2:47 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, Chao Gao

On 2017年08月22日 20:56, Wei Liu wrote:
> On Wed, Aug 09, 2017 at 04:34:07PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> Add dmar table structure according Chapter 8 "BIOS Considerations" of
>> VTd spec Rev. 2.4.
>>
>> VTd spec:http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> I check the spec and the content, they match.
> 

Thanks.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-22 13:48         ` Wei Liu
@ 2017-08-23  5:35           ` Lan Tianyu
  2017-08-23  8:34             ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  5:35 UTC (permalink / raw)
  To: Wei Liu, xen-devel, ian.jackson, jbeulich, andrew.cooper3,
	kevin.tian, julien.grall, Gao, Chao

On 2017年08月22日 21:48, Wei Liu wrote:
>> > Hi, Wei
>> > Thanks for your comments.
>> > 
>> > iirc, HVM only supports one module; DMAR cannot be a new module. Joining to
>> > the existing one is the approach we are taking. 
>> > 
>> > Which kind of conflicts you think should be resolved? If you mean I
>> > forget to free the old buf, I will fix this. If you mean the potential
>> > overlap between the binary passed by admin and DMAR table built here, I
>> > don't have much idea on this. Even without the DMAR table, the binary
>> > may contains MADT or other tables and tool stacks don't intrepret the
>> > binary and check whether there are conflicts, right?
>> > 
> Thinking a bit more about this, when I first said "conflicts" I didn't
> mean to parse the content. I was referring to the code in
> libxl_x86_apci.c which also seems to manipulate acpi_modules.

Code in libxl_x86_acpi.c works for Hvmlite/PVHv2. The code we added is
for hvm guest.

> 
> I would like the code to generate dmar take into consideration
> libxl__dom_load_acpi.
> 

If add dmar table for hvmlite, we should combine dmar table with other
ACPI table and populate into acpi_modules[2]. This is how hvmlite add
other ACPI tables in libxl__dom_load_acpi().


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-22 14:32   ` Roger Pau Monné
@ 2017-08-23  6:06     ` Lan Tianyu
  2017-08-23  7:22       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  6:06 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

Hi Roger:
	Thanks for your review.

On 2017年08月22日 22:32, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
>> This patch is to introduce create, destroy and query capabilities
>> command for vIOMMU. vIOMMU layer will deal with requests and call
>> arch vIOMMU ops.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/domctl.c         |  3 +++
>>  xen/common/viommu.c         | 43 +++++++++++++++++++++++++++++++++++++
> 
> I'm confused, I don't see this file in the repo, and the cover letter
> doesn't mention this being based on top of any other series, where
> does this viommu.c file come from?
> 
>>  xen/include/public/domctl.h | 52 +++++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/xen/viommu.h    |  6 ++++++
>>  4 files changed, 104 insertions(+)
>>
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index d80488b..01c3024 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -1144,6 +1144,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>          if ( !ret )
>>              copyback = 1;
>>          break;
>> +    case XEN_DOMCTL_viommu_op:
>> +        ret = viommu_domctl(d, &op->u.viommu_op, &copyback);
>> +        break;
> 
> Hm, shouldn't this be protected with #ifdef CONFIG_VIOMMU?
> 

Added viommu_domctl() always returns -ENODEV when CONFIG_VIOMMU is unset.

>>      default:
>>          ret = arch_do_domctl(op, d, u_domctl);
>> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
>> index 6874d9f..a4d004d 100644
>> --- a/xen/common/viommu.c
>> +++ b/xen/common/viommu.c
>> @@ -148,6 +148,49 @@ static u64 viommu_query_caps(struct domain *d, u64 type)
>>      return viommu_type->ops->query_caps(d);
>>  }
>>  
>> +int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
>> +                  bool *need_copy)
>> +{
>> +    int rc = -EINVAL, ret;
> 
> Do you really need both ret and rc?
> 
>> +    if ( !viommu_enabled() )
>> +        return rc;
> 
> EINVAL? Maybe ENODEV?

OK.

> 
>> +
>> +    switch ( op->cmd )
>> +    {
>> +    case XEN_DOMCTL_create_viommu:
>> +        ret = viommu_create(d, op->u.create_viommu.viommu_type,
>> +            op->u.create_viommu.base_address,
>> +            op->u.create_viommu.length,
>> +            op->u.create_viommu.capabilities);
> 
> I would rather prefer for viommu_create to simply return an error or
> 0, and store the viommu_id by passing a pointer parameter to viommu_create, ie:
> 
> rc = viommu_create(d, op->u.create_viommu.viommu_type,
>                    op->u.create_viommu.base_address,
>                    op->u.create_viommu.length,
>                    op->u.create_viommu.capabilities,
>                    &op->u.create_viommu.viommu_id);
> 

Got it. Will update in the next version.

>> +        if ( ret >= 0 ) {
>                            ^ coding style
>> +            op->u.create_viommu.viommu_id = ret;
>> +            *need_copy = true;
>> +            rc = 0; /* return 0 if success */
>> +        }
>> +        break;
>> +
>> +    case XEN_DOMCTL_destroy_viommu:
>> +        rc = viommu_destroy(d, op->u.destroy_viommu.viommu_id);
>> +        break;
>> +
>> +    case XEN_DOMCTL_query_viommu_caps:
>> +        ret = viommu_query_caps(d, op->u.query_caps.viommu_type);
> 
> Same here, I would rather pass another parameter and use the return
> for error only.
> 
>> +        if ( ret >= 0 )
>> +        {
>> +            op->u.query_caps.capabilities = ret;
>> +            rc = 0;
>> +        }
>> +        *need_copy = true;
>> +        break;
>> +
>> +    default:
>> +        break;
> 
> Here you should return ENOSYS.


OK.

> 
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>>  int __init viommu_setup(void)
>>  {
>>      INIT_LIST_HEAD(&type_list);
>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>> index ff39762..4b10f26 100644
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -1149,6 +1149,56 @@ struct xen_domctl_psr_cat_op {
>>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
>>  
>> +/*  vIOMMU helper
>> + *
>> + *  vIOMMU interface can be used to create/destroy vIOMMU and
>> + *  query vIOMMU capabilities.
>> + */
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
> 
> If this going to be used to specify the vendor only, it doesn't need
> to be a bitfield, because it doesn't make sense to specify for
> example VIOMMU_TYPE_INTEL_VTD | VIOMMU_TYPE_AMD, it's either Intel or
> AMD. Do you have plans to expand this with other uses? In which case
> the comment should be fixed.

Wei suggested to replace this bitfield with a number directly. Will update.

> 
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> +    uint32_t cmd;
>> +#define XEN_DOMCTL_create_viommu          0
>> +#define XEN_DOMCTL_destroy_viommu         1
>> +#define XEN_DOMCTL_query_viommu_caps      2
>> +    union {
>> +        struct {
>> +            /* IN - vIOMMU type */
>> +            uint64_t viommu_type;
>> +            /* 
>> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
>> +             * are in charge of to check base_address and length.
>> +             */
>> +            uint64_t base_address;
>> +            /* IN - Length of MMIO region */
>> +            uint64_t length;
> 
> It seems weird that you can specify the length, is that something
> that a user would like to set? Isn't the length of the IOMMU MMIO
> region fixed by the hardware spec?

Different vendor may have different IOMMU register region sizes. (e.g,
VTD has one page size for register region). The length field is to make
vIOMMU device model not to abuse address space. Some registers' offsets
are reported by other register and these offsets are emulated by vIOMMU
device model. If it's not necessary, we can remove it and add it when
there is real such requirement.

> 
>> +            /* IN - Capabilities with which we want to create */
>> +            uint64_t capabilities;
>> +            /* OUT - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } create_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } destroy_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU type */
>> +            uint64_t viommu_type;
>> +            /* OUT - vIOMMU Capabilities */
>> +            uint64_t capabilities;
>> +        } query_caps;
> 
> This also seems weird, shouldn't you query the capabilities of an
> already created vIOMMU, rather than a vIOMMU type? Shouldn't the first
> field be viommu_id?
> 

Query interface here is to check what capabilities the vIOMMU device
model specified by viommu_type can support before create vIOMMU (suppose
user may select different capabilities). If capabilities returned by
query interface doesn't meet user configuration, tool stack should
return error. So it just accepts viommu_type.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-22 15:27       ` Roger Pau Monné
@ 2017-08-23  7:10         ` Lan Tianyu
  2017-08-23  7:38           ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  7:10 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On 2017年08月22日 23:27, Roger Pau Monné wrote:
> On Thu, Aug 17, 2017 at 08:22:16PM -0400, Lan Tianyu wrote:
>> This patch is to introduct an abstract layer for arch vIOMMU implementation
>> to deal with requests from dom0. Arch vIOMMU code needs to provide callback
>> to perform create, destroy and query capabilities operation.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/arch/x86/Kconfig     |   1 +
>>  xen/arch/x86/setup.c     |   1 +
>>  xen/common/Kconfig       |   3 +
>>  xen/common/Makefile      |   1 +
>>  xen/common/domain.c      |   3 +
>>  xen/common/viommu.c      | 165 +++++++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/xen/sched.h  |   2 +
>>  xen/include/xen/viommu.h |  71 ++++++++++++++++++++
>>  8 files changed, 247 insertions(+)
>>  create mode 100644 xen/common/viommu.c
>>  create mode 100644 xen/include/xen/viommu.h
>>
>> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
>> index 30c2769..1f1de96 100644
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -23,6 +23,7 @@ config X86
>>  	select HAS_PDX
>>  	select NUMA
>>  	select VGA
>> +	select VIOMMU
>>  
>>  config ARCH_DEFCONFIG
>>  	string
>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>> index db5df69..68f1631 100644
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -1513,6 +1513,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>      early_msi_init();
>>  
>>      iommu_setup();    /* setup iommu if available */
>> +    viommu_setup();
>>  
>>      smp_prepare_cpus(max_cpus);
>>  
>> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
>> index dc8e876..2ad2c8d 100644
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -49,6 +49,9 @@ config HAS_CHECKPOLICY
>>  	string
>>  	option env="XEN_HAS_CHECKPOLICY"
>>  
>> +config VIOMMU
>> +	bool
>> +
>>  config KEXEC
>>  	bool "kexec support"
>>  	default y
>> diff --git a/xen/common/Makefile b/xen/common/Makefile
>> index 26c5a64..852553d 100644
>> --- a/xen/common/Makefile
>> +++ b/xen/common/Makefile
>> @@ -56,6 +56,7 @@ obj-y += time.o
>>  obj-y += timer.o
>>  obj-y += trace.o
>>  obj-y += version.o
>> +obj-$(CONFIG_VIOMMU) += viommu.o
>>  obj-y += virtual_region.o
>>  obj-y += vm_event.o
>>  obj-y += vmap.o
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index b22aacc..d1f9b10 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -396,6 +396,9 @@ struct domain *domain_create(domid_t domid, unsigned int domcr_flags,
>>          spin_unlock(&domlist_update_lock);
>>      }
>>  
>> +    if ( (err = viommu_init_domain(d)) != 0 )
>> +        goto fail;
>> +
>>      return d;
>>  
>>   fail:
>> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
>> new file mode 100644
>> index 0000000..6874d9f
>> --- /dev/null
>> +++ b/xen/common/viommu.c
>> @@ -0,0 +1,165 @@
>> +/*
>> + * common/viommu.c
>> + * 
>> + * Copyright (c) 2017 Intel Corporation
>> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/sched.h>
>> +#include <xen/spinlock.h>
>> +#include <xen/types.h>
>> +#include <xen/viommu.h>
>> +
>> +bool __read_mostly opt_viommu;
>> +boolean_param("viommu", opt_viommu);
>> +
>> +static spinlock_t type_list_lock;
> 
> static DEFINE_SPINLOCK(type_list_lock);
> 
>> +static struct list_head type_list;
> 
> static LIST_HEAD(type_list);
> 
>> +
>> +struct viommu_type {
>> +    u64 type;
>> +    struct viommu_ops *ops;
>> +    struct list_head node;
>> +};
>> +
>> +int viommu_init_domain(struct domain *d)
>> +{
>> +    d->viommu.nr_viommu = 0;
>> +    return 0;
>> +}
> 
> If you don't use the viommu_info struct you can also get rid of this.
> The array entries will point to NULL which can be used to signal not
> initialized.

Yes, just check the memory  for struct domain will be set to all zero
after allocation.

> 
>> +
>> +static struct viommu_type *viommu_get_type(u64 type)
>> +{
>> +    struct viommu_type *viommu_type = NULL;
>> +
>> +    spin_lock(&type_list_lock);
>> +    list_for_each_entry( viommu_type, &type_list, node )
>> +    {
>> +        if ( viommu_type->type == type )
>> +        {
>> +            spin_unlock(&type_list_lock);
>> +            return viommu_type;
>> +        }
>> +    }
>> +    spin_unlock(&type_list_lock);
>> +
>> +    return NULL;
>> +}
>> +
>> +int viommu_register_type(u64 type, struct viommu_ops * ops)
>> +{
>> +    struct viommu_type *viommu_type = NULL;
>> +
>> +    if ( !viommu_enabled() )
>> +        return -EINVAL;
> 
> ENODEV is maybe better here.

Will update.

> 
>> +
>> +    if ( viommu_get_type(type) )
>> +        return -EEXIST;
>> +
>> +    viommu_type = xzalloc(struct viommu_type);
>> +    if ( !viommu_type )
>> +        return -ENOMEM;
>> +
>> +    viommu_type->type = type;
>> +    viommu_type->ops = ops;
>> +
>> +    spin_lock(&type_list_lock);
>> +    list_add_tail(&viommu_type->node, &type_list);
>> +    spin_unlock(&type_list_lock);
>> +
>> +    return 0;
>> +}
> 
> Hm, I haven't seen the usage of this function, but from the looks of
> it it seems like you want to use something similar to what's used by
> the scheduler in order to register vIOMMU types.
> 
> See the logic around REGISTER_SCHEDULER in xen/sched-if.h.

Each vIOMMU devel model needs to call viommu_register_type() to register
its vIOMMU type. Tool stack will pass viommu_type and vIOMMU abstract
layer call vendor's vIOMMU callback to query vIOMMU capabilities, create
and destroy() vIOMMU. We just need to maintain type list.
REGISTER_SCHEDULER seems to heavy which reserve a scheduler array in the
Xen data section.

> 
>> +
>> +static int viommu_create(struct domain *d, u64 type, u64 base_address,
>> +    u64 length, u64 caps)
>> +{
>> +    struct viommu_info *info = &d->viommu;
>> +    struct viommu *viommu;
>> +    struct viommu_type *viommu_type = NULL;
>> +    int rc;
>> +
>> +    viommu_type = viommu_get_type(type);
>> +    if ( !viommu_type )
>> +        return -EINVAL;
>> +
>> +    if ( info->nr_viommu >= NR_VIOMMU_PER_DOMAIN
>> +        || !viommu_type->ops || !viommu_type->ops->create )
>            ^ aligned with "info" above.
>> +        return -EINVAL;
>> +
>> +    viommu = xzalloc(struct viommu);
>> +    if ( !viommu )
>> +        return -ENOMEM;
>> +
>> +    viommu->base_address = base_address;
>> +    viommu->length = length;
>> +    viommu->caps = caps;
>> +    viommu->ops = viommu_type->ops;
>> +    viommu->viommu_id = info->nr_viommu;
>> +
>> +    info->viommu[info->nr_viommu] = viommu;
>> +    info->nr_viommu++;
>> +
>> +    rc = viommu->ops->create(d, viommu);
>> +    if ( rc < 0 )
>> +    {
>> +        xfree(viommu);
>> +        info->nr_viommu--;
>> +        info->viommu[info->nr_viommu] = NULL;
>> +        return rc;
>> +    }
>> +
>> +    return viommu->viommu_id;
>> +}
>> +
>> +static int viommu_destroy(struct domain *d, u32 viommu_id)
>> +{
>> +    struct viommu_info *info = &d->viommu;
>> +
>> +    if ( viommu_id >= info->nr_viommu || !info->viommu[viommu_id] )
>> +        return -EINVAL;
>> +
>> +    if ( info->viommu[viommu_id]->ops->destroy(info->viommu[viommu_id]) )
>> +        return -EFAULT;
> 
> You should return the return the original return value from the
> "destroy" function pointer, instead of hardcoding it to EFAULT.

OK. Will update.


> 
>> +
>> +    xfree(info->viommu[viommu_id]);
>> +    info->viommu[viommu_id] = NULL;
>> +    return 0;
>> +}
>> +
>> +static u64 viommu_query_caps(struct domain *d, u64 type)
>> +{
>> +    struct viommu_type *viommu_type = viommu_get_type(type);
>> +
>> +    if ( !viommu_type )
>> +        return -EINVAL;
>> +
>> +    return viommu_type->ops->query_caps(d);
>> +}
>> +
>> +int __init viommu_setup(void)
>> +{
>> +    INIT_LIST_HEAD(&type_list);
>> +    spin_lock_init(&type_list_lock);
>> +    return 0;
>> +}
> 
> With the suggested changes to init type_list and type_list_lock at
> definition time you can get rid of viommu_setup.

OK. Will update.

> 
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * End:
>> + */
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index 6673b27..98a965a 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -21,6 +21,7 @@
>>  #include <xen/perfc.h>
>>  #include <asm/atomic.h>
>>  #include <xen/wait.h>
>> +#include <xen/viommu.h>
>>  #include <public/xen.h>
>>  #include <public/domctl.h>
>>  #include <public/sysctl.h>
>> @@ -477,6 +478,7 @@ struct domain
>>      /* vNUMA topology accesses are protected by rwlock. */
>>      rwlock_t vnuma_rwlock;
>>      struct vnuma_info *vnuma;
>> +    struct viommu_info viommu;
>>  
>>      /* Common monitor options */
>>      struct {
>> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
>> new file mode 100644
>> index 0000000..506ea54
>> --- /dev/null
>> +++ b/xen/include/xen/viommu.h
>> @@ -0,0 +1,71 @@
>> +/*
>> + * include/xen/viommu.h
>> + *
>> + * Copyright (c) 2017, Intel Corporation
>> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + */
>> +#ifndef __XEN_VIOMMU_H__
>> +#define __XEN_VIOMMU_H__
>> +
>> +#define NR_VIOMMU_PER_DOMAIN 1
>> +
>> +struct viommu;
>> +
>> +struct viommu_ops {
>> +    u64 (*query_caps)(struct domain *d);
>> +    int (*create)(struct domain *d, struct viommu *viommu);
>> +    int (*destroy)(struct viommu *viommu);
>> +};
>> +
>> +struct viommu {
>> +    u64 base_address;
> 
> All those should use uint*_t instead of the u* types.
> 
>> +    u64 length;
>> +    u64 caps;
>> +    u32 viommu_id;
>> +    const struct viommu_ops *ops;
>> +    void *priv;
>> +};
>> +
>> +struct viommu_info {
>> +    u32 nr_viommu;
>> +    struct viommu *viommu[NR_VIOMMU_PER_DOMAIN]; /* viommu array*/
> 
> Seems kind of pointless to have a nr_viommu field when the array is
> not dynamic, you could just use ARRAY_SIZE. And then in the domain
> struct you could directly add an array of viommu structs, getting rid
> of viommu_info altogether.

nr_viommu helps to allocate new viommu id and check whether vIOMMU id
passed by tool stack is valid. Otherwise, we have to check vIOMMU array
every time to get viommu number.

> 
>> +};
>> +
>> +#ifdef CONFIG_VIOMMU
>> +extern bool_t opt_viommu;
> 
> bool
> 
>> +static inline bool viommu_enabled(void) { return opt_viommu; }
> 
> I think those static inline functions should also follow the coding
> standard.

Yes, I am not sure what wrong here. Could you elaborate? Thanks.


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-23  6:06     ` Lan Tianyu
@ 2017-08-23  7:22       ` Roger Pau Monné
  2017-08-23  9:12         ` Lan Tianyu
  2017-08-23 10:19         ` Julien Grall
  0 siblings, 2 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  7:22 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 23, 2017 at 02:06:17PM +0800, Lan Tianyu wrote:
> Hi Roger:
> 	Thanks for your review.
> 
> On 2017年08月22日 22:32, Roger Pau Monné wrote:
> > On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
> >> +
> >> +/* vIOMMU capabilities */
> >> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> >> +
> >> +struct xen_domctl_viommu_op {
> >> +    uint32_t cmd;
> >> +#define XEN_DOMCTL_create_viommu          0
> >> +#define XEN_DOMCTL_destroy_viommu         1
> >> +#define XEN_DOMCTL_query_viommu_caps      2
> >> +    union {
> >> +        struct {
> >> +            /* IN - vIOMMU type */
> >> +            uint64_t viommu_type;
> >> +            /* 
> >> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
> >> +             * are in charge of to check base_address and length.
> >> +             */
> >> +            uint64_t base_address;
> >> +            /* IN - Length of MMIO region */
> >> +            uint64_t length;
> > 
> > It seems weird that you can specify the length, is that something
> > that a user would like to set? Isn't the length of the IOMMU MMIO
> > region fixed by the hardware spec?
> 
> Different vendor may have different IOMMU register region sizes. (e.g,
> VTD has one page size for register region). The length field is to make
> vIOMMU device model not to abuse address space. Some registers' offsets
> are reported by other register and these offsets are emulated by vIOMMU
> device model. If it's not necessary, we can remove it and add it when
> there is real such requirement.

So from my understanding the size of the IOMMU MMIO region is implicit
in the IOMMU type that the user chooses. I don't think this field is
needed.

> > 
> >> +            /* IN - Capabilities with which we want to create */
> >> +            uint64_t capabilities;
> >> +            /* OUT - vIOMMU identity */
> >> +            uint32_t viommu_id;
> >> +        } create_viommu;
> >> +
> >> +        struct {
> >> +            /* IN - vIOMMU identity */
> >> +            uint32_t viommu_id;
> >> +        } destroy_viommu;
> >> +
> >> +        struct {
> >> +            /* IN - vIOMMU type */
> >> +            uint64_t viommu_type;
> >> +            /* OUT - vIOMMU Capabilities */
> >> +            uint64_t capabilities;
> >> +        } query_caps;
> > 
> > This also seems weird, shouldn't you query the capabilities of an
> > already created vIOMMU, rather than a vIOMMU type? Shouldn't the first
> > field be viommu_id?
> > 
> 
> Query interface here is to check what capabilities the vIOMMU device
> model specified by viommu_type can support before create vIOMMU (suppose
> user may select different capabilities). If capabilities returned by
> query interface doesn't meet user configuration, tool stack should
> return error. So it just accepts viommu_type.

I don't think that's needed, if the chosen capabilities are not
supported by the selected IOMMU type simply return error in
XEN_DOMCTL_create_viommu.

The capabilities of each IOMMU type should be listed in the man page,
and the user should select a supported set or else
XEN_DOMCTL_create_viommu will fail. Doing the checks both in the
toolstack and in XEN_DOMCTL_create_viommu seems pointless and prone to
errors.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-22 15:55   ` Roger Pau Monné
@ 2017-08-23  7:36     ` Lan Tianyu
  2017-08-23 13:53       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  7:36 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On 2017年08月22日 23:55, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
>> This patch is to add Xen virtual IOMMU doc to introduce motivation,
>> framework, vIOMMU hypercall and xl configuration.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  docs/misc/viommu.txt | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 139 insertions(+)
>>  create mode 100644 docs/misc/viommu.txt
>>
>> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
>> new file mode 100644
>> index 0000000..39455bb
>> --- /dev/null
>> +++ b/docs/misc/viommu.txt
> 
> IMHO, this should be the first patch in the series.

OK. Will update.

> 
>> @@ -0,0 +1,139 @@
>> +Xen virtual IOMMU
>> +
>> +Motivation
>> +==========
>> +*) Enable more than 255 vcpu support
> 
> Seems like the "*)" is some kind of leftover?
> 
>> +HPC cloud service requires VM provides high performance parallel
>> +computing and we hope to create a huge VM with >255 vcpu on one machine
>> +to meet such requirement. Pin each vcpu to separate pcpus.
> 
> I would re-write this as:
> 
> The current requirements of HPC cloud service requires VM with a high
> number of CPUs in order to achieve high performance in parallel
> computing.
> 
> Also, this is needed in order to create VMs with > 128 vCPUs, not 255
> vCPUs. That's because the APIC ID used by Xen is CPU ID * 2 (ie: CPU
> 127 has APIC ID 254, which is the last one available in xAPIC mode).
> You should reword the paragraphs below in order to fix the mention of
> 255 vCPUs.

Thanks for your rewrite.

> 
>> +
>> +To support >255 vcpus, X2APIC mode in guest is necessary because legacy
>> +APIC(XAPIC) just supports 8-bit APIC ID and it only can support 255
>> +vcpus at most. X2APIC mode supports 32-bit APIC ID and it requires
>> +interrupt mapping function of vIOMMU.
> 
> Correct me if I'm wrong, but I don't think x2APIC requires vIOMMU. The
> IOMMU is required so that you can route interrupts to all the possible
> CPUs. One could image a setup where only CPUs with APIC IDs < 255 are
> used as targets of external interrupts, and that doesn't require a
> IOMMU.

This is OS behavior. IIRC, Windows strictly requires IOMMU when enable
x2apic mode and Linux kernel only has such requirement when cpu number
is > 255.


> 
>> +The reason for this is that there is no modification to existing PCI MSI
>> +and IOAPIC with the introduction of X2APIC. PCI MSI/IOAPIC can only send
>> +interrupt message containing 8-bit APIC ID, which cannot address >255
>> +cpus. Interrupt remapping supports 32-bit APIC ID and so it's necessary
>> +to enable >255 cpus with x2apic mode.
>> +
>> +
>> +vIOMMU Architecture
>> +===================
>> +vIOMMU device model is inside Xen hypervisor for following factors
>> +    1) Avoid round trips between Qemu and Xen hypervisor
>> +    2) Ease of integration with the rest of hypervisor
>> +    3) HVMlite/PVH doesn't use Qemu
>> +
>> +* Interrupt remapping overview.
>> +Interrupts from virtual devices and physical devices are delivered
>> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
>> +this procedure.
>> +
>> ++---------------------------------------------------+
>> +|Qemu                       |VM                     |
>> +|                           | +----------------+    |
>> +|                           | |  Device driver |    |
>> +|                           | +--------+-------+    |
>> +|                           |          ^            |
>> +|       +----------------+  | +--------+-------+    |
>> +|       | Virtual device |  | |  IRQ subsystem |    |
>> +|       +-------+--------+  | +--------+-------+    |
>> +|               |           |          ^            |
>> +|               |           |          |            |
>> ++---------------------------+-----------------------+
>> +|hypervisor     |                      | VIRQ       |
>> +|               |            +---------+--------+   |
>> +|               |            |      vLAPIC      |   |
>> +|               |VIRQ        +---------+--------+   |
>> +|               |                      ^            |
>> +|               |                      |            |
>> +|               |            +---------+--------+   |
>> +|               |            |      vIOMMU      |   |
>> +|               |            +---------+--------+   |
>> +|               |                      ^            |
>> +|               |                      |            |
>> +|               |            +---------+--------+   |
>> +|               |            |   vIOAPIC/vMSI   |   |
>> +|               |            +----+----+--------+   |
>> +|               |                 ^    ^            |
>> +|               +-----------------+    |            |
>> +|                                      |            |
>> ++---------------------------------------------------+
>> +HW                                     |IRQ
>> +                                +-------------------+
>> +                                |   PCI Device      |
>> +                                +-------------------+
>> +
>> +
>> +vIOMMU hypercall
>> +================
>> +Introduce new domctl hypercall "xen_domctl_viommu_op" to create/destroy
>             ^ a
>> +vIOMMU and query vIOMMU capabilities that device model can support.
>          ^ s                                ^ the
>> +
>> +* vIOMMU hypercall parameter structure
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> +    uint32_t cmd;
>> +#define XEN_DOMCTL_create_viommu          0
>> +#define XEN_DOMCTL_destroy_viommu         1
>> +#define XEN_DOMCTL_query_viommu_caps      2
>> +    union {
>> +        struct {
>> +            /* IN - vIOMMU type  */
>> +            uint64_t viommu_type;
>> +            /* IN - MMIO base address of vIOMMU. */
>> +            uint64_t base_address;
>> +            /* IN - Length of MMIO region */
>> +            uint64_t length;
>> +            /* IN - Capabilities with which we want to create */
>> +            uint64_t capabilities;
>> +            /* OUT - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } create_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } destroy_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU type */
>> +            uint64_t viommu_type;
>> +            /* OUT - vIOMMU Capabilities */
>> +            uint64_t capabilities;
>> +        } query_caps;
>> +    } u;
>> +};
>> +
>> +- XEN_DOMCTL_query_viommu_caps
>> +    Query capabilities of vIOMMU device model. vIOMMU_type specifies
>> +which vendor vIOMMU device model(E,G Intel VTD) is targeted and hypervisor
>> +returns capability bits(E,G interrupt remapping bit).
>> +
>> +- XEN_DOMCTL_create_viommu
>> +    Create vIOMMU device with vIOMMU_type, capabilities, MMIO
>> +base address and length. Hypervisor returns viommu_id. Capabilities should
>> +be in range of value returned by query_viommu_caps hypercall.
>> +
>> +- XEN_DOMCTL_destroy_viommu
>> +    Destroy vIOMMU in Xen hypervisor with viommu_id as parameters.
>> +
>> +Now just suppport single vIOMMU for one VM and introduced domtcls are compatible
>> +with multi-vIOMMU support.
>> +
>> +xl vIOMMU configuration
> 
> This should be "xl x86 vIOMMU configuration", since it's clearly x86
> specific.

OK. Will update.

> 
>> +=======================
>> +viommu="type=intel_vtd,intremap=1,x2apic=1"
> 
> Shouldn't this have some kind of array form? From the code I saw it
> seems like you are adding support for domains having multiple IOMMUs,
> in which case this should at least look like:

No, we don't support mult-vIOMMU but some vIOMMU data structure is
defined with multi-vIOMMU consideration.

> 
> viommu = [
>     'type=intel_vtd,intremap=1,x2apic=1',
>     'type=intel_vtd,intremap=1,x2apic=1'
> ]
> 

Wei also suggested this. Will update.

> But then it's missing to which PCI bus each IOMMU is attached.

This will be added if we really need to support multi vIOMMU.

> 
> Also, why do you need the x2apic parameter? Is there any value in
> providing a vIOMMU if it doesn't support x2APIC mode?

User can configure whether vIOMMU can support x2APIC mode and tool stack
will use this configuration to prepare ACPI DMAR table. There is an
X2APIC_OPT_OUT bit in DMAR table to tell OS not enable X2APIC mode for
IOMMU.

> 
> Roger.
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-23  7:10         ` Lan Tianyu
@ 2017-08-23  7:38           ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  7:38 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 23, 2017 at 03:10:48PM +0800, Lan Tianyu wrote:
> On 2017年08月22日 23:27, Roger Pau Monné wrote:
> > On Thu, Aug 17, 2017 at 08:22:16PM -0400, Lan Tianyu wrote:
> >> +int viommu_register_type(u64 type, struct viommu_ops * ops)
> >> +{
> >> +    struct viommu_type *viommu_type = NULL;
> >> +
> >> +    if ( !viommu_enabled() )
> >> +        return -EINVAL;
> > 
> > ENODEV is maybe better here.
> 
> Will update.
> 
> > 
> >> +
> >> +    if ( viommu_get_type(type) )
> >> +        return -EEXIST;
> >> +
> >> +    viommu_type = xzalloc(struct viommu_type);
> >> +    if ( !viommu_type )
> >> +        return -ENOMEM;
> >> +
> >> +    viommu_type->type = type;
> >> +    viommu_type->ops = ops;
> >> +
> >> +    spin_lock(&type_list_lock);
> >> +    list_add_tail(&viommu_type->node, &type_list);
> >> +    spin_unlock(&type_list_lock);
> >> +
> >> +    return 0;
> >> +}
> > 
> > Hm, I haven't seen the usage of this function, but from the looks of
> > it it seems like you want to use something similar to what's used by
> > the scheduler in order to register vIOMMU types.
> > 
> > See the logic around REGISTER_SCHEDULER in xen/sched-if.h.
> 
> Each vIOMMU devel model needs to call viommu_register_type() to register
> its vIOMMU type. Tool stack will pass viommu_type and vIOMMU abstract
> layer call vendor's vIOMMU callback to query vIOMMU capabilities, create
> and destroy() vIOMMU. We just need to maintain type list.
> REGISTER_SCHEDULER seems to heavy which reserve a scheduler array in the
> Xen data section.

I will let the maintainers comment, but I find it easier to use
something similar to REGISTER_SCHEDULER rather than have each IOMMU
type have it's initialization function hooked up in the init procedure
and calling viommu_register_type.

From an abstract point of view I don't really see how this is
different from a scheduler for example, the IOMMU types are fixed and
compiled into Xen, and they need to be initialized/registered at boot
time.

> >> +static inline bool viommu_enabled(void) { return opt_viommu; }
> > 
> > I think those static inline functions should also follow the coding
> > standard.
> 
> Yes, I am not sure what wrong here. Could you elaborate? Thanks.

IMHO they should be:

static inline bool viommu_enabled(void)
{
    return opt_viommu;
}

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-22 15:32   ` Roger Pau Monné
@ 2017-08-23  7:42     ` Lan Tianyu
  2017-08-23  9:24       ` Jan Beulich
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  7:42 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On 2017年08月22日 23:32, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
>> This patch is to add irq request callback for platform implementation
>> to deal with irq remapping request.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/viommu.c          | 15 +++++++++
>>  xen/include/asm-x86/viommu.h | 73 ++++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/xen/viommu.h     |  9 ++++++
>>  3 files changed, 97 insertions(+)
>>  create mode 100644 xen/include/asm-x86/viommu.h
>>
>> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
>> index a4d004d..f4d34e6 100644
>> --- a/xen/common/viommu.c
>> +++ b/xen/common/viommu.c
>> @@ -198,6 +198,21 @@ int __init viommu_setup(void)
>>      return 0;
>>  }
>>  
>> +int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>> +                              struct irq_remapping_request *request)
>> +{
>> +    struct viommu_info *info = &d->viommu;
>> +
>> +    if ( viommu_id >= info->nr_viommu
>> +         || !info->viommu[viommu_id] )
> 
> This fits on the same line, no need to split it.
> 
>> +        return -EINVAL;
>> +
>> +    if ( !info->viommu[viommu_id]->ops->handle_irq_request )
>> +        return -EINVAL;
>> +
>> +    return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
>> +}
>> +
>>  /*
>>   * Local variables:
>>   * mode: C
>> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
>> new file mode 100644
>> index 0000000..51bda72
>> --- /dev/null
>> +++ b/xen/include/asm-x86/viommu.h
>> @@ -0,0 +1,73 @@
>> +/*
>> + * include/asm-x86/viommu.h
>> + *
>> + * Copyright (c) 2017 Intel Corporation.
>> + * Author: Lan Tianyu <tianyu.lan@intel.com> 
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + */
>> +#ifndef __ARCH_X86_VIOMMU_H__
>> +#define __ARCH_X86_VIOMMU_H__
>> +
>> +#include <xen/viommu.h>
>> +#include <asm/types.h>
>> +
>> +/* IRQ request type */
>> +#define VIOMMU_REQUEST_IRQ_MSI          0
>> +#define VIOMMU_REQUEST_IRQ_APIC         1
>> +
>> +struct irq_remapping_request
>> +{
>> +    union {
>> +        /* MSI */
>> +        struct {
>> +            u64 addr;
>> +            u32 data;
>> +        } msi;
>> +        /* Redirection Entry in IOAPIC */
>> +        u64 rte;
>> +    } msg;
>> +    u16 source_id;
>> +    u8 type;
>> +};
>> +
>> +static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
>> +                             uint32_t ioapic_id, uint64_t rte)
>> +{
>> +    ASSERT(req);
>> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
>> +    req->source_id = ioapic_id;
>> +    req->msg.rte = rte;
>> +}
>> +
>> +static inline void irq_request_msi_fill(struct irq_remapping_request *req,
>> +                          uint32_t source_id, uint64_t addr, uint32_t data)
>> +{
>> +    ASSERT(req);
>> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
>> +    req->source_id = source_id;
>> +    req->msg.msi.addr = addr;
>> +    req->msg.msi.data = data;
>> +}
> 
> What's the usage of those two functions? AFAICT they don't have any
> callers in this patch.

These functions will be called in the following interrupt patch 22
"x86/vmsi: Hook delivering remapping format msi to guest" and patch 16
"x86/vioapic: Hook interrupt delivery of vIOAPIC"

> 
>> +
>> +#endif /* __ARCH_X86_VIOMMU_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * End:
>> + */
>> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
>> index 527afb1..0be1b3a 100644
>> --- a/xen/include/xen/viommu.h
>> +++ b/xen/include/xen/viommu.h
>> @@ -20,6 +20,8 @@
>>  #ifndef __XEN_VIOMMU_H__
>>  #define __XEN_VIOMMU_H__
>>  
>> +#include <asm/viommu.h>
>> +
>>  #define NR_VIOMMU_PER_DOMAIN 1
>>  
>>  struct viommu;
>> @@ -28,6 +30,8 @@ struct viommu_ops {
>>      u64 (*query_caps)(struct domain *d);
>>      int (*create)(struct domain *d, struct viommu *viommu);
>>      int (*destroy)(struct viommu *viommu);
>> +    int (*handle_irq_request)(struct domain *d,
>> +                              struct irq_remapping_request *request);
> 
> I'm slightly lost, you add the function pointer here and some inline
> functions in asm-x86/viommu.h, yet I don't see them being hooked into
> the struct viommu_ops in any way.
> 
> Roger.
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-08-22 15:38   ` Roger Pau Monné
@ 2017-08-23  7:43     ` Lan Tianyu
  2017-08-23  9:25     ` Jan Beulich
  1 sibling, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  7:43 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On 2017年08月22日 23:38, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:04PM -0400, Lan Tianyu wrote:
>> This patch is to add get_irq_info callback for platform implementation
>> to convert irq remapping request to irq info (E,G vector, dest, dest_mode
>> and so on).
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/viommu.c          | 16 ++++++++++++++++
>>  xen/include/asm-x86/viommu.h |  8 ++++++++
>>  xen/include/xen/viommu.h     |  9 +++++++++
>>  3 files changed, 33 insertions(+)
>>
>> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
>> index f4d34e6..03c879d 100644
>> --- a/xen/common/viommu.c
>> +++ b/xen/common/viommu.c
>> @@ -213,6 +213,22 @@ int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>>      return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
>>  }
>>  
>> +int viommu_get_irq_info(struct domain *d, u32 viommu_id,
>> +                        struct irq_remapping_request *request,
>> +                        struct irq_remapping_info *irq_info)
> 
> The definition of this struct seems to be arch-specific, in which case
> IMHO it should be called arch_irq_remapping_info, in order to denote
> it's arch-specific.

OK. Will update.

> 
>> +{
>> +    struct viommu_info *info = &d->viommu;
>> +
>> +    if ( viommu_id >= info->nr_viommu
>> +         || !info->viommu[viommu_id] )
> 
> Unneeded line break.
> 
> Roger.
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction
  2017-08-09 20:34 ` [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction Lan Tianyu
@ 2017-08-23  7:45   ` Roger Pau Monné
  2017-08-23  8:02     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  7:45 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:11PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> If guest is configured to have a vIOMMU, create it during domain construction.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  tools/libxl/libxl_x86.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> index 455f6f0..ace20e5 100644
> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -341,8 +341,36 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
>      if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {

I would rather change this check so it's:

d_config->b_info.type != LIBXL_DOMAIN_TYPE_PV

Is there any reason why PVH guests shouldn't get a vIOMMU?

>          unsigned long shadow = DIV_ROUNDUP(d_config->b_info.shadow_memkb,
>                                             1024);
> +        libxl_viommu_info *viommu = &d_config->b_info.viommu;
> +
>          xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
>                            NULL, 0, &shadow, 0, NULL);
> +
> +        /* Check supported capbilities and create viommu */
> +        if (viommu->type) {
> +            uint32_t id;
> +            uint64_t cap;
> +
> +            if (xc_viommu_query_cap(ctx->xch, domid, viommu->type, &cap)) {
> +                LOGED(ERROR, domid, "failed to query vIOMMU's capabilities");
> +                ret = ERROR_FAIL;
> +                goto out;
> +            }
> +
> +            if ((cap & viommu->cap) != viommu->cap) {
> +                LOGED(ERROR, domid, "vIOMMU: Unsupported cap %"PRIu64, cap);
> +                ret = ERROR_FAIL;
> +                goto out;
> +            }

As said earlier, I don't think you should check the capabilities, just
try to create the vIOMMU and if the selected capabilities are not
supported xc_viommu_create should fail.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-22 16:41   ` Roger Pau Monné
@ 2017-08-23  7:52     ` Lan Tianyu
  2017-08-23  8:04       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  7:52 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月23日 00:41, Roger Pau Monné wrote:
>> > +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
>> > +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
>> > +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
>> > +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
>> > +    drhd->pci_segment = 0;
>> > +    drhd->base_address = config->iommu_base_addr;
>> > +
>> > +    scope = &drhd->scope[0];
>> > +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
>> > +    scope->length = ioapic_scope_size;
>> > +    scope->enumeration_id = config->ioapic_id;
>> > +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
>> > +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
> I'm not sure whether this constants should instead be fields in the
> acpi_config struct passed down from libxl. libxc shouldn't really need
> to know anything about which chipset a VM is using.

How about rename I440_PSEUDO_XXX to VIOMMU_PSEUDO_XXX?


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM
  2017-08-09 20:34 ` [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
@ 2017-08-23  7:58   ` Roger Pau Monné
  2017-08-24  2:16     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  7:58 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:12PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds create/destroy/query function for the emulated VTD
> and adapts it to the common VIOMMU abstraction.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/Makefile |   7 +-
>  xen/drivers/passthrough/vtd/iommu.h  |  99 +++++++++++++++++-----
>  xen/drivers/passthrough/vtd/vvtd.c   | 158 +++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/viommu.h         |   3 +
>  4 files changed, 241 insertions(+), 26 deletions(-)
>  create mode 100644 xen/drivers/passthrough/vtd/vvtd.c
> 
> diff --git a/xen/drivers/passthrough/vtd/Makefile b/xen/drivers/passthrough/vtd/Makefile
> index f302653..163c7fe 100644
> --- a/xen/drivers/passthrough/vtd/Makefile
> +++ b/xen/drivers/passthrough/vtd/Makefile
> @@ -1,8 +1,9 @@
>  subdir-$(CONFIG_X86) += x86
>  
> -obj-y += iommu.o
>  obj-y += dmar.o
> -obj-y += utils.o
> -obj-y += qinval.o
>  obj-y += intremap.o
> +obj-y += iommu.o
> +obj-y += qinval.o
>  obj-y += quirks.o
> +obj-y += utils.o
> +obj-$(CONFIG_VIOMMU) += vvtd.o
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 72c1a2e..55f3b6e 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -23,31 +23,54 @@
>  #include <asm/msi.h>
>  
>  /*
> - * Intel IOMMU register specification per version 1.0 public spec.
> + * Intel IOMMU register specification per version 2.4 public spec.
>   */
>  
> -#define    DMAR_VER_REG    0x0    /* Arch version supported by this IOMMU */
> -#define    DMAR_CAP_REG    0x8    /* Hardware supported capabilities */
> -#define    DMAR_ECAP_REG    0x10    /* Extended capabilities supported */
> -#define    DMAR_GCMD_REG    0x18    /* Global command register */
> -#define    DMAR_GSTS_REG    0x1c    /* Global status register */
> -#define    DMAR_RTADDR_REG    0x20    /* Root entry table */
> -#define    DMAR_CCMD_REG    0x28    /* Context command reg */
> -#define    DMAR_FSTS_REG    0x34    /* Fault Status register */
> -#define    DMAR_FECTL_REG    0x38    /* Fault control register */
> -#define    DMAR_FEDATA_REG    0x3c    /* Fault event interrupt data register */
> -#define    DMAR_FEADDR_REG    0x40    /* Fault event interrupt addr register */
> -#define    DMAR_FEUADDR_REG 0x44    /* Upper address register */
> -#define    DMAR_AFLOG_REG    0x58    /* Advanced Fault control */
> -#define    DMAR_PMEN_REG    0x64    /* Enable Protected Memory Region */
> -#define    DMAR_PLMBASE_REG 0x68    /* PMRR Low addr */
> -#define    DMAR_PLMLIMIT_REG 0x6c    /* PMRR low limit */
> -#define    DMAR_PHMBASE_REG 0x70    /* pmrr high base addr */
> -#define    DMAR_PHMLIMIT_REG 0x78    /* pmrr high limit */
> -#define    DMAR_IQH_REG    0x80    /* invalidation queue head */
> -#define    DMAR_IQT_REG    0x88    /* invalidation queue tail */
> -#define    DMAR_IQA_REG    0x90    /* invalidation queue addr */
> -#define    DMAR_IRTA_REG   0xB8    /* intr remap */
> +#define DMAR_VER_REG            0x0  /* Arch version supported by this IOMMU */
> +#define DMAR_CAP_REG            0x8  /* Hardware supported capabilities */
> +#define DMAR_ECAP_REG           0x10 /* Extended capabilities supported */
> +#define DMAR_GCMD_REG           0x18 /* Global command register */
> +#define DMAR_GSTS_REG           0x1c /* Global status register */
> +#define DMAR_RTADDR_REG         0x20 /* Root entry table */
> +#define DMAR_CCMD_REG           0x28 /* Context command reg */
> +#define DMAR_FSTS_REG           0x34 /* Fault Status register */
> +#define DMAR_FECTL_REG          0x38 /* Fault control register */
> +#define DMAR_FEDATA_REG         0x3c /* Fault event interrupt data register */
> +#define DMAR_FEADDR_REG         0x40 /* Fault event interrupt addr register */
> +#define DMAR_FEUADDR_REG        0x44 /* Upper address register */
> +#define DMAR_AFLOG_REG          0x58 /* Advanced Fault control */
> +#define DMAR_PMEN_REG           0x64 /* Enable Protected Memory Region */
> +#define DMAR_PLMBASE_REG        0x68 /* PMRR Low addr */
> +#define DMAR_PLMLIMIT_REG       0x6c /* PMRR low limit */
> +#define DMAR_PHMBASE_REG        0x70 /* pmrr high base addr */
> +#define DMAR_PHMLIMIT_REG       0x78 /* pmrr high limit */
> +#define DMAR_IQH_REG            0x80 /* invalidation queue head */
> +#define DMAR_IQT_REG            0x88 /* invalidation queue tail */
> +#define DMAR_IQT_REG_HI         0x8c
> +#define DMAR_IQA_REG            0x90 /* invalidation queue addr */
> +#define DMAR_IQA_REG_HI         0x94
> +#define DMAR_ICS_REG            0x9c /* Invalidation complete status */
> +#define DMAR_IECTL_REG          0xa0 /* Invalidation event control */
> +#define DMAR_IEDATA_REG         0xa4 /* Invalidation event data */
> +#define DMAR_IEADDR_REG         0xa8 /* Invalidation event address */
> +#define DMAR_IEUADDR_REG        0xac /* Invalidation event address */
> +#define DMAR_IRTA_REG           0xb8 /* Interrupt remapping table addr */
> +#define DMAR_IRTA_REG_HI        0xbc
> +#define DMAR_PQH_REG            0xc0 /* Page request queue head */
> +#define DMAR_PQH_REG_HI         0xc4
> +#define DMAR_PQT_REG            0xc8 /* Page request queue tail*/
> +#define DMAR_PQT_REG_HI         0xcc
> +#define DMAR_PQA_REG            0xd0 /* Page request queue address */
> +#define DMAR_PQA_REG_HI         0xd4
> +#define DMAR_PRS_REG            0xdc /* Page request status */
> +#define DMAR_PECTL_REG          0xe0 /* Page request event control */
> +#define DMAR_PEDATA_REG         0xe4 /* Page request event data */
> +#define DMAR_PEADDR_REG         0xe8 /* Page request event address */
> +#define DMAR_PEUADDR_REG        0xec /* Page event upper address */
> +#define DMAR_MTRRCAP_REG        0x100 /* MTRR capability */
> +#define DMAR_MTRRCAP_REG_HI     0x104
> +#define DMAR_MTRRDEF_REG        0x108 /* MTRR default type */
> +#define DMAR_MTRRDEF_REG_HI     0x10c
>  
>  #define OFFSET_STRIDE        (9)
>  #define dmar_readl(dmar, reg) readl((dmar) + (reg))
> @@ -58,6 +81,30 @@
>  #define VER_MAJOR(v)        (((v) & 0xf0) >> 4)
>  #define VER_MINOR(v)        ((v) & 0x0f)
>  
> +/* CAP_REG */
> +#define DMA_DOMAIN_ID_SHIFT         16  /* 16-bit domain id for 64K domains */
> +#define DMA_DOMAIN_ID_MASK          ((1UL << DMA_DOMAIN_ID_SHIFT) - 1)
> +#define DMA_CAP_ND                  (((DMA_DOMAIN_ID_SHIFT - 4) / 2) & 7ULL)
> +#define DMA_MGAW                    39  /* Maximum Guest Address Width */
> +#define DMA_CAP_MGAW                (((DMA_MGAW - 1) & 0x3fULL) << 16)
> +#define DMA_MAMV                    18ULL
> +#define DMA_CAP_MAMV                (DMA_MAMV << 48)
> +#define DMA_CAP_PSI                 (1ULL << 39)
> +#define DMA_CAP_SLLPS               ((1ULL << 34) | (1ULL << 35))
> +#define DMA_FRCD_REG_NR             1ULL
> +#define DMA_CAP_NFR                 ((DMA_FRCD_REG_NR - 1) << 40)
> +#define DMA_CAP_FRO_OFFSET          0x220ULL
> +#define DMA_CAP_FRO                 (DMA_CAP_FRO_OFFSET << 20)
> +
> +/* Supported Adjusted Guest Address Widths */
> +#define DMA_CAP_SAGAW_SHIFT         8
> +#define DMA_CAP_SAGAW_MASK          (0x1fULL << DMA_CAP_SAGAW_SHIFT)
> + /* 39-bit AGAW, 3-level page-table */
> +#define DMA_CAP_SAGAW_39bit         (0x2ULL << DMA_CAP_SAGAW_SHIFT)
> + /* 48-bit AGAW, 4-level page-table */
> +#define DMA_CAP_SAGAW_48bit         (0x4ULL << DMA_CAP_SAGAW_SHIFT)
> +#define DMA_CAP_SAGAW               DMA_CAP_SAGAW_39bit
> +
>  /*
>   * Decoding Capability Register
>   */
> @@ -89,6 +136,12 @@
>  #define cap_afl(c)        (((c) >> 3) & 1)
>  #define cap_ndoms(c)        (1 << (4 + 2 * ((c) & 0x7)))
>  
> +/* ECAP_REG */
> +#define DMA_ECAP_QI                 (1ULL << 1)
> +#define DMA_ECAP_IR                 (1ULL << 3)
> +#define DMA_ECAP_EIM                (1ULL << 4)
> +#define DMA_ECAP_MHMV               (15ULL << 20)

Wow, what's this chunk above? The description only mentions adding
functions for the VDT IOMMU implementation, yet there seems to be some
code movement here. Please split it into a separate patch.

>  /*
>   * Extended Capability Register
>   */
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> new file mode 100644
> index 0000000..353fafe
> --- /dev/null
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -0,0 +1,158 @@
> +/*
> + * vvtd.c
> + *
> + * virtualize VTD for HVM.
> + *
> + * Copyright (C) 2017 Chao Gao, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain_page.h>
> +#include <xen/sched.h>
> +#include <xen/types.h>
> +#include <xen/viommu.h>
> +#include <xen/xmalloc.h>
> +#include <asm/current.h>
> +#include <asm/hvm/domain.h>
> +#include <asm/page.h>
> +
> +#include "iommu.h"
> +
> +struct hvm_hw_vvtd_regs {
> +    uint8_t data[1024];
> +};
> +
> +/* Status field of struct vvtd */
> +#define VIOMMU_STATUS_DEFAULT                   (0)
> +#define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
> +#define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
> +
> +struct vvtd {
> +    /* VIOMMU_STATUS_XXX */
> +    int status;

unsigned int if it's a bitfield.

> +    /* Address range of remapping hardware register-set */
> +    uint64_t base_addr;
> +    uint64_t length;
> +    /* Point back to the owner domain */
> +    struct domain *domain;
> +    struct hvm_hw_vvtd_regs *regs;
> +    struct page_info *regs_page;
> +};
> +
> +static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
> +                                uint32_t value)
> +{
> +    *((uint32_t *)(&vtd->regs->data[reg])) = value;
> +}
> +
> +static inline uint32_t vvtd_get_reg(struct vvtd *vtd, uint32_t reg)
> +{
> +    return *((uint32_t *)(&vtd->regs->data[reg]));
> +}
> +
> +static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
> +{
> +    return *((uint8_t *)(&vtd->regs->data[reg]));

You don't need castings here, data is already an array of
uint8_t.

> +}
> +
> +#define vvtd_get_reg_quad(vvtd, reg, val) do {  \
> +    (val) = vvtd_get_reg(vvtd, (reg) + 4 );     \
> +    (val) = (val) << 32;                        \
> +    (val) += vvtd_get_reg(vvtd, reg);           \
> +} while(0)
> +#define vvtd_set_reg_quad(vvtd, reg, val) do {  \
> +    vvtd_set_reg(vvtd, reg, (val));             \
> +    vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
> +} while(0)

You seem to need to access hvm_hw_vvtd_regs using different sizes, why
not do:

union hvm_hw_vvtd_regs {
    uint8_t  data8[1024];
    uint16_t data16[512];
    uint32_t data32[256];
    uint64_t data64[128];
};

Then the access is much more straightforward and you don't need the
complicated helpers that you have above.

> +static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
> +{
> +    uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
> +                   DMA_CAP_MGAW | DMA_CAP_SAGAW | DMA_CAP_ND;
> +    uint64_t ecap = DMA_ECAP_IR | DMA_ECAP_EIM | DMA_ECAP_QI;
> +
> +    vvtd_set_reg(vvtd, DMAR_VER_REG, 0x10UL);
> +    vvtd_set_reg_quad(vvtd, DMAR_CAP_REG, cap);
> +    vvtd_set_reg_quad(vvtd, DMAR_ECAP_REG, ecap);
> +    vvtd_set_reg(vvtd, DMAR_FECTL_REG, 0x80000000UL);
> +    vvtd_set_reg(vvtd, DMAR_IECTL_REG, 0x80000000UL);
> +}
> +
> +static u64 vvtd_query_caps(struct domain *d)
> +{
> +    return VIOMMU_CAP_IRQ_REMAPPING;
> +}
> +
> +static int vvtd_create(struct domain *d, struct viommu *viommu)
> +{
> +    struct vvtd *vvtd;
> +    int ret;
> +
> +    if ( !is_hvm_domain(d) || (viommu->length != PAGE_SIZE) ||
> +        (~vvtd_query_caps(d) & viommu->caps) )
> +        return -EINVAL;
> +
> +    ret = -ENOMEM;
> +    vvtd = xmalloc_bytes(sizeof(struct vvtd));
> +    if ( !vvtd )
> +        return ret;
> +
> +    vvtd->regs_page = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !vvtd->regs_page )
> +        goto out1;
> +
> +    vvtd->regs = __map_domain_page_global(vvtd->regs_page);
> +    if ( !vvtd->regs )
> +        goto out2;
> +    clear_page(vvtd->regs);
> +
> +    vvtd_reset(vvtd, viommu->caps);
> +    vvtd->base_addr = viommu->base_address;

Don't you need to perform any checks on the base_address? It needs to
be page aligned at least.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction
  2017-08-23  7:45   ` Roger Pau Monné
@ 2017-08-23  8:02     ` Lan Tianyu
  2017-08-23 13:53       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  8:02 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月23日 15:45, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:11PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> If guest is configured to have a vIOMMU, create it during domain construction.
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  tools/libxl/libxl_x86.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
>> index 455f6f0..ace20e5 100644
>> --- a/tools/libxl/libxl_x86.c
>> +++ b/tools/libxl/libxl_x86.c
>> @@ -341,8 +341,36 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
>>      if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {
> 
> I would rather change this check so it's:
> 
> d_config->b_info.type != LIBXL_DOMAIN_TYPE_PV
> 
> Is there any reason why PVH guests shouldn't get a vIOMMU?

No, but we current only support vIOMMU for HVM guest and don't know how
PVH guest enumerates vIOMMU without ACPI DMAR table.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-23  7:52     ` Lan Tianyu
@ 2017-08-23  8:04       ` Roger Pau Monné
  2017-08-23 14:11         ` Roger Pau Monné
  2017-08-24  2:33         ` Lan Tianyu
  0 siblings, 2 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  8:04 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 23, 2017 at 03:52:01PM +0800, Lan Tianyu wrote:
> On 2017年08月23日 00:41, Roger Pau Monné wrote:
> >> > +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
> >> > +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
> >> > +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
> >> > +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
> >> > +    drhd->pci_segment = 0;
> >> > +    drhd->base_address = config->iommu_base_addr;
> >> > +
> >> > +    scope = &drhd->scope[0];
> >> > +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
> >> > +    scope->length = ioapic_scope_size;
> >> > +    scope->enumeration_id = config->ioapic_id;
> >> > +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
> >> > +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
> > I'm not sure whether this constants should instead be fields in the
> > acpi_config struct passed down from libxl. libxc shouldn't really need
> > to know anything about which chipset a VM is using.
> 
> How about rename I440_PSEUDO_XXX to VIOMMU_PSEUDO_XXX?

I'm not really complaining about the naming, I'm just saying that I'm
not sure whether this constants should live in libxl. It would be
better IMHO if they where defined in some libxl x86 specific header,
and passed to libxc inside of the acpi_config struct.

At the end it is libxl which decides which chipset the VM is going to
use, not libxc.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-23  2:36     ` Lan Tianyu
@ 2017-08-23  8:07       ` Wei Liu
  0 siblings, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-23  8:07 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 23, 2017 at 10:36:45AM +0800, Lan Tianyu wrote:
> On 2017年08月22日 21:12, Wei Liu wrote:
> > On Wed, Aug 09, 2017 at 04:34:08PM -0400, Lan Tianyu wrote:
> >> From: Chao Gao <chao.gao@intel.com>
> >>
> >> The BIOS reports the remapping hardware units in a platform to system software
> >> through the DMA Remapping Reporting (DMAR) ACPI table.
> >> New fields are introduces for DMAR table. These new fields are set by
> >> toolstack through parsing guest's config file. construct_dmar() is added to
> >> build DMAR table according to the new fields.
> >>
> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >>  tools/libacpi/build.c   | 57 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>  tools/libacpi/libacpi.h |  9 ++++++++
> >>  2 files changed, 66 insertions(+)
> >>
> >> diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
> >> index f9881c9..c7cc784 100644
> >> --- a/tools/libacpi/build.c
> >> +++ b/tools/libacpi/build.c
> >> @@ -28,6 +28,10 @@
> >>  
> >>  #define ACPI_MAX_SECONDARY_TABLES 16
> >>  
> >> +#define VTD_HOST_ADDRESS_WIDTH 39
> >> +#define I440_PSEUDO_BUS_PLATFORM 0xff
> >> +#define I440_PSEUDO_DEVFN_IOAPIC 0x0
> > 
> > I have some stupid questions. What are these I440 values? Where do they
> > come from?
> > 
> 
> Each IOAPIC device in the system reported via ACPI MADT must be
> explicitly enumerated under one specific remapping hardware unit. We
> assigned IOAPCI unit to bdf ff:00 and referenced Qemu vIOMMU implementation.

OK, do they need to be somewhere in a public header?

(See Roger's comment in the other sub-thread)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-08-23  2:46     ` Lan Tianyu
@ 2017-08-23  8:09       ` Wei Liu
  0 siblings, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-23  8:09 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 23, 2017 at 10:46:13AM +0800, Lan Tianyu wrote:
> On 2017年08月22日 21:19, Wei Liu wrote:
> >> +=over 4
> >> > +
> >> > +=item B<KEY=VALUE>
> >> > +
> >> > +Possible B<KEY>s are:
> >> > +
> >> > +=over 4
> >> > +
> >> > +=item B<type="STRING">
> >> > +
> >> > +Currently there is only one valid type:
> >> > +
> >> > +(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
> >> > +
> >> > +=item B<intremap=BOOLEAN>
> >> > +
> >> > +Specifies whether the vIOMMU should support interrupt remapping
> >> > +and default 'true'.
> >> > +
> >> > +=item B<x2apic=BOOLEAN>
> >> > +
> >> > +Specifies whether the vIOMMU should support x2apic mode and default 'true'.
> >> > +Only valid for "intel_vtd".
> > Why not expose base address and length as well?
> 
> "base address" and "length" of vIOMMU register region is low level
> vIOMMU configuration. I am afraid users is vary hard to determine which
> region is suitable for vIOMMU and doesn't conflict with other device model.

That's fair. Assuming those two values are hardware specific (as I read
in another sub-thread) I'm fine with not exposing them (should they be
needed at all).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 12/25] x86/vvtd: Add MMIO handler for VVTD
  2017-08-09 20:34 ` [PATCH V2 12/25] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
@ 2017-08-23  8:27   ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  8:27 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:13PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds VVTD MMIO handler to deal with MMIO access.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/vvtd.c | 114 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 114 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 353fafe..94680e6 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -50,6 +50,38 @@ struct vvtd {
>      struct page_info *regs_page;
>  };
>  
> +#define __DEBUG_VVTD__
> +#ifdef __DEBUG_VVTD__

Why do you need this define? You can use NDEBUG which is the global
Xen debug define.

> +extern unsigned int vvtd_debug_level;
> +#define VVTD_DBG_INFO     1
> +#define VVTD_DBG_TRANS    (1<<1)
> +#define VVTD_DBG_RW       (1<<2)
> +#define VVTD_DBG_FAULT    (1<<3)
> +#define VVTD_DBG_EOI      (1<<4)
> +#define VVTD_DEBUG(lvl, _f, _a...) do { \
> +    if ( vvtd_debug_level & lvl ) \
> +        printk("VVTD %s:" _f "\n", __func__, ## _a);    \
> +} while(0)
> +#else
> +#define VVTD_DEBUG(fmt...) do {} while(0)
> +#endif
> +
> +unsigned int vvtd_debug_level __read_mostly;
> +integer_param("vvtd_debug", vvtd_debug_level);

I think this should be a top level option for viommu, not a vtd
specific one, like it's done for iommu. I would prefer to have
something like:

viommu=verbose,[...]

So that we can add further options to it in the future, and that
should be shared between all the vIOMMU implementations.

> +
> +struct vvtd *domain_vvtd(struct domain *d)
> +{
> +    struct viommu_info *info = &d->viommu;
> +
> +    BUILD_BUG_ON(NR_VIOMMU_PER_DOMAIN != 1);
> +    return (info && info->viommu[0]) ? info->viommu[0]->priv : NULL;
> +}
> +
> +static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
> +{
> +    return domain_vvtd(v->domain);
> +}

Do you really need the above helper? Seems quite trivial to directly
open code the call to domain_vvtd.

> +
>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
>                                  uint32_t value)
>  {
> @@ -76,6 +108,87 @@ static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
>      vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
>  } while(0)
>  
> +static int vvtd_range(struct vcpu *v, unsigned long addr)

bool and maybe vvtd_in_range is more descriptive.

> +{
> +    struct vvtd *vvtd = vcpu_vvtd(v);
> +
> +    if ( vvtd )
> +        return (addr >= vvtd->base_addr) &&
> +               (addr < vvtd->base_addr + PAGE_SIZE);

So here you simply hardcode PAGE_SIZE, which makes me think that the
size parameter is indeed not needed.

> +    return 0;
> +}
> +
> +static int vvtd_read(struct vcpu *v, unsigned long addr,
> +                     unsigned int len, unsigned long *pval)
> +{
> +    struct vvtd *vvtd = vcpu_vvtd(v);
> +    unsigned int offset = addr - vvtd->base_addr;
> +    unsigned int offset_aligned = offset & ~3;

This is not needed IMHO.

> +
> +    VVTD_DEBUG(VVTD_DBG_RW, "READ INFO: offset %x len %d.", offset, len);
> +
> +    if ( !pval )
> +        return X86EMUL_UNHANDLEABLE;

I don't think you can get a NULL pval here.

> +
> +    if ( (offset & 3) || ((len != 4) && (len != 8)) )

Do you really intend to allow non-aligned 8 byte accesses? If so my
previous recommendation to use a union for the vvtd struct data field
is moot.

> +    {
> +        VVTD_DEBUG(VVTD_DBG_RW, "Alignment or length is not canonical");
> +        return X86EMUL_UNHANDLEABLE;

IMHO you should not return X86EMUL_UNHANDLEABLE here. The read does
indeed belong to the vIOMMU region, so what does bare-metal hardware
do when a non-aligned or non size compliant read is performed?

> +    }
> +
> +    if ( len == 4 )
> +        *pval = vvtd_get_reg(vvtd, offset_aligned);

You can just use offset here and below, the check that you do above
guarantees that offset & 3 == 0 (ie: at this point offset ==
offset_aligned).

> +    else
> +        vvtd_get_reg_quad(vvtd, offset_aligned, *pval);

Newline

> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write(struct vcpu *v, unsigned long addr,
> +                      unsigned int len, unsigned long val)
> +{
> +    struct vvtd *vvtd = vcpu_vvtd(v);
> +    unsigned int offset = addr - vvtd->base_addr;
> +    unsigned int offset_aligned = offset & ~0x3;
> +    int ret;
> +
> +    VVTD_DEBUG(VVTD_DBG_RW, "WRITE INFO: offset %x len %d val %lx.",
> +               offset, len, val);
> +
> +    if ( (offset & 3) || ((len != 4) && (len != 8)) )

Same comment about alignment.

> +    {
> +        VVTD_DEBUG(VVTD_DBG_RW, "Alignment or length is not canonical");
> +        return X86EMUL_UNHANDLEABLE;
> +    }
> +
> +    ret = X86EMUL_UNHANDLEABLE;

Same here, you should not return X86EMUL_UNHANDLEABLE but instead do
what the native hardware would do, which I guess is to just ignore the
write?

> +    if ( len == 4 )
> +    {
> +        switch ( offset_aligned )
> +        {
> +        case DMAR_IEDATA_REG:
> +        case DMAR_IEADDR_REG:
> +        case DMAR_IEUADDR_REG:
> +        case DMAR_FEDATA_REG:
> +        case DMAR_FEADDR_REG:
> +        case DMAR_FEUADDR_REG:
> +            vvtd_set_reg(vvtd, offset_aligned, val);
> +            ret = X86EMUL_OKAY;
> +            break;
> +
> +        default:
> +            break;
> +        }
> +    }
> +
> +    return ret;
> +}
> +
> +static const struct hvm_mmio_ops vvtd_mmio_ops = {
> +    .check = vvtd_range,
> +    .read = vvtd_read,
> +    .write = vvtd_write
> +};
> +
>  static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
>  {
>      uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
> @@ -122,6 +235,7 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
>      vvtd->length = viommu->length;
>      vvtd->domain = d;
>      vvtd->status = VIOMMU_STATUS_DEFAULT;
> +    register_mmio_handler(d, &vvtd_mmio_ops);

Newline.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-23  5:35           ` Lan Tianyu
@ 2017-08-23  8:34             ` Wei Liu
  2017-08-24  3:24               ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-23  8:34 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Gao, Chao

On Wed, Aug 23, 2017 at 01:35:17PM +0800, Lan Tianyu wrote:
> On 2017年08月22日 21:48, Wei Liu wrote:
> >> > Hi, Wei
> >> > Thanks for your comments.
> >> > 
> >> > iirc, HVM only supports one module; DMAR cannot be a new module. Joining to
> >> > the existing one is the approach we are taking. 
> >> > 
> >> > Which kind of conflicts you think should be resolved? If you mean I
> >> > forget to free the old buf, I will fix this. If you mean the potential
> >> > overlap between the binary passed by admin and DMAR table built here, I
> >> > don't have much idea on this. Even without the DMAR table, the binary
> >> > may contains MADT or other tables and tool stacks don't intrepret the
> >> > binary and check whether there are conflicts, right?
> >> > 
> > Thinking a bit more about this, when I first said "conflicts" I didn't
> > mean to parse the content. I was referring to the code in
> > libxl_x86_apci.c which also seems to manipulate acpi_modules.
> 
> Code in libxl_x86_acpi.c works for Hvmlite/PVHv2. The code we added is
> for hvm guest.
> 

That's correct for the code as-is but what is preventing the code there
from working with HVM? Assuming correct checks and branches are added
to appropriate places?

I'm against having multiple locations doing things that could
potentially clash with each other. In the foreseeable future PVH is
going to get need similar functionality.

My expectation is that if the existing code needs to be taken into
consideration and the contributors need to figure out if and how it can
be modified to suite their needs. If everyone is doing their own thing
in their own little function Xen will eventually become unmaintainable.

> > 
> > I would like the code to generate dmar take into consideration
> > libxl__dom_load_acpi.
> > 
> 
> If add dmar table for hvmlite, we should combine dmar table with other
> ACPI table and populate into acpi_modules[2]. This is how hvmlite add
> other ACPI tables in libxl__dom_load_acpi().
> 

Sure, that sounds plausible.

What I would like to see is to have one entry point to manipulate APCI
tables.

Given the patch volume we're seeing now, we expect contributors to drive
the discussion forward. If you're not sure, feel free to ask more questions.

> 
> -- 
> Best regards
> Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 13/25] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  2017-08-09 20:34 ` [PATCH V2 13/25] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
@ 2017-08-23  8:47   ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  8:47 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:14PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Software sets this field to set/update the interrupt remapping table pointer
> used by hardware. The interrupt remapping table pointer is specified through
> the Interrupt Remapping Table Address (IRTA_REG) register.
> 
> This patch emulates this operation and adds some new fields in VVTD to track
> info (e.g. the table's gfn and max supported entries) of interrupt remapping
> table.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  9 ++++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 73 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 81 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 55f3b6e..102b4f3 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -192,9 +192,16 @@
>  #define DMA_GSTS_WBFS   (((u64)1) << 27)
>  #define DMA_GSTS_QIES   (((u64)1) <<26)
>  #define DMA_GSTS_IRES   (((u64)1) <<25)
> -#define DMA_GSTS_SIRTPS (((u64)1) << 24)
> +#define DMA_GSTS_SIRTPS_BIT     24

Those kind of defines are usually suffixed with _SHIFT, not _BIT.

> +#define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_BIT)
>  #define DMA_GSTS_CFIS   (((u64)1) <<23)
>  
> +/* IRTA_REG */
> +#define DMA_IRTA_ADDR(val)      (val & ~0xfffULL)
> +#define DMA_IRTA_EIME(val)      (!!(val & (1 << 11)))

No need for the outer parentheses.

> +#define DMA_IRTA_S(val)         (val & 0xf)
> +#define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))

All those defines above seem kind of magic. Can we maybe get a small
comment describing it's meaning?

>  /* PMEN_REG */
>  #define DMA_PMEN_EPM    (((u32)1) << 31)
>  #define DMA_PMEN_PRS    (((u32)1) << 0)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 94680e6..8e8dbe6 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -46,6 +46,13 @@ struct vvtd {
>      uint64_t length;
>      /* Point back to the owner domain */
>      struct domain *domain;
> +    /* Is in Extended Interrupt Mode? */
> +    bool eim;
> +    /* Max remapping entries in IRT */
> +    int irt_max_entry;

unsigned int.

> +    /* Interrupt remapping table base gfn */
> +    uint64_t irt;

If it's a gfn you should use gfn_t.

> +
>      struct hvm_hw_vvtd_regs *regs;
>      struct page_info *regs_page;
>  };
> @@ -82,6 +89,11 @@ static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
>      return domain_vvtd(v->domain);
>  }
>  
> +static inline void __vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)

No leading underscores, and unsigned int for nr.

> +{
> +    return __set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);

Why the return?

> +}
> +
>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
>                                  uint32_t value)
>  {
> @@ -108,6 +120,44 @@ static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
>      vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
>  } while(0)
>  
> +static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
> +{
> +    uint64_t irta;
> +
> +    if ( !(val & DMA_GCMD_SIRTP) )
> +        return X86EMUL_OKAY;
> +
> +    vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
> +    vvtd->irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
> +    vvtd->irt_max_entry = DMA_IRTA_SIZE(irta);
> +    vvtd->eim = DMA_IRTA_EIME(irta);
> +    VVTD_DEBUG(VVTD_DBG_RW, "Update IR info (addr=%lx eim=%d size=%d).",
> +               vvtd->irt, vvtd->eim, vvtd->irt_max_entry);
> +    __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_SIRTPS_BIT);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
> +{
> +    uint32_t orig = vvtd_get_reg(vvtd, DMAR_GSTS_REG);
> +    uint32_t changed;
> +
> +    orig = orig & 0x96ffffff;    /* reset the one-shot bits */

Some kind of define for this magic value is needed.

> +    changed = orig ^ val;
> +
> +    if ( !changed )
> +        return X86EMUL_OKAY;
> +    if ( (changed & (changed - 1)) )
> +        VVTD_DEBUG(VVTD_DBG_RW, "Guest attempts to update multiple fields "
> +                     "of GCMD_REG in one write transation.");

Since this is a debug message it would be good to print at least the
value of changed, or possibly even better the values of both orig and
val.

> +
> +    if ( changed & DMA_GCMD_SIRTP )
> +        vvtd_handle_gcmd_sirtp(vvtd, val);
> +
> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_range(struct vcpu *v, unsigned long addr)
>  {
>      struct vvtd *vvtd = vcpu_vvtd(v);
> @@ -165,12 +215,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>      {
>          switch ( offset_aligned )
>          {
> +        case DMAR_GCMD_REG:
> +            ret = vvtd_write_gcmd(vvtd, val);
> +            break;
> +
>          case DMAR_IEDATA_REG:
>          case DMAR_IEADDR_REG:
>          case DMAR_IEUADDR_REG:
>          case DMAR_FEDATA_REG:
>          case DMAR_FEADDR_REG:
>          case DMAR_FEUADDR_REG:
> +        case DMAR_IRTA_REG:
> +        case DMAR_IRTA_REG_HI:
>              vvtd_set_reg(vvtd, offset_aligned, val);
>              ret = X86EMUL_OKAY;
>              break;
> @@ -179,6 +235,20 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>              break;
>          }
>      }
> +    else /* len == 8 */
> +    {
> +        switch ( offset_aligned )
> +        {
> +        case DMAR_IRTA_REG:
> +            vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);
> +            ret = X86EMUL_OKAY;
> +            break;
> +
> +        default:
> +            ret = X86EMUL_UNHANDLEABLE;

Same here, you should not return X86EMUL_UNHANDLEABLE but instead
mimic what native hardware would do in such case.

> +            break;
> +        }
> +    }
>  
>      return ret;
>  }
> @@ -235,6 +305,9 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
>      vvtd->length = viommu->length;
>      vvtd->domain = d;
>      vvtd->status = VIOMMU_STATUS_DEFAULT;
> +    vvtd->eim = 0;
> +    vvtd->irt = 0;
> +    vvtd->irt_max_entry = 0;

Maybe you should consider using xzalloc which will already initialize
this to 0.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-23  7:22       ` Roger Pau Monné
@ 2017-08-23  9:12         ` Lan Tianyu
  2017-08-23 10:19         ` Julien Grall
  1 sibling, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  9:12 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On 2017年08月23日 15:22, Roger Pau Monné wrote:
> On Wed, Aug 23, 2017 at 02:06:17PM +0800, Lan Tianyu wrote:
>> Hi Roger:
>> 	Thanks for your review.
>>
>> On 2017年08月22日 22:32, Roger Pau Monné wrote:
>>> On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
>>>> +
>>>> +/* vIOMMU capabilities */
>>>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>>>> +
>>>> +struct xen_domctl_viommu_op {
>>>> +    uint32_t cmd;
>>>> +#define XEN_DOMCTL_create_viommu          0
>>>> +#define XEN_DOMCTL_destroy_viommu         1
>>>> +#define XEN_DOMCTL_query_viommu_caps      2
>>>> +    union {
>>>> +        struct {
>>>> +            /* IN - vIOMMU type */
>>>> +            uint64_t viommu_type;
>>>> +            /* 
>>>> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
>>>> +             * are in charge of to check base_address and length.
>>>> +             */
>>>> +            uint64_t base_address;
>>>> +            /* IN - Length of MMIO region */
>>>> +            uint64_t length;
>>>
>>> It seems weird that you can specify the length, is that something
>>> that a user would like to set? Isn't the length of the IOMMU MMIO
>>> region fixed by the hardware spec?
>>
>> Different vendor may have different IOMMU register region sizes. (e.g,
>> VTD has one page size for register region). The length field is to make
>> vIOMMU device model not to abuse address space. Some registers' offsets
>> are reported by other register and these offsets are emulated by vIOMMU
>> device model. If it's not necessary, we can remove it and add it when
>> there is real such requirement.
> 
> So from my understanding the size of the IOMMU MMIO region is implicit
> in the IOMMU type that the user chooses. I don't think this field is
> needed.

OK. Will remove it.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-23  7:42     ` Lan Tianyu
@ 2017-08-23  9:24       ` Jan Beulich
  2017-08-23  9:47         ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Jan Beulich @ 2017-08-23  9:24 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, Roger Pau Monné,
	chao.gao

>>> On 23.08.17 at 09:42, <tianyu.lan@intel.com> wrote:
> On 2017年08月22日 23:32, Roger Pau Monné wrote:
>> On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
>>> +static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
>>> +                             uint32_t ioapic_id, uint64_t rte)
>>> +{
>>> +    ASSERT(req);
>>> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
>>> +    req->source_id = ioapic_id;
>>> +    req->msg.rte = rte;
>>> +}
>>> +
>>> +static inline void irq_request_msi_fill(struct irq_remapping_request *req,
>>> +                          uint32_t source_id, uint64_t addr, uint32_t data)
>>> +{
>>> +    ASSERT(req);
>>> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
>>> +    req->source_id = source_id;
>>> +    req->msg.msi.addr = addr;
>>> +    req->msg.msi.data = data;
>>> +}
>> 
>> What's the usage of those two functions? AFAICT they don't have any
>> callers in this patch.
> 
> These functions will be called in the following interrupt patch 22
> "x86/vmsi: Hook delivering remapping format msi to guest" and patch 16
> "x86/vioapic: Hook interrupt delivery of vIOAPIC"

That's _far_ away. As implied by Roger's comment, please try to
avoid introducing dead code, especially when it's dead for an
extended period of time. Always remember that a series may not
be committed in one go.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-08-22 15:38   ` Roger Pau Monné
  2017-08-23  7:43     ` Lan Tianyu
@ 2017-08-23  9:25     ` Jan Beulich
  1 sibling, 0 replies; 136+ messages in thread
From: Jan Beulich @ 2017-08-23  9:25 UTC (permalink / raw)
  To: Roger Pau Monné, Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, chao.gao

>>> On 22.08.17 at 17:38, <roger.pau@citrix.com> wrote:
> On Wed, Aug 09, 2017 at 04:34:04PM -0400, Lan Tianyu wrote:
>> --- a/xen/common/viommu.c
>> +++ b/xen/common/viommu.c
>> @@ -213,6 +213,22 @@ int viommu_handle_irq_request(struct domain *d, u32 viommu_id,
>>      return info->viommu[viommu_id]->ops->handle_irq_request(d, request);
>>  }
>>  
>> +int viommu_get_irq_info(struct domain *d, u32 viommu_id,
>> +                        struct irq_remapping_request *request,
>> +                        struct irq_remapping_info *irq_info)
> 
> The definition of this struct seems to be arch-specific, in which case
> IMHO it should be called arch_irq_remapping_info, in order to denote
> it's arch-specific.

In which case it also wouldn't belong in this file.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping
  2017-08-23  9:24       ` Jan Beulich
@ 2017-08-23  9:47         ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-23  9:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, Roger Pau Monné,
	chao.gao

On 2017年08月23日 17:24, Jan Beulich wrote:
>>>> On 23.08.17 at 09:42, <tianyu.lan@intel.com> wrote:
>> On 2017年08月22日 23:32, Roger Pau Monné wrote:
>>> On Wed, Aug 09, 2017 at 04:34:03PM -0400, Lan Tianyu wrote:
>>>> +static inline void irq_request_ioapic_fill(struct irq_remapping_request *req,
>>>> +                             uint32_t ioapic_id, uint64_t rte)
>>>> +{
>>>> +    ASSERT(req);
>>>> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
>>>> +    req->source_id = ioapic_id;
>>>> +    req->msg.rte = rte;
>>>> +}
>>>> +
>>>> +static inline void irq_request_msi_fill(struct irq_remapping_request *req,
>>>> +                          uint32_t source_id, uint64_t addr, uint32_t data)
>>>> +{
>>>> +    ASSERT(req);
>>>> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
>>>> +    req->source_id = source_id;
>>>> +    req->msg.msi.addr = addr;
>>>> +    req->msg.msi.data = data;
>>>> +}
>>>
>>> What's the usage of those two functions? AFAICT they don't have any
>>> callers in this patch.
>>
>> These functions will be called in the following interrupt patch 22
>> "x86/vmsi: Hook delivering remapping format msi to guest" and patch 16
>> "x86/vioapic: Hook interrupt delivery of vIOAPIC"
> 
> That's _far_ away. As implied by Roger's comment, please try to
> avoid introducing dead code, especially when it's dead for an
> extended period of time. Always remember that a series may not
> be committed in one go.
OK. Will change order.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request
  2017-08-09 20:34 ` [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request Lan Tianyu
@ 2017-08-23  9:49   ` Roger Pau Monné
  2017-08-23  9:59     ` Jan Beulich
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  9:49 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:15PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> When a remapping interrupt request arrives, remapping hardware computes the
> interrupt_index per the algorithm described in VTD spec
> "Interrupt Remapping Table", interprets the IRTE and generates a remapped
> interrupt request.
> 
> This patch introduces viommu_handle_irq_request() to emulate the process how
> remapping hardware handles a remapping interrupt request.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  21 +++
>  xen/drivers/passthrough/vtd/vtd.h   |   6 +
>  xen/drivers/passthrough/vtd/vvtd.c  | 276 +++++++++++++++++++++++++++++++++++-
>  3 files changed, 302 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 102b4f3..70e64cf 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -244,6 +244,21 @@
>  #define dma_frcd_source_id(c) (c & 0xffff)
>  #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
>  
> +enum VTD_FAULT_TYPE
> +{
> +    /* Interrupt remapping transition faults */
> +    VTD_FR_IR_REQ_RSVD = 0x20,   /* One or more IR request reserved
> +                                  * fields set */
> +    VTD_FR_IR_INDEX_OVER = 0x21, /* Index value greater than max */
> +    VTD_FR_IR_ENTRY_P = 0x22,    /* Present (P) not set in IRTE */
> +    VTD_FR_IR_ROOT_INVAL = 0x23, /* IR Root table invalid */
> +    VTD_FR_IR_IRTE_RSVD = 0x24,  /* IRTE Rsvd field non-zero with
> +                                  * Present flag set */
> +    VTD_FR_IR_REQ_COMPAT = 0x25, /* Encountered compatible IR
> +                                  * request while disabled */
> +    VTD_FR_IR_SID_ERR = 0x26,    /* Invalid Source-ID */
> +};

Could you please align the values, like:

enum VTD_FAULT_TYPE
{
    /* Interrupt remapping transition faults */
    VTD_FR_IR_REQ_RSVD    = 0x20, /* One or more IR request reserved
                                   * fields set */
    VTD_FR_IR_INDEX_OVER  = 0x21, /* Index value greater than max */


> +
>  /*
>   * 0: Present
>   * 1-11: Reserved
> @@ -384,6 +399,12 @@ struct iremap_entry {
>  };
>  
>  /*
> + * When VT-d doesn't enable Extended Interrupt Mode. Hardware only interprets
> + * only 8-bits ([15:8]) of Destination-ID field in the IRTEs.

The above comment needs to be rewritten. My knowledge of VT-d is
limited, what is IRTE referring to? I thought I was referring to the
IO APIC redirection table registers, but the mask then doesn't make
sense.

> + */
> +#define IRTE_xAPIC_DEST_MASK 0xff00
> +
> +/*
>   * Posted-interrupt descriptor address is 64 bits with 64-byte aligned, only
>   * the upper 26 bits of lest significiant 32 bits is available.
>   */
> diff --git a/xen/drivers/passthrough/vtd/vtd.h b/xen/drivers/passthrough/vtd/vtd.h
> index bb8889f..1032b46 100644
> --- a/xen/drivers/passthrough/vtd/vtd.h
> +++ b/xen/drivers/passthrough/vtd/vtd.h
> @@ -47,6 +47,8 @@ struct IO_APIC_route_remap_entry {
>      };
>  };
>  
> +#define IOAPIC_REMAP_ENTRY_INDEX(x) ((x.index_15 << 15) + x.index_0_14)

Could you use entry instead of x? And you should add parentheses
around it's usage in the macro.

> +
>  struct msi_msg_remap_entry {
>      union {
>          u32 val;
> @@ -65,4 +67,8 @@ struct msi_msg_remap_entry {
>      u32	data;		/* msi message data */
>  };
>  
> +#define MSI_REMAP_ENTRY_INDEX(x) ((x.address_lo.index_15 << 15) + \
> +                                  x.address_lo.index_0_14 + \
> +                                  (x.address_lo.SHV ? (uint16_t)x.data : 0))

Same here. And it would be clearer to place both macros together IMHO.

>  #endif // _VTD_H_
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 8e8dbe6..2bee352 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -23,11 +23,16 @@
>  #include <xen/types.h>
>  #include <xen/viommu.h>
>  #include <xen/xmalloc.h>
> +#include <asm/apic.h>
>  #include <asm/current.h>
> +#include <asm/event.h>
>  #include <asm/hvm/domain.h>
> +#include <asm/io_apic.h>
>  #include <asm/page.h>
> +#include <asm/p2m.h>
>  
>  #include "iommu.h"
> +#include "vtd.h"
>  
>  struct hvm_hw_vvtd_regs {
>      uint8_t data[1024];
> @@ -38,6 +43,9 @@ struct hvm_hw_vvtd_regs {
>  #define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
>  #define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
>  
> +#define vvtd_irq_remapping_enabled(vvtd) \
> +    (vvtd->status & VIOMMU_STATUS_IRQ_REMAPPING_ENABLED)
> +
>  struct vvtd {
>      /* VIOMMU_STATUS_XXX */
>      int status;
> @@ -120,6 +128,140 @@ static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
>      vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
>  } while(0)
>  
> +static int map_guest_page(struct domain *d, uint64_t gfn, void **virt)

You can make this function return a void * and encode the error in
there. See IS_ERR_VALUE and ERR_PTR.

> +{
> +    struct page_info *p;
> +
> +    p = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
> +    if ( !p )
> +        return -EINVAL;
> +
> +    if ( !get_page_type(p, PGT_writable_page) )
> +    {
> +        put_page(p);
> +        return -EINVAL;
> +    }
> +
> +    *virt = __map_domain_page_global(p);
> +    if ( !*virt )
> +    {
> +        put_page_and_type(p);
> +        return -ENOMEM;
> +    }

Newline

> +    return 0;
> +}
> +
> +static void unmap_guest_page(void *virt)
> +{
> +    struct page_info *page;
> +
> +    if ( !virt )
> +        return;
> +
> +    virt = (void *)((unsigned long)virt & PAGE_MASK);

This should maybe be an ASSERT? Are you going to call unmap_guest_page
with something different than the return of map_guest_page?

> +    page = mfn_to_page(domain_page_map_to_mfn(virt));
> +
> +    unmap_domain_page_global(virt);
> +    put_page_and_type(page);
> +}
> +
> +static void vvtd_inj_irq(
> +    struct vlapic *target,
> +    uint8_t vector,
> +    uint8_t trig_mode,
> +    uint8_t delivery_mode)

Indentation.

> +{
> +    VVTD_DEBUG(VVTD_DBG_INFO, "dest=v%d, delivery_mode=%x vector=%d "
> +               "trig_mode=%d.",
> +               vlapic_vcpu(target)->vcpu_id, delivery_mode,
> +               vector, trig_mode);
> +
> +    ASSERT((delivery_mode == dest_Fixed) ||
> +           (delivery_mode == dest_LowestPrio));
> +
> +    vlapic_set_irq(target, vector, trig_mode);
> +}
> +
> +static int vvtd_delivery(
> +    struct domain *d, int vector,

uint8_t vector?

> +    uint32_t dest, uint8_t dest_mode,
> +    uint8_t delivery_mode, uint8_t trig_mode)

Indentation.

> +{
> +    struct vlapic *target;
> +    struct vcpu *v;
> +
> +    switch ( delivery_mode )
> +    {
> +    case dest_LowestPrio:
> +        target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
> +        if ( target != NULL )
> +        {
> +            vvtd_inj_irq(target, vector, trig_mode, delivery_mode);
> +            break;
> +        }
> +        VVTD_DEBUG(VVTD_DBG_INFO, "null round robin: vector=%02x\n", vector);
> +        break;
> +
> +    case dest_Fixed:
> +        for_each_vcpu ( d, v )
> +            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest,
> +                                   dest_mode) )
> +                vvtd_inj_irq(vcpu_vlapic(v), vector,
> +                             trig_mode, delivery_mode);
> +        break;
> +
> +    case dest_NMI:
> +        for_each_vcpu ( d, v )
> +            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode)
> +                 && !test_and_set_bool(v->nmi_pending) )
> +                vcpu_kick(v);
> +        break;
> +
> +    default:
> +        printk(XENLOG_G_WARNING
> +               "%pv: Unsupported VTD delivery mode %d for Dom%d\n",
> +               current, delivery_mode, d->domain_id);

gprintk? Or at least this should be rate-limited somehow, it seems
like the guest can trigger such behavior?

> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static uint32_t irq_remapping_request_index(struct irq_remapping_request *irq)

irq should be const.

> +{
> +    if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
> +    {
> +        struct msi_msg_remap_entry msi_msg = { { irq->msg.msi.addr }, 0,
> +                                               irq->msg.msi.data };

I would rather prefer that you use designated initializers, ie:

struct msi_msg_remap_entry msi_msg =
{
    .address_lo = { .val = irq->msg.msi.addr },
    .data = irq->msg.msi.data,
};

> +
> +        return MSI_REMAP_ENTRY_INDEX(msi_msg);
> +    }
> +    else if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
> +    {
> +        struct IO_APIC_route_remap_entry remap_rte = { { irq->msg.rte } };
> +
> +        return IOAPIC_REMAP_ENTRY_INDEX(remap_rte);
> +    }
> +    BUG();

ASSERT_UNREACHABLE();

And newline.

> +    return 0;
> +}
> +
> +static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)

vvtd should be const.

> +{
> +    uint64_t irta;
> +
> +    vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
> +    /* In xAPIC mode, only 8-bits([15:8]) are valid */
> +    return DMA_IRTA_EIME(irta) ? dest : MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
> +}
> +
> +static int vvtd_record_fault(struct vvtd *vvtd,
> +                             struct irq_remapping_request *irq,
> +                             int reason)
> +{
> +    return 0;
> +}

I guess this is going to be expanded later in the series? Seems
pointless to introduce it now.

> +
>  static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
>  {
>      uint64_t irta;
> @@ -259,6 +401,137 @@ static const struct hvm_mmio_ops vvtd_mmio_ops = {
>      .write = vvtd_write
>  };
>  
> +static bool ir_sid_valid(struct iremap_entry *irte, uint32_t source_id)
> +{
> +    return true;
> +}

Same here, it's pointless to introduce such helper now if it's a dummy
one anyway.

> +
> +/*
> + * 'record_fault' is a flag to indicate whether we need recording a fault
> + * and notifying guest when a fault happens during fetching vIRTE.
                                                      ^ fetch of
> + */
> +static int vvtd_get_entry(struct vvtd *vvtd,
> +                          struct irq_remapping_request *irq,

Seems like vvtd and irq could be const.

> +                          struct iremap_entry *dest,
> +                          bool record_fault)
> +{
> +    int ret;
> +    uint32_t entry = irq_remapping_request_index(irq);
> +    struct iremap_entry  *irte, *irt_page;
> +
> +    VVTD_DEBUG(VVTD_DBG_TRANS, "interpret a request with index %x", entry);
> +
> +    if ( entry > vvtd->irt_max_entry )
> +    {
> +        ret = VTD_FR_IR_INDEX_OVER;
> +        goto handle_fault;
> +    }
> +
> +    ret = map_guest_page(vvtd->domain, vvtd->irt + (entry >> IREMAP_ENTRY_ORDER),
> +                         (void**)&irt_page);
> +    if ( ret )
> +    {
> +        ret = VTD_FR_IR_ROOT_INVAL;
> +        goto handle_fault;
> +    }
> +
> +    irte = irt_page + (entry % (1 << IREMAP_ENTRY_ORDER));
> +    dest->val = irte->val;
> +    if ( !qinval_present(*irte) )
> +    {
> +        ret = VTD_FR_IR_ENTRY_P;
> +        goto unmap_handle_fault;
> +    }
> +
> +    /* Check reserved bits */
> +    if ( (irte->remap.res_1 || irte->remap.res_2 || irte->remap.res_3 ||
> +          irte->remap.res_4) )
> +    {
> +        ret = VTD_FR_IR_IRTE_RSVD;
> +        goto unmap_handle_fault;
> +    }
> +
> +    if (!ir_sid_valid(irte, irq->source_id))
> +    {
> +        ret = VTD_FR_IR_SID_ERR;
> +        goto unmap_handle_fault;
> +    }
> +    unmap_guest_page(irt_page);

Newline.

> +    return 0;
> +
> + unmap_handle_fault:
> +    unmap_guest_page(irt_page);
> + handle_fault:
> +    if ( !record_fault )
> +        return ret;
> +
> +    switch ( ret )
> +    {
> +    case VTD_FR_IR_SID_ERR:
> +    case VTD_FR_IR_IRTE_RSVD:
> +    case VTD_FR_IR_ENTRY_P:
> +        if ( qinval_fault_disable(*irte) )
> +            break;
> +    /* fall through */
> +    case VTD_FR_IR_INDEX_OVER:
> +    case VTD_FR_IR_ROOT_INVAL:
> +        vvtd_record_fault(vvtd, irq, ret);
> +        break;
> +
> +    default:
> +        gdprintk(XENLOG_G_INFO, "Can't handle VT-d fault %x\n", ret);
> +    }
> +    return ret;

In order to avoid the usage of labels I would put the code in
handle_fault in a helper function and call it on error.

> +}
> +
> +static int vvtd_irq_request_sanity_check(struct vvtd *vvtd,
> +                                         struct irq_remapping_request *irq)
> +{
> +    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
> +    {
> +        struct IO_APIC_route_remap_entry rte = { { irq->msg.rte } };
> +
> +        ASSERT(rte.format);
> +        return (!rte.reserved) ? 0 : VTD_FR_IR_REQ_RSVD;
> +    }
> +    else if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
> +    {
> +        struct msi_msg_remap_entry msi_msg = { { irq->msg.msi.addr } };
> +
> +        ASSERT(msi_msg.address_lo.format);
> +        return 0;
> +    }
> +    BUG();

ASSERT_UNREACHABLE(); and newline.

> +    return 0;
> +}
> +
> +static int vvtd_handle_irq_request(struct domain *d,
> +                                   struct irq_remapping_request *irq)
> +{
> +    struct iremap_entry irte;
> +    int ret;
> +    struct vvtd *vvtd = domain_vvtd(d);
> +
> +    if ( !vvtd || !vvtd_irq_remapping_enabled(vvtd) )
> +        return -EINVAL;

ENODEV.

> +
> +    ret = vvtd_irq_request_sanity_check(vvtd, irq);
> +    if ( ret )
> +    {
> +        vvtd_record_fault(vvtd, irq, ret);
> +        return ret;
> +    }
> +
> +    if ( !vvtd_get_entry(vvtd, irq, &irte, true) )

You are losing the return value of vvtd_get_entry here.

> +    {
> +        vvtd_delivery(vvtd->domain, irte.remap.vector,
> +                      irte_dest(vvtd, irte.remap.dst), irte.remap.dm,
> +                      irte.remap.dlm, irte.remap.tm);

You are losing the return value of vvtd_delivery.

> +        return 0;
> +    }

Newline.

> +    return -EFAULT;
> +}
> +

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 15/25] x86/vvtd: decode interrupt attribute from IRTE
  2017-08-09 20:34 ` [PATCH V2 15/25] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
@ 2017-08-23  9:57   ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  9:57 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:16PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Previously, interrupt attributes can be extracted from msi message or
> IOAPIC RTE. However, with interrupt remapping enabled, the attributes
> are enclosed in the associated IRTE. This callback is for cases in
> which the caller wants to acquire interrupt attributes.

Can you elaborate a little bit more on the usage? Is this for internal
Xen usage or the guest itself?

> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/vvtd.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 2bee352..374fd88 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -532,6 +532,25 @@ static int vvtd_handle_irq_request(struct domain *d,
>      return -EFAULT;
>  }
>  
> +static int vvtd_get_irq_info(struct domain *d,
> +                             struct irq_remapping_request *irq,

const.

> +                             struct irq_remapping_info *info)
> +{
> +    int ret;
> +    struct iremap_entry irte;
> +    struct vvtd *vvtd = domain_vvtd(d);
> +
> +    ret = vvtd_get_entry(vvtd, irq, &irte, false);
> +    if ( ret )
> +        return ret;
> +
> +    info->vector = irte.remap.vector;
> +    info->dest = irte_dest(vvtd, irte.remap.dst);
> +    info->dest_mode = irte.remap.dm;
> +    info->delivery_mode = irte.remap.dlm;
> +    return 0;
> +}
> +
>  static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
>  {
>      uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
> @@ -608,7 +627,8 @@ struct viommu_ops vvtd_hvm_vmx_ops = {

Forgot to mention in previous patches, vvtd_hvm_vmx_ops should be
const.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request
  2017-08-23  9:49   ` Roger Pau Monné
@ 2017-08-23  9:59     ` Jan Beulich
  0 siblings, 0 replies; 136+ messages in thread
From: Jan Beulich @ 2017-08-23  9:59 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, julien.grall, Chao Gao

>>> On 23.08.17 at 11:49, <roger.pau@citrix.com> wrote:
> On Wed, Aug 09, 2017 at 04:34:15PM -0400, Lan Tianyu wrote:
>> @@ -384,6 +399,12 @@ struct iremap_entry {
>>  };
>>  
>>  /*
>> + * When VT-d doesn't enable Extended Interrupt Mode. Hardware only interprets
>> + * only 8-bits ([15:8]) of Destination-ID field in the IRTEs.
> 
> The above comment needs to be rewritten. My knowledge of VT-d is
> limited, what is IRTE referring to? I thought I was referring to the
> IO APIC redirection table registers, but the mask then doesn't make
> sense.

IRTE = Interrupt Remapping Table Entry (a term used quite
frequently); an IO-APIC one would normally be called just RTE.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC
  2017-08-09 20:34 ` [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
@ 2017-08-23  9:59   ` Roger Pau Monné
  2017-08-24  5:28     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23  9:59 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:17PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> When irq remapping is enabled, IOAPIC Redirection Entry may be in remapping
> format. If that, generate an irq_remapping_request and call the common
> VIOMMU abstraction's callback to handle this interrupt request. Device
> model is responsible for checking the request's validity.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/arch/x86/hvm/vioapic.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 72cae93..322f33c 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -30,6 +30,7 @@
>  #include <xen/lib.h>
>  #include <xen/errno.h>
>  #include <xen/sched.h>
> +#include <xen/viommu.h>
>  #include <public/hvm/ioreq.h>
>  #include <asm/hvm/io.h>
>  #include <asm/hvm/vpic.h>
> @@ -39,6 +40,8 @@
>  #include <asm/event.h>
>  #include <asm/io_apic.h>
>  
> +#include "../../../drivers/passthrough/vtd/vtd.h"

Ouch, that's not very nice. Why do you need this? I though that you
introduced an arch-agnostic layer that should be suitable?

>  /* HACK: Route IRQ0 only to VCPU0 to prevent time jumps. */
>  #define IRQ0_SPECIAL_ROUTING 1
>  
> @@ -387,9 +390,20 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>      struct vlapic *target;
>      struct vcpu *v;
>      unsigned int irq = vioapic->base_gsi + pin;
> +    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };

Designated initializers please.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 17/25] x86/vvtd: Enable Queued Invalidation through GCMD
  2017-08-09 20:34 ` [PATCH V2 17/25] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
@ 2017-08-23 10:03   ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 10:03 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:18PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Software writes to QIE fields of GCMD to enable or disable queued

fields or field?

> invalidations. This patch emulates QIE fields of GCMD.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  3 ++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 22 ++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 70e64cf..82bf6bc 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -190,7 +190,8 @@
>  #define DMA_GSTS_FLS    (((u64)1) << 29)
>  #define DMA_GSTS_AFLS   (((u64)1) << 28)
>  #define DMA_GSTS_WBFS   (((u64)1) << 27)
> -#define DMA_GSTS_QIES   (((u64)1) <<26)
> +#define DMA_GSTS_QIES_BIT       26

_SHIFT.

> +#define DMA_GSTS_QIES           (((u64)1) << DMA_GSTS_QIES_BIT)
>  #define DMA_GSTS_IRES   (((u64)1) <<25)
>  #define DMA_GSTS_SIRTPS_BIT     24
>  #define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_BIT)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 374fd88..470bc56 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -102,6 +102,11 @@ static inline void __vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
>      return __set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
>  }
>  
> +static inline void __vvtd_clear_bit(struct vvtd *vvtd, uint32_t reg, int nr)

No underscore prefixes please.

> +{
> +    return __clear_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);

Unneeded return.

> +}
> +
>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
>                                  uint32_t value)
>  {
> @@ -262,6 +267,21 @@ static int vvtd_record_fault(struct vvtd *vvtd,
>      return 0;
>  }
>  
> +static int vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)
> +{
> +    VVTD_DEBUG(VVTD_DBG_RW, "%sable Queue Invalidation.",
> +               (val & DMA_GCMD_QIE) ? "En" : "Dis");
> +
> +    if ( val & DMA_GCMD_QIE )
> +        __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_BIT);
> +    else
> +    {
> +        vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, 0ULL);

0 should be fine, no need for the ull suffix.

> +        __vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_BIT);
> +    }

Newline.

> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
>  {
>      uint64_t irta;
> @@ -296,6 +316,8 @@ static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
>  
>      if ( changed & DMA_GCMD_SIRTP )
>          vvtd_handle_gcmd_sirtp(vvtd, val);
> +    if ( changed & DMA_GCMD_QIE )
> +        vvtd_handle_gcmd_qie(vvtd, val);

You are losing the return value of vvtd_handle_gcmd_qie. So you either
make the function void or do something with the return value.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 18/25] x86/vvtd: Enable Interrupt Remapping through GCMD
  2017-08-09 20:34 ` [PATCH V2 18/25] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
@ 2017-08-23 10:07   ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 10:07 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:19PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Software writes this field to enable/disable interrupt reampping. This patch
> emulate IRES field of GCMD.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  3 ++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 27 +++++++++++++++++++++++++++
>  2 files changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 82bf6bc..e323352 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -192,7 +192,8 @@
>  #define DMA_GSTS_WBFS   (((u64)1) << 27)
>  #define DMA_GSTS_QIES_BIT       26
>  #define DMA_GSTS_QIES           (((u64)1) << DMA_GSTS_QIES_BIT)
> -#define DMA_GSTS_IRES   (((u64)1) <<25)
> +#define DMA_GSTS_IRES_BIT       25

_SHIFT.

> +#define DMA_GSTS_IRES   (((u64)1) << DMA_GSTS_IRES_BIT)
>  #define DMA_GSTS_SIRTPS_BIT     24
>  #define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_BIT)
>  #define DMA_GSTS_CFIS   (((u64)1) <<23)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 470bc56..eae8f11 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -282,6 +282,25 @@ static int vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)
>      return X86EMUL_OKAY;
>  }
>  
> +static int vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
> +{
> +    VVTD_DEBUG(VVTD_DBG_RW, "%sable Interrupt Remapping.",
> +               (val & DMA_GCMD_IRE) ? "En" : "Dis");
> +
> +    if ( val & DMA_GCMD_IRE )
> +    {
> +        vvtd->status |= VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
> +        __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
> +    }
> +    else
> +    {
> +        vvtd->status |= ~VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;

Hm, that's not correct, you are not clearing the bit here. It should
be '&=', not '|='.

> +        __vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
> +    }
> +
> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
>  {
>      uint64_t irta;
> @@ -289,6 +308,10 @@ static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
>      if ( !(val & DMA_GCMD_SIRTP) )
>          return X86EMUL_OKAY;
>  
> +    if ( vvtd_irq_remapping_enabled(vvtd) )
> +        VVTD_DEBUG(VVTD_DBG_RW, "Update Interrupt Remapping Table when "
> +                   "active." );

Don't split console messages, instead add a newline and align to the
'('.

> +
>      vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
>      vvtd->irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
>      vvtd->irt_max_entry = DMA_IRTA_SIZE(irta);
> @@ -318,6 +341,10 @@ static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
>          vvtd_handle_gcmd_sirtp(vvtd, val);
>      if ( changed & DMA_GCMD_QIE )
>          vvtd_handle_gcmd_qie(vvtd, val);
> +    if ( changed & DMA_GCMD_IRE )
> +        vvtd_handle_gcmd_ire(vvtd, val);

Lost return value of vvtd_handle_gcmd_ire.

> +    if ( changed & ~(DMA_GCMD_QIE | DMA_GCMD_SIRTP | DMA_GCMD_IRE) )
> +        gdprintk(XENLOG_INFO, "Only QIE,SIRTP,IRE in GCMD_REG are handled.\n");

Missing spaces between commas, and I think this should be a VVTD_DEBUG
in any case, although I'm not sure of it's usefulness.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-08-09 20:34 ` [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
@ 2017-08-23 10:14   ` Roger Pau Monné
  2017-08-24  6:11     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 10:14 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:20PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> When IOAPIC RTE is in remapping format, it doesn't contain the vector of
> interrupt. For this case, the RTE contains an index of interrupt remapping
> table where the vector of interrupt is stored. This patchs gets the vector
> through a vIOMMU interface.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/arch/x86/hvm/vioapic.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 322f33c..ff0742d 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -565,11 +565,27 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>  {
>      unsigned int pin;
>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
> +    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };

Designated initialization and const.

>  
>      if ( !vioapic )
>          return -EINVAL;
>  
> -    return vioapic->redirtbl[pin].fields.vector;
> +    if ( rte.format )
> +    {
> +        int err;
> +        struct irq_remapping_request request;
> +        struct irq_remapping_info info;
> +
> +        irq_request_ioapic_fill(&request, vioapic->id, rte.val);
> +        /* Currently, only viommu 0 is supported */

This seems to be hardcoded in a bunch of places, which makes me wonder
whether having an array of vIOMMUs is the correct choice. I think that
you should remove the array and have a single vIOMMU per domain.

> +        err = viommu_get_irq_info(vioapic->domain, 0, &request, &info);
> +        return !err ? info.vector : -1;

maybe:

return err ?: info.vector;

?

> +    }
> +    else
> +    {
> +        return vioapic->redirtbl[pin].fields.vector;
> +    }
> +
>  }
>  
>  int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-23  7:22       ` Roger Pau Monné
  2017-08-23  9:12         ` Lan Tianyu
@ 2017-08-23 10:19         ` Julien Grall
  2017-08-23 14:05           ` Roger Pau Monné
  1 sibling, 1 reply; 136+ messages in thread
From: Julien Grall @ 2017-08-23 10:19 UTC (permalink / raw)
  To: Roger Pau Monné, Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	jbeulich, chao.gao

Hi Roger,

On 23/08/17 08:22, Roger Pau Monné wrote:
> On Wed, Aug 23, 2017 at 02:06:17PM +0800, Lan Tianyu wrote:
>> Hi Roger:
>> 	Thanks for your review.
>>
>> On 2017年08月22日 22:32, Roger Pau Monné wrote:
>>> On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
>>>> +
>>>> +/* vIOMMU capabilities */
>>>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>>>> +
>>>> +struct xen_domctl_viommu_op {
>>>> +    uint32_t cmd;
>>>> +#define XEN_DOMCTL_create_viommu          0
>>>> +#define XEN_DOMCTL_destroy_viommu         1
>>>> +#define XEN_DOMCTL_query_viommu_caps      2
>>>> +    union {
>>>> +        struct {
>>>> +            /* IN - vIOMMU type */
>>>> +            uint64_t viommu_type;
>>>> +            /*
>>>> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
>>>> +             * are in charge of to check base_address and length.
>>>> +             */
>>>> +            uint64_t base_address;
>>>> +            /* IN - Length of MMIO region */
>>>> +            uint64_t length;
>>>
>>> It seems weird that you can specify the length, is that something
>>> that a user would like to set? Isn't the length of the IOMMU MMIO
>>> region fixed by the hardware spec?
>>
>> Different vendor may have different IOMMU register region sizes. (e.g,
>> VTD has one page size for register region). The length field is to make
>> vIOMMU device model not to abuse address space. Some registers' offsets
>> are reported by other register and these offsets are emulated by vIOMMU
>> device model. If it's not necessary, we can remove it and add it when
>> there is real such requirement.
>
> So from my understanding the size of the IOMMU MMIO region is implicit
> in the IOMMU type that the user chooses. I don't think this field is
> needed.

To me, it makes more sense to care both the base and the size rather 
than only the former.

The toolstack is in charge of the address space and should be aware of 
the size of everything. This address space may not be static and it 
makes sense to give this information to Xen and verify we had the same 
assumption.

>
>>>
>>>> +            /* IN - Capabilities with which we want to create */
>>>> +            uint64_t capabilities;
>>>> +            /* OUT - vIOMMU identity */
>>>> +            uint32_t viommu_id;
>>>> +        } create_viommu;
>>>> +
>>>> +        struct {
>>>> +            /* IN - vIOMMU identity */
>>>> +            uint32_t viommu_id;
>>>> +        } destroy_viommu;
>>>> +
>>>> +        struct {
>>>> +            /* IN - vIOMMU type */
>>>> +            uint64_t viommu_type;
>>>> +            /* OUT - vIOMMU Capabilities */
>>>> +            uint64_t capabilities;
>>>> +        } query_caps;
>>>
>>> This also seems weird, shouldn't you query the capabilities of an
>>> already created vIOMMU, rather than a vIOMMU type? Shouldn't the first
>>> field be viommu_id?
>>>
>>
>> Query interface here is to check what capabilities the vIOMMU device
>> model specified by viommu_type can support before create vIOMMU (suppose
>> user may select different capabilities). If capabilities returned by
>> query interface doesn't meet user configuration, tool stack should
>> return error. So it just accepts viommu_type.
>
> I don't think that's needed, if the chosen capabilities are not
> supported by the selected IOMMU type simply return error in
> XEN_DOMCTL_create_viommu.
>
> The capabilities of each IOMMU type should be listed in the man page,
> and the user should select a supported set or else
> XEN_DOMCTL_create_viommu will fail. Doing the checks both in the
> toolstack and in XEN_DOMCTL_create_viommu seems pointless and prone to
> errors.

What if the some capabilities depends on host IOMMU? How are you going 
to report that to the user?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-08-09 20:34 ` [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
@ 2017-08-23 10:41   ` Roger Pau Monné
  2017-08-25  7:28     ` Chao Gao
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 10:41 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:22PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Introduce a new binding relationship and provide a new interface to
> manage the new relationship.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  tools/libxc/include/xenctrl.h |  17 ++++++
>  tools/libxc/xc_domain.c       |  53 +++++++++++++++++
>  xen/drivers/passthrough/io.c  | 135 +++++++++++++++++++++++++++++++++++-------
>  xen/include/public/domctl.h   |   7 +++
>  xen/include/xen/hvm/irq.h     |   7 +++
>  5 files changed, 198 insertions(+), 21 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index dfaa9d5..b0a9437 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1720,6 +1720,15 @@ int xc_domain_ioport_mapping(xc_interface *xch,
>                               uint32_t nr_ports,
>                               uint32_t add_mapping);
>  
> +int xc_domain_update_msi_irq_remapping(
> +    xc_interface *xch,
> +    uint32_t domid,
> +    uint32_t pirq,
> +    uint32_t source_id,
> +    uint32_t data,
> +    uint64_t addr,
> +    uint64_t gtable);
> +
>  int xc_domain_update_msi_irq(
>      xc_interface *xch,
>      uint32_t domid,
> @@ -1734,6 +1743,14 @@ int xc_domain_unbind_msi_irq(xc_interface *xch,
>                               uint32_t pirq,
>                               uint32_t gflags);
>  
> +int xc_domain_unbind_msi_irq_remapping(
> +    xc_interface *xch,
> +    uint32_t domid,
> +    uint32_t pirq,
> +    uint32_t source_id,
> +    uint32_t data,
> +    uint64_t addr);

I think this doesn't match the coding style, but it seems like the
surrounding functions also use it, so I will let the maintainers
decide whether this is fine or not.

>  int xc_domain_bind_pt_irq(xc_interface *xch,
>                            uint32_t domid,
>                            uint8_t machine_irq,
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index 3bab4e8..4b6a510 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -1702,8 +1702,34 @@ int xc_deassign_dt_device(
>      return rc;
>  }
>  
> +int xc_domain_update_msi_irq_remapping(
> +    xc_interface *xch,
> +    uint32_t domid,
> +    uint32_t pirq,
> +    uint32_t source_id,
> +    uint32_t data,
> +    uint64_t addr,
> +    uint64_t gtable)
> +{
> +    int rc;
> +    xen_domctl_bind_pt_irq_t *bind;

No newline.

> +    DECLARE_DOMCTL;
>  
> +    domctl.cmd = XEN_DOMCTL_bind_pt_irq;
> +    domctl.domain = (domid_t)domid;
>  
> +    bind = &(domctl.u.bind_pt_irq);
> +    bind->irq_type = PT_IRQ_TYPE_MSI_IR;
> +    bind->machine_irq = pirq;
> +    bind->u.msi_ir.source_id = source_id;
> +    bind->u.msi_ir.data = data;
> +    bind->u.msi_ir.addr = addr;
> +    bind->u.msi_ir.gtable = gtable;
> +
> +    rc = do_domctl(xch, &domctl);
> +    return rc;
> +}
>  
>  int xc_domain_update_msi_irq(
>      xc_interface *xch,
> @@ -1732,6 +1758,33 @@ int xc_domain_update_msi_irq(
>      return rc;
>  }
>  
> +int xc_domain_unbind_msi_irq_remapping(
> +    xc_interface *xch,
> +    uint32_t domid,
> +    uint32_t pirq,
> +    uint32_t source_id,
> +    uint32_t data,
> +    uint64_t addr)
> +{
> +    int rc;
> +    xen_domctl_bind_pt_irq_t *bind;
> +
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_unbind_pt_irq;
> +    domctl.domain = (domid_t)domid;
> +
> +    bind = &(domctl.u.bind_pt_irq);
> +    bind->irq_type = PT_IRQ_TYPE_MSI_IR;
> +    bind->machine_irq = pirq;
> +    bind->u.msi_ir.source_id = source_id;
> +    bind->u.msi_ir.data = data;
> +    bind->u.msi_ir.addr = addr;
> +
> +    rc = do_domctl(xch, &domctl);
> +    return rc;
> +}
> +
>  int xc_domain_unbind_msi_irq(
>      xc_interface *xch,
>      uint32_t domid,
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index 4d457f6..0510887 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -276,6 +276,92 @@ static struct vcpu *vector_hashing_dest(const struct domain *d,
>      return dest;
>  }
>  
> +static inline void set_hvm_gmsi_info(struct hvm_gmsi_info *msi,
> +                                     xen_domctl_bind_pt_irq_t *pt_irq_bind)

Inline? I would rather let the compiler decide IMHO.

> +{
> +    if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI )

A switch seems like a better choice here.

> +    {
> +        msi->legacy.gvec = pt_irq_bind->u.msi.gvec;
> +        msi->legacy.gflags = pt_irq_bind->u.msi.gflags;
> +    }
> +    else if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR )
> +    {
> +        msi->intremap.source_id = pt_irq_bind->u.msi_ir.source_id;
> +        msi->intremap.data = pt_irq_bind->u.msi_ir.data;
> +        msi->intremap.addr = pt_irq_bind->u.msi_ir.addr;
> +    }
> +    else
> +        BUG();

ASSERT_UNREACHABLE();

> +}
> +
> +static inline void clear_hvm_gmsi_info(struct hvm_gmsi_info *msi, int irq_type)

No inline.

> +{
> +    if ( irq_type == PT_IRQ_TYPE_MSI )

Same here (switch + ASSERT). Maybe a memset would be faster here?

> +    {
> +        msi->legacy.gvec = 0;
> +        msi->legacy.gflags = 0;
> +    }
> +    else if ( irq_type == PT_IRQ_TYPE_MSI_IR )
> +    {
> +        msi->intremap.source_id = 0;
> +        msi->intremap.data = 0;
> +        msi->intremap.addr = 0;
> +    }
> +    else
> +        BUG();
> +}
> +
> +static inline bool hvm_gmsi_info_need_update(struct hvm_gmsi_info *msi,

No inline.

> +                                         xen_domctl_bind_pt_irq_t *pt_irq_bind)
> +{
> +    if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI )

switch please.

> +        return ((msi->legacy.gvec != pt_irq_bind->u.msi.gvec) ||
> +                (msi->legacy.gflags != pt_irq_bind->u.msi.gflags));
> +    else if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR )

Unneeded else.

> +        return ((msi->intremap.source_id != pt_irq_bind->u.msi_ir.source_id) ||
> +                (msi->intremap.data != pt_irq_bind->u.msi_ir.data) ||
> +                (msi->intremap.addr != pt_irq_bind->u.msi_ir.addr));
> +    BUG();

ASSERT_UNREACHABLE and newline.

> +    return 0;
> +}
> +
> +static int pirq_dpci_2_msi_attr(struct domain *d,
> +                                struct hvm_pirq_dpci *pirq_dpci, uint8_t *gvec,
> +                                uint8_t *dest, uint8_t *dm, uint8_t *dlm)
> +{
> +    int rc = 0;
> +
> +    if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI )
> +    {
> +        *gvec = pirq_dpci->gmsi.legacy.gvec;
> +        *dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
> +        *dm = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
> +        *dlm = (pirq_dpci->gmsi.legacy.gflags & VMSI_DELIV_MASK) >>
> +                GFLAGS_SHIFT_DELIV_MODE;

MASK_EXTR.

> +    }
> +    else if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI_IR )
> +    {
> +        struct irq_remapping_request request;
> +        struct irq_remapping_info irq_info;
> +
> +        irq_request_msi_fill(&request, pirq_dpci->gmsi.intremap.source_id,
> +                             pirq_dpci->gmsi.intremap.addr,
> +                             pirq_dpci->gmsi.intremap.data);
> +        /* Currently, only viommu 0 is supported */
> +        rc = viommu_get_irq_info(d, 0, &request, &irq_info);
> +        if ( !rc )

I don't like the !rc construct, I would rather have:

if ( rc )
    return rc;

*gvec = ...;

But that's my personal taste, you should wait for maintainers to
express their opinions.

> +        {
> +            *gvec = irq_info.vector;
> +            *dest = irq_info.dest;
> +            *dm = irq_info.dest_mode;
> +            *dlm = irq_info.delivery_mode;
> +        }
> +    }
> +    else
> +        BUG();

ASSERT_UNREACHABLE();

> +    return rc;
> +}
> +
>  int pt_irq_create_bind(
>      struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
>  {
> @@ -339,17 +425,21 @@ int pt_irq_create_bind(
>      switch ( pt_irq_bind->irq_type )
>      {
>      case PT_IRQ_TYPE_MSI:
> +    case PT_IRQ_TYPE_MSI_IR:
>      {
> -        uint8_t dest, dest_mode, delivery_mode;
> +        uint8_t dest = 0, dest_mode = 0, delivery_mode = 0, gvec;

Why do you need those initializations now? They where unneeded before
and I don't see you using those variables.

>          int dest_vcpu_id;
>          const struct vcpu *vcpu;
> +        bool ir = (pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR);
> +        uint64_t gtable = ir ? pt_irq_bind->u.msi_ir.gtable :
> +                          pt_irq_bind->u.msi.gtable;
>  
>          if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
>          {
>              pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED | HVM_IRQ_DPCI_MACH_MSI |
> -                               HVM_IRQ_DPCI_GUEST_MSI;
> -            pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
> -            pirq_dpci->gmsi.legacy.gflags = pt_irq_bind->u.msi.gflags;
> +                               (ir ? HVM_IRQ_DPCI_GUEST_MSI_IR :
> +                                HVM_IRQ_DPCI_GUEST_MSI);
> +            set_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind);
>              /*
>               * 'pt_irq_create_bind' can be called after 'pt_irq_destroy_bind'.
>               * The 'pirq_cleanup_check' which would free the structure is only
> @@ -364,9 +454,9 @@ int pt_irq_create_bind(
>              pirq_dpci->dom = d;
>              /* bind after hvm_irq_dpci is setup to avoid race with irq handler*/
>              rc = pirq_guest_bind(d->vcpu[0], info, 0);
> -            if ( rc == 0 && pt_irq_bind->u.msi.gtable )
> +            if ( rc == 0 && gtable )
>              {
> -                rc = msixtbl_pt_register(d, info, pt_irq_bind->u.msi.gtable);
> +                rc = msixtbl_pt_register(d, info, gtable);
>                  if ( unlikely(rc) )
>                  {
>                      pirq_guest_unbind(d, info);
> @@ -381,8 +471,7 @@ int pt_irq_create_bind(
>              }
>              if ( unlikely(rc) )
>              {
> -                pirq_dpci->gmsi.legacy.gflags = 0;
> -                pirq_dpci->gmsi.legacy.gvec = 0;
> +                clear_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind->irq_type);
>                  pirq_dpci->dom = NULL;
>                  pirq_dpci->flags = 0;
>                  pirq_cleanup_check(info, d);
> @@ -392,7 +481,8 @@ int pt_irq_create_bind(
>          }
>          else
>          {
> -            uint32_t mask = HVM_IRQ_DPCI_MACH_MSI | HVM_IRQ_DPCI_GUEST_MSI;
> +            uint32_t mask = HVM_IRQ_DPCI_MACH_MSI |
> +                     (ir ? HVM_IRQ_DPCI_GUEST_MSI_IR : HVM_IRQ_DPCI_GUEST_MSI);

Maybe:

uint32_t mask = (ir ? HVM_IRQ_DPCI_GUEST_MSI_IR
                    : HVM_IRQ_DPCI_GUEST_MSI) |
                HVM_IRQ_DPCI_MACH_MSI;

>  
>              if ( (pirq_dpci->flags & mask) != mask )
>              {
> @@ -401,29 +491,31 @@ int pt_irq_create_bind(
>              }
>  
>              /* If pirq is already mapped as vmsi, update guest data/addr. */
> -            if ( pirq_dpci->gmsi.legacy.gvec != pt_irq_bind->u.msi.gvec ||
> -                 pirq_dpci->gmsi.legacy.gflags != pt_irq_bind->u.msi.gflags )
> +            if ( hvm_gmsi_info_need_update(&pirq_dpci->gmsi, pt_irq_bind) )
>              {
>                  /* Directly clear pending EOIs before enabling new MSI info. */
>                  pirq_guest_eoi(info);
>  
> -                pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
> -                pirq_dpci->gmsi.legacy.gflags = pt_irq_bind->u.msi.gflags;
> +                set_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind);
>              }
>          }
>          /* Calculate dest_vcpu_id for MSI-type pirq migration. */
> -        dest = pirq_dpci->gmsi.legacy.gflags & VMSI_DEST_ID_MASK;
> -        dest_mode = !!(pirq_dpci->gmsi.legacy.gflags & VMSI_DM_MASK);
> -        delivery_mode = (pirq_dpci->gmsi.legacy.gflags & VMSI_DELIV_MASK) >>
> -                         GFLAGS_SHIFT_DELIV_MODE;
> -
> -        dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
> +        rc = pirq_dpci_2_msi_attr(d, pirq_dpci, &gvec, &dest, &dest_mode,
> +                                  &delivery_mode);
> +        if ( unlikely(rc) )
> +        {
> +            spin_unlock(&d->event_lock);
> +            return -EFAULT;

Return rc? Or else you are losing the return value for no apparent
reason.

> +        }
> +        else

Unneeded else branch.

> +            dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
>          pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
>          spin_unlock(&d->event_lock);
>  
>          pirq_dpci->gmsi.posted = false;
>          vcpu = (dest_vcpu_id >= 0) ? d->vcpu[dest_vcpu_id] : NULL;
> -        if ( iommu_intpost )
> +        /* Currently, don't use interrupt posting for guest's remapping MSIs */
> +        if ( iommu_intpost && !ir )
>          {
>              if ( delivery_mode == dest_LowestPrio )
>                  vcpu = vector_hashing_dest(d, dest, dest_mode,
> @@ -435,7 +527,7 @@ int pt_irq_create_bind(
>              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
>  
>          /* Use interrupt posting if it is supported. */
> -        if ( iommu_intpost )
> +        if ( iommu_intpost && !ir )

So with interrupt remapping posted interrupts are not available...

>              pi_update_irte(vcpu ? &vcpu->arch.hvm_vmx.pi_desc : NULL,
>                             info, pirq_dpci->gmsi.legacy.gvec);
>  
> @@ -627,6 +719,7 @@ int pt_irq_destroy_bind(
>          }
>          break;
>      case PT_IRQ_TYPE_MSI:
> +    case PT_IRQ_TYPE_MSI_IR:
>          break;
>      default:
>          return -EOPNOTSUPP;
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 4b10f26..1adf032 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -555,6 +555,7 @@ typedef enum pt_irq_type_e {
>      PT_IRQ_TYPE_MSI,
>      PT_IRQ_TYPE_MSI_TRANSLATE,
>      PT_IRQ_TYPE_SPI,    /* ARM: valid range 32-1019 */
> +    PT_IRQ_TYPE_MSI_IR,
>  } pt_irq_type_t;
>  struct xen_domctl_bind_pt_irq {
>      uint32_t machine_irq;
> @@ -575,6 +576,12 @@ struct xen_domctl_bind_pt_irq {
>              uint64_aligned_t gtable;
>          } msi;
>          struct {
> +            uint32_t source_id;
> +            uint32_t data;
> +            uint64_t addr;
> +            uint64_aligned_t gtable;

uint64_aligned_t? Please use uint64_t.

> +        } msi_ir;
> +        struct {
>              uint16_t spi;
>          } spi;
>      } u;
> diff --git a/xen/include/xen/hvm/irq.h b/xen/include/xen/hvm/irq.h
> index 5e736f8..884e092 100644
> --- a/xen/include/xen/hvm/irq.h
> +++ b/xen/include/xen/hvm/irq.h
> @@ -41,6 +41,7 @@ struct dev_intx_gsi_link {
>  #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
>  #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
>  #define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
> +#define _HVM_IRQ_DPCI_GUEST_MSI_IR_SHIFT        7 

Trailing space.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest
  2017-08-09 20:34 ` [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
@ 2017-08-23 10:55   ` Roger Pau Monné
  2017-08-23 12:17     ` Jan Beulich
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 10:55 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, andrew.cooper3, xen-devel, julien.grall, jbeulich, Chao Gao

On Wed, Aug 09, 2017 at 04:34:23PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> In two situations, hypervisor delivers a msi to a hvm guest. One is
> when qemu sends a request to hypervisor through XEN_DMOP_inject_msi.
> The other is when a physical interrupt arrives and it has been bound
> to a guest msi.
> 
> For the former, the msi is routed to common vIOMMU layer if it is in
> remapping format. For the latter, if the pt irq is bound to a guest
> remapping msi, a new remapping msi is constructed based on the binding
> information and routed to common vIOMMU layer.

After looking at the code below, I'm wondering whether it would make
sense to add a new flag that's HVM_IRQ_DPCI_GUEST_REMAPPED or similar,
so that you would use:

HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_GUEST_REMAPPED

In order to designate a remapped MSI. It seems like it would avoid
some of the changes below (where you are just adding
HVM_IRQ_DPCI_GUEST_MSI_IR to code paths already used by
HVM_IRQ_DPCI_GUEST_MSI). More of a suggestion rather than a request
for you to change the code.

> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/arch/x86/hvm/irq.c       | 11 ++++++++++
>  xen/arch/x86/hvm/vmsi.c      | 14 ++++++++++--
>  xen/drivers/passthrough/io.c | 51 +++++++++++++++++++++++++++++++++-----------
>  xen/include/asm-x86/msi.h    |  3 +++
>  4 files changed, 65 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
> index e425df9..12d83b3 100644
> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -26,6 +26,7 @@
>  #include <asm/hvm/domain.h>
>  #include <asm/hvm/support.h>
>  #include <asm/msi.h>
> +#include <asm/viommu.h>
>  
>  /* Must be called with hvm_domain->irq_lock hold */
>  static void assert_gsi(struct domain *d, unsigned ioapic_gsi)
> @@ -340,6 +341,16 @@ int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
>          >> MSI_DATA_TRIGGER_SHIFT;
>      uint8_t vector = data & MSI_DATA_VECTOR_MASK;
>  
> +    if ( addr & MSI_ADDR_INTEFORMAT_MASK )
> +    {
> +        struct irq_remapping_request request;
> +
> +        irq_request_msi_fill(&request, 0, addr, data);
> +        /* Currently, only viommu 0 is supported */
> +        viommu_handle_irq_request(d, 0, &request);
> +        return 0;
> +    }
> +
>      if ( !vector )
>      {
>          int pirq = ((addr >> 32) & 0xffffff00) | dest;
> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index c4ec0ad..75ceb19 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -114,9 +114,19 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
>                  "vector=%x trig_mode=%x\n",
>                  dest, dest_mode, delivery_mode, vector, trig_mode);
>  
> -    ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
> +    ASSERT(pirq_dpci->flags & (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_GUEST_MSI_IR));

Line break.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults
  2017-08-09 20:34 ` [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults Lan Tianyu
@ 2017-08-23 11:51   ` Roger Pau Monné
  2017-08-25  7:17     ` Chao Gao
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 11:51 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: kevin.tian, Chao Gao, julien.grall, xen-devel

On Wed, Aug 09, 2017 at 04:34:24PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Interrupt translation faults are non-recoverable fault. When faults
                                                   ^ faults
> are triggered, it needs to populate fault info to Fault Recording
> Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
> to deal with faults.
> 
> This patch emulates hardware's handling interrupt translation
> faults (more information about the process can be found in VT-d spec,
> chipter "Translation Faults", section "Non-Recoverable Fault
  ^ chapter
> Reporting" and section "Non-Recoverable Logging").
> Specifically, viommu_record_fault() records the fault information and
> viommu_report_non_recoverable_fault() reports faults to software.
> Currently, only Primary Fault Logging is supported and the Number of
> Fault-recording Registers is 1.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  60 +++++++--
>  xen/drivers/passthrough/vtd/vvtd.c  | 238 +++++++++++++++++++++++++++++++++++-
>  2 files changed, 286 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index e323352..a9e905b 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -226,26 +226,66 @@
>  #define DMA_CCMD_CAIG_MASK(x) (((u64)x) & ((u64) 0x3 << 59))
>  
>  /* FECTL_REG */
> -#define DMA_FECTL_IM (((u64)1) << 31)
> +#define DMA_FECTL_IM_BIT 31

_SHIFT (here and below).

> +#define DMA_FECTL_IM (1U << DMA_FECTL_IM_BIT)
> +#define DMA_FECTL_IP_BIT 30
> +#define DMA_FECTL_IP (1U << DMA_FECTL_IP_BIT)
>  
>  /* FSTS_REG */
> -#define DMA_FSTS_PFO ((u64)1 << 0)
> -#define DMA_FSTS_PPF ((u64)1 << 1)
> -#define DMA_FSTS_AFO ((u64)1 << 2)
> -#define DMA_FSTS_APF ((u64)1 << 3)
> -#define DMA_FSTS_IQE ((u64)1 << 4)
> -#define DMA_FSTS_ICE ((u64)1 << 5)
> -#define DMA_FSTS_ITE ((u64)1 << 6)
> -#define DMA_FSTS_FAULTS    DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE
> +#define DMA_FSTS_PFO_BIT 0
> +#define DMA_FSTS_PFO (1U << DMA_FSTS_PFO_BIT)
> +#define DMA_FSTS_PPF_BIT 1
> +#define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_BIT)
> +#define DMA_FSTS_AFO (1U << 2)
> +#define DMA_FSTS_APF (1U << 3)
> +#define DMA_FSTS_IQE (1U << 4)
> +#define DMA_FSTS_ICE (1U << 5)
> +#define DMA_FSTS_ITE (1U << 6)
> +#define DMA_FSTS_PRO_BIT 7
> +#define DMA_FSTS_PRO (1U << DMA_FSTS_PRO_BIT)
> +#define DMA_FSTS_FAULTS    (DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE | DMA_FSTS_PRO)
> +#define DMA_FSTS_RW1CS     (DMA_FSTS_PFO | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE | DMA_FSTS_PRO)

Please split into separate lines.

>  #define dma_fsts_fault_record_index(s) (((s) >> 8) & 0xff)
>  
>  /* FRCD_REG, 32 bits access */
> -#define DMA_FRCD_F (((u64)1) << 31)
> +#define DMA_FRCD_LEN            0x10
> +#define DMA_FRCD0_OFFSET        0x0

0

> +#define DMA_FRCD1_OFFSET        0x4
> +#define DMA_FRCD2_OFFSET        0x8
> +#define DMA_FRCD3_OFFSET        0xc
> +#define DMA_FRCD3_FR_MASK       0xffUL
> +#define DMA_FRCD_F_BIT 31
> +#define DMA_FRCD_F ((u64)1 << DMA_FRCD_F_BIT)
> +#define DMA_FRCD(idx, offset) (DMA_CAP_FRO_OFFSET + DMA_FRCD_LEN * idx + offset)

idx and offset need parentheses.

>  #define dma_frcd_type(d) ((d >> 30) & 1)
>  #define dma_frcd_fault_reason(c) (c & 0xff)
>  #define dma_frcd_source_id(c) (c & 0xffff)
>  #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
>  
> +struct vtd_fault_record_register
> +{
> +    union {
> +        struct {
> +            u64 lo;

uint64_t here and below.

> +            u64 hi;
> +        } bits;

s/bits/raw/? I don't have a strong opinion though.

> +        struct {
> +            u64 rsvd0   :12,
> +                FI      :52; /* Fault Info */
> +            u64 SID     :16, /* Source Identifier */
> +                rsvd1   :9,
> +                PRIV    :1,  /* Privilege Mode Requested */
> +                EXE     :1,  /* Execute Permission Requested */
> +                PP      :1,  /* PASID Present */
> +                FR      :8,  /* Fault Reason */
> +                PV      :20, /* PASID Value */
> +                AT      :2,  /* Address Type */
> +                T       :1,  /* Type. (0) Write (1) Read/AtomicOp */
> +                F       :1;  /* Fault */

I don't think we use capital letters for struct fields. Also some of
them could be more descriptive IMHO, like T -> type, F -> fault, FI ->
fault_info...

> +        } fields;
> +    };
> +};
> +
>  enum VTD_FAULT_TYPE
>  {
>      /* Interrupt remapping transition faults */
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index eae8f11..f1e6d01 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -19,6 +19,7 @@
>   */
>  
>  #include <xen/domain_page.h>
> +#include <xen/lib.h>
>  #include <xen/sched.h>
>  #include <xen/types.h>
>  #include <xen/viommu.h>
> @@ -30,6 +31,7 @@
>  #include <asm/io_apic.h>
>  #include <asm/page.h>
>  #include <asm/p2m.h>
> +#include <asm/system.h>

system.h already includes xen/lib.h IIRC.

>  
>  #include "iommu.h"
>  #include "vtd.h"
> @@ -49,6 +51,8 @@ struct hvm_hw_vvtd_regs {
>  struct vvtd {
>      /* VIOMMU_STATUS_XXX */
>      int status;
> +    /* Fault Recording index */
> +    int frcd_idx;

fault_index?

>      /* Address range of remapping hardware register-set */
>      uint64_t base_addr;
>      uint64_t length;
> @@ -97,6 +101,23 @@ static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
>      return domain_vvtd(v->domain);
>  }
>  
> +static inline int vvtd_test_and_set_bit(struct vvtd *vvtd, uint32_t reg,
> +                                        int nr)

unsigned int for nr, and I'm not really sure the usefulness of this
helpers. In any case inline should not be used and instead let the
compiler optimize this.

> +{
> +    return test_and_set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
> +}
> +
> +static inline int vvtd_test_and_clear_bit(struct vvtd *vvtd, uint32_t reg,
> +                                          int nr)
> +{
> +    return test_and_clear_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
> +}
> +
> +static inline int vvtd_test_bit(struct vvtd *vvtd, uint32_t reg, int nr)
> +{
> +    return test_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
> +}
> +
>  static inline void __vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
>  {
>      return __set_bit(nr, (uint32_t *)&vvtd->regs->data[reg]);
> @@ -232,6 +253,24 @@ static int vvtd_delivery(
>      return 0;
>  }
>  
> +void vvtd_generate_interrupt(struct vvtd *vvtd,

const.

> +                             uint32_t addr,
> +                             uint32_t data)
> +{
> +    uint8_t dest, dm, dlm, tm, vector;
> +
> +    VVTD_DEBUG(VVTD_DBG_FAULT, "Sending interrupt %x %x to d%d",
> +               addr, data, vvtd->domain->domain_id);
> +
> +    dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;

MASK_EXTR (here and below).

> +    dm = !!(addr & MSI_ADDR_DESTMODE_MASK);
> +    dlm = (data & MSI_DATA_DELIVERY_MODE_MASK) >> MSI_DATA_DELIVERY_MODE_SHIFT;
> +    tm = (data & MSI_DATA_TRIGGER_MASK) >> MSI_DATA_TRIGGER_SHIFT;
> +    vector = data & MSI_DATA_VECTOR_MASK;
> +
> +    vvtd_delivery(vvtd->domain, vector, dest, dm, dlm, tm);
> +}
> +
>  static uint32_t irq_remapping_request_index(struct irq_remapping_request *irq)
>  {
>      if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
> @@ -260,11 +299,189 @@ static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
>      return DMA_IRTA_EIME(irta) ? dest : MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
>  }
>  
> +static void vvtd_report_non_recoverable_fault(struct vvtd *vvtd, int reason)
> +{
> +    uint32_t fsts;
> +
> +    ASSERT(reason & DMA_FSTS_FAULTS);
> +    fsts = vvtd_get_reg(vvtd, DMAR_FSTS_REG);
> +    __vvtd_set_bit(vvtd, DMAR_FSTS_REG, reason);

I don't understand this, is reason a bit position or a mask?

DMA_FSTS_FAULTS seems to be a mask, that should be set into DMAR_FSTS_REG?

> +
> +    /*
> +     * Accoroding to VT-d spec "Non-Recoverable Fault Event" chapter, if
> +     * there are any previously reported interrupt conditions that are yet to
> +     * be sevices by software, the Fault Event interrrupt is not generated.
> +     */
> +    if ( fsts & DMA_FSTS_FAULTS )
> +        return;
> +
> +    __vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
> +    if ( !vvtd_test_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_BIT) )
> +    {
> +        uint32_t fe_data, fe_addr;

Newline.

> +        fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
> +        fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
> +        vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
> +        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
> +    }
> +}
> +
> +static void vvtd_recomputing_ppf(struct vvtd *vvtd)

recompute, or maybe better update?

> +{
> +    int i;

unsigned int.

> +
> +    for ( i = 0; i < DMA_FRCD_REG_NR; i++ )
> +    {
> +        if ( vvtd_test_bit(vvtd, DMA_FRCD(i, DMA_FRCD3_OFFSET),
> +                           DMA_FRCD_F_BIT) )
> +        {
> +            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PPF_BIT);
> +            return;
> +        }
> +    }
> +    /*
> +     * No Primary Fault is in Fault Record Registers, thus clear PPF bit in
> +     * FSTS.
> +     */
> +    __vvtd_clear_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PPF_BIT);
> +
> +    /* If no fault is in FSTS, clear pending bit in FECTL. */
> +    if ( !(vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS) )
> +        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
> +}
> +
> +/*
> + * Commit a frcd to emulated Fault Record Registers.

frcd is not really helpful here, you should expand this into something
that's readable.

> + */
> +static void vvtd_commit_frcd(struct vvtd *vvtd, int idx,
> +                             struct vtd_fault_record_register *frcd)
> +{
> +    vvtd_set_reg_quad(vvtd, DMA_FRCD(idx, DMA_FRCD0_OFFSET), frcd->bits.lo);
> +    vvtd_set_reg_quad(vvtd, DMA_FRCD(idx, DMA_FRCD2_OFFSET), frcd->bits.hi);
> +    vvtd_recomputing_ppf(vvtd);
> +}
> +
> +/*
> + * Allocate a FRCD for the caller. If success, return the FRI. Or, return -1
> + * when failure.
> + */
> +static int vvtd_alloc_frcd(struct vvtd *vvtd)
> +{
> +    int prev;
> +
> +    /* Set the F bit to indicate the FRCD is in use. */
> +    if ( vvtd_test_and_set_bit(vvtd, DMA_FRCD(vvtd->frcd_idx, DMA_FRCD3_OFFSET),
> +                               DMA_FRCD_F_BIT) )

Shouldn't this be !vvtd_test_and_set_bit?

> +    {
> +        prev = vvtd->frcd_idx;
> +        vvtd->frcd_idx = (prev + 1) % DMA_FRCD_REG_NR;
> +        return vvtd->frcd_idx;
> +    }
> +    return -1;

-ENOMEM?

AFAICT this happens when you cannot find a free register?

> +}
> +
> +static void vvtd_free_frcd(struct vvtd *vvtd, int i)
> +{
> +    __vvtd_clear_bit(vvtd, DMA_FRCD(i, DMA_FRCD3_OFFSET), DMA_FRCD_F_BIT);
> +}
> +
>  static int vvtd_record_fault(struct vvtd *vvtd,
> -                             struct irq_remapping_request *irq,
> +                             struct irq_remapping_request *request,
>                               int reason)
>  {
> -    return 0;
> +    struct vtd_fault_record_register frcd;
> +    int frcd_idx;
> +
> +    switch(reason)
> +    {
> +    case VTD_FR_IR_REQ_RSVD:
> +    case VTD_FR_IR_INDEX_OVER:
> +    case VTD_FR_IR_ENTRY_P:
> +    case VTD_FR_IR_ROOT_INVAL:
> +    case VTD_FR_IR_IRTE_RSVD:
> +    case VTD_FR_IR_REQ_COMPAT:
> +    case VTD_FR_IR_SID_ERR:
> +        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_BIT) )
> +            return X86EMUL_OKAY;
> +
> +        /* No available Fault Record means Fault overflowed */
> +        frcd_idx = vvtd_alloc_frcd(vvtd);
> +        if ( frcd_idx == -1 )
> +        {
> +            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PFO_BIT);
> +            return X86EMUL_OKAY;
> +        }
> +        memset(&frcd, 0, sizeof(frcd));
> +        frcd.fields.FR = (u8)reason;
> +        frcd.fields.FI = ((u64)irq_remapping_request_index(request)) << 36;
> +        frcd.fields.SID = (u16)request->source_id;
> +        frcd.fields.F = 1;
> +        vvtd_commit_frcd(vvtd, frcd_idx, &frcd);
> +        return X86EMUL_OKAY;
> +
> +    default:

Other reasons are just ignored? Should this have an ASSERT_UNREACHABLE
maybe?

> +        break;
> +    }
> +
> +    gdprintk(XENLOG_ERR, "Can't handle vVTD Fault (reason 0x%x).", reason);
> +    domain_crash(vvtd->domain);

Oh, I see. Is it expected that such faults with unhandled reasons can
be somehow generated by the domain itself?

> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
> +{
> +    /* Writing a 1 means clear fault */
> +    if ( val & DMA_FRCD_F )
> +    {
> +        vvtd_free_frcd(vvtd, 0);
> +        vvtd_recomputing_ppf(vvtd);
> +    }
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
> +{
> +    /*
> +     * Only DMA_FECTL_IM bit is writable. Generate pending event when unmask.
> +     */
> +    if ( !(val & DMA_FECTL_IM) )
> +    {
> +        /* Clear IM */
> +        __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_BIT);
> +        if ( vvtd_test_and_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT) )
> +        {
> +            uint32_t fe_data, fe_addr;

Newline.

> +            fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
> +            fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
> +            vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
> +        }
> +    }
> +    else
> +        __vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_BIT);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
> +{
> +    int i, max_fault_index = DMA_FSTS_PRO_BIT;

unsigned int.

> +    uint64_t bits_to_clear = val & DMA_FSTS_RW1CS;
> +
> +    i = find_first_bit(&bits_to_clear, max_fault_index / 8 + 1);
> +    while ( i <= max_fault_index )

Shouldn't you check whether bits_to_clear is not 0 also? And I don't
remember, but is the return of find_first_bit based on 0 or 1 (ie: is
bit 0 reported as 0 or 1).

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support
  2017-08-09 20:34 ` [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
@ 2017-08-23 12:16   ` Roger Pau Monné
  2017-08-24  6:31     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 12:16 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: kevin.tian, Chao Gao, julien.grall, xen-devel

On Wed, Aug 09, 2017 at 04:34:25PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Queued Invalidation Interface is an expanded invalidation interface with
> extended capabilities. Hardware implementations report support for queued
> invalidation interface through the Extended Capability Register. The queued
> invalidation interface uses an Invalidation Queue (IQ), which is a circular
> buffer in system memory. Software submits commands by writing Invalidation
> Descriptors to the IQ.
> 
> In this patch, a new function viommu_process_iq() is used for emulating how
> hardware handles invalidation requests through QI.

It seems like this is an extended feature, which is not needed for
basic functionality. Would it be possible to have this series focus on
the bare-minimum functionality, leaving everything else to a separate
series?

> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  29 ++++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 244 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 272 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index a9e905b..eac0fbe 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -204,6 +204,32 @@
>  #define DMA_IRTA_S(val)         (val & 0xf)
>  #define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
>  
> +/* IQH_REG */
> +#define DMA_IQH_QH_SHIFT        4
> +#define DMA_IQH_QH(val)         ((val >> 4) & 0x7fffULL)

Missing parentheses around val (here and below).

> +
> +/* IQT_REG */
> +#define DMA_IQT_QT_SHIFT        4
> +#define DMA_IQT_QT(val)         ((val >> 4) & 0x7fffULL)
> +#define DMA_IQT_RSVD            0xfffffffffff80007ULL

~0x7fffULL?

> +/* IQA_REG */
> +#define DMA_MGAW                39  /* Maximum Guest Address Width */

I've got the feeling this is also in the CPUID info, in which case
shouldn't this match what's reported there?

Or is it expected to have the IOMMU report support for address width
different than the processor?

> +#define DMA_IQA_ADDR(val)       (val & ~0xfffULL)
> +#define DMA_IQA_QS(val)         (val & 0x7)
> +#define DMA_IQA_ENTRY_PER_PAGE  (1 << 8)
> +#define DMA_IQA_RSVD            (~((1ULL << DMA_MGAW) -1 ) | 0xff8ULL)

There seems to be a fair amount of magic constants here, shouldn't
those be added as defines?

> +
> +/* IECTL_REG */
> +#define DMA_IECTL_IM_BIT 31

_SHIFT.

> +#define DMA_IECTL_IM            (1 << DMA_IECTL_IM_BIT)
> +#define DMA_IECTL_IP_BIT 30
> +#define DMA_IECTL_IP (((u64)1) << DMA_IECTL_IP_BIT)
> +
> +/* ICS_REG */
> +#define DMA_ICS_IWC_BIT         0
> +#define DMA_ICS_IWC             (1 << DMA_ICS_IWC_BIT)
> +
>  /* PMEN_REG */
>  #define DMA_PMEN_EPM    (((u32)1) << 31)
>  #define DMA_PMEN_PRS    (((u32)1) << 0)
> @@ -238,7 +264,8 @@
>  #define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_BIT)
>  #define DMA_FSTS_AFO (1U << 2)
>  #define DMA_FSTS_APF (1U << 3)
> -#define DMA_FSTS_IQE (1U << 4)
> +#define DMA_FSTS_IQE_BIT 4
> +#define DMA_FSTS_IQE (1U << DMA_FSTS_IQE_BIT)
>  #define DMA_FSTS_ICE (1U << 5)
>  #define DMA_FSTS_ITE (1U << 6)
>  #define DMA_FSTS_PRO_BIT 7
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index f1e6d01..4f5e28e 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -428,6 +428,185 @@ static int vvtd_record_fault(struct vvtd *vvtd,
>      return X86EMUL_OKAY;
>  }
>  
> +/*
> + * Process a invalidation descriptor. Currently, only two types descriptors,
> + * Interrupt Entry Cache Invalidation Descritor and Invalidation Wait
> + * Descriptor are handled.
> + * @vvtd: the virtual vtd instance
> + * @i: the index of the invalidation descriptor to be processed
> + *
> + * If success return 0, or return -1 when failure.
> + */
> +static int process_iqe(struct vvtd *vvtd, int i)
> +{
> +    uint64_t iqa, addr;
> +    struct qinval_entry *qinval_page;
> +    void *pg;
> +    int ret;
> +
> +    vvtd_get_reg_quad(vvtd, DMAR_IQA_REG, iqa);
> +    ret = map_guest_page(vvtd->domain, DMA_IQA_ADDR(iqa)>>PAGE_SHIFT,
> +                         (void**)&qinval_page);
> +    if ( ret )
> +    {
> +        gdprintk(XENLOG_ERR, "Can't map guest IRT (rc %d)", ret);

VVTD_DEBUG?

> +        return -1;

return ret;

> +    }
> +
> +    switch ( qinval_page[i].q.inv_wait_dsc.lo.type )
> +    {
> +    case TYPE_INVAL_WAIT:
> +        if ( qinval_page[i].q.inv_wait_dsc.lo.sw )
> +        {
> +            addr = (qinval_page[i].q.inv_wait_dsc.hi.saddr << 2);
> +            ret = map_guest_page(vvtd->domain, addr >> PAGE_SHIFT, &pg);
> +            if ( ret )
> +            {
> +                gdprintk(XENLOG_ERR, "Can't map guest memory to inform guest "
> +                         "IWC completion (rc %d)", ret);
> +                goto error;
> +            }
> +            *(uint32_t *)((uint64_t)pg + (addr & ~PAGE_MASK)) =
                             ^ no need to cast AFAICT

> +                qinval_page[i].q.inv_wait_dsc.lo.sdata;
> +            unmap_guest_page(pg);

Since this is kind of a sporadic usage, maybe you could use
hvm_copy_to_guest_phys?

> +        }
> +
> +        /*
> +         * The following code generates an invalidation completion event
> +         * indicating the invalidation wait descriptor completion. Note that
> +         * the following code fragment is not tested properly.
> +         */
> +        if ( qinval_page[i].q.inv_wait_dsc.lo.iflag )
> +        {
> +            uint32_t ie_data, ie_addr;
> +            if ( !vvtd_test_and_set_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_BIT) )
> +            {
> +                __vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT);
> +                if ( !vvtd_test_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_BIT) )
> +                {
> +                    ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
> +                    ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
> +                    vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
> +                    __vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT);
> +                }
> +            }
> +        }
> +        break;
> +
> +    case TYPE_INVAL_IEC:
> +        /*
> +         * Currently, no cache is preserved in hypervisor. Only need to update
> +         * pIRTEs which are modified in binding process.
> +         */
> +        break;
> +
> +    default:
> +        goto error;
> +    }
> +
> +    unmap_guest_page((void*)qinval_page);
> +    return 0;
> +
> + error:
> +    unmap_guest_page((void*)qinval_page);
> +    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
> +    domain_crash(vvtd->domain);

Is it really needed to crash the guest?

> +    return -1;

return ret;

> +}
> +
> +/*
> + * Invalidate all the descriptors in Invalidation Queue.
> + */
> +static void vvtd_process_iq(struct vvtd *vvtd)
> +{
> +    uint64_t iqh, iqt, iqa, max_entry, i;
> +    int ret = 0;
> +
> +    /*
> +     * No new descriptor is fetched from the Invalidation Queue until
> +     * software clears the IQE field in the Fault Status Register
> +     */
> +    if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_BIT) )
> +        return;
> +
> +    vvtd_get_reg_quad(vvtd, DMAR_IQH_REG, iqh);
> +    vvtd_get_reg_quad(vvtd, DMAR_IQT_REG, iqt);
> +    vvtd_get_reg_quad(vvtd, DMAR_IQA_REG, iqa);
> +
> +    max_entry = DMA_IQA_ENTRY_PER_PAGE << DMA_IQA_QS(iqa);
> +    iqh = DMA_IQH_QH(iqh);
> +    iqt = DMA_IQT_QT(iqt);
> +
> +    ASSERT(iqt < max_entry);

Is the guest allowed to write to DMAR_IQT_REG? Is there a chance it
can write a value that could make the ASSERT trigger?

> +    if ( iqh == iqt )
> +        return;
> +
> +    i = iqh;
> +    while ( i != iqt )

This looks like it wants to be a for loop.

> +    {
> +        ret = process_iqe(vvtd, i);
> +        if ( ret )
> +            break;
> +        else

Unneeded else.

> +            i = (i + 1) % max_entry;
> +        vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, i << DMA_IQH_QH_SHIFT);

Can't you do this at the end of the loop instead of doing it on every
iterations?

> +    }
> +
> +    /*
> +     * When IQE set, IQH references the desriptor associated with the error.
> +     */
> +    if ( ret )
> +        vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_IQE_BIT);
> +}
> +
> +static int vvtd_write_iqt(struct vvtd *vvtd, unsigned long val)
> +{
> +    uint64_t iqa;
> +
> +    if ( val & DMA_IQT_RSVD )
> +    {
> +        VVTD_DEBUG(VVTD_DBG_RW, "Attempt to set reserved bits in "
> +                   "Invalidation Queue Tail.");

Please try to not split the debug messages into separate lines, it
makes grepping for them harder.

> +        return X86EMUL_OKAY;
> +    }
> +
> +    vvtd_get_reg_quad(vvtd, DMAR_IQA_REG, iqa);
> +    if ( DMA_IQT_QT(val) >= DMA_IQA_ENTRY_PER_PAGE << DMA_IQA_QS(iqa) )
> +    {
> +        VVTD_DEBUG(VVTD_DBG_RW, "IQT: Value %lx exceeded supported max "
> +                   "index.", val);

Same here.

> +        return X86EMUL_OKAY;
> +    }
> +
> +    vvtd_set_reg_quad(vvtd, DMAR_IQT_REG, val);
> +    vvtd_process_iq(vvtd);

Newline.

> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_iqa(struct vvtd *vvtd, unsigned long val)
> +{
> +    if ( val & DMA_IQA_RSVD )
> +    {
> +        VVTD_DEBUG(VVTD_DBG_RW, "Attempt to set reserved bits in "
> +                   "Invalidation Queue Address.");
> +        return X86EMUL_OKAY;
> +    }
> +
> +    vvtd_set_reg_quad(vvtd, DMAR_IQA_REG, val);
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_ics(struct vvtd *vvtd, uint32_t val)
> +{
> +    if ( val & DMA_ICS_IWC )
> +    {
> +        __vvtd_clear_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_BIT);
> +        /*When IWC field is cleared, the IP field needs to be cleared */
> +        __vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT);
> +    }
> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
>  {
>      /* Writing a 1 means clear fault */
> @@ -439,6 +618,29 @@ static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
>      return X86EMUL_OKAY;
>  }
>  
> +static int vvtd_write_iectl(struct vvtd *vvtd, uint32_t val)
> +{
> +    /*
> +     * Only DMA_IECTL_IM bit is writable. Generate pending event when unmask.
> +     */
> +    if ( !(val & DMA_IECTL_IM) )
> +    {
> +        /* Clear IM and clear IP */
> +        __vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_BIT);
> +        if ( vvtd_test_and_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_BIT) )
> +        {
> +            uint32_t ie_data, ie_addr;

Newline.

> +            ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
> +            ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
> +            vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
> +        }
> +    }
> +    else
> +        __vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_BIT);
> +
> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
>  {
>      /*
> @@ -481,6 +683,10 @@ static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
>      if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
>          __vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_BIT);
>  
> +    /* Continue to deal invalidation when IQE is clear */
> +    if ( !vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_BIT) )
> +        vvtd_process_iq(vvtd);
> +
>      return X86EMUL_OKAY;
>  }
>  
> @@ -639,6 +845,36 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>              ret = vvtd_write_frcd3(vvtd, val);
>              break;
>  
> +        case DMAR_IECTL_REG:
> +            ret = vvtd_write_iectl(vvtd, val);
> +            break;
> +
> +        case DMAR_ICS_REG:
> +            ret = vvtd_write_ics(vvtd, val);
> +            break;
> +
> +        case DMAR_IQT_REG:
> +            ret = vvtd_write_iqt(vvtd, (uint32_t)val);
> +            break;
> +
> +        case DMAR_IQA_REG:
> +        {
> +            uint32_t iqa_hi;
> +
> +            iqa_hi = vvtd_get_reg(vvtd, DMAR_IQA_REG_HI);
> +            ret = vvtd_write_iqa(vvtd, (uint32_t)val | ((uint64_t)iqa_hi << 32));

Line length.

> +            break;
> +        }
> +
> +        case DMAR_IQA_REG_HI:
> +        {
> +            uint32_t iqa_lo;
> +
> +            iqa_lo = vvtd_get_reg(vvtd, DMAR_IQA_REG);
> +            ret = vvtd_write_iqa(vvtd, (val << 32) | iqa_lo);
> +            break;
> +        }
> +
>          case DMAR_IEDATA_REG:
>          case DMAR_IEADDR_REG:
>          case DMAR_IEUADDR_REG:
> @@ -668,6 +904,14 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>              ret = vvtd_write_frcd3(vvtd, val >> 32);
>              break;
>  
> +        case DMAR_IQT_REG:
> +            ret = vvtd_write_iqt(vvtd, val);
> +            break;
> +
> +        case DMAR_IQA_REG:
> +            ret = vvtd_write_iqa(vvtd, val);
> +            break;
> +
>          default:
>              ret = X86EMUL_UNHANDLEABLE;
>              break;
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest
  2017-08-23 10:55   ` Roger Pau Monné
@ 2017-08-23 12:17     ` Jan Beulich
  0 siblings, 0 replies; 136+ messages in thread
From: Jan Beulich @ 2017-08-23 12:17 UTC (permalink / raw)
  To: Roger Pau Monné, Lan Tianyu
  Cc: andrew.cooper3, julien.grall, xen-devel, kevin.tian, Chao Gao

>>> On 23.08.17 at 12:55, <roger.pau@citrix.com> wrote:
> On Wed, Aug 09, 2017 at 04:34:23PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> In two situations, hypervisor delivers a msi to a hvm guest. One is
>> when qemu sends a request to hypervisor through XEN_DMOP_inject_msi.
>> The other is when a physical interrupt arrives and it has been bound
>> to a guest msi.
>> 
>> For the former, the msi is routed to common vIOMMU layer if it is in
>> remapping format. For the latter, if the pt irq is bound to a guest
>> remapping msi, a new remapping msi is constructed based on the binding
>> information and routed to common vIOMMU layer.
> 
> After looking at the code below, I'm wondering whether it would make
> sense to add a new flag that's HVM_IRQ_DPCI_GUEST_REMAPPED or similar,
> so that you would use:
> 
> HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_GUEST_REMAPPED
> 
> In order to designate a remapped MSI. It seems like it would avoid
> some of the changes below (where you are just adding
> HVM_IRQ_DPCI_GUEST_MSI_IR to code paths already used by
> HVM_IRQ_DPCI_GUEST_MSI). More of a suggestion rather than a request
> for you to change the code.

I think this is a pretty good suggestion.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d
  2017-08-09 20:34 ` [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d Lan Tianyu
@ 2017-08-23 12:19   ` Roger Pau Monné
  2017-08-25  6:35     ` Chao Gao
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 12:19 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: kevin.tian, Chao Gao, julien.grall, xen-devel

On Wed, Aug 09, 2017 at 04:34:26PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Wrap some useful status in a new structure hvm_hw_vvtd, following
> the customs of vlapic, vioapic and etc. Provide two save-restore
> pairs to save/restore registers and non-register status.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/vvtd.c     | 98 ++++++++++++++++++++++------------
>  xen/include/public/arch-x86/hvm/save.h | 24 ++++++++-
>  2 files changed, 88 insertions(+), 34 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 4f5e28e..dd6be83 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -20,6 +20,7 @@
>  
>  #include <xen/domain_page.h>
>  #include <xen/lib.h>
> +#include <xen/hvm/save.h>
>  #include <xen/sched.h>
>  #include <xen/types.h>
>  #include <xen/viommu.h>
> @@ -32,39 +33,26 @@
>  #include <asm/page.h>
>  #include <asm/p2m.h>
>  #include <asm/system.h>
> +#include <public/hvm/save.h>
>  
>  #include "iommu.h"
>  #include "vtd.h"
>  
> -struct hvm_hw_vvtd_regs {
> -    uint8_t data[1024];
> -};
> -
>  /* Status field of struct vvtd */
>  #define VIOMMU_STATUS_DEFAULT                   (0)
>  #define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
>  #define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
>  
>  #define vvtd_irq_remapping_enabled(vvtd) \
> -    (vvtd->status & VIOMMU_STATUS_IRQ_REMAPPING_ENABLED)
> +    (vvtd->hw.status & VIOMMU_STATUS_IRQ_REMAPPING_ENABLED)
>  
>  struct vvtd {
> -    /* VIOMMU_STATUS_XXX */
> -    int status;
> -    /* Fault Recording index */
> -    int frcd_idx;
>      /* Address range of remapping hardware register-set */
>      uint64_t base_addr;
>      uint64_t length;
>      /* Point back to the owner domain */
>      struct domain *domain;
> -    /* Is in Extended Interrupt Mode? */
> -    bool eim;
> -    /* Max remapping entries in IRT */
> -    int irt_max_entry;
> -    /* Interrupt remapping table base gfn */
> -    uint64_t irt;
> -
> +    struct hvm_hw_vvtd hw;

This should have been done in the first patch IMHO, rather than moving
things around now. Directly define hvm_hw_vvtd instead of itroducing
it now.

>      struct hvm_hw_vvtd_regs *regs;
>      struct page_info *regs_page;
>  };
> @@ -370,12 +358,12 @@ static int vvtd_alloc_frcd(struct vvtd *vvtd)
>      int prev;
>  
>      /* Set the F bit to indicate the FRCD is in use. */
> -    if ( vvtd_test_and_set_bit(vvtd, DMA_FRCD(vvtd->frcd_idx, DMA_FRCD3_OFFSET),
> +    if ( vvtd_test_and_set_bit(vvtd, DMA_FRCD(vvtd->hw.frcd_idx, DMA_FRCD3_OFFSET),
>                                 DMA_FRCD_F_BIT) )
>      {
> -        prev = vvtd->frcd_idx;
> -        vvtd->frcd_idx = (prev + 1) % DMA_FRCD_REG_NR;
> -        return vvtd->frcd_idx;
> +        prev = vvtd->hw.frcd_idx;
> +        vvtd->hw.frcd_idx = (prev + 1) % DMA_FRCD_REG_NR;
> +        return vvtd->hw.frcd_idx;
>      }
>      return -1;
>  }
> @@ -712,12 +700,12 @@ static int vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
>  
>      if ( val & DMA_GCMD_IRE )
>      {
> -        vvtd->status |= VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
> +        vvtd->hw.status |= VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
>          __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
>      }
>      else
>      {
> -        vvtd->status |= ~VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
> +        vvtd->hw.status |= ~VIOMMU_STATUS_IRQ_REMAPPING_ENABLED;
>          __vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_BIT);
>      }
>  
> @@ -736,11 +724,11 @@ static int vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
>                     "active." );
>  
>      vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG, irta);
> -    vvtd->irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
> -    vvtd->irt_max_entry = DMA_IRTA_SIZE(irta);
> -    vvtd->eim = DMA_IRTA_EIME(irta);
> +    vvtd->hw.irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
> +    vvtd->hw.irt_max_entry = DMA_IRTA_SIZE(irta);
> +    vvtd->hw.eim = DMA_IRTA_EIME(irta);
>      VVTD_DEBUG(VVTD_DBG_RW, "Update IR info (addr=%lx eim=%d size=%d).",
> -               vvtd->irt, vvtd->eim, vvtd->irt_max_entry);
> +               vvtd->hw.irt, vvtd->hw.eim, vvtd->hw.irt_max_entry);
>      __vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_SIRTPS_BIT);
>  
>      return X86EMUL_OKAY;
> @@ -947,13 +935,13 @@ static int vvtd_get_entry(struct vvtd *vvtd,
>  
>      VVTD_DEBUG(VVTD_DBG_TRANS, "interpret a request with index %x", entry);
>  
> -    if ( entry > vvtd->irt_max_entry )
> +    if ( entry > vvtd->hw.irt_max_entry )
>      {
>          ret = VTD_FR_IR_INDEX_OVER;
>          goto handle_fault;
>      }
>  
> -    ret = map_guest_page(vvtd->domain, vvtd->irt + (entry >> IREMAP_ENTRY_ORDER),
> +    ret = map_guest_page(vvtd->domain, vvtd->hw.irt + (entry >> IREMAP_ENTRY_ORDER),

Line length.

>                           (void**)&irt_page);
>      if ( ret )
>      {
> @@ -1077,6 +1065,49 @@ static int vvtd_get_irq_info(struct domain *d,
>      return 0;
>  }
>  
> +static int vvtd_load_regs(struct domain *d, hvm_domain_context_t *h)
> +{
> +    if ( !domain_vvtd(d) )
> +        return -ENODEV;
> +
> +    if ( hvm_load_entry(IOMMU_REGS, h, domain_vvtd(d)->regs) )
> +        return -EINVAL;
> +
> +    return 0;
> +}
> +
> +static int vvtd_save_regs(struct domain *d, hvm_domain_context_t *h)
> +{
> +    if ( !domain_vvtd(d) )
> +        return 0;
> +
> +    return hvm_save_entry(IOMMU_REGS, 0, h, domain_vvtd(d)->regs);
> +}
> +
> +static int vvtd_load_hidden(struct domain *d, hvm_domain_context_t *h)
> +{
> +    if ( !domain_vvtd(d) )
> +        return -ENODEV;
> +
> +    if ( hvm_load_entry(IOMMU, h, &domain_vvtd(d)->hw) )
> +        return -EINVAL;
> +
> +    return 0;
> +}
> +
> +static int vvtd_save_hidden(struct domain *d, hvm_domain_context_t *h)
> +{
> +    if ( !domain_vvtd(d) )
> +        return 0;
> +
> +    return hvm_save_entry(IOMMU, 0, h, &domain_vvtd(d)->hw);
> +}
> +
> +HVM_REGISTER_SAVE_RESTORE(IOMMU, vvtd_save_hidden, vvtd_load_hidden,
> +                          1, HVMSR_PER_DOM);
> +HVM_REGISTER_SAVE_RESTORE(IOMMU_REGS, vvtd_save_regs, vvtd_load_regs,
> +                          1, HVMSR_PER_DOM);
> +
>  static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
>  {
>      uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
> @@ -1122,12 +1153,13 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
>      vvtd->base_addr = viommu->base_address;
>      vvtd->length = viommu->length;
>      vvtd->domain = d;
> -    vvtd->status = VIOMMU_STATUS_DEFAULT;
> -    vvtd->eim = 0;
> -    vvtd->irt = 0;
> -    vvtd->irt_max_entry = 0;
> -    vvtd->frcd_idx = 0;
> +    vvtd->hw.status = VIOMMU_STATUS_DEFAULT;
> +    vvtd->hw.eim = 0;
> +    vvtd->hw.irt = 0;
> +    vvtd->hw.irt_max_entry = 0;
> +    vvtd->hw.frcd_idx = 0;
>      register_mmio_handler(d, &vvtd_mmio_ops);
> +    viommu->priv = (void *)vvtd;
>      return 0;
>  
>   out2:
> diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
> index fd7bf3f..10536cb 100644
> --- a/xen/include/public/arch-x86/hvm/save.h
> +++ b/xen/include/public/arch-x86/hvm/save.h
> @@ -639,10 +639,32 @@ struct hvm_msr {
>  
>  #define CPU_MSR_CODE  20
>  
> +struct hvm_hw_vvtd_regs {
> +    uint8_t data[1024];
> +};
> +
> +DECLARE_HVM_SAVE_TYPE(IOMMU_REGS, 21, struct hvm_hw_vvtd_regs);
> +
> +struct hvm_hw_vvtd
> +{
> +    /* VIOMMU_STATUS_XXX */
> +    uint32_t status;
> +    /* Fault Recording index */
> +    uint32_t frcd_idx;
> +    /* Is in Extended Interrupt Mode? */
> +    uint32_t eim;
> +    /* Max remapping entries in IRT */
> +    uint32_t irt_max_entry;
> +    /* Interrupt remapping table base gfn */
> +    uint64_t irt;
> +};
> +
> +DECLARE_HVM_SAVE_TYPE(IOMMU, 22, struct hvm_hw_vvtd);

Why two separate structures? It should be the same structure.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc
  2017-08-23  7:36     ` Lan Tianyu
@ 2017-08-23 13:53       ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 13:53 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, chao.gao

On Wed, Aug 23, 2017 at 03:36:19PM +0800, Lan Tianyu wrote:
> On 2017年08月22日 23:55, Roger Pau Monné wrote:
> > On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
> >> This patch is to add Xen virtual IOMMU doc to introduce motivation,
> >> framework, vIOMMU hypercall and xl configuration.
> >>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >>  docs/misc/viommu.txt | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 139 insertions(+)
> >>  create mode 100644 docs/misc/viommu.txt
> >>
> >> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
> >> new file mode 100644
> >> index 0000000..39455bb
> >> --- /dev/null
> >> +++ b/docs/misc/viommu.txt
> >> +
> >> +To support >255 vcpus, X2APIC mode in guest is necessary because legacy
> >> +APIC(XAPIC) just supports 8-bit APIC ID and it only can support 255
> >> +vcpus at most. X2APIC mode supports 32-bit APIC ID and it requires
> >> +interrupt mapping function of vIOMMU.
> > 
> > Correct me if I'm wrong, but I don't think x2APIC requires vIOMMU. The
> > IOMMU is required so that you can route interrupts to all the possible
> > CPUs. One could image a setup where only CPUs with APIC IDs < 255 are
> > used as targets of external interrupts, and that doesn't require a
> > IOMMU.
> 
> This is OS behavior. IIRC, Windows strictly requires IOMMU when enable
> x2apic mode and Linux kernel only has such requirement when cpu number
> is > 255.

But this document doesn't speak about OSes, it speaks about the IOMMU
implementation. What I think is wrong is the following sentence:

"x2APIC mode supports 32-bit APIC ID and it requires interrupt mapping
function of vIOMMU."

IMHO it should be:

"x2APIC mode supports 32-bit APIC ID and it requires the interrupt
remapping functionality of a vIOMMU if the guest wishes to route
interrupts to all available vCPUs."

> > 
> > Also, why do you need the x2apic parameter? Is there any value in
> > providing a vIOMMU if it doesn't support x2APIC mode?
> 
> User can configure whether vIOMMU can support x2APIC mode and tool stack
> will use this configuration to prepare ACPI DMAR table. There is an
> X2APIC_OPT_OUT bit in DMAR table to tell OS not enable X2APIC mode for
> IOMMU.

Let me rephrase my question, what's the value in implementing the
xAPIC support for vIOMMU?

The vIOMMU work is done so that Xen can create guests with > 128 vCPUs
(> 255 APIC IDs), at which point you _must_ use x2APIC mode. Is there
any value is providing a vIOMMU implementation that doesn't support
x2APIC?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction
  2017-08-23  8:02     ` Lan Tianyu
@ 2017-08-23 13:53       ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 13:53 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Wed, Aug 23, 2017 at 04:02:40PM +0800, Lan Tianyu wrote:
> On 2017年08月23日 15:45, Roger Pau Monné wrote:
> > On Wed, Aug 09, 2017 at 04:34:11PM -0400, Lan Tianyu wrote:
> >> From: Chao Gao <chao.gao@intel.com>
> >>
> >> If guest is configured to have a vIOMMU, create it during domain construction.
> >>
> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >>  tools/libxl/libxl_x86.c | 28 ++++++++++++++++++++++++++++
> >>  1 file changed, 28 insertions(+)
> >>
> >> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> >> index 455f6f0..ace20e5 100644
> >> --- a/tools/libxl/libxl_x86.c
> >> +++ b/tools/libxl/libxl_x86.c
> >> @@ -341,8 +341,36 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
> >>      if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {
> > 
> > I would rather change this check so it's:
> > 
> > d_config->b_info.type != LIBXL_DOMAIN_TYPE_PV
> > 
> > Is there any reason why PVH guests shouldn't get a vIOMMU?
> 
> No, but we current only support vIOMMU for HVM guest and don't know how
> PVH guest enumerates vIOMMU without ACPI DMAR table.

PVH guests have ACPI tables, you just need to add a DMAR table, like
you are doing for HVM guests.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-23 10:19         ` Julien Grall
@ 2017-08-23 14:05           ` Roger Pau Monné
  2017-08-24 14:03             ` Julien Grall
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 14:05 UTC (permalink / raw)
  To: Julien Grall
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich, chao.gao

On Wed, Aug 23, 2017 at 11:19:01AM +0100, Julien Grall wrote:
> Hi Roger,
> 
> On 23/08/17 08:22, Roger Pau Monné wrote:
> > On Wed, Aug 23, 2017 at 02:06:17PM +0800, Lan Tianyu wrote:
> > > Hi Roger:
> > > 	Thanks for your review.
> > > 
> > > On 2017年08月22日 22:32, Roger Pau Monné wrote:
> > > > On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
> > > > > +
> > > > > +/* vIOMMU capabilities */
> > > > > +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> > > > > +
> > > > > +struct xen_domctl_viommu_op {
> > > > > +    uint32_t cmd;
> > > > > +#define XEN_DOMCTL_create_viommu          0
> > > > > +#define XEN_DOMCTL_destroy_viommu         1
> > > > > +#define XEN_DOMCTL_query_viommu_caps      2
> > > > > +    union {
> > > > > +        struct {
> > > > > +            /* IN - vIOMMU type */
> > > > > +            uint64_t viommu_type;
> > > > > +            /*
> > > > > +             * IN - MMIO base address of vIOMMU. vIOMMU device models
> > > > > +             * are in charge of to check base_address and length.
> > > > > +             */
> > > > > +            uint64_t base_address;
> > > > > +            /* IN - Length of MMIO region */
> > > > > +            uint64_t length;
> > > > 
> > > > It seems weird that you can specify the length, is that something
> > > > that a user would like to set? Isn't the length of the IOMMU MMIO
> > > > region fixed by the hardware spec?
> > > 
> > > Different vendor may have different IOMMU register region sizes. (e.g,
> > > VTD has one page size for register region). The length field is to make
> > > vIOMMU device model not to abuse address space. Some registers' offsets
> > > are reported by other register and these offsets are emulated by vIOMMU
> > > device model. If it's not necessary, we can remove it and add it when
> > > there is real such requirement.
> > 
> > So from my understanding the size of the IOMMU MMIO region is implicit
> > in the IOMMU type that the user chooses. I don't think this field is
> > needed.
> 
> To me, it makes more sense to care both the base and the size rather than
> only the former.
> 
> The toolstack is in charge of the address space and should be aware of the
> size of everything. This address space may not be static and it makes sense
> to give this information to Xen and verify we had the same assumption.

Does this imply that we will have variable size vIOMMU MMIO regions?

If not the toolstack should know the size of the MMIO region at all
times, unless you are running a toolstack version != Xen version,
which is not supported.

> > 
> > > > 
> > > > > +            /* IN - Capabilities with which we want to create */
> > > > > +            uint64_t capabilities;
> > > > > +            /* OUT - vIOMMU identity */
> > > > > +            uint32_t viommu_id;
> > > > > +        } create_viommu;
> > > > > +
> > > > > +        struct {
> > > > > +            /* IN - vIOMMU identity */
> > > > > +            uint32_t viommu_id;
> > > > > +        } destroy_viommu;
> > > > > +
> > > > > +        struct {
> > > > > +            /* IN - vIOMMU type */
> > > > > +            uint64_t viommu_type;
> > > > > +            /* OUT - vIOMMU Capabilities */
> > > > > +            uint64_t capabilities;
> > > > > +        } query_caps;
> > > > 
> > > > This also seems weird, shouldn't you query the capabilities of an
> > > > already created vIOMMU, rather than a vIOMMU type? Shouldn't the first
> > > > field be viommu_id?
> > > > 
> > > 
> > > Query interface here is to check what capabilities the vIOMMU device
> > > model specified by viommu_type can support before create vIOMMU (suppose
> > > user may select different capabilities). If capabilities returned by
> > > query interface doesn't meet user configuration, tool stack should
> > > return error. So it just accepts viommu_type.
> > 
> > I don't think that's needed, if the chosen capabilities are not
> > supported by the selected IOMMU type simply return error in
> > XEN_DOMCTL_create_viommu.
> > 
> > The capabilities of each IOMMU type should be listed in the man page,
> > and the user should select a supported set or else
> > XEN_DOMCTL_create_viommu will fail. Doing the checks both in the
> > toolstack and in XEN_DOMCTL_create_viommu seems pointless and prone to
> > errors.
> 
> What if the some capabilities depends on host IOMMU? How are you going to
> report that to the user?

I would print a message on the hypervisor console, I don't see the
value of doing the same check in the toolstack that Xen will also need
to do in XEN_DOMCTL_create_viommu.

I would see value on having such a query hypercall once we have an
implementation that indeed has different capabilities depending on the
hardware, and once a xl command to fetch and print such capabilities
is introduced.

As said, the above query is only used to perform the checks done in
XEN_DOMCTL_create_viommu on the toolstack.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-23  8:04       ` Roger Pau Monné
@ 2017-08-23 14:11         ` Roger Pau Monné
  2017-08-24  2:33         ` Lan Tianyu
  1 sibling, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-23 14:11 UTC (permalink / raw)
  To: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, julien.grall, jbeulich, Chao Gao

Small mistake in my message.

On Wed, Aug 23, 2017 at 09:04:06AM +0100, Roger Pau Monné wrote:
> On Wed, Aug 23, 2017 at 03:52:01PM +0800, Lan Tianyu wrote:
> > On 2017年08月23日 00:41, Roger Pau Monné wrote:
> > >> > +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
> > >> > +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
> > >> > +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
> > >> > +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
> > >> > +    drhd->pci_segment = 0;
> > >> > +    drhd->base_address = config->iommu_base_addr;
> > >> > +
> > >> > +    scope = &drhd->scope[0];
> > >> > +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
> > >> > +    scope->length = ioapic_scope_size;
> > >> > +    scope->enumeration_id = config->ioapic_id;
> > >> > +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
> > >> > +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
> > > I'm not sure whether this constants should instead be fields in the
> > > acpi_config struct passed down from libxl. libxc shouldn't really need
> > > to know anything about which chipset a VM is using.
> > 
> > How about rename I440_PSEUDO_XXX to VIOMMU_PSEUDO_XXX?
> 
> I'm not really complaining about the naming, I'm just saying that I'm
> not sure whether this constants should live in libxl. It would be
                                                 ^ libxc
> better IMHO if they where defined in some libxl x86 specific header,
> and passed to libxc inside of the acpi_config struct.
> 
> At the end it is libxl which decides which chipset the VM is going to
> use, not libxc.
> 
> Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM
  2017-08-23  7:58   ` Roger Pau Monné
@ 2017-08-24  2:16     ` Lan Tianyu
  2017-08-24  8:49       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  2:16 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月23日 15:58, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:12PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> This patch adds create/destroy/query function for the emulated VTD
>> and adapts it to the common VIOMMU abstraction.
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/drivers/passthrough/vtd/Makefile |   7 +-
>>  xen/drivers/passthrough/vtd/iommu.h  |  99 +++++++++++++++++-----
>>  xen/drivers/passthrough/vtd/vvtd.c   | 158 +++++++++++++++++++++++++++++++++++
>>  xen/include/asm-x86/viommu.h         |   3 +
>>  4 files changed, 241 insertions(+), 26 deletions(-)
>>  create mode 100644 xen/drivers/passthrough/vtd/vvtd.c
>>
>> diff --git a/xen/drivers/passthrough/vtd/Makefile b/xen/drivers/passthrough/vtd/Makefile
>> index f302653..163c7fe 100644
>> --- a/xen/drivers/passthrough/vtd/Makefile
>> +++ b/xen/drivers/passthrough/vtd/Makefile
>> @@ -1,8 +1,9 @@
>>  subdir-$(CONFIG_X86) += x86
>>  
>> -obj-y += iommu.o
>>  obj-y += dmar.o
>> -obj-y += utils.o
>> -obj-y += qinval.o
>>  obj-y += intremap.o
>> +obj-y += iommu.o
>> +obj-y += qinval.o
>>  obj-y += quirks.o
>> +obj-y += utils.o
>> +obj-$(CONFIG_VIOMMU) += vvtd.o
>> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
>> index 72c1a2e..55f3b6e 100644
>> --- a/xen/drivers/passthrough/vtd/iommu.h
>> +++ b/xen/drivers/passthrough/vtd/iommu.h
>> @@ -23,31 +23,54 @@
>>  #include <asm/msi.h>
>>  
>>  /*
>> - * Intel IOMMU register specification per version 1.0 public spec.
>> + * Intel IOMMU register specification per version 2.4 public spec.
>>   */
>>  
>> -#define    DMAR_VER_REG    0x0    /* Arch version supported by this IOMMU */
>> -#define    DMAR_CAP_REG    0x8    /* Hardware supported capabilities */
>> -#define    DMAR_ECAP_REG    0x10    /* Extended capabilities supported */
>> -#define    DMAR_GCMD_REG    0x18    /* Global command register */
>> -#define    DMAR_GSTS_REG    0x1c    /* Global status register */
>> -#define    DMAR_RTADDR_REG    0x20    /* Root entry table */
>> -#define    DMAR_CCMD_REG    0x28    /* Context command reg */
>> -#define    DMAR_FSTS_REG    0x34    /* Fault Status register */
>> -#define    DMAR_FECTL_REG    0x38    /* Fault control register */
>> -#define    DMAR_FEDATA_REG    0x3c    /* Fault event interrupt data register */
>> -#define    DMAR_FEADDR_REG    0x40    /* Fault event interrupt addr register */
>> -#define    DMAR_FEUADDR_REG 0x44    /* Upper address register */
>> -#define    DMAR_AFLOG_REG    0x58    /* Advanced Fault control */
>> -#define    DMAR_PMEN_REG    0x64    /* Enable Protected Memory Region */
>> -#define    DMAR_PLMBASE_REG 0x68    /* PMRR Low addr */
>> -#define    DMAR_PLMLIMIT_REG 0x6c    /* PMRR low limit */
>> -#define    DMAR_PHMBASE_REG 0x70    /* pmrr high base addr */
>> -#define    DMAR_PHMLIMIT_REG 0x78    /* pmrr high limit */
>> -#define    DMAR_IQH_REG    0x80    /* invalidation queue head */
>> -#define    DMAR_IQT_REG    0x88    /* invalidation queue tail */
>> -#define    DMAR_IQA_REG    0x90    /* invalidation queue addr */
>> -#define    DMAR_IRTA_REG   0xB8    /* intr remap */
>> +#define DMAR_VER_REG            0x0  /* Arch version supported by this IOMMU */
>> +#define DMAR_CAP_REG            0x8  /* Hardware supported capabilities */
>> +#define DMAR_ECAP_REG           0x10 /* Extended capabilities supported */
>> +#define DMAR_GCMD_REG           0x18 /* Global command register */
>> +#define DMAR_GSTS_REG           0x1c /* Global status register */
>> +#define DMAR_RTADDR_REG         0x20 /* Root entry table */
>> +#define DMAR_CCMD_REG           0x28 /* Context command reg */
>> +#define DMAR_FSTS_REG           0x34 /* Fault Status register */
>> +#define DMAR_FECTL_REG          0x38 /* Fault control register */
>> +#define DMAR_FEDATA_REG         0x3c /* Fault event interrupt data register */
>> +#define DMAR_FEADDR_REG         0x40 /* Fault event interrupt addr register */
>> +#define DMAR_FEUADDR_REG        0x44 /* Upper address register */
>> +#define DMAR_AFLOG_REG          0x58 /* Advanced Fault control */
>> +#define DMAR_PMEN_REG           0x64 /* Enable Protected Memory Region */
>> +#define DMAR_PLMBASE_REG        0x68 /* PMRR Low addr */
>> +#define DMAR_PLMLIMIT_REG       0x6c /* PMRR low limit */
>> +#define DMAR_PHMBASE_REG        0x70 /* pmrr high base addr */
>> +#define DMAR_PHMLIMIT_REG       0x78 /* pmrr high limit */
>> +#define DMAR_IQH_REG            0x80 /* invalidation queue head */
>> +#define DMAR_IQT_REG            0x88 /* invalidation queue tail */
>> +#define DMAR_IQT_REG_HI         0x8c
>> +#define DMAR_IQA_REG            0x90 /* invalidation queue addr */
>> +#define DMAR_IQA_REG_HI         0x94
>> +#define DMAR_ICS_REG            0x9c /* Invalidation complete status */
>> +#define DMAR_IECTL_REG          0xa0 /* Invalidation event control */
>> +#define DMAR_IEDATA_REG         0xa4 /* Invalidation event data */
>> +#define DMAR_IEADDR_REG         0xa8 /* Invalidation event address */
>> +#define DMAR_IEUADDR_REG        0xac /* Invalidation event address */
>> +#define DMAR_IRTA_REG           0xb8 /* Interrupt remapping table addr */
>> +#define DMAR_IRTA_REG_HI        0xbc
>> +#define DMAR_PQH_REG            0xc0 /* Page request queue head */
>> +#define DMAR_PQH_REG_HI         0xc4
>> +#define DMAR_PQT_REG            0xc8 /* Page request queue tail*/
>> +#define DMAR_PQT_REG_HI         0xcc
>> +#define DMAR_PQA_REG            0xd0 /* Page request queue address */
>> +#define DMAR_PQA_REG_HI         0xd4
>> +#define DMAR_PRS_REG            0xdc /* Page request status */
>> +#define DMAR_PECTL_REG          0xe0 /* Page request event control */
>> +#define DMAR_PEDATA_REG         0xe4 /* Page request event data */
>> +#define DMAR_PEADDR_REG         0xe8 /* Page request event address */
>> +#define DMAR_PEUADDR_REG        0xec /* Page event upper address */
>> +#define DMAR_MTRRCAP_REG        0x100 /* MTRR capability */
>> +#define DMAR_MTRRCAP_REG_HI     0x104
>> +#define DMAR_MTRRDEF_REG        0x108 /* MTRR default type */
>> +#define DMAR_MTRRDEF_REG_HI     0x10c
>>  
>>  #define OFFSET_STRIDE        (9)
>>  #define dmar_readl(dmar, reg) readl((dmar) + (reg))
>> @@ -58,6 +81,30 @@
>>  #define VER_MAJOR(v)        (((v) & 0xf0) >> 4)
>>  #define VER_MINOR(v)        ((v) & 0x0f)
>>  
>> +/* CAP_REG */
>> +#define DMA_DOMAIN_ID_SHIFT         16  /* 16-bit domain id for 64K domains */
>> +#define DMA_DOMAIN_ID_MASK          ((1UL << DMA_DOMAIN_ID_SHIFT) - 1)
>> +#define DMA_CAP_ND                  (((DMA_DOMAIN_ID_SHIFT - 4) / 2) & 7ULL)
>> +#define DMA_MGAW                    39  /* Maximum Guest Address Width */
>> +#define DMA_CAP_MGAW                (((DMA_MGAW - 1) & 0x3fULL) << 16)
>> +#define DMA_MAMV                    18ULL
>> +#define DMA_CAP_MAMV                (DMA_MAMV << 48)
>> +#define DMA_CAP_PSI                 (1ULL << 39)
>> +#define DMA_CAP_SLLPS               ((1ULL << 34) | (1ULL << 35))
>> +#define DMA_FRCD_REG_NR             1ULL
>> +#define DMA_CAP_NFR                 ((DMA_FRCD_REG_NR - 1) << 40)
>> +#define DMA_CAP_FRO_OFFSET          0x220ULL
>> +#define DMA_CAP_FRO                 (DMA_CAP_FRO_OFFSET << 20)
>> +
>> +/* Supported Adjusted Guest Address Widths */
>> +#define DMA_CAP_SAGAW_SHIFT         8
>> +#define DMA_CAP_SAGAW_MASK          (0x1fULL << DMA_CAP_SAGAW_SHIFT)
>> + /* 39-bit AGAW, 3-level page-table */
>> +#define DMA_CAP_SAGAW_39bit         (0x2ULL << DMA_CAP_SAGAW_SHIFT)
>> + /* 48-bit AGAW, 4-level page-table */
>> +#define DMA_CAP_SAGAW_48bit         (0x4ULL << DMA_CAP_SAGAW_SHIFT)
>> +#define DMA_CAP_SAGAW               DMA_CAP_SAGAW_39bit
>> +
>>  /*
>>   * Decoding Capability Register
>>   */
>> @@ -89,6 +136,12 @@
>>  #define cap_afl(c)        (((c) >> 3) & 1)
>>  #define cap_ndoms(c)        (1 << (4 + 2 * ((c) & 0x7)))
>>  
>> +/* ECAP_REG */
>> +#define DMA_ECAP_QI                 (1ULL << 1)
>> +#define DMA_ECAP_IR                 (1ULL << 3)
>> +#define DMA_ECAP_EIM                (1ULL << 4)
>> +#define DMA_ECAP_MHMV               (15ULL << 20)
> 
> Wow, what's this chunk above? The description only mentions adding
> functions for the VDT IOMMU implementation, yet there seems to be some
> code movement here. Please split it into a separate patch.
> 

Sure. Will split.

>>  /*
>>   * Extended Capability Register
>>   */
>> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
>> new file mode 100644
>> index 0000000..353fafe
>> --- /dev/null
>> +++ b/xen/drivers/passthrough/vtd/vvtd.c
>> @@ -0,0 +1,158 @@
>> +/*
>> + * vvtd.c
>> + *
>> + * virtualize VTD for HVM.
>> + *
>> + * Copyright (C) 2017 Chao Gao, Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public
>> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/domain_page.h>
>> +#include <xen/sched.h>
>> +#include <xen/types.h>
>> +#include <xen/viommu.h>
>> +#include <xen/xmalloc.h>
>> +#include <asm/current.h>
>> +#include <asm/hvm/domain.h>
>> +#include <asm/page.h>
>> +
>> +#include "iommu.h"
>> +
>> +struct hvm_hw_vvtd_regs {
>> +    uint8_t data[1024];
>> +};
>> +
>> +/* Status field of struct vvtd */
>> +#define VIOMMU_STATUS_DEFAULT                   (0)
>> +#define VIOMMU_STATUS_IRQ_REMAPPING_ENABLED     (1 << 0)
>> +#define VIOMMU_STATUS_DMA_REMAPPING_ENABLED     (1 << 1)
>> +
>> +struct vvtd {
>> +    /* VIOMMU_STATUS_XXX */
>> +    int status;
> 
> unsigned int if it's a bitfield.
> 
>> +    /* Address range of remapping hardware register-set */
>> +    uint64_t base_addr;
>> +    uint64_t length;
>> +    /* Point back to the owner domain */
>> +    struct domain *domain;
>> +    struct hvm_hw_vvtd_regs *regs;
>> +    struct page_info *regs_page;
>> +};
>> +
>> +static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg,
>> +                                uint32_t value)
>> +{
>> +    *((uint32_t *)(&vtd->regs->data[reg])) = value;
>> +}
>> +
>> +static inline uint32_t vvtd_get_reg(struct vvtd *vtd, uint32_t reg)
>> +{
>> +    return *((uint32_t *)(&vtd->regs->data[reg]));
>> +}
>> +
>> +static inline uint8_t vvtd_get_reg_byte(struct vvtd *vtd, uint32_t reg)
>> +{
>> +    return *((uint8_t *)(&vtd->regs->data[reg]));
> 
> You don't need castings here, data is already an array of
> uint8_t.
> 
>> +}
>> +
>> +#define vvtd_get_reg_quad(vvtd, reg, val) do {  \
>> +    (val) = vvtd_get_reg(vvtd, (reg) + 4 );     \
>> +    (val) = (val) << 32;                        \
>> +    (val) += vvtd_get_reg(vvtd, reg);           \
>> +} while(0)
>> +#define vvtd_set_reg_quad(vvtd, reg, val) do {  \
>> +    vvtd_set_reg(vvtd, reg, (val));             \
>> +    vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
>> +} while(0)
> 
> You seem to need to access hvm_hw_vvtd_regs using different sizes, why
> not do:
> 
> union hvm_hw_vvtd_regs {
>     uint8_t  data8[1024];
>     uint16_t data16[512];
>     uint32_t data32[256];
>     uint64_t data64[128];
> };
> 
> Then the access is much more straightforward and you don't need the
> complicated helpers that you have above.

Yes, that will be simpler.

> 
>> +static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
>> +{
>> +    uint64_t cap = DMA_CAP_NFR | DMA_CAP_SLLPS | DMA_CAP_FRO |
>> +                   DMA_CAP_MGAW | DMA_CAP_SAGAW | DMA_CAP_ND;
>> +    uint64_t ecap = DMA_ECAP_IR | DMA_ECAP_EIM | DMA_ECAP_QI;
>> +
>> +    vvtd_set_reg(vvtd, DMAR_VER_REG, 0x10UL);
>> +    vvtd_set_reg_quad(vvtd, DMAR_CAP_REG, cap);
>> +    vvtd_set_reg_quad(vvtd, DMAR_ECAP_REG, ecap);
>> +    vvtd_set_reg(vvtd, DMAR_FECTL_REG, 0x80000000UL);
>> +    vvtd_set_reg(vvtd, DMAR_IECTL_REG, 0x80000000UL);
>> +}
>> +
>> +static u64 vvtd_query_caps(struct domain *d)
>> +{
>> +    return VIOMMU_CAP_IRQ_REMAPPING;
>> +}
>> +
>> +static int vvtd_create(struct domain *d, struct viommu *viommu)
>> +{
>> +    struct vvtd *vvtd;
>> +    int ret;
>> +
>> +    if ( !is_hvm_domain(d) || (viommu->length != PAGE_SIZE) ||
>> +        (~vvtd_query_caps(d) & viommu->caps) )
>> +        return -EINVAL;
>> +
>> +    ret = -ENOMEM;
>> +    vvtd = xmalloc_bytes(sizeof(struct vvtd));
>> +    if ( !vvtd )
>> +        return ret;
>> +
>> +    vvtd->regs_page = alloc_domheap_page(d, MEMF_no_owner);
>> +    if ( !vvtd->regs_page )
>> +        goto out1;
>> +
>> +    vvtd->regs = __map_domain_page_global(vvtd->regs_page);
>> +    if ( !vvtd->regs )
>> +        goto out2;
>> +    clear_page(vvtd->regs);
>> +
>> +    vvtd_reset(vvtd, viommu->caps);
>> +    vvtd->base_addr = viommu->base_address;
> 
> Don't you need to perform any checks on the base_address? It needs to
> be page aligned at least.

Yes, it should be check.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-23  8:04       ` Roger Pau Monné
  2017-08-23 14:11         ` Roger Pau Monné
@ 2017-08-24  2:33         ` Lan Tianyu
  2017-08-24  6:54           ` Jan Beulich
  1 sibling, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  2:33 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月23日 16:04, Roger Pau Monné wrote:
> On Wed, Aug 23, 2017 at 03:52:01PM +0800, Lan Tianyu wrote:
>> On 2017年08月23日 00:41, Roger Pau Monné wrote:
>>>>> +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
>>>>> +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
>>>>> +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
>>>>> +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
>>>>> +    drhd->pci_segment = 0;
>>>>> +    drhd->base_address = config->iommu_base_addr;
>>>>> +
>>>>> +    scope = &drhd->scope[0];
>>>>> +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
>>>>> +    scope->length = ioapic_scope_size;
>>>>> +    scope->enumeration_id = config->ioapic_id;
>>>>> +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
>>>>> +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
>>> I'm not sure whether this constants should instead be fields in the
>>> acpi_config struct passed down from libxl. libxc shouldn't really need
>>> to know anything about which chipset a VM is using.
>>
>> How about rename I440_PSEUDO_XXX to VIOMMU_PSEUDO_XXX?
> 
> I'm not really complaining about the naming, I'm just saying that I'm
> not sure whether this constants should live in libxl. It would be
> better IMHO if they where defined in some libxl x86 specific header,
> and passed to libxc inside of the acpi_config struct.
> 
> At the end it is libxl which decides which chipset the VM is going to
> use, not libxc.

We can do that but the bdf is reserved for IOAPIC and should be same for
different chipset. Do we still need to pass it via acpi_config?


> 
> Roger.
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-23  8:34             ` Wei Liu
@ 2017-08-24  3:24               ` Lan Tianyu
  2017-08-24 11:08                 ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  3:24 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, Gao, Chao

On 2017年08月23日 16:34, Wei Liu wrote:
> On Wed, Aug 23, 2017 at 01:35:17PM +0800, Lan Tianyu wrote:
>> On 2017年08月22日 21:48, Wei Liu wrote:
>>>>> Hi, Wei
>>>>> Thanks for your comments.
>>>>>
>>>>> iirc, HVM only supports one module; DMAR cannot be a new module. Joining to
>>>>> the existing one is the approach we are taking. 
>>>>>
>>>>> Which kind of conflicts you think should be resolved? If you mean I
>>>>> forget to free the old buf, I will fix this. If you mean the potential
>>>>> overlap between the binary passed by admin and DMAR table built here, I
>>>>> don't have much idea on this. Even without the DMAR table, the binary
>>>>> may contains MADT or other tables and tool stacks don't intrepret the
>>>>> binary and check whether there are conflicts, right?
>>>>>
>>> Thinking a bit more about this, when I first said "conflicts" I didn't
>>> mean to parse the content. I was referring to the code in
>>> libxl_x86_apci.c which also seems to manipulate acpi_modules.
>>
>> Code in libxl_x86_acpi.c works for Hvmlite/PVHv2. The code we added is
>> for hvm guest.
>>
> 
> That's correct for the code as-is but what is preventing the code there
> from working with HVM? Assuming correct checks and branches are added
> to appropriate places?
> 
> I'm against having multiple locations doing things that could
> potentially clash with each other. In the foreseeable future PVH is
> going to get need similar functionality.
> 
> My expectation is that if the existing code needs to be taken into
> consideration and the contributors need to figure out if and how it can
> be modified to suite their needs. If everyone is doing their own thing
> in their own little function Xen will eventually become unmaintainable.
> 
>>>
>>> I would like the code to generate dmar take into consideration
>>> libxl__dom_load_acpi.
>>>
>>
>> If add dmar table for hvmlite, we should combine dmar table with other
>> ACPI table and populate into acpi_modules[2]. This is how hvmlite add
>> other ACPI tables in libxl__dom_load_acpi().
>>
> 
> Sure, that sounds plausible.
> 
> What I would like to see is to have one entry point to manipulate APCI
> tables.
> 
> Given the patch volume we're seeing now, we expect contributors to drive
> the discussion forward. If you're not sure, feel free to ask more questions.

I am not sure whether I understood correctly.

PVHv2 builds all ACPI table in tool stack and uses acpi_module[0, 1, 2]
to pass related table content.

HVM builds ACPI tables in hvmloader and just use acpi_module[0] to pass
additional ACPI firmware or table.

These two modes have different way to use acpi_modules[]. So I think we
can't combine them, right?

For build dmar table, we have introduced construct_dmar() in under
libacpi to build dmar table and PVHv2 also can use it in
libxl__dom_load_acpi().


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC
  2017-08-23  9:59   ` Roger Pau Monné
@ 2017-08-24  5:28     ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  5:28 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月23日 17:59, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:17PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> When irq remapping is enabled, IOAPIC Redirection Entry may be in remapping
>> format. If that, generate an irq_remapping_request and call the common
>> VIOMMU abstraction's callback to handle this interrupt request. Device
>> model is responsible for checking the request's validity.
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/arch/x86/hvm/vioapic.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>> index 72cae93..322f33c 100644
>> --- a/xen/arch/x86/hvm/vioapic.c
>> +++ b/xen/arch/x86/hvm/vioapic.c
>> @@ -30,6 +30,7 @@
>>  #include <xen/lib.h>
>>  #include <xen/errno.h>
>>  #include <xen/sched.h>
>> +#include <xen/viommu.h>
>>  #include <public/hvm/ioreq.h>
>>  #include <asm/hvm/io.h>
>>  #include <asm/hvm/vpic.h>
>> @@ -39,6 +40,8 @@
>>  #include <asm/event.h>
>>  #include <asm/io_apic.h>
>>  
>> +#include "../../../drivers/passthrough/vtd/vtd.h"
> 
> Ouch, that's not very nice. Why do you need this? I though that you
> introduced an arch-agnostic layer that should be suitable?

Yes, agree. So far, I think of introducing a callback of checking
remapping mode in viommu ops and let vIOMMU device model to check
whether vioapic is in interrupt remapping mode. Device model can use
Intel or AMD IOAPIC remapping format to parse IOAPIC entry.

> 
>>  /* HACK: Route IRQ0 only to VCPU0 to prevent time jumps. */
>>  #define IRQ0_SPECIAL_ROUTING 1
>>  
>> @@ -387,9 +390,20 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>>      struct vlapic *target;
>>      struct vcpu *v;
>>      unsigned int irq = vioapic->base_gsi + pin;
>> +    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };
> 
> Designated initializers please.
> 
> Roger.
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-08-23 10:14   ` Roger Pau Monné
@ 2017-08-24  6:11     ` Lan Tianyu
  2017-08-24  6:59       ` Jan Beulich
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  6:11 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月23日 18:14, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:20PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> When IOAPIC RTE is in remapping format, it doesn't contain the vector of
>> interrupt. For this case, the RTE contains an index of interrupt remapping
>> table where the vector of interrupt is stored. This patchs gets the vector
>> through a vIOMMU interface.
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/arch/x86/hvm/vioapic.c | 18 +++++++++++++++++-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
>> index 322f33c..ff0742d 100644
>> --- a/xen/arch/x86/hvm/vioapic.c
>> +++ b/xen/arch/x86/hvm/vioapic.c
>> @@ -565,11 +565,27 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>>  {
>>      unsigned int pin;
>>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>> +    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };
> 
> Designated initialization and const.
> 
>>  
>>      if ( !vioapic )
>>          return -EINVAL;
>>  
>> -    return vioapic->redirtbl[pin].fields.vector;
>> +    if ( rte.format )
>> +    {
>> +        int err;
>> +        struct irq_remapping_request request;
>> +        struct irq_remapping_info info;
>> +
>> +        irq_request_ioapic_fill(&request, vioapic->id, rte.val);
>> +        /* Currently, only viommu 0 is supported */
> 
> This seems to be hardcoded in a bunch of places, which makes me wonder
> whether having an array of vIOMMUs is the correct choice. I think that
> you should remove the array and have a single vIOMMU per domain.

The array is reserved for mult-vIOMMU support but so far no such
requirement as I know. In design stage, someone commented we should take
mult-vIOMMU support into account.

We may add callback of getting vIOMMU in vIOMMU ops and let vIOMMU
device model return associated vIOMMU instance according irq remapping
information(e.g source id). One VM suppose to have only one vIOMMU type.
When add multi-vIOMMU support, this logic also can be applied.

For current scenario. device model should return the first vIOMMU directly.

> 
>> +        err = viommu_get_irq_info(vioapic->domain, 0, &request, &info);
>> +        return !err ? info.vector : -1;
> 
> maybe:
> 
> return err ?: info.vector;
> 
> ?
> 
>> +    }
>> +    else
>> +    {
>> +        return vioapic->redirtbl[pin].fields.vector;
>> +    }
>> +
>>  }
>>  
>>  int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
>> -- 
>> 1.8.3.1
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support
  2017-08-23 12:16   ` Roger Pau Monné
@ 2017-08-24  6:31     ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  6:31 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: kevin.tian, Chao Gao, julien.grall, xen-devel

On 2017年08月23日 20:16, Roger Pau Monné wrote:
> On Wed, Aug 09, 2017 at 04:34:25PM -0400, Lan Tianyu wrote:
>> > From: Chao Gao <chao.gao@intel.com>
>> > 
>> > Queued Invalidation Interface is an expanded invalidation interface with
>> > extended capabilities. Hardware implementations report support for queued
>> > invalidation interface through the Extended Capability Register. The queued
>> > invalidation interface uses an Invalidation Queue (IQ), which is a circular
>> > buffer in system memory. Software submits commands by writing Invalidation
>> > Descriptors to the IQ.
>> > 
>> > In this patch, a new function viommu_process_iq() is used for emulating how
>> > hardware handles invalidation requests through QI.
> It seems like this is an extended feature, which is not needed for
> basic functionality. Would it be possible to have this series focus on
> the bare-minimum functionality, leaving everything else to a separate
> series?
> 

No, IOMMU supporting interrupt remapping must also support Queued
Invalidation (QI) according VTD spec.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-24  2:33         ` Lan Tianyu
@ 2017-08-24  6:54           ` Jan Beulich
  2017-08-24  8:36             ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Jan Beulich @ 2017-08-24  6:54 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, Roger Pau Monné,
	Chao Gao

>>> On 24.08.17 at 04:33, <tianyu.lan@intel.com> wrote:
> On 2017年08月23日 16:04, Roger Pau Monné wrote:
>> On Wed, Aug 23, 2017 at 03:52:01PM +0800, Lan Tianyu wrote:
>>> On 2017年08月23日 00:41, Roger Pau Monné wrote:
>>>>>> +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
>>>>>> +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
>>>>>> +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
>>>>>> +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
>>>>>> +    drhd->pci_segment = 0;
>>>>>> +    drhd->base_address = config->iommu_base_addr;
>>>>>> +
>>>>>> +    scope = &drhd->scope[0];
>>>>>> +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
>>>>>> +    scope->length = ioapic_scope_size;
>>>>>> +    scope->enumeration_id = config->ioapic_id;
>>>>>> +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
>>>>>> +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
>>>> I'm not sure whether this constants should instead be fields in the
>>>> acpi_config struct passed down from libxl. libxc shouldn't really need
>>>> to know anything about which chipset a VM is using.
>>>
>>> How about rename I440_PSEUDO_XXX to VIOMMU_PSEUDO_XXX?
>> 
>> I'm not really complaining about the naming, I'm just saying that I'm
>> not sure whether this constants should live in libxl. It would be
>> better IMHO if they where defined in some libxl x86 specific header,
>> and passed to libxc inside of the acpi_config struct.
>> 
>> At the end it is libxl which decides which chipset the VM is going to
>> use, not libxc.
> 
> We can do that but the bdf is reserved for IOAPIC and should be same for
> different chipset. Do we still need to pass it via acpi_config?

Well, which value is the right (reserved) one surely can - at least
in theory - depend on the chipset. Which means that it should
come from the same place which determines the chipset to be
emulated for the guest.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-08-24  6:11     ` Lan Tianyu
@ 2017-08-24  6:59       ` Jan Beulich
  2017-08-24  8:04         ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Jan Beulich @ 2017-08-24  6:59 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, Roger Pau Monné,
	Chao Gao

>>> On 24.08.17 at 08:11, <tianyu.lan@intel.com> wrote:
> On 2017年08月23日 18:14, Roger Pau Monné wrote:
>> On Wed, Aug 09, 2017 at 04:34:20PM -0400, Lan Tianyu wrote:
>>> --- a/xen/arch/x86/hvm/vioapic.c
>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>> @@ -565,11 +565,27 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>>>  {
>>>      unsigned int pin;
>>>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>>> +    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };
>> 
>> Designated initialization and const.
>> 
>>>  
>>>      if ( !vioapic )
>>>          return -EINVAL;
>>>  
>>> -    return vioapic->redirtbl[pin].fields.vector;
>>> +    if ( rte.format )
>>> +    {
>>> +        int err;
>>> +        struct irq_remapping_request request;
>>> +        struct irq_remapping_info info;
>>> +
>>> +        irq_request_ioapic_fill(&request, vioapic->id, rte.val);
>>> +        /* Currently, only viommu 0 is supported */
>> 
>> This seems to be hardcoded in a bunch of places, which makes me wonder
>> whether having an array of vIOMMUs is the correct choice. I think that
>> you should remove the array and have a single vIOMMU per domain.
> 
> The array is reserved for mult-vIOMMU support but so far no such
> requirement as I know. In design stage, someone commented we should take
> mult-vIOMMU support into account.

It _may_ suffice to do so at the public interface level. I'm not
against using a single entry array right away, but then the rest
of the code needs to be written as if the array bound was not
fixed at 1, i.e. no hard coded uses of zero as the only valid
array index should occur.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-08-24  6:59       ` Jan Beulich
@ 2017-08-24  8:04         ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  8:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, Roger Pau Monné,
	Chao Gao

On 2017年08月24日 14:59, Jan Beulich wrote:
>>>> On 24.08.17 at 08:11, <tianyu.lan@intel.com> wrote:
>> On 2017年08月23日 18:14, Roger Pau Monné wrote:
>>> On Wed, Aug 09, 2017 at 04:34:20PM -0400, Lan Tianyu wrote:
>>>> --- a/xen/arch/x86/hvm/vioapic.c
>>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>>> @@ -565,11 +565,27 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>>>>  {
>>>>      unsigned int pin;
>>>>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>>>> +    struct IO_APIC_route_remap_entry rte = { { vioapic->redirtbl[pin].bits } };
>>>
>>> Designated initialization and const.
>>>
>>>>  
>>>>      if ( !vioapic )
>>>>          return -EINVAL;
>>>>  
>>>> -    return vioapic->redirtbl[pin].fields.vector;
>>>> +    if ( rte.format )
>>>> +    {
>>>> +        int err;
>>>> +        struct irq_remapping_request request;
>>>> +        struct irq_remapping_info info;
>>>> +
>>>> +        irq_request_ioapic_fill(&request, vioapic->id, rte.val);
>>>> +        /* Currently, only viommu 0 is supported */
>>>
>>> This seems to be hardcoded in a bunch of places, which makes me wonder
>>> whether having an array of vIOMMUs is the correct choice. I think that
>>> you should remove the array and have a single vIOMMU per domain.
>>
>> The array is reserved for mult-vIOMMU support but so far no such
>> requirement as I know. In design stage, someone commented we should take
>> mult-vIOMMU support into account.
> 
> It _may_ suffice to do so at the public interface level. I'm not
> against using a single entry array right away, but then the rest
> of the code needs to be written as if the array bound was not
> fixed at 1, i.e. no hard coded uses of zero as the only valid
> array index should occur.

Hi Jan:

I am not sure whether we can hide single vIOMMU logic in the device
model. When vIOMMU instance is created in device model, store vIOMMU
instance in the global variable of virtual VTD code. Provide getting
viommu instance callback in vIOMMU ops and helper function in vIOMMU
abstract layer with interrupt information as parameter. New callback
always returns stored vIOMMU instance. This seems to avoid hard coded
uses of zero.

If this can't be accept, removing the array seems to be only feasible way.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities
  2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
                         ` (2 preceding siblings ...)
  2017-08-22 15:27       ` Roger Pau Monné
@ 2017-08-24  8:14       ` Tian, Kevin
  3 siblings, 0 replies; 136+ messages in thread
From: Tian, Kevin @ 2017-08-24  8:14 UTC (permalink / raw)
  To: Lan, Tianyu, xen-devel
  Cc: wei.liu2, andrew.cooper3, ian.jackson, julien.grall, jbeulich, Gao, Chao

> From: Lan, Tianyu
> Sent: Friday, August 18, 2017 8:22 AM
> 
> This patch is to introduct an abstract layer for arch vIOMMU
> implementation
> to deal with requests from dom0. Arch vIOMMU code needs to provide
> callback
> to perform create, destroy and query capabilities operation.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/arch/x86/Kconfig     |   1 +
>  xen/arch/x86/setup.c     |   1 +
>  xen/common/Kconfig       |   3 +
>  xen/common/Makefile      |   1 +
>  xen/common/domain.c      |   3 +
>  xen/common/viommu.c      | 165
> +++++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/sched.h  |   2 +
>  xen/include/xen/viommu.h |  71 ++++++++++++++++++++
>  8 files changed, 247 insertions(+)
>  create mode 100644 xen/common/viommu.c
>  create mode 100644 xen/include/xen/viommu.h
> 
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 30c2769..1f1de96 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -23,6 +23,7 @@ config X86
>  	select HAS_PDX
>  	select NUMA
>  	select VGA
> +	select VIOMMU
> 
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index db5df69..68f1631 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1513,6 +1513,7 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>      early_msi_init();
> 
>      iommu_setup();    /* setup iommu if available */
> +    viommu_setup();

start_xen is about physical bits, why placing viommu stuff here?

> 
>      smp_prepare_cpus(max_cpus);
> 
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index dc8e876..2ad2c8d 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -49,6 +49,9 @@ config HAS_CHECKPOLICY
>  	string
>  	option env="XEN_HAS_CHECKPOLICY"
> 
> +config VIOMMU
> +	bool
> +
>  config KEXEC
>  	bool "kexec support"
>  	default y
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 26c5a64..852553d 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -56,6 +56,7 @@ obj-y += time.o
>  obj-y += timer.o
>  obj-y += trace.o
>  obj-y += version.o
> +obj-$(CONFIG_VIOMMU) += viommu.o
>  obj-y += virtual_region.o
>  obj-y += vm_event.o
>  obj-y += vmap.o
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index b22aacc..d1f9b10 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -396,6 +396,9 @@ struct domain *domain_create(domid_t domid,
> unsigned int domcr_flags,
>          spin_unlock(&domlist_update_lock);
>      }
> 
> +    if ( (err = viommu_init_domain(d)) != 0 )
> +        goto fail;
> +

though viommu framework is common, it's better to have arch
specific code invoke above function based on their own 
requirement (e.g. if any dependency required to enforce)

>      return d;
> 
>   fail:
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> new file mode 100644
> index 0000000..6874d9f
> --- /dev/null
> +++ b/xen/common/viommu.c
> @@ -0,0 +1,165 @@
> +/*
> + * common/viommu.c
> + *
> + * Copyright (c) 2017 Intel Corporation
> + * Author: Lan Tianyu <tianyu.lan@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/spinlock.h>
> +#include <xen/types.h>
> +#include <xen/viommu.h>
> +
> +bool __read_mostly opt_viommu;
> +boolean_param("viommu", opt_viommu);
> +
> +static spinlock_t type_list_lock;
> +static struct list_head type_list;
> +
> +struct viommu_type {
> +    u64 type;

please add some comment what 'type' stands for.

> +    struct viommu_ops *ops;
> +    struct list_head node;
> +};
> +
> +int viommu_init_domain(struct domain *d)
> +{
> +    d->viommu.nr_viommu = 0;
> +    return 0;
> +}
> +
> +static struct viommu_type *viommu_get_type(u64 type)
> +{
> +    struct viommu_type *viommu_type = NULL;
> +
> +    spin_lock(&type_list_lock);
> +    list_for_each_entry( viommu_type, &type_list, node )
> +    {
> +        if ( viommu_type->type == type )
> +        {
> +            spin_unlock(&type_list_lock);
> +            return viommu_type;
> +        }
> +    }
> +    spin_unlock(&type_list_lock);
> +
> +    return NULL;
> +}
> +
> +int viommu_register_type(u64 type, struct viommu_ops * ops)
> +{
> +    struct viommu_type *viommu_type = NULL;
> +
> +    if ( !viommu_enabled() )
> +        return -EINVAL;

shouldn't above be domain specific check?

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-08-24  6:54           ` Jan Beulich
@ 2017-08-24  8:36             ` Lan Tianyu
  0 siblings, 0 replies; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  8:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, Roger Pau Monné,
	Chao Gao

On 2017年08月24日 14:54, Jan Beulich wrote:
>>>> On 24.08.17 at 04:33, <tianyu.lan@intel.com> wrote:
>> On 2017年08月23日 16:04, Roger Pau Monné wrote:
>>> On Wed, Aug 23, 2017 at 03:52:01PM +0800, Lan Tianyu wrote:
>>>> On 2017年08月23日 00:41, Roger Pau Monné wrote:
>>>>>>> +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
>>>>>>> +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
>>>>>>> +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
>>>>>>> +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
>>>>>>> +    drhd->pci_segment = 0;
>>>>>>> +    drhd->base_address = config->iommu_base_addr;
>>>>>>> +
>>>>>>> +    scope = &drhd->scope[0];
>>>>>>> +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
>>>>>>> +    scope->length = ioapic_scope_size;
>>>>>>> +    scope->enumeration_id = config->ioapic_id;
>>>>>>> +    scope->bus = I440_PSEUDO_BUS_PLATFORM;
>>>>>>> +    scope->path[0] = I440_PSEUDO_DEVFN_IOAPIC;
>>>>> I'm not sure whether this constants should instead be fields in the
>>>>> acpi_config struct passed down from libxl. libxc shouldn't really need
>>>>> to know anything about which chipset a VM is using.
>>>>
>>>> How about rename I440_PSEUDO_XXX to VIOMMU_PSEUDO_XXX?
>>>
>>> I'm not really complaining about the naming, I'm just saying that I'm
>>> not sure whether this constants should live in libxl. It would be
>>> better IMHO if they where defined in some libxl x86 specific header,
>>> and passed to libxc inside of the acpi_config struct.
>>>
>>> At the end it is libxl which decides which chipset the VM is going to
>>> use, not libxc.
>>
>> We can do that but the bdf is reserved for IOAPIC and should be same for
>> different chipset. Do we still need to pass it via acpi_config?
> 
> Well, which value is the right (reserved) one surely can - at least
> in theory - depend on the chipset. Which means that it should
> come from the same place which determines the chipset to be
> emulated for the guest.
> 

OK. Will update.


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM
  2017-08-24  2:16     ` Lan Tianyu
@ 2017-08-24  8:49       ` Roger Pau Monné
  2017-08-24  8:54         ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-24  8:49 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Thu, Aug 24, 2017 at 10:16:32AM +0800, Lan Tianyu wrote:
> On 2017年08月23日 15:58, Roger Pau Monné wrote:
> > On Wed, Aug 09, 2017 at 04:34:12PM -0400, Lan Tianyu wrote:
> >> From: Chao Gao <chao.gao@intel.com>
> >> +}
> >> +
> >> +#define vvtd_get_reg_quad(vvtd, reg, val) do {  \
> >> +    (val) = vvtd_get_reg(vvtd, (reg) + 4 );     \
> >> +    (val) = (val) << 32;                        \
> >> +    (val) += vvtd_get_reg(vvtd, reg);           \
> >> +} while(0)
> >> +#define vvtd_set_reg_quad(vvtd, reg, val) do {  \
> >> +    vvtd_set_reg(vvtd, reg, (val));             \
> >> +    vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
> >> +} while(0)
> > 
> > You seem to need to access hvm_hw_vvtd_regs using different sizes, why
> > not do:
> > 
> > union hvm_hw_vvtd_regs {
> >     uint8_t  data8[1024];
> >     uint16_t data16[512];
> >     uint32_t data32[256];
> >     uint64_t data64[128];
> > };
> > 
> > Then the access is much more straightforward and you don't need the
> > complicated helpers that you have above.
> 
> Yes, that will be simpler.

Keep in mind (as said in another patch) that this approach will only
work correctly as long as you force accesses to be size aligned, which
you where not doing now.

I've looked at the VT-d spec, but I cannot find any section that
explains the restrictions on access sizes and alignments.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM
  2017-08-24  8:49       ` Roger Pau Monné
@ 2017-08-24  8:54         ` Lan Tianyu
  2017-08-24  9:02           ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-24  8:54 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On 2017年08月24日 16:49, Roger Pau Monné wrote:
> On Thu, Aug 24, 2017 at 10:16:32AM +0800, Lan Tianyu wrote:
>> On 2017年08月23日 15:58, Roger Pau Monné wrote:
>>> On Wed, Aug 09, 2017 at 04:34:12PM -0400, Lan Tianyu wrote:
>>>> From: Chao Gao <chao.gao@intel.com>
>>>> +}
>>>> +
>>>> +#define vvtd_get_reg_quad(vvtd, reg, val) do {  \
>>>> +    (val) = vvtd_get_reg(vvtd, (reg) + 4 );     \
>>>> +    (val) = (val) << 32;                        \
>>>> +    (val) += vvtd_get_reg(vvtd, reg);           \
>>>> +} while(0)
>>>> +#define vvtd_set_reg_quad(vvtd, reg, val) do {  \
>>>> +    vvtd_set_reg(vvtd, reg, (val));             \
>>>> +    vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
>>>> +} while(0)
>>>
>>> You seem to need to access hvm_hw_vvtd_regs using different sizes, why
>>> not do:
>>>
>>> union hvm_hw_vvtd_regs {
>>>     uint8_t  data8[1024];
>>>     uint16_t data16[512];
>>>     uint32_t data32[256];
>>>     uint64_t data64[128];
>>> };
>>>
>>> Then the access is much more straightforward and you don't need the
>>> complicated helpers that you have above.
>>
>> Yes, that will be simpler.
> 
> Keep in mind (as said in another patch) that this approach will only
> work correctly as long as you force accesses to be size aligned, which
> you where not doing now.
> 
> I've looked at the VT-d spec, but I cannot find any section that
> explains the restrictions on access sizes and alignments.
> 

10.2 Software Access to Registers?

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM
  2017-08-24  8:54         ` Lan Tianyu
@ 2017-08-24  9:02           ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-24  9:02 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, wei.liu2, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Chao Gao

On Thu, Aug 24, 2017 at 04:54:25PM +0800, Lan Tianyu wrote:
> On 2017年08月24日 16:49, Roger Pau Monné wrote:
> > On Thu, Aug 24, 2017 at 10:16:32AM +0800, Lan Tianyu wrote:
> >> On 2017年08月23日 15:58, Roger Pau Monné wrote:
> >>> On Wed, Aug 09, 2017 at 04:34:12PM -0400, Lan Tianyu wrote:
> >>>> From: Chao Gao <chao.gao@intel.com>
> >>>> +}
> >>>> +
> >>>> +#define vvtd_get_reg_quad(vvtd, reg, val) do {  \
> >>>> +    (val) = vvtd_get_reg(vvtd, (reg) + 4 );     \
> >>>> +    (val) = (val) << 32;                        \
> >>>> +    (val) += vvtd_get_reg(vvtd, reg);           \
> >>>> +} while(0)
> >>>> +#define vvtd_set_reg_quad(vvtd, reg, val) do {  \
> >>>> +    vvtd_set_reg(vvtd, reg, (val));             \
> >>>> +    vvtd_set_reg(vvtd, (reg) + 4, (val) >> 32); \
> >>>> +} while(0)
> >>>
> >>> You seem to need to access hvm_hw_vvtd_regs using different sizes, why
> >>> not do:
> >>>
> >>> union hvm_hw_vvtd_regs {
> >>>     uint8_t  data8[1024];
> >>>     uint16_t data16[512];
> >>>     uint32_t data32[256];
> >>>     uint64_t data64[128];
> >>> };
> >>>
> >>> Then the access is much more straightforward and you don't need the
> >>> complicated helpers that you have above.
> >>
> >> Yes, that will be simpler.
> > 
> > Keep in mind (as said in another patch) that this approach will only
> > work correctly as long as you force accesses to be size aligned, which
> > you where not doing now.
> > 
> > I've looked at the VT-d spec, but I cannot find any section that
> > explains the restrictions on access sizes and alignments.
> > 
> 
> 10.2 Software Access to Registers?

Oh, thanks. I was grepping for "access sizes".

So yes only size aligned accesses are allowed. You should fix the
patch where you check the access size/alignment so it checks for (addr
& (len - 1)) (IIRC you where checking addr & 3).

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-24  3:24               ` Lan Tianyu
@ 2017-08-24 11:08                 ` Wei Liu
  2017-08-25  3:19                   ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Wei Liu @ 2017-08-24 11:08 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Gao, Chao

On Thu, Aug 24, 2017 at 11:24:12AM +0800, Lan Tianyu wrote:
> On 2017年08月23日 16:34, Wei Liu wrote:
> >>>
> >>> I would like the code to generate dmar take into consideration
> >>> libxl__dom_load_acpi.
> >>>
> >>
> >> If add dmar table for hvmlite, we should combine dmar table with other
> >> ACPI table and populate into acpi_modules[2]. This is how hvmlite add
> >> other ACPI tables in libxl__dom_load_acpi().
> >>
> > 
> > Sure, that sounds plausible.
> > 
> > What I would like to see is to have one entry point to manipulate APCI
> > tables.
> > 
> > Given the patch volume we're seeing now, we expect contributors to drive
> > the discussion forward. If you're not sure, feel free to ask more questions.
> 
> I am not sure whether I understood correctly.
> 
> PVHv2 builds all ACPI table in tool stack and uses acpi_module[0, 1, 2]
> to pass related table content.
> 
> HVM builds ACPI tables in hvmloader and just use acpi_module[0] to pass
> additional ACPI firmware or table.
> 
> These two modes have different way to use acpi_modules[]. So I think we
> can't combine them, right?
> 

There might be some misunderstanding.  We probably don't want to
manipulate the content of the tables in libxl.

> For build dmar table, we have introduced construct_dmar() in under
> libacpi to build dmar table and PVHv2 also can use it in
> libxl__dom_load_acpi().
> 

My major complain is now there are two functions and in two different
locations, in two different phases of domain construction that would
manipulate ACPI tables. I would like to have only one.

The function you're currently modifying libxl__domain_firmware is not
the right place. It's primary function is to load files from disks.

You should be able to call the function you introduced in
libxl__dom_load_acpi, provided appropriate checks are added.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-23 14:05           ` Roger Pau Monné
@ 2017-08-24 14:03             ` Julien Grall
  2017-08-24 14:25               ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Julien Grall @ 2017-08-24 14:03 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich, chao.gao

Hi,

On 23/08/17 15:05, Roger Pau Monné wrote:
> On Wed, Aug 23, 2017 at 11:19:01AM +0100, Julien Grall wrote:
>> Hi Roger,
>>
>> On 23/08/17 08:22, Roger Pau Monné wrote:
>>> On Wed, Aug 23, 2017 at 02:06:17PM +0800, Lan Tianyu wrote:
>>>> Hi Roger:
>>>> 	Thanks for your review.
>>>>
>>>> On 2017年08月22日 22:32, Roger Pau Monné wrote:
>>>>> On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
>>>>>> +
>>>>>> +/* vIOMMU capabilities */
>>>>>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>>>>>> +
>>>>>> +struct xen_domctl_viommu_op {
>>>>>> +    uint32_t cmd;
>>>>>> +#define XEN_DOMCTL_create_viommu          0
>>>>>> +#define XEN_DOMCTL_destroy_viommu         1
>>>>>> +#define XEN_DOMCTL_query_viommu_caps      2
>>>>>> +    union {
>>>>>> +        struct {
>>>>>> +            /* IN - vIOMMU type */
>>>>>> +            uint64_t viommu_type;
>>>>>> +            /*
>>>>>> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
>>>>>> +             * are in charge of to check base_address and length.
>>>>>> +             */
>>>>>> +            uint64_t base_address;
>>>>>> +            /* IN - Length of MMIO region */
>>>>>> +            uint64_t length;
>>>>>
>>>>> It seems weird that you can specify the length, is that something
>>>>> that a user would like to set? Isn't the length of the IOMMU MMIO
>>>>> region fixed by the hardware spec?
>>>>
>>>> Different vendor may have different IOMMU register region sizes. (e.g,
>>>> VTD has one page size for register region). The length field is to make
>>>> vIOMMU device model not to abuse address space. Some registers' offsets
>>>> are reported by other register and these offsets are emulated by vIOMMU
>>>> device model. If it's not necessary, we can remove it and add it when
>>>> there is real such requirement.
>>>
>>> So from my understanding the size of the IOMMU MMIO region is implicit
>>> in the IOMMU type that the user chooses. I don't think this field is
>>> needed.
>>
>> To me, it makes more sense to care both the base and the size rather than
>> only the former.
>>
>> The toolstack is in charge of the address space and should be aware of the
>> size of everything. This address space may not be static and it makes sense
>> to give this information to Xen and verify we had the same assumption.
>
> Does this imply that we will have variable size vIOMMU MMIO regions?

There are existing IOMMU with multiple MMIO regions. This is the case of 
the Nvidia SMMU. Whether we will emulate then is another question. But 
for completeness, I would use address/size.

Note that we haven't decided on ARM whether we will emulate the IOMMU or 
use a PV based solution.

>
> If not the toolstack should know the size of the MMIO region at all
> times, unless you are running a toolstack version != Xen version,
> which is not supported.
>
>>>
>>>>>
>>>>>> +            /* IN - Capabilities with which we want to create */
>>>>>> +            uint64_t capabilities;
>>>>>> +            /* OUT - vIOMMU identity */
>>>>>> +            uint32_t viommu_id;
>>>>>> +        } create_viommu;
>>>>>> +
>>>>>> +        struct {
>>>>>> +            /* IN - vIOMMU identity */
>>>>>> +            uint32_t viommu_id;
>>>>>> +        } destroy_viommu;
>>>>>> +
>>>>>> +        struct {
>>>>>> +            /* IN - vIOMMU type */
>>>>>> +            uint64_t viommu_type;
>>>>>> +            /* OUT - vIOMMU Capabilities */
>>>>>> +            uint64_t capabilities;
>>>>>> +        } query_caps;
>>>>>
>>>>> This also seems weird, shouldn't you query the capabilities of an
>>>>> already created vIOMMU, rather than a vIOMMU type? Shouldn't the first
>>>>> field be viommu_id?
>>>>>
>>>>
>>>> Query interface here is to check what capabilities the vIOMMU device
>>>> model specified by viommu_type can support before create vIOMMU (suppose
>>>> user may select different capabilities). If capabilities returned by
>>>> query interface doesn't meet user configuration, tool stack should
>>>> return error. So it just accepts viommu_type.
>>>
>>> I don't think that's needed, if the chosen capabilities are not
>>> supported by the selected IOMMU type simply return error in
>>> XEN_DOMCTL_create_viommu.
>>>
>>> The capabilities of each IOMMU type should be listed in the man page,
>>> and the user should select a supported set or else
>>> XEN_DOMCTL_create_viommu will fail. Doing the checks both in the
>>> toolstack and in XEN_DOMCTL_create_viommu seems pointless and prone to
>>> errors.
>>
>> What if the some capabilities depends on host IOMMU? How are you going to
>> report that to the user?
>
> I would print a message on the hypervisor console, I don't see the
> value of doing the same check in the toolstack that Xen will also need
> to do in XEN_DOMCTL_create_viommu.
>
> I would see value on having such a query hypercall once we have an
> implementation that indeed has different capabilities depending on the
> hardware, and once a xl command to fetch and print such capabilities
> is introduced.

Fair enough.


> As said, the above query is only used to perform the checks done in
> XEN_DOMCTL_create_viommu on the toolstack.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-08-24 14:03             ` Julien Grall
@ 2017-08-24 14:25               ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-24 14:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich, chao.gao

On Thu, Aug 24, 2017 at 03:03:49PM +0100, Julien Grall wrote:
> Hi,
> 
> On 23/08/17 15:05, Roger Pau Monné wrote:
> > On Wed, Aug 23, 2017 at 11:19:01AM +0100, Julien Grall wrote:
> > > Hi Roger,
> > > 
> > > On 23/08/17 08:22, Roger Pau Monné wrote:
> > > > On Wed, Aug 23, 2017 at 02:06:17PM +0800, Lan Tianyu wrote:
> > > > > Hi Roger:
> > > > > 	Thanks for your review.
> > > > > 
> > > > > On 2017年08月22日 22:32, Roger Pau Monné wrote:
> > > > > > On Wed, Aug 09, 2017 at 04:34:02PM -0400, Lan Tianyu wrote:
> > > > > > > +
> > > > > > > +/* vIOMMU capabilities */
> > > > > > > +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> > > > > > > +
> > > > > > > +struct xen_domctl_viommu_op {
> > > > > > > +    uint32_t cmd;
> > > > > > > +#define XEN_DOMCTL_create_viommu          0
> > > > > > > +#define XEN_DOMCTL_destroy_viommu         1
> > > > > > > +#define XEN_DOMCTL_query_viommu_caps      2
> > > > > > > +    union {
> > > > > > > +        struct {
> > > > > > > +            /* IN - vIOMMU type */
> > > > > > > +            uint64_t viommu_type;
> > > > > > > +            /*
> > > > > > > +             * IN - MMIO base address of vIOMMU. vIOMMU device models
> > > > > > > +             * are in charge of to check base_address and length.
> > > > > > > +             */
> > > > > > > +            uint64_t base_address;
> > > > > > > +            /* IN - Length of MMIO region */
> > > > > > > +            uint64_t length;
> > > > > > 
> > > > > > It seems weird that you can specify the length, is that something
> > > > > > that a user would like to set? Isn't the length of the IOMMU MMIO
> > > > > > region fixed by the hardware spec?
> > > > > 
> > > > > Different vendor may have different IOMMU register region sizes. (e.g,
> > > > > VTD has one page size for register region). The length field is to make
> > > > > vIOMMU device model not to abuse address space. Some registers' offsets
> > > > > are reported by other register and these offsets are emulated by vIOMMU
> > > > > device model. If it's not necessary, we can remove it and add it when
> > > > > there is real such requirement.
> > > > 
> > > > So from my understanding the size of the IOMMU MMIO region is implicit
> > > > in the IOMMU type that the user chooses. I don't think this field is
> > > > needed.
> > > 
> > > To me, it makes more sense to care both the base and the size rather than
> > > only the former.
> > > 
> > > The toolstack is in charge of the address space and should be aware of the
> > > size of everything. This address space may not be static and it makes sense
> > > to give this information to Xen and verify we had the same assumption.
> > 
> > Does this imply that we will have variable size vIOMMU MMIO regions?
> 
> There are existing IOMMU with multiple MMIO regions. This is the case of the
> Nvidia SMMU. Whether we will emulate then is another question. But for
> completeness, I would use address/size.

The proposed implementation does not support multiple MMIO regions
anyway. I'm not going to oppose to this anymore, but I don't see much
value on implementing things just for completeness without having a
real use case, specially when this is a domctl interface that is not
stable, ie: we can always modify it later on without any issues.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-24 11:08                 ` Wei Liu
@ 2017-08-25  3:19                   ` Lan Tianyu
  2017-08-25  7:33                     ` Lan Tianyu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-25  3:19 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, Gao, Chao

On 2017年08月24日 19:08, Wei Liu wrote:
>>>> If add dmar table for hvmlite, we should combine dmar table with other
>>>> > >> ACPI table and populate into acpi_modules[2]. This is how hvmlite add
>>>> > >> other ACPI tables in libxl__dom_load_acpi().
>>>> > >>
>>> > > 
>>> > > Sure, that sounds plausible.
>>> > > 
>>> > > What I would like to see is to have one entry point to manipulate APCI
>>> > > tables.
>>> > > 
>>> > > Given the patch volume we're seeing now, we expect contributors to drive
>>> > > the discussion forward. If you're not sure, feel free to ask more questions.
>> > 
>> > I am not sure whether I understood correctly.
>> > 
>> > PVHv2 builds all ACPI table in tool stack and uses acpi_module[0, 1, 2]
>> > to pass related table content.
>> > 
>> > HVM builds ACPI tables in hvmloader and just use acpi_module[0] to pass
>> > additional ACPI firmware or table.
>> > 
>> > These two modes have different way to use acpi_modules[]. So I think we
>> > can't combine them, right?
>> > 
> There might be some misunderstanding.  We probably don't want to
> manipulate the content of the tables in libxl.
> 
>> > For build dmar table, we have introduced construct_dmar() in under
>> > libacpi to build dmar table and PVHv2 also can use it in
>> > libxl__dom_load_acpi().
>> > 
> My major complain is now there are two functions and in two different
> locations, in two different phases of domain construction that would
> manipulate ACPI tables. I would like to have only one.
> 
> The function you're currently modifying libxl__domain_firmware is not
> the right place. It's primary function is to load files from disks.
> 
> You should be able to call the function you introduced in
> libxl__dom_load_acpi, provided appropriate checks are added.

But libxl__dom_load_acpi() isn't called on hvm guest code path. It just
works for PVHv2/HVMlite and have some conflict with hvm guest
configuration(i.e, acpi_module).


int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
                                               libxl_domain_build_info
*info,
                                               struct xc_dom_image *dom)
{
    int rc = 0;

    if ((info->type == LIBXL_DOMAIN_TYPE_HVM) &&
        (info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)) {
        rc = libxl__dom_load_acpi(gc, info, dom);
        if (rc != 0)
            LOGE(ERROR, "libxl_dom_load_acpi failed");
    }

    return rc;
}



-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d
  2017-08-23 12:19   ` Roger Pau Monné
@ 2017-08-25  6:35     ` Chao Gao
  2017-08-25  9:00       ` Jan Beulich
  0 siblings, 1 reply; 136+ messages in thread
From: Chao Gao @ 2017-08-25  6:35 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Lan Tianyu, kevin.tian, julien.grall, xen-devel

On Wed, Aug 23, 2017 at 01:19:41PM +0100, Roger Pau Monné wrote:
>On Wed, Aug 09, 2017 at 04:34:26PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Wrap some useful status in a new structure hvm_hw_vvtd, following
>> the customs of vlapic, vioapic and etc. Provide two save-restore
>> pairs to save/restore registers and non-register status.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>> diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
>> index fd7bf3f..10536cb 100644
>> --- a/xen/include/public/arch-x86/hvm/save.h
>> +++ b/xen/include/public/arch-x86/hvm/save.h
>> @@ -639,10 +639,32 @@ struct hvm_msr {
>>  
>>  #define CPU_MSR_CODE  20
>>  
>> +struct hvm_hw_vvtd_regs {
>> +    uint8_t data[1024];
>> +};
>> +
>> +DECLARE_HVM_SAVE_TYPE(IOMMU_REGS, 21, struct hvm_hw_vvtd_regs);
>> +
>> +struct hvm_hw_vvtd
>> +{
>> +    /* VIOMMU_STATUS_XXX */
>> +    uint32_t status;
>> +    /* Fault Recording index */
>> +    uint32_t frcd_idx;
>> +    /* Is in Extended Interrupt Mode? */
>> +    uint32_t eim;
>> +    /* Max remapping entries in IRT */
>> +    uint32_t irt_max_entry;
>> +    /* Interrupt remapping table base gfn */
>> +    uint64_t irt;
>> +};
>> +
>> +DECLARE_HVM_SAVE_TYPE(IOMMU, 22, struct hvm_hw_vvtd);
>
>Why two separate structures? It should be the same structure.

Hi, Roger.

Thank you for your review. I agree with most of your comments on the
whole series. I will only reply to some points I think still need
discussion.

Here we use two separate structures for some field cannot be infered
from the struct hvm_hw_vvtd_regs. For example, the 'irt' is the gfn of
the base address Interrupt Remapping Table. The field is set through
1. set the register DMAR_IRTE_REG in hvm_hw_vvtd_regs.
2. send a command to vtd by writting another command register.

If the current base address is A, and guest wants to update the base
address to B and finish the first step. Unfortunately, saving and
restoring happen here. In this case, we need the struct hvm_hw_vvtd
to correctly restore some information.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults
  2017-08-23 11:51   ` Roger Pau Monné
@ 2017-08-25  7:17     ` Chao Gao
  2017-08-25  9:43       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Chao Gao @ 2017-08-25  7:17 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Lan Tianyu, kevin.tian, julien.grall, xen-devel

On Wed, Aug 23, 2017 at 12:51:27PM +0100, Roger Pau Monné wrote:
>On Wed, Aug 09, 2017 at 04:34:24PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Interrupt translation faults are non-recoverable fault. When faults
>                                                   ^ faults
>> are triggered, it needs to populate fault info to Fault Recording
>> Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
>> to deal with faults.
>> 
>> This patch emulates hardware's handling interrupt translation
>> faults (more information about the process can be found in VT-d spec,
>> chipter "Translation Faults", section "Non-Recoverable Fault
>  ^ chapter
>> Reporting" and section "Non-Recoverable Logging").
>> Specifically, viommu_record_fault() records the fault information and
>> viommu_report_non_recoverable_fault() reports faults to software.
>> Currently, only Primary Fault Logging is supported and the Number of
>> Fault-recording Registers is 1.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>
>>      /* Address range of remapping hardware register-set */
>>      uint64_t base_addr;
>>      uint64_t length;
>> @@ -97,6 +101,23 @@ static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
>>      return domain_vvtd(v->domain);
>>  }
>>  
>> +static inline int vvtd_test_and_set_bit(struct vvtd *vvtd, uint32_t reg,
>> +                                        int nr)
>
>unsigned int for nr, and I'm not really sure the usefulness of this
>helpers. In any case inline should not be used and instead let the
>compiler optimize this.
>

I think compiler doesn't know the frequency of calling these function.
Explicitly make this function inline sometimes can avoid compiler
doesn't do this for some short and frequently used functions.

>> +static void vvtd_report_non_recoverable_fault(struct vvtd *vvtd, int reason)
>> +{
>> +    uint32_t fsts;
>> +
>> +    ASSERT(reason & DMA_FSTS_FAULTS);
>> +    fsts = vvtd_get_reg(vvtd, DMAR_FSTS_REG);
>> +    __vvtd_set_bit(vvtd, DMAR_FSTS_REG, reason);
>
>I don't understand this, is reason a bit position or a mask?
>
>DMA_FSTS_FAULTS seems to be a mask, that should be set into DMAR_FSTS_REG?

According VT-d spec 10.4.9, Each kind of fault is denoted by one bit in
DMAR_FSTS_REG.

>>  static int vvtd_record_fault(struct vvtd *vvtd,
>> -                             struct irq_remapping_request *irq,
>> +                             struct irq_remapping_request *request,
>>                               int reason)
>>  {
>> -    return 0;
>> +    struct vtd_fault_record_register frcd;
>> +    int frcd_idx;
>> +
>> +    switch(reason)
>> +    {
>> +    case VTD_FR_IR_REQ_RSVD:
>> +    case VTD_FR_IR_INDEX_OVER:
>> +    case VTD_FR_IR_ENTRY_P:
>> +    case VTD_FR_IR_ROOT_INVAL:
>> +    case VTD_FR_IR_IRTE_RSVD:
>> +    case VTD_FR_IR_REQ_COMPAT:
>> +    case VTD_FR_IR_SID_ERR:
>> +        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_BIT) )
>> +            return X86EMUL_OKAY;
>> +
>> +        /* No available Fault Record means Fault overflowed */
>> +        frcd_idx = vvtd_alloc_frcd(vvtd);
>> +        if ( frcd_idx == -1 )
>> +        {
>> +            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PFO_BIT);
>> +            return X86EMUL_OKAY;
>> +        }
>> +        memset(&frcd, 0, sizeof(frcd));
>> +        frcd.fields.FR = (u8)reason;
>> +        frcd.fields.FI = ((u64)irq_remapping_request_index(request)) << 36;
>> +        frcd.fields.SID = (u16)request->source_id;
>> +        frcd.fields.F = 1;
>> +        vvtd_commit_frcd(vvtd, frcd_idx, &frcd);
>> +        return X86EMUL_OKAY;
>> +
>> +    default:
>
>Other reasons are just ignored? Should this have an ASSERT_UNREACHABLE
>maybe?

It can have for all the faults are raised by vvtd. When vvtd generates a
new kinds of fault, the corresponding handler also should be added.

>
>> +        break;
>> +    }
>> +
>> +    gdprintk(XENLOG_ERR, "Can't handle vVTD Fault (reason 0x%x).", reason);
>> +    domain_crash(vvtd->domain);
>
>Oh, I see. Is it expected that such faults with unhandled reasons can
>be somehow generated by the domain itself?
>

No. Faults are generated by vvtd. We only add interrupt translation
faults. Other faults can be added when adding other features (e.g. DMA
remapping). 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-08-23 10:41   ` Roger Pau Monné
@ 2017-08-25  7:28     ` Chao Gao
  2017-08-25  9:35       ` Roger Pau Monné
  0 siblings, 1 reply; 136+ messages in thread
From: Chao Gao @ 2017-08-25  7:28 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, kevin.tian, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, julien.grall, jbeulich

On Wed, Aug 23, 2017 at 11:41:25AM +0100, Roger Pau Monné wrote:
>On Wed, Aug 09, 2017 at 04:34:22PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Introduce a new binding relationship and provide a new interface to
>> manage the new relationship.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>          pirq_dpci->gmsi.posted = false;
>>          vcpu = (dest_vcpu_id >= 0) ? d->vcpu[dest_vcpu_id] : NULL;
>> -        if ( iommu_intpost )
>> +        /* Currently, don't use interrupt posting for guest's remapping MSIs */
>> +        if ( iommu_intpost && !ir )
>>          {
>>              if ( delivery_mode == dest_LowestPrio )
>>                  vcpu = vector_hashing_dest(d, dest, dest_mode,
>> @@ -435,7 +527,7 @@ int pt_irq_create_bind(
>>              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
>>  
>>          /* Use interrupt posting if it is supported. */
>> -        if ( iommu_intpost )
>> +        if ( iommu_intpost && !ir )
>
>So with interrupt remapping posted interrupts are not available...

Yes. We want to make thing simple. Currently, all vIRTE isn't
cached by vvtd and thus we needn't do anything when guest try to flush
vIRTE. If we use posted interrupt here, it means some information will
be cached by physical VTd. In that case, we should push effort to flush
correspond phyiscal IRTE. We don't include these patches in this
series.

Thanks
Chao


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-25  3:19                   ` Lan Tianyu
@ 2017-08-25  7:33                     ` Lan Tianyu
  2017-08-25  9:11                       ` Wei Liu
  0 siblings, 1 reply; 136+ messages in thread
From: Lan Tianyu @ 2017-08-25  7:33 UTC (permalink / raw)
  To: Wei Liu
  Cc: kevin.tian, andrew.cooper3, ian.jackson, xen-devel, julien.grall,
	jbeulich, Gao, Chao

On 2017年08月25日 11:19, Lan Tianyu wrote:
> On 2017年08月24日 19:08, Wei Liu wrote:
>>>>> If add dmar table for hvmlite, we should combine dmar table with other
>>>>>>>> ACPI table and populate into acpi_modules[2]. This is how hvmlite add
>>>>>>>> other ACPI tables in libxl__dom_load_acpi().
>>>>>>>>
>>>>>>
>>>>>> Sure, that sounds plausible.
>>>>>>
>>>>>> What I would like to see is to have one entry point to manipulate APCI
>>>>>> tables.
>>>>>>
>>>>>> Given the patch volume we're seeing now, we expect contributors to drive
>>>>>> the discussion forward. If you're not sure, feel free to ask more questions.
>>>>
>>>> I am not sure whether I understood correctly.
>>>>
>>>> PVHv2 builds all ACPI table in tool stack and uses acpi_module[0, 1, 2]
>>>> to pass related table content.
>>>>
>>>> HVM builds ACPI tables in hvmloader and just use acpi_module[0] to pass
>>>> additional ACPI firmware or table.
>>>>
>>>> These two modes have different way to use acpi_modules[]. So I think we
>>>> can't combine them, right?
>>>>
>> There might be some misunderstanding.  We probably don't want to
>> manipulate the content of the tables in libxl.
>>
>>>> For build dmar table, we have introduced construct_dmar() in under
>>>> libacpi to build dmar table and PVHv2 also can use it in
>>>> libxl__dom_load_acpi().
>>>>
>> My major complain is now there are two functions and in two different
>> locations, in two different phases of domain construction that would
>> manipulate ACPI tables. I would like to have only one.
>>
>> The function you're currently modifying libxl__domain_firmware is not
>> the right place. It's primary function is to load files from disks.
>>
>> You should be able to call the function you introduced in
>> libxl__dom_load_acpi, provided appropriate checks are added.
> 
> But libxl__dom_load_acpi() isn't called on hvm guest code path. It just
> works for PVHv2/HVMlite and have some conflict with hvm guest
> configuration(i.e, acpi_module).
> 
> 
> int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>                                                libxl_domain_build_info
> *info,
>                                                struct xc_dom_image *dom)
> {
>     int rc = 0;
> 
>     if ((info->type == LIBXL_DOMAIN_TYPE_HVM) &&
>         (info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)) {
>         rc = libxl__dom_load_acpi(gc, info, dom);
>         if (rc != 0)
>             LOGE(ERROR, "libxl_dom_load_acpi failed");
>     }
> 
>     return rc;
> }

We may remove the check and move introduced code in
libxl__dom_load_acpi(). Run new code just for hvm guest. Does this make
sense?

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d
  2017-08-25  9:00       ` Jan Beulich
@ 2017-08-25  8:27         ` Chao Gao
  0 siblings, 0 replies; 136+ messages in thread
From: Chao Gao @ 2017-08-25  8:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Lan Tianyu, julien.grall, xen-devel, kevin.tian, Roger Pau Monné

On Fri, Aug 25, 2017 at 03:00:32AM -0600, Jan Beulich wrote:
>>>> On 25.08.17 at 08:35, <chao.gao@intel.com> wrote:
>> On Wed, Aug 23, 2017 at 01:19:41PM +0100, Roger Pau Monné wrote:
>>>On Wed, Aug 09, 2017 at 04:34:26PM -0400, Lan Tianyu wrote:
>>>> From: Chao Gao <chao.gao@intel.com>
>>>> 
>>>> Wrap some useful status in a new structure hvm_hw_vvtd, following
>>>> the customs of vlapic, vioapic and etc. Provide two save-restore
>>>> pairs to save/restore registers and non-register status.
>>>> 
>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>>>> ---
>>>> diff --git a/xen/include/public/arch-x86/hvm/save.h 
>> b/xen/include/public/arch-x86/hvm/save.h
>>>> index fd7bf3f..10536cb 100644
>>>> --- a/xen/include/public/arch-x86/hvm/save.h
>>>> +++ b/xen/include/public/arch-x86/hvm/save.h
>>>> @@ -639,10 +639,32 @@ struct hvm_msr {
>>>>  
>>>>  #define CPU_MSR_CODE  20
>>>>  
>>>> +struct hvm_hw_vvtd_regs {
>>>> +    uint8_t data[1024];
>>>> +};
>>>> +
>>>> +DECLARE_HVM_SAVE_TYPE(IOMMU_REGS, 21, struct hvm_hw_vvtd_regs);
>>>> +
>>>> +struct hvm_hw_vvtd
>>>> +{
>>>> +    /* VIOMMU_STATUS_XXX */
>>>> +    uint32_t status;
>>>> +    /* Fault Recording index */
>>>> +    uint32_t frcd_idx;
>>>> +    /* Is in Extended Interrupt Mode? */
>>>> +    uint32_t eim;
>>>> +    /* Max remapping entries in IRT */
>>>> +    uint32_t irt_max_entry;
>>>> +    /* Interrupt remapping table base gfn */
>>>> +    uint64_t irt;
>>>> +};
>>>> +
>>>> +DECLARE_HVM_SAVE_TYPE(IOMMU, 22, struct hvm_hw_vvtd);
>>>
>>>Why two separate structures? It should be the same structure.
>> 
>> Hi, Roger.
>> 
>> Thank you for your review. I agree with most of your comments on the
>> whole series. I will only reply to some points I think still need
>> discussion.
>> 
>> Here we use two separate structures for some field cannot be infered
>> from the struct hvm_hw_vvtd_regs. For example, the 'irt' is the gfn of
>> the base address Interrupt Remapping Table. The field is set through
>> 1. set the register DMAR_IRTE_REG in hvm_hw_vvtd_regs.
>> 2. send a command to vtd by writting another command register.
>> 
>> If the current base address is A, and guest wants to update the base
>> address to B and finish the first step. Unfortunately, saving and
>> restoring happen here. In this case, we need the struct hvm_hw_vvtd
>> to correctly restore some information.
>
>Hmm, the way I've understood Roger's question is why you
>don't combine the two structures into one, not whether one
>of the two can be omitted.

It seems likely that they can be combined. will give it a try.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d
  2017-08-25  6:35     ` Chao Gao
@ 2017-08-25  9:00       ` Jan Beulich
  2017-08-25  8:27         ` Chao Gao
  0 siblings, 1 reply; 136+ messages in thread
From: Jan Beulich @ 2017-08-25  9:00 UTC (permalink / raw)
  To: Chao Gao
  Cc: Lan Tianyu, julien.grall, xen-devel, kevin.tian, Roger Pau Monné

>>> On 25.08.17 at 08:35, <chao.gao@intel.com> wrote:
> On Wed, Aug 23, 2017 at 01:19:41PM +0100, Roger Pau Monné wrote:
>>On Wed, Aug 09, 2017 at 04:34:26PM -0400, Lan Tianyu wrote:
>>> From: Chao Gao <chao.gao@intel.com>
>>> 
>>> Wrap some useful status in a new structure hvm_hw_vvtd, following
>>> the customs of vlapic, vioapic and etc. Provide two save-restore
>>> pairs to save/restore registers and non-register status.
>>> 
>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>>> ---
>>> diff --git a/xen/include/public/arch-x86/hvm/save.h 
> b/xen/include/public/arch-x86/hvm/save.h
>>> index fd7bf3f..10536cb 100644
>>> --- a/xen/include/public/arch-x86/hvm/save.h
>>> +++ b/xen/include/public/arch-x86/hvm/save.h
>>> @@ -639,10 +639,32 @@ struct hvm_msr {
>>>  
>>>  #define CPU_MSR_CODE  20
>>>  
>>> +struct hvm_hw_vvtd_regs {
>>> +    uint8_t data[1024];
>>> +};
>>> +
>>> +DECLARE_HVM_SAVE_TYPE(IOMMU_REGS, 21, struct hvm_hw_vvtd_regs);
>>> +
>>> +struct hvm_hw_vvtd
>>> +{
>>> +    /* VIOMMU_STATUS_XXX */
>>> +    uint32_t status;
>>> +    /* Fault Recording index */
>>> +    uint32_t frcd_idx;
>>> +    /* Is in Extended Interrupt Mode? */
>>> +    uint32_t eim;
>>> +    /* Max remapping entries in IRT */
>>> +    uint32_t irt_max_entry;
>>> +    /* Interrupt remapping table base gfn */
>>> +    uint64_t irt;
>>> +};
>>> +
>>> +DECLARE_HVM_SAVE_TYPE(IOMMU, 22, struct hvm_hw_vvtd);
>>
>>Why two separate structures? It should be the same structure.
> 
> Hi, Roger.
> 
> Thank you for your review. I agree with most of your comments on the
> whole series. I will only reply to some points I think still need
> discussion.
> 
> Here we use two separate structures for some field cannot be infered
> from the struct hvm_hw_vvtd_regs. For example, the 'irt' is the gfn of
> the base address Interrupt Remapping Table. The field is set through
> 1. set the register DMAR_IRTE_REG in hvm_hw_vvtd_regs.
> 2. send a command to vtd by writting another command register.
> 
> If the current base address is A, and guest wants to update the base
> address to B and finish the first step. Unfortunately, saving and
> restoring happen here. In this case, we need the struct hvm_hw_vvtd
> to correctly restore some information.

Hmm, the way I've understood Roger's question is why you
don't combine the two structures into one, not whether one
of the two can be omitted.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-08-25  7:33                     ` Lan Tianyu
@ 2017-08-25  9:11                       ` Wei Liu
  0 siblings, 0 replies; 136+ messages in thread
From: Wei Liu @ 2017-08-25  9:11 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: kevin.tian, Wei Liu, andrew.cooper3, ian.jackson, xen-devel,
	julien.grall, jbeulich, Gao, Chao

On Fri, Aug 25, 2017 at 03:33:47PM +0800, Lan Tianyu wrote:
> > 
> > int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
> >                                                libxl_domain_build_info
> > *info,
> >                                                struct xc_dom_image *dom)
> > {
> >     int rc = 0;
> > 
> >     if ((info->type == LIBXL_DOMAIN_TYPE_HVM) &&
> >         (info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)) {
> >         rc = libxl__dom_load_acpi(gc, info, dom);
> >         if (rc != 0)
> >             LOGE(ERROR, "libxl_dom_load_acpi failed");
> >     }
> > 
> >     return rc;
> > }
> 
> We may remove the check and move introduced code in
> libxl__dom_load_acpi(). Run new code just for hvm guest. Does this make
> sense?
> 

More or less. Push the check down to libxl__dom_load_acpi

> -- 
> Best regards
> Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-08-25  7:28     ` Chao Gao
@ 2017-08-25  9:35       ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-25  9:35 UTC (permalink / raw)
  To: Lan Tianyu, xen-devel, kevin.tian, wei.liu2, andrew.cooper3,
	ian.jackson, julien.grall, jbeulich

On Fri, Aug 25, 2017 at 03:28:46PM +0800, Chao Gao wrote:
> On Wed, Aug 23, 2017 at 11:41:25AM +0100, Roger Pau Monné wrote:
> >On Wed, Aug 09, 2017 at 04:34:22PM -0400, Lan Tianyu wrote:
> >> From: Chao Gao <chao.gao@intel.com>
> >> 
> >> Introduce a new binding relationship and provide a new interface to
> >> manage the new relationship.
> >> 
> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >>          pirq_dpci->gmsi.posted = false;
> >>          vcpu = (dest_vcpu_id >= 0) ? d->vcpu[dest_vcpu_id] : NULL;
> >> -        if ( iommu_intpost )
> >> +        /* Currently, don't use interrupt posting for guest's remapping MSIs */
> >> +        if ( iommu_intpost && !ir )
> >>          {
> >>              if ( delivery_mode == dest_LowestPrio )
> >>                  vcpu = vector_hashing_dest(d, dest, dest_mode,
> >> @@ -435,7 +527,7 @@ int pt_irq_create_bind(
> >>              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
> >>  
> >>          /* Use interrupt posting if it is supported. */
> >> -        if ( iommu_intpost )
> >> +        if ( iommu_intpost && !ir )
> >
> >So with interrupt remapping posted interrupts are not available...
> 
> Yes. We want to make thing simple. Currently, all vIRTE isn't
> cached by vvtd and thus we needn't do anything when guest try to flush
> vIRTE. If we use posted interrupt here, it means some information will
> be cached by physical VTd. In that case, we should push effort to flush
> correspond phyiscal IRTE. We don't include these patches in this
> series.

Might be worth adding a "FIXME".

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults
  2017-08-25  7:17     ` Chao Gao
@ 2017-08-25  9:43       ` Roger Pau Monné
  0 siblings, 0 replies; 136+ messages in thread
From: Roger Pau Monné @ 2017-08-25  9:43 UTC (permalink / raw)
  To: Lan Tianyu, xen-devel, kevin.tian, julien.grall

On Fri, Aug 25, 2017 at 03:17:24PM +0800, Chao Gao wrote:
> On Wed, Aug 23, 2017 at 12:51:27PM +0100, Roger Pau Monné wrote:
> >On Wed, Aug 09, 2017 at 04:34:24PM -0400, Lan Tianyu wrote:
> >> From: Chao Gao <chao.gao@intel.com>
> >> 
> >> Interrupt translation faults are non-recoverable fault. When faults
> >                                                   ^ faults
> >> are triggered, it needs to populate fault info to Fault Recording
> >> Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
> >> to deal with faults.
> >> 
> >> This patch emulates hardware's handling interrupt translation
> >> faults (more information about the process can be found in VT-d spec,
> >> chipter "Translation Faults", section "Non-Recoverable Fault
> >  ^ chapter
> >> Reporting" and section "Non-Recoverable Logging").
> >> Specifically, viommu_record_fault() records the fault information and
> >> viommu_report_non_recoverable_fault() reports faults to software.
> >> Currently, only Primary Fault Logging is supported and the Number of
> >> Fault-recording Registers is 1.
> >> 
> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >
> >>      /* Address range of remapping hardware register-set */
> >>      uint64_t base_addr;
> >>      uint64_t length;
> >> @@ -97,6 +101,23 @@ static inline struct vvtd *vcpu_vvtd(struct vcpu *v)
> >>      return domain_vvtd(v->domain);
> >>  }
> >>  
> >> +static inline int vvtd_test_and_set_bit(struct vvtd *vvtd, uint32_t reg,
> >> +                                        int nr)
> >
> >unsigned int for nr, and I'm not really sure the usefulness of this
> >helpers. In any case inline should not be used and instead let the
> >compiler optimize this.
> >
> 
> I think compiler doesn't know the frequency of calling these function.
> Explicitly make this function inline sometimes can avoid compiler
> doesn't do this for some short and frequently used functions.

I will leave that to the maintainers. IMHO I wouldn't add inline
unless I see some figures that back up the supposed speed increase.

> >> +static void vvtd_report_non_recoverable_fault(struct vvtd *vvtd, int reason)
> >> +{
> >> +    uint32_t fsts;
> >> +
> >> +    ASSERT(reason & DMA_FSTS_FAULTS);
> >> +    fsts = vvtd_get_reg(vvtd, DMAR_FSTS_REG);
> >> +    __vvtd_set_bit(vvtd, DMAR_FSTS_REG, reason);
> >
> >I don't understand this, is reason a bit position or a mask?
> >
> >DMA_FSTS_FAULTS seems to be a mask, that should be set into DMAR_FSTS_REG?
> 
> According VT-d spec 10.4.9, Each kind of fault is denoted by one bit in
> DMAR_FSTS_REG.

So 'reason' is a bit position in this context? It needs to be unsigned
int then.

> >>  static int vvtd_record_fault(struct vvtd *vvtd,
> >> -                             struct irq_remapping_request *irq,
> >> +                             struct irq_remapping_request *request,
> >>                               int reason)
> >>  {
> >> -    return 0;
> >> +    struct vtd_fault_record_register frcd;
> >> +    int frcd_idx;
> >> +
> >> +    switch(reason)
> >> +    {
> >> +    case VTD_FR_IR_REQ_RSVD:
> >> +    case VTD_FR_IR_INDEX_OVER:
> >> +    case VTD_FR_IR_ENTRY_P:
> >> +    case VTD_FR_IR_ROOT_INVAL:
> >> +    case VTD_FR_IR_IRTE_RSVD:
> >> +    case VTD_FR_IR_REQ_COMPAT:
> >> +    case VTD_FR_IR_SID_ERR:
> >> +        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_BIT) )
> >> +            return X86EMUL_OKAY;
> >> +
> >> +        /* No available Fault Record means Fault overflowed */
> >> +        frcd_idx = vvtd_alloc_frcd(vvtd);
> >> +        if ( frcd_idx == -1 )
> >> +        {
> >> +            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PFO_BIT);
> >> +            return X86EMUL_OKAY;
> >> +        }
> >> +        memset(&frcd, 0, sizeof(frcd));
> >> +        frcd.fields.FR = (u8)reason;
> >> +        frcd.fields.FI = ((u64)irq_remapping_request_index(request)) << 36;
> >> +        frcd.fields.SID = (u16)request->source_id;
> >> +        frcd.fields.F = 1;
> >> +        vvtd_commit_frcd(vvtd, frcd_idx, &frcd);
> >> +        return X86EMUL_OKAY;
> >> +
> >> +    default:
> >
> >Other reasons are just ignored? Should this have an ASSERT_UNREACHABLE
> >maybe?
> 
> It can have for all the faults are raised by vvtd. When vvtd generates a
> new kinds of fault, the corresponding handler also should be added.

Do you mean that with the code above all the possible fault types are
covered?

In which case it seems you want to add a ASSERT_UNREACHABLE to the
default case.

> >
> >> +        break;
> >> +    }
> >> +
> >> +    gdprintk(XENLOG_ERR, "Can't handle vVTD Fault (reason 0x%x).", reason);
> >> +    domain_crash(vvtd->domain);
> >
> >Oh, I see. Is it expected that such faults with unhandled reasons can
> >be somehow generated by the domain itself?
> >
> 
> No. Faults are generated by vvtd. We only add interrupt translation
> faults. Other faults can be added when adding other features (e.g. DMA
> remapping). 

OK, so then it looks like you want an ASSERT above, and I'm not sure
whether you want to kill the domain. At the end if Xen ever gets here
it means there's a bug in the vvtd implementation. I think the proper
solution is to add the ASSERT above and simply return X86EMUL_OKAY
here.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 136+ messages in thread

end of thread, other threads:[~2017-08-25  9:43 UTC | newest]

Thread overview: 136+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-09 20:34 [PATCH V2 00/25] xen/vIOMMU: Add vIOMMU support with irq remapping fucntion of virtual vtd Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 1/25] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
2017-08-17 11:18   ` Wei Liu
2017-08-18  2:57     ` Lan Tianyu
2017-08-22 14:32   ` Roger Pau Monné
2017-08-23  6:06     ` Lan Tianyu
2017-08-23  7:22       ` Roger Pau Monné
2017-08-23  9:12         ` Lan Tianyu
2017-08-23 10:19         ` Julien Grall
2017-08-23 14:05           ` Roger Pau Monné
2017-08-24 14:03             ` Julien Grall
2017-08-24 14:25               ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
2017-08-17 11:18   ` Wei Liu
2017-08-18  0:22     ` [PATCH V2 1/25] VIOMMU: Add vIOMMU helper functions to create, destroy and query capabilities Lan Tianyu
2017-08-18  8:41       ` Jan Beulich
2017-08-18  8:50         ` Lan Tianyu
2017-08-18 13:32       ` Wei Liu
2017-08-22 15:27       ` Roger Pau Monné
2017-08-23  7:10         ` Lan Tianyu
2017-08-23  7:38           ` Roger Pau Monné
2017-08-24  8:14       ` Tian, Kevin
2017-08-18  7:09     ` [PATCH V2 2/25] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
2017-08-18 10:13       ` Wei Liu
2017-08-22  8:04         ` Lan Tianyu
2017-08-22  8:42           ` Wei Liu
2017-08-22 10:39             ` Lan Tianyu
2017-08-22 10:53               ` Wei Liu
2017-08-22 10:54                 ` Lan Tianyu
2017-08-22 15:32   ` Roger Pau Monné
2017-08-23  7:42     ` Lan Tianyu
2017-08-23  9:24       ` Jan Beulich
2017-08-23  9:47         ` Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 3/25] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
2017-08-17 11:19   ` Wei Liu
2017-08-22 15:38   ` Roger Pau Monné
2017-08-23  7:43     ` Lan Tianyu
2017-08-23  9:25     ` Jan Beulich
2017-08-09 20:34 ` [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
2017-08-17 11:19   ` Wei Liu
2017-08-18  7:17     ` Lan Tianyu
2017-08-18 10:15       ` Wei Liu
2017-08-22  8:07         ` Lan Tianyu
2017-08-22 11:03           ` Wei Liu
2017-08-23  2:06             ` Lan Tianyu
2017-08-22 15:55   ` Roger Pau Monné
2017-08-23  7:36     ` Lan Tianyu
2017-08-23 13:53       ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 5/25] tools/libxc: Add viommu operations in libxc Lan Tianyu
2017-08-22 11:09   ` Wei Liu
2017-08-22 16:25   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 6/25] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
2017-08-22 12:56   ` Wei Liu
2017-08-23  2:47     ` Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 7/25] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
2017-08-22 13:12   ` Wei Liu
2017-08-23  2:36     ` Lan Tianyu
2017-08-23  8:07       ` Wei Liu
2017-08-22 16:41   ` Roger Pau Monné
2017-08-23  7:52     ` Lan Tianyu
2017-08-23  8:04       ` Roger Pau Monné
2017-08-23 14:11         ` Roger Pau Monné
2017-08-24  2:33         ` Lan Tianyu
2017-08-24  6:54           ` Jan Beulich
2017-08-24  8:36             ` Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 8/25] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
2017-08-22 13:19   ` Wei Liu
2017-08-23  2:46     ` Lan Tianyu
2017-08-23  8:09       ` Wei Liu
2017-08-22 16:48   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 9/25] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
2017-08-17 11:32   ` Wei Liu
2017-08-17 12:28     ` Wei Liu
2017-08-18  5:45       ` Chao Gao
2017-08-18 13:45         ` Wei Liu
2017-08-18 13:56           ` Jan Beulich
2017-08-22 13:44             ` Wei Liu
2017-08-22 13:48         ` Wei Liu
2017-08-23  5:35           ` Lan Tianyu
2017-08-23  8:34             ` Wei Liu
2017-08-24  3:24               ` Lan Tianyu
2017-08-24 11:08                 ` Wei Liu
2017-08-25  3:19                   ` Lan Tianyu
2017-08-25  7:33                     ` Lan Tianyu
2017-08-25  9:11                       ` Wei Liu
2017-08-09 20:34 ` [PATCH V2 10/25] tools/libxl: create vIOMMU during domain construction Lan Tianyu
2017-08-23  7:45   ` Roger Pau Monné
2017-08-23  8:02     ` Lan Tianyu
2017-08-23 13:53       ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 11/25] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
2017-08-23  7:58   ` Roger Pau Monné
2017-08-24  2:16     ` Lan Tianyu
2017-08-24  8:49       ` Roger Pau Monné
2017-08-24  8:54         ` Lan Tianyu
2017-08-24  9:02           ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 12/25] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
2017-08-23  8:27   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 13/25] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
2017-08-23  8:47   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 14/25] x86/vvtd: Process interrupt remapping request Lan Tianyu
2017-08-23  9:49   ` Roger Pau Monné
2017-08-23  9:59     ` Jan Beulich
2017-08-09 20:34 ` [PATCH V2 15/25] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
2017-08-23  9:57   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 16/25] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
2017-08-23  9:59   ` Roger Pau Monné
2017-08-24  5:28     ` Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 17/25] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
2017-08-23 10:03   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 18/25] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
2017-08-23 10:07   ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 19/25] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
2017-08-23 10:14   ` Roger Pau Monné
2017-08-24  6:11     ` Lan Tianyu
2017-08-24  6:59       ` Jan Beulich
2017-08-24  8:04         ` Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 20/25] passthrough: move some fields of hvm_gmsi_info to a sub-structure Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 21/25] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
2017-08-23 10:41   ` Roger Pau Monné
2017-08-25  7:28     ` Chao Gao
2017-08-25  9:35       ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 22/25] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
2017-08-23 10:55   ` Roger Pau Monné
2017-08-23 12:17     ` Jan Beulich
2017-08-09 20:34 ` [PATCH V2 23/25] x86/vvtd: Handle interrupt translation faults Lan Tianyu
2017-08-23 11:51   ` Roger Pau Monné
2017-08-25  7:17     ` Chao Gao
2017-08-25  9:43       ` Roger Pau Monné
2017-08-09 20:34 ` [PATCH V2 24/25] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
2017-08-23 12:16   ` Roger Pau Monné
2017-08-24  6:31     ` Lan Tianyu
2017-08-09 20:34 ` [PATCH V2 25/25] x86/vvtd: save and restore emulated VT-d Lan Tianyu
2017-08-23 12:19   ` Roger Pau Monné
2017-08-25  6:35     ` Chao Gao
2017-08-25  9:00       ` Jan Beulich
2017-08-25  8:27         ` Chao Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.