All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 00/29]
@ 2017-09-22  3:01 Lan Tianyu
  2017-09-22  3:01 ` [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
                   ` (29 more replies)
  0 siblings, 30 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

Change since v2:
       1) Remove vIOMMU hypercall of query capabilities and introduce when necessary.
       2) Remove length field of vIOMMU create parameter of vIOMMU hypercall
       3) Introduce irq remapping mode callback to vIOMMU framework and vIOMMU device models
can check irq remapping mode by vendor specific ways.
       4) Update vIOMMU docs.
       5) Other changes please see patches' change logs.

Change since v1:
       1) Fix coding style issues
       2) Add definitions for vIOMMU type and capabilities
       3) Change vIOMMU kconfig and select vIOMMU default on x86
       4) Put vIOMMU creation in libxl__arch_domain_create()
       5) Make vIOMMU structure of tool stack more general for both PV and HVM.

Change since RFC v2:
       1) Move vvtd.c to drivers/passthrough/vtd directroy. 
       2) Make vIOMMU always built in on x86
       3) Add new boot cmd "viommu" to enable viommu function
       4) Fix some code stype issues.

Change since RFC v1:
       1) Add Xen virtual IOMMU doc docs/misc/viommu.txt
       2) Move vIOMMU hypercall of create/destroy vIOMMU and query  
capabilities from dmop to domctl suggested by Paul Durrant. Because
these hypercalls can be done in tool stack and more VM mode(E,G PVH
or other modes don't use Qemu) can be benefit.
       3) Add check of input MMIO address and length.
       4) Add iommu_type in vIOMMU hypercall parameter to specify
vendor vIOMMU device model(E,G Intel VTD, AMD or ARM IOMMU. So far
only support Intel VTD).
       5) Add save and restore support for vvtd


This patchset is to introduce vIOMMU framework and add virtual VTD's
interrupt remapping support according "Xen virtual IOMMU high level
design doc V3"(https://lists.xenproject.org/archives/html/xen-devel/
2016-11/msg01391.html).

- vIOMMU framework
New framework provides viommu_ops and help functions to abstract
vIOMMU operations(E,G create, destroy, handle irq remapping request
and so on). Vendors(Intel, ARM, AMD and son) can implement their
vIOMMU callbacks.

- Virtual VTD
We enable irq remapping function and covers both
MSI and IOAPIC interrupts. Don't support post interrupt mode emulation
and post interrupt mode enabled on host with virtual VTD. will add
later.

Repo:
https://github.com/lantianyu/Xen/tree/xen_viommu_v3


Chao Gao (23):
  tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table
    structures
  tools/libacpi: Add new fields in acpi_config for DMAR table
  tools/libxl: Add a user configurable parameter to control vIOMMU
    attributes
  tools/libxl: build DMAR table for a guest with one virtual VTD
  tools/libxl: create vIOMMU during domain construction
  tools/libxc: Add viommu operations in libxc
  vtd: add and align register definitions
  x86/hvm: Introduce a emulated VTD for HVM
  x86/vvtd: Add MMIO handler for VVTD
  x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  x86/vvtd: Enable Interrupt Remapping through GCMD
  x86/vvtd: Process interrupt remapping request
  x86/vvtd: decode interrupt attribute from IRTE
  x86/vvtd: add a helper function to decide the interrupt format
  x86/vioapic: Hook interrupt delivery of vIOAPIC
  x86/vioapic: extend vioapic_get_vector() to support remapping format
    RTE
  passthrough: move some fields of hvm_gmsi_info to a sub-structure
  tools/libxc: Add a new interface to bind remapping format msi with
    pirq
  x86/vmsi: Hook delivering remapping format msi to guest
  x86/vvtd: Handle interrupt translation faults
  x86/vvtd: Enable Queued Invalidation through GCMD
  x86/vvtd: Add queued invalidation (QI) support
  x86/vvtd: save and restore emulated VT-d

Lan Tianyu (6):
  Xen/doc: Add Xen virtual IOMMU doc
  VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  VIOMMU: Add irq request callback to deal with irq remapping
  VIOMMU: Add get irq info callback to convert irq remapping request
  VIOMMU: Introduce callback of checking irq remapping mode

 docs/man/xl.cfg.pod.5.in               |   27 +
 docs/misc/viommu.txt                   |  136 ++++
 docs/misc/xen-command-line.markdown    |    7 +
 tools/libacpi/acpi2_0.h                |   61 ++
 tools/libacpi/build.c                  |   53 ++
 tools/libacpi/libacpi.h                |   12 +
 tools/libxc/Makefile                   |    1 +
 tools/libxc/include/xenctrl.h          |   21 +
 tools/libxc/xc_domain.c                |   53 ++
 tools/libxc/xc_viommu.c                |   64 ++
 tools/libxl/libxl_create.c             |   52 ++
 tools/libxl/libxl_types.idl            |   12 +
 tools/libxl/libxl_x86.c                |   20 +-
 tools/libxl/libxl_x86_acpi.c           |   98 ++-
 tools/xl/xl_parse.c                    |   52 +-
 xen/arch/x86/Kconfig                   |    1 +
 xen/arch/x86/hvm/irq.c                 |    7 +
 xen/arch/x86/hvm/vioapic.c             |   26 +-
 xen/arch/x86/hvm/vmsi.c                |   18 +-
 xen/common/Kconfig                     |    3 +
 xen/common/Makefile                    |    1 +
 xen/common/domain.c                    |    4 +
 xen/common/domctl.c                    |    6 +
 xen/common/viommu.c                    |  220 ++++++
 xen/drivers/passthrough/io.c           |  178 ++++-
 xen/drivers/passthrough/vtd/Makefile   |    7 +-
 xen/drivers/passthrough/vtd/iommu.h    |  193 ++++--
 xen/drivers/passthrough/vtd/vvtd.c     | 1178 ++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/irq.h          |   15 +-
 xen/include/asm-x86/viommu.h           |   80 +++
 xen/include/public/arch-x86/hvm/save.h |   25 +-
 xen/include/public/domctl.h            |   49 ++
 xen/include/xen/sched.h                |    8 +
 xen/include/xen/viommu.h               |  100 +++
 34 files changed, 2698 insertions(+), 90 deletions(-)
 create mode 100644 docs/misc/viommu.txt
 create mode 100644 tools/libxc/xc_viommu.c
 create mode 100644 xen/common/viommu.c
 create mode 100644 xen/drivers/passthrough/vtd/vvtd.c
 create mode 100644 xen/include/asm-x86/viommu.h
 create mode 100644 xen/include/xen/viommu.h

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-18 13:26   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance Lan Tianyu
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

This patch is to add Xen virtual IOMMU doc to introduce motivation,
framework, vIOMMU hypercall and xl configuration.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 docs/misc/viommu.txt | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 136 insertions(+)
 create mode 100644 docs/misc/viommu.txt

diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
new file mode 100644
index 0000000..348e8c4
--- /dev/null
+++ b/docs/misc/viommu.txt
@@ -0,0 +1,136 @@
+Xen virtual IOMMU
+
+Motivation
+==========
+Enable more than 128 vcpu support
+
+The current requirements of HPC cloud service requires VM with a high
+number of CPUs in order to achieve high performance in parallel
+computing.
+
+To support >128 vcpus, X2APIC mode in guest is necessary because legacy
+APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is
+CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available
+in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode
+supports 32-bit APIC ID and it requires the interrupt remapping functionality
+of a vIOMMU if the guest wishes to route interrupts to all available vCPUs
+
+The reason for this is that there is no modification for existing PCI MSI
+and IOAPIC when introduce X2APIC. PCI MSI/IOAPIC can only send interrupt
+message containing 8-bit APIC ID, which cannot address cpus with >254
+APIC ID. Interrupt remapping supports 32-bit APIC ID and so it's necessary
+for >128 vcpus support.
+
+
+vIOMMU Architecture
+===================
+vIOMMU device model is inside Xen hypervisor for following factors
+    1) Avoid round trips between Qemu and Xen hypervisor
+    2) Ease of integration with the rest of hypervisor
+    3) HVMlite/PVH doesn't use Qemu
+
+* Interrupt remapping overview.
+Interrupts from virtual devices and physical devices are delivered
+to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
+this procedure.
+
++---------------------------------------------------+
+|Qemu                       |VM                     |
+|                           | +----------------+    |
+|                           | |  Device driver |    |
+|                           | +--------+-------+    |
+|                           |          ^            |
+|       +----------------+  | +--------+-------+    |
+|       | Virtual device |  | |  IRQ subsystem |    |
+|       +-------+--------+  | +--------+-------+    |
+|               |           |          ^            |
+|               |           |          |            |
++---------------------------+-----------------------+
+|hypervisor     |                      | VIRQ       |
+|               |            +---------+--------+   |
+|               |            |      vLAPIC      |   |
+|               |VIRQ        +---------+--------+   |
+|               |                      ^            |
+|               |                      |            |
+|               |            +---------+--------+   |
+|               |            |      vIOMMU      |   |
+|               |            +---------+--------+   |
+|               |                      ^            |
+|               |                      |            |
+|               |            +---------+--------+   |
+|               |            |   vIOAPIC/vMSI   |   |
+|               |            +----+----+--------+   |
+|               |                 ^    ^            |
+|               +-----------------+    |            |
+|                                      |            |
++---------------------------------------------------+
+HW                                     |IRQ
+                                +-------------------+
+                                |   PCI Device      |
+                                +-------------------+
+
+
+vIOMMU hypercall
+================
+Introduce a new domctl hypercall "xen_domctl_viommu_op" to create/destroy
+vIOMMUs.
+
+* vIOMMU hypercall parameter structure
+
+/* vIOMMU type - specify vendor vIOMMU device model */
+#define VIOMMU_TYPE_INTEL_VTD	       0
+
+/* vIOMMU capabilities */
+#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
+
+struct xen_domctl_viommu_op {
+    uint32_t cmd;
+#define XEN_DOMCTL_create_viommu          0
+#define XEN_DOMCTL_destroy_viommu         1
+    union {
+        struct {
+            /* IN - vIOMMU type  */
+            uint64_t viommu_type;
+            /* IN - MMIO base address of vIOMMU. */
+            uint64_t base_address;
+            /* IN - Capabilities with which we want to create */
+            uint64_t capabilities;
+            /* OUT - vIOMMU identity */
+            uint32_t viommu_id;
+        } create_viommu;
+
+        struct {
+            /* IN - vIOMMU identity */
+            uint32_t viommu_id;
+        } destroy_viommu;
+    } u;
+};
+
+- XEN_DOMCTL_create_viommu
+    Create vIOMMU device with vIOMMU_type, capabilities and MMIO base
+address. Hypervisor allocates viommu_id for new vIOMMU instance and return
+back. The vIOMMU device model in hypervisor should check whether it can
+support the input capabilities and return error if not.
+
+- XEN_DOMCTL_destroy_viommu
+    Destroy vIOMMU in Xen hypervisor with viommu_id as parameter.
+
+These vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU
+support for single VM.(e.g, parameters of create/destroy vIOMMU includes
+vIOMMU id). But function implementation only supports one vIOMMU per VM so far.
+
+Xen hypervisor vIOMMU command
+=============================
+Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor.
+It's default disabled.
+
+xl x86 vIOMMU configuration"
+============================
+viommu = [
+    'type=intel_vtd,intremap=1',
+    ...
+]
+
+"type" - Specify vIOMMU device model type. Currently only supports Intel vtd
+device model.
+"intremap" - Enable vIOMMU interrupt remapping function.
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
  2017-09-22  3:01 ` [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-18 14:05   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

This patch is to introduce an abstract layer for arch vIOMMU implementation
to deal with requests from dom0. Arch vIOMMU code needs to provide callback
to do create and destroy operation.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 docs/misc/xen-command-line.markdown |   7 ++
 xen/arch/x86/Kconfig                |   1 +
 xen/common/Kconfig                  |   3 +
 xen/common/Makefile                 |   1 +
 xen/common/domain.c                 |   4 +
 xen/common/viommu.c                 | 144 ++++++++++++++++++++++++++++++++++++
 xen/include/xen/sched.h             |   8 ++
 xen/include/xen/viommu.h            |  63 ++++++++++++++++
 8 files changed, 231 insertions(+)
 create mode 100644 xen/common/viommu.c
 create mode 100644 xen/include/xen/viommu.h

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 9797c8d..dfd1db5 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1825,3 +1825,10 @@ mode.
 > Default: `true`
 
 Permit use of the `xsave/xrstor` instructions.
+
+### viommu
+> `= <boolean>`
+
+> Default: `false`
+
+Permit use of viommu interface to create and destroy viommu device model.
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 30c2769..1f1de96 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -23,6 +23,7 @@ config X86
 	select HAS_PDX
 	select NUMA
 	select VGA
+	select VIOMMU
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index dc8e876..2ad2c8d 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -49,6 +49,9 @@ config HAS_CHECKPOLICY
 	string
 	option env="XEN_HAS_CHECKPOLICY"
 
+config VIOMMU
+	bool
+
 config KEXEC
 	bool "kexec support"
 	default y
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 39e2614..da32f71 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -56,6 +56,7 @@ obj-y += time.o
 obj-y += timer.o
 obj-y += trace.o
 obj-y += version.o
+obj-$(CONFIG_VIOMMU) += viommu.o
 obj-y += virtual_region.o
 obj-y += vm_event.o
 obj-y += vmap.o
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 5aebcf2..cdb1c9d 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -814,6 +814,10 @@ static void complete_domain_destroy(struct rcu_head *head)
 
     sched_destroy_domain(d);
 
+#ifdef CONFIG_VIOMMU
+    viommu_destroy_domain(d);
+#endif
+
     /* Free page used by xen oprofile buffer. */
 #ifdef CONFIG_XENOPROF
     free_xenoprof_pages(d);
diff --git a/xen/common/viommu.c b/xen/common/viommu.c
new file mode 100644
index 0000000..64d91e6
--- /dev/null
+++ b/xen/common/viommu.c
@@ -0,0 +1,144 @@
+/*
+ * common/viommu.c
+ *
+ * Copyright (c) 2017 Intel Corporation
+ * Author: Lan Tianyu <tianyu.lan@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/sched.h>
+#include <xen/spinlock.h>
+#include <xen/types.h>
+#include <xen/viommu.h>
+
+bool __read_mostly opt_viommu;
+boolean_param("viommu", opt_viommu);
+
+static DEFINE_SPINLOCK(type_list_lock);
+static LIST_HEAD(type_list);
+
+struct viommu_type {
+    uint64_t type;
+    struct viommu_ops *ops;
+    struct list_head node;
+};
+
+int viommu_destroy_domain(struct domain *d)
+{
+    int ret;
+
+    if ( !d->viommu )
+        return -EINVAL;
+
+    ret = d->viommu->ops->destroy(d->viommu);
+    if ( ret < 0 )
+        return ret;
+
+    xfree(d->viommu);
+    d->viommu = NULL;
+    return 0;
+}
+
+static struct viommu_type *viommu_get_type(uint64_t type)
+{
+    struct viommu_type *viommu_type = NULL;
+
+    spin_lock(&type_list_lock);
+    list_for_each_entry( viommu_type, &type_list, node )
+    {
+        if ( viommu_type->type == type )
+        {
+            spin_unlock(&type_list_lock);
+            return viommu_type;
+        }
+    }
+    spin_unlock(&type_list_lock);
+
+    return NULL;
+}
+
+int viommu_register_type(uint64_t type, struct viommu_ops *ops)
+{
+    struct viommu_type *viommu_type = NULL;
+
+    if ( !viommu_enabled() )
+        return -ENODEV;
+
+    if ( viommu_get_type(type) )
+        return -EEXIST;
+
+    viommu_type = xzalloc(struct viommu_type);
+    if ( !viommu_type )
+        return -ENOMEM;
+
+    viommu_type->type = type;
+    viommu_type->ops = ops;
+
+    spin_lock(&type_list_lock);
+    list_add_tail(&viommu_type->node, &type_list);
+    spin_unlock(&type_list_lock);
+
+    return 0;
+}
+
+static int viommu_create(struct domain *d, uint64_t type,
+                         uint64_t base_address, uint64_t caps,
+                         uint32_t *viommu_id)
+{
+    struct viommu *viommu;
+    struct viommu_type *viommu_type = NULL;
+    int rc;
+
+    /* Only support one vIOMMU per domain. */
+    if ( d->viommu )
+        return -E2BIG;
+
+    viommu_type = viommu_get_type(type);
+    if ( !viommu_type )
+        return -EINVAL;
+
+    if ( !viommu_type->ops || !viommu_type->ops->create )
+        return -EINVAL;
+
+    viommu = xzalloc(struct viommu);
+    if ( !viommu )
+        return -ENOMEM;
+
+    viommu->base_address = base_address;
+    viommu->caps = caps;
+    viommu->ops = viommu_type->ops;
+
+    rc = viommu->ops->create(d, viommu);
+    if ( rc < 0 )
+    {
+        xfree(viommu);
+        return rc;
+    }
+
+    d->viommu = viommu;
+
+    /* Only support one vIOMMU per domain. */
+    *viommu_id = 0;
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 5b8f8c6..750f235 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -33,6 +33,10 @@
 DEFINE_XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t);
 #endif
 
+#ifdef CONFIG_VIOMMU
+#include <xen/viommu.h>
+#endif
+
 /*
  * Stats
  *
@@ -479,6 +483,10 @@ struct domain
     rwlock_t vnuma_rwlock;
     struct vnuma_info *vnuma;
 
+#ifdef CONFIG_VIOMMU
+    struct viommu *viommu;
+#endif
+
     /* Common monitor options */
     struct {
         unsigned int guest_request_enabled       : 1;
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
new file mode 100644
index 0000000..636a2a3
--- /dev/null
+++ b/xen/include/xen/viommu.h
@@ -0,0 +1,63 @@
+/*
+ * include/xen/viommu.h
+ *
+ * Copyright (c) 2017, Intel Corporation
+ * Author: Lan Tianyu <tianyu.lan@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+#ifndef __XEN_VIOMMU_H__
+#define __XEN_VIOMMU_H__
+
+struct viommu;
+
+struct viommu_ops {
+    int (*create)(struct domain *d, struct viommu *viommu);
+    int (*destroy)(struct viommu *viommu);
+};
+
+struct viommu {
+    uint64_t base_address;
+    uint64_t caps;
+    const struct viommu_ops *ops;
+    void *priv;
+};
+
+#ifdef CONFIG_VIOMMU
+extern bool opt_viommu;
+static inline bool viommu_enabled(void)
+{
+    return opt_viommu;
+}
+
+int viommu_register_type(uint64_t type, struct viommu_ops *ops);
+int viommu_destroy_domain(struct domain *d);
+#else
+static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
+{
+    return -EINVAL;
+}
+#endif
+
+#endif /* __XEN_VIOMMU_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
  2017-09-22  3:01 ` [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
  2017-09-22  3:01 ` [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-18 14:18   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
                   ` (26 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

This patch is to introduce create, destroy and query capabilities
command for vIOMMU. vIOMMU layer will deal with requests and call
arch vIOMMU ops.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/domctl.c         |  6 ++++++
 xen/common/viommu.c         | 30 ++++++++++++++++++++++++++++++
 xen/include/public/domctl.h | 42 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/viommu.h    |  2 ++
 4 files changed, 80 insertions(+)

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 42658e5..7e28237 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1149,6 +1149,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             copyback = 1;
         break;
 
+#ifdef CONFIG_VIOMMU
+    case XEN_DOMCTL_viommu_op:
+        ret = viommu_domctl(d, &op->u.viommu_op, &copyback);
+        break;
+#endif
+
     default:
         ret = arch_do_domctl(op, d, u_domctl);
         break;
diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index 64d91e6..55feb5d 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -133,6 +133,36 @@ static int viommu_create(struct domain *d, uint64_t type,
     return 0;
 }
 
+int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
+                  bool *need_copy)
+{
+    int rc = -EINVAL;
+
+    if ( !viommu_enabled() )
+        return -ENODEV;
+
+    switch ( op->cmd )
+    {
+    case XEN_DOMCTL_create_viommu:
+        rc = viommu_create(d, op->u.create.viommu_type,
+                           op->u.create.base_address,
+                           op->u.create.capabilities,
+                           &op->u.create.viommu_id);
+        if ( !rc )
+            *need_copy = true;
+        break;
+
+    case XEN_DOMCTL_destroy_viommu:
+        rc = viommu_destroy_domain(d);
+        break;
+
+    default:
+        return -ENOSYS;
+    }
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 50ff58f..68854b6 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1163,6 +1163,46 @@ struct xen_domctl_psr_cat_op {
 typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
 
+/*  vIOMMU helper
+ *
+ *  vIOMMU interface can be used to create/destroy vIOMMU and
+ *  query vIOMMU capabilities.
+ */
+
+/* vIOMMU type - specify vendor vIOMMU device model */
+#define VIOMMU_TYPE_INTEL_VTD           0
+
+/* vIOMMU capabilities */
+#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
+
+struct xen_domctl_viommu_op {
+    uint32_t cmd;
+#define XEN_DOMCTL_create_viommu          0
+#define XEN_DOMCTL_destroy_viommu         1
+    union {
+        struct {
+            /* IN - vIOMMU type */
+            uint64_t viommu_type;
+            /* 
+             * IN - MMIO base address of vIOMMU. vIOMMU device models
+             * are in charge of to check base_address.
+             */
+            uint64_t base_address;
+            /* IN - Capabilities with which we want to create */
+            uint64_t capabilities;
+            /* OUT - vIOMMU identity */
+            uint32_t viommu_id;
+        } create;
+
+        struct {
+            /* IN - vIOMMU identity */
+            uint32_t viommu_id;
+        } destroy;
+    } u;
+};
+typedef struct xen_domctl_viommu_op xen_domctl_viommu_op;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_viommu_op);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1240,6 +1280,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_monitor_op                    77
 #define XEN_DOMCTL_psr_cat_op                    78
 #define XEN_DOMCTL_soft_reset                    79
+#define XEN_DOMCTL_viommu_op                     80
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1302,6 +1343,7 @@ struct xen_domctl {
         struct xen_domctl_psr_cmt_op        psr_cmt_op;
         struct xen_domctl_monitor_op        monitor_op;
         struct xen_domctl_psr_cat_op        psr_cat_op;
+        struct xen_domctl_viommu_op         viommu_op;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index 636a2a3..baa8ab7 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -43,6 +43,8 @@ static inline bool viommu_enabled(void)
 
 int viommu_register_type(uint64_t type, struct viommu_ops *ops);
 int viommu_destroy_domain(struct domain *d);
+int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
+                  bool_t *need_copy);
 #else
 static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
 {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (2 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-18 14:36   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Add dmar table structure according Chapter 8 "BIOS Considerations" of
VTd spec Rev. 2.4.

VTd spec:http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 tools/libacpi/acpi2_0.h | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/tools/libacpi/acpi2_0.h b/tools/libacpi/acpi2_0.h
index 2619ba3..758a823 100644
--- a/tools/libacpi/acpi2_0.h
+++ b/tools/libacpi/acpi2_0.h
@@ -422,6 +422,65 @@ struct acpi_20_slit {
 };
 
 /*
+ * DMA Remapping Table header definition (DMAR)
+ */
+
+/*
+ * DMAR Flags.
+ */
+#define ACPI_DMAR_INTR_REMAP        (1 << 0)
+#define ACPI_DMAR_X2APIC_OPT_OUT    (1 << 1)
+
+struct acpi_dmar {
+    struct acpi_header header;
+    uint8_t host_address_width;
+    uint8_t flags;
+    uint8_t reserved[10];
+};
+
+/*
+ * Device Scope Types
+ */
+#define ACPI_DMAR_DEVICE_SCOPE_PCI_ENDPOINT             0x01
+#define ACPI_DMAR_DEVICE_SCOPE_PCI_SUB_HIERARACHY       0x01
+#define ACPI_DMAR_DEVICE_SCOPE_IOAPIC                   0x03
+#define ACPI_DMAR_DEVICE_SCOPE_HPET                     0x04
+#define ACPI_DMAR_DEVICE_SCOPE_ACPI_NAMESPACE_DEVICE    0x05
+
+struct dmar_device_scope {
+    uint8_t type;
+    uint8_t length;
+    uint8_t reserved[2];
+    uint8_t enumeration_id;
+    uint8_t bus;
+    uint16_t path[0];
+};
+
+/*
+ * DMA Remapping Hardware Unit Types
+ */
+#define ACPI_DMAR_TYPE_HARDWARE_UNIT        0x00
+#define ACPI_DMAR_TYPE_RESERVED_MEMORY      0x01
+#define ACPI_DMAR_TYPE_ATSR                 0x02
+#define ACPI_DMAR_TYPE_HARDWARE_AFFINITY    0x03
+#define ACPI_DMAR_TYPE_ANDD                 0x04
+
+/*
+ * DMA Remapping Hardware Unit Flags. All other bits are reserved and must be 0.
+ */
+#define ACPI_DMAR_INCLUDE_PCI_ALL   (1 << 0)
+
+struct acpi_dmar_hardware_unit {
+    uint16_t type;
+    uint16_t length;
+    uint8_t flags;
+    uint8_t reserved;
+    uint16_t pci_segment;
+    uint64_t base_address;
+    struct dmar_device_scope scope[0];
+};
+
+/*
  * Table Signatures.
  */
 #define ACPI_2_0_RSDP_SIGNATURE ASCII64('R','S','D',' ','P','T','R',' ')
@@ -435,6 +494,7 @@ struct acpi_20_slit {
 #define ACPI_2_0_WAET_SIGNATURE ASCII32('W','A','E','T')
 #define ACPI_2_0_SRAT_SIGNATURE ASCII32('S','R','A','T')
 #define ACPI_2_0_SLIT_SIGNATURE ASCII32('S','L','I','T')
+#define ACPI_2_0_DMAR_SIGNATURE ASCII32('D','M','A','R')
 
 /*
  * Table revision numbers.
@@ -449,6 +509,7 @@ struct acpi_20_slit {
 #define ACPI_1_0_FADT_REVISION 0x01
 #define ACPI_2_0_SRAT_REVISION 0x01
 #define ACPI_2_0_SLIT_REVISION 0x01
+#define ACPI_2_0_DMAR_REVISION 0x01
 
 #pragma pack ()
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (3 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-18 15:12   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

The BIOS reports the remapping hardware units in a platform to system software
through the DMA Remapping Reporting (DMAR) ACPI table.
New fields are introduces for DMAR table. These new fields are set by
toolstack through parsing guest's config file. construct_dmar() is added to
build DMAR table according to the new fields.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
v3:
 - Remove chip-set specific IOAPIC BDF. Instead, let IOAPIC-related
 info be passed by struct acpi_config.

---
 tools/libacpi/build.c   | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/libacpi/libacpi.h | 12 +++++++++++
 2 files changed, 65 insertions(+)

diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
index f9881c9..5ee8fcd 100644
--- a/tools/libacpi/build.c
+++ b/tools/libacpi/build.c
@@ -303,6 +303,59 @@ static struct acpi_20_slit *construct_slit(struct acpi_ctxt *ctxt,
     return slit;
 }
 
+/*
+ * Only one DMA remapping hardware unit is exposed and all devices
+ * are under the remapping hardware unit. I/O APIC should be explicitly
+ * enumerated.
+ */
+struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
+                                 const struct acpi_config *config)
+{
+    struct acpi_dmar *dmar;
+    struct acpi_dmar_hardware_unit *drhd;
+    struct dmar_device_scope *scope;
+    unsigned int size;
+    unsigned int ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);
+
+    size = sizeof(*dmar) + sizeof(*drhd) + ioapic_scope_size;
+
+    dmar = ctxt->mem_ops.alloc(ctxt, size, 16);
+    if ( !dmar )
+        return NULL;
+
+    memset(dmar, 0, size);
+    dmar->header.signature = ACPI_2_0_DMAR_SIGNATURE;
+    dmar->header.revision = ACPI_2_0_DMAR_REVISION;
+    dmar->header.length = size;
+    fixed_strcpy(dmar->header.oem_id, ACPI_OEM_ID);
+    fixed_strcpy(dmar->header.oem_table_id, ACPI_OEM_TABLE_ID);
+    dmar->header.oem_revision = ACPI_OEM_REVISION;
+    dmar->header.creator_id   = ACPI_CREATOR_ID;
+    dmar->header.creator_revision = ACPI_CREATOR_REVISION;
+    dmar->host_address_width = config->host_addr_width - 1;
+    if ( config->iommu_intremap_supported )
+        dmar->flags |= ACPI_DMAR_INTR_REMAP;
+    if ( !config->iommu_x2apic_supported )
+        dmar->flags |= ACPI_DMAR_X2APIC_OPT_OUT;
+
+    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
+    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
+    drhd->length = sizeof(*drhd) + ioapic_scope_size;
+    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
+    drhd->pci_segment = 0;
+    drhd->base_address = config->iommu_base_addr;
+
+    scope = &drhd->scope[0];
+    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
+    scope->length = ioapic_scope_size;
+    scope->enumeration_id = config->ioapic_id;
+    scope->bus = config->ioapic_bus;
+    scope->path[0] = config->ioapic_devfn;
+
+    set_checksum(dmar, offsetof(struct acpi_header, checksum), size);
+    return dmar;
+}
+
 static int construct_passthrough_tables(struct acpi_ctxt *ctxt,
                                         unsigned long *table_ptrs,
                                         int nr_tables,
diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
index a2efd23..fdd6a78 100644
--- a/tools/libacpi/libacpi.h
+++ b/tools/libacpi/libacpi.h
@@ -20,6 +20,8 @@
 #ifndef __LIBACPI_H__
 #define __LIBACPI_H__
 
+#include <stdbool.h>
+
 #define ACPI_HAS_COM1              (1<<0)
 #define ACPI_HAS_COM2              (1<<1)
 #define ACPI_HAS_LPT1              (1<<2)
@@ -96,8 +98,18 @@ struct acpi_config {
     uint32_t ioapic_base_address;
     uint16_t pci_isa_irq_mask;
     uint8_t ioapic_id;
+
+    /* Emulated IOMMU features, location and IOAPIC under the scope of IOMMU */
+    bool iommu_intremap_supported;
+    bool iommu_x2apic_supported;
+    uint8_t host_addr_width;
+    uint8_t ioapic_bus;
+    uint16_t ioapic_devfn;
+    uint64_t iommu_base_addr;
 };
 
+struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
+                                 const struct acpi_config *config);
 int acpi_build_tables(struct acpi_ctxt *ctxt, struct acpi_config *config);
 
 #endif /* __LIBACPI_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (4 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19  9:49   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

A field, viommu_info, is added to struct libxl_domain_build_info. Several
attributes can be specified by guest config file for virtual IOMMU. These
attributes are used for DMAR construction and vIOMMU creation.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - allow an array of viommu other than only one viommu to present to guest.
 During domain building, an error would be raised for
 multiple viommus case since we haven't implemented this yet.
 - provide a libxl__viommu_set_default() for viommu

---
 docs/man/xl.cfg.pod.5.in    | 27 +++++++++++++++++++++++
 tools/libxl/libxl_create.c  | 52 +++++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_types.idl | 12 +++++++++++
 tools/xl/xl_parse.c         | 52 ++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
index 79cb2ea..9cd7dd7 100644
--- a/docs/man/xl.cfg.pod.5.in
+++ b/docs/man/xl.cfg.pod.5.in
@@ -1547,6 +1547,33 @@ L<http://www.microsoft.com/en-us/download/details.aspx?id=30707>
 
 =back 
 
+=item B<viommu=[ "VIOMMU_STRING", "VIOMMU_STRING", ...]>
+
+Specifies the vIOMMUs which are to be provided to the guest.
+
+B<VIOMMU_STRING> has the form C<KEY=VALUE,KEY=VALUE,...> where:
+
+=over 4
+
+=item B<KEY=VALUE>
+
+Possible B<KEY>s are:
+
+=over 4
+
+=item B<type="STRING">
+
+Currently there is only one valid type:
+
+(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
+
+=item B<intremap=BOOLEAN>
+
+Specifies whether the vIOMMU should support interrupt remapping
+and default 'true'.
+
+=back
+
 =head3 Guest Virtual Time Controls
 
 =over 4
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 9123585..decd7a8 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -27,6 +27,8 @@
 
 #include <xen-xsm/flask/flask.h>
 
+#define VIOMMU_VTD_BASE_ADDR        0xfed90000ULL
+
 int libxl__domain_create_info_setdefault(libxl__gc *gc,
                                          libxl_domain_create_info *c_info)
 {
@@ -59,6 +61,47 @@ void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
                             LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT;
 }
 
+static int libxl__viommu_set_default(libxl__gc *gc,
+                                     libxl_domain_build_info *b_info)
+{
+    int i;
+
+    if (!b_info->num_viommus)
+        return 0;
+
+    for (i = 0; i < b_info->num_viommus; i++) {
+        libxl_viommu_info *viommu = &b_info->viommu[i];
+
+        if (libxl_defbool_is_default(viommu->intremap))
+            libxl_defbool_set(&viommu->intremap, true);
+
+        if (!libxl_defbool_val(viommu->intremap)) {
+            LOGE(ERROR, "Cannot create one virtual VTD without intremap");
+            return ERROR_INVAL;
+        }
+
+        if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
+            /*
+             * If there are multiple vIOMMUs, we need arrange all vIOMMUs to
+             * avoid overlap. Put a check here in case we get here for multiple
+             * vIOMMUs case.
+             */
+            if (b_info->num_viommus > 1) {
+                LOGE(ERROR, "Multiple vIOMMUs support is under implementation");
+                return ERROR_INVAL;
+            }
+
+            /* Set default values to unexposed fields */
+            viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
+
+            /* Set desired capbilities */
+            viommu->cap = VIOMMU_CAP_IRQ_REMAPPING;
+        }
+    }
+
+    return 0;
+}
+
 int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
@@ -214,6 +257,9 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 
     libxl__arch_domain_build_info_acpi_setdefault(b_info);
 
+    if (libxl__viommu_set_default(gc, b_info))
+        return ERROR_FAIL;
+
     switch (b_info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
         if (b_info->shadow_memkb == LIBXL_MEMKB_DEFAULT)
@@ -890,6 +936,12 @@ static void initiate_domain_create(libxl__egc *egc,
         goto error_out;
     }
 
+    if (d_config->b_info.num_viommus > 1) {
+        ret = ERROR_INVAL;
+        LOGD(ERROR, domid, "Cannot support multiple vIOMMUs");
+        goto error_out;
+    }
+
     ret = libxl__domain_create_info_setdefault(gc, &d_config->c_info);
     if (ret) {
         LOGD(ERROR, domid, "Unable to set domain create info defaults");
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 173d70a..286c960 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -450,6 +450,17 @@ libxl_altp2m_mode = Enumeration("altp2m_mode", [
     (3, "limited"),
     ], init_val = "LIBXL_ALTP2M_MODE_DISABLED")
 
+libxl_viommu_type = Enumeration("viommu_type", [
+    (1, "intel_vtd"),
+    ])
+
+libxl_viommu_info = Struct("viommu_info", [
+    ("type",            libxl_viommu_type),
+    ("intremap",        libxl_defbool),
+    ("cap",             uint64),
+    ("base_addr",       uint64),
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),
     ("avail_vcpus",     libxl_bitmap),
@@ -506,6 +517,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     # 65000 which is reserved by the toolstack.
     ("device_tree",      string),
     ("acpi",             libxl_defbool),
+    ("viommu",           Array(libxl_viommu_info, "num_viommus")),
     ("u", KeyedUnion(None, libxl_domain_type, "type",
                 [("hvm", Struct(None, [("firmware",         string),
                                        ("bios",             libxl_bios_type),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 02ddd2e..34f8128 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -804,6 +804,38 @@ int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token)
     return 0;
 }
 
+/* Parses viommu data and adds info into viommu
+ * Returns 1 if the input doesn't form a valid viommu
+ * or parsed values are not correct. Successful parse returns 0 */
+static int parse_viommu_config(libxl_viommu_info *viommu, const char *info)
+{
+    char *ptr, *oparg, *saveptr = NULL, *buf = xstrdup(info);
+
+    ptr = strtok_r(buf, ",", &saveptr);
+    if (MATCH_OPTION("type", ptr, oparg)) {
+        if (!strcmp(oparg, "intel_vtd")) {
+            viommu->type = LIBXL_VIOMMU_TYPE_INTEL_VTD;
+        } else {
+            fprintf(stderr, "Invalid viommu type: %s\n", oparg);
+            return 1;
+        }
+    } else {
+        fprintf(stderr, "viommu type should be set first: %s\n", oparg);
+        return 1;
+    }
+
+    for (ptr = strtok_r(NULL, ",", &saveptr); ptr;
+         ptr = strtok_r(NULL, ",", &saveptr)) {
+        if (MATCH_OPTION("intremap", ptr, oparg)) {
+            libxl_defbool_set(&viommu->intremap, !!strtoul(oparg, NULL, 0));
+        } else {
+            fprintf(stderr, "Unknown string `%s' in viommu spec\n", ptr);
+            return 1;
+        }
+    }
+    return 0;
+}
+
 void parse_config_data(const char *config_source,
                        const char *config_data,
                        int config_len,
@@ -813,7 +845,7 @@ void parse_config_data(const char *config_source,
     long l, vcpus = 0;
     XLU_Config *config;
     XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
-                   *usbctrls, *usbdevs, *p9devs;
+                   *usbctrls, *usbdevs, *p9devs, *iommus;
     XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs,
                    *mca_caps;
     int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps;
@@ -1037,6 +1069,24 @@ void parse_config_data(const char *config_source,
     xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0);
     xlu_cfg_get_defbool(config, "acpi", &b_info->acpi, 0);
 
+    if (!xlu_cfg_get_list (config, "viommu", &iommus, 0, 0)) {
+        b_info->num_viommus = 0;
+        b_info->viommu = NULL;
+        while ((buf = xlu_cfg_get_listitem (iommus, b_info->num_viommus))
+                != NULL) {
+            libxl_viommu_info *viommu;
+
+            viommu = ARRAY_EXTEND_INIT_NODEVID(b_info->viommu,
+                                               b_info->num_viommus,
+                                               libxl_viommu_info_init);
+
+            if (parse_viommu_config(viommu, buf)) {
+                fprintf(stderr, "ERROR: invalid viommu setting\n");
+                exit (1);
+            }
+        }
+    }
+
     switch(b_info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
         kernel_basename = libxl_basename(b_info->kernel);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (5 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 10:00   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction Lan Tianyu
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

A new logic is added to build ACPI DMAR table in tool stack for a guest
with one virtual VTD and pass through it to guest via existing mechanism. If
there already are ACPI tables needed to pass through, we joint the tables.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - build dmar and initialize related acpi_modules struct in
 libxl_x86_acpi.c, keeping in accordance with pvh.

---
 tools/libxl/libxl_x86.c      |  3 +-
 tools/libxl/libxl_x86_acpi.c | 98 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 96 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 455f6f0..23c9a55 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -381,8 +381,7 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
 {
     int rc = 0;
 
-    if ((info->type == LIBXL_DOMAIN_TYPE_HVM) &&
-        (info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)) {
+    if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
         rc = libxl__dom_load_acpi(gc, info, dom);
         if (rc != 0)
             LOGE(ERROR, "libxl_dom_load_acpi failed");
diff --git a/tools/libxl/libxl_x86_acpi.c b/tools/libxl/libxl_x86_acpi.c
index 1761756..adf02f4 100644
--- a/tools/libxl/libxl_x86_acpi.c
+++ b/tools/libxl/libxl_x86_acpi.c
@@ -16,6 +16,7 @@
 #include "libxl_arch.h"
 #include <xen/hvm/hvm_info_table.h>
 #include <xen/hvm/e820.h>
+#include "libacpi/acpi2_0.h"
 #include "libacpi/libacpi.h"
 
 #include <xc_dom.h>
@@ -161,9 +162,9 @@ out:
     return rc;
 }
 
-int libxl__dom_load_acpi(libxl__gc *gc,
-                         const libxl_domain_build_info *b_info,
-                         struct xc_dom_image *dom)
+static int libxl__dom_load_acpi_pvh(libxl__gc *gc,
+                                    const libxl_domain_build_info *b_info,
+                                    struct xc_dom_image *dom)
 {
     struct acpi_config config = {0};
     struct libxl_acpi_ctxt libxl_ctxt;
@@ -236,6 +237,97 @@ out:
     return rc;
 }
 
+static void *acpi_memalign(struct acpi_ctxt *ctxt, uint32_t size,
+                           uint32_t align)
+{
+    int ret;
+    void *ptr;
+
+    ret = posix_memalign(&ptr, align, size);
+    if (ret != 0 || !ptr)
+        return NULL;
+
+    return ptr;
+}
+
+/*
+ * For hvm, we don't need build acpi in libxl. Instead, it's built in hvmloader.
+ * But if one hvm has virtual VTD(s), we build DMAR table for it and joint this
+ * table with existing content in acpi_modules in order to employ HVM
+ * firmware pass-through mechanism to pass-through DMAR table.
+ */
+static int libxl__dom_load_acpi_hvm(libxl__gc *gc,
+                                    const libxl_domain_build_info *b_info,
+                                    struct xc_dom_image *dom)
+{
+    struct acpi_config config = { 0 };
+    struct acpi_ctxt ctxt;
+    void *table;
+    uint32_t len;
+
+    if ((b_info->type != LIBXL_DOMAIN_TYPE_HVM) ||
+        (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) ||
+        (b_info->num_viommus != 1) ||
+        (b_info->viommu[0].type != LIBXL_VIOMMU_TYPE_INTEL_VTD))
+        return 0;
+
+    ctxt.mem_ops.alloc = acpi_memalign;
+    ctxt.mem_ops.v2p = virt_to_phys;
+    ctxt.mem_ops.free = acpi_mem_free;
+
+    if (libxl_defbool_val(b_info->viommu[0].intremap))
+        config.iommu_intremap_supported = true;
+    /* x2apic is always enabled since in no case we must disable it */
+    config.iommu_x2apic_supported = true;
+    config.iommu_base_addr = b_info->viommu[0].base_addr;
+
+    /* IOAPIC id and PSEUDO BDF */
+    config.ioapic_id = 1;
+    config.ioapic_bus = 0xff;
+    config.ioapic_devfn = 0x0;
+
+    config.host_addr_width = 39;
+
+    table = construct_dmar(&ctxt, &config);
+    if ( !table )
+        return ERROR_NOMEM;
+    len = ((struct acpi_header *)table)->length;
+
+    if (len) {
+        libxl__ptr_add(gc, table);
+        if (!dom->acpi_modules[0].data) {
+            dom->acpi_modules[0].data = table;
+            dom->acpi_modules[0].length = len;
+        } else {
+            /* joint tables */
+            void *newdata;
+
+            newdata = libxl__malloc(gc, len + dom->acpi_modules[0].length);
+            memcpy(newdata, dom->acpi_modules[0].data,
+                   dom->acpi_modules[0].length);
+            memcpy(newdata + dom->acpi_modules[0].length, table, len);
+
+            free(dom->acpi_modules[0].data);
+            dom->acpi_modules[0].data = newdata;
+            dom->acpi_modules[0].length += len;
+        }
+    }
+    return 0;
+}
+
+int libxl__dom_load_acpi(libxl__gc *gc,
+                         const libxl_domain_build_info *b_info,
+                         struct xc_dom_image *dom)
+{
+
+    if (b_info->type != LIBXL_DOMAIN_TYPE_HVM)
+        return 0;
+
+    if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)
+        return libxl__dom_load_acpi_pvh(gc, b_info, dom);
+    else
+        return libxl__dom_load_acpi_hvm(gc, b_info, dom);
+}
 /*
  * Local variables:
  * mode: C
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (6 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 10:13   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 9/29] tools/libxc: Add viommu operations in libxc Lan Tianyu
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

If guest is configured to have a vIOMMU, create it during domain construction.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - Remove the process of querying capabilities.
---
 tools/libxl/libxl_x86.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 23c9a55..25cae5f 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -341,8 +341,25 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
     if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {
         unsigned long shadow = DIV_ROUNDUP(d_config->b_info.shadow_memkb,
                                            1024);
+        int i;
+
         xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
                           NULL, 0, &shadow, 0, NULL);
+
+        for (i = 0; i < d_config->b_info.num_viommus; i++) {
+            uint32_t id;
+            libxl_viommu_info *viommu = d_config->b_info.viommu + i;
+
+            if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
+                ret = xc_viommu_create(ctx->xch, domid, VIOMMU_TYPE_INTEL_VTD,
+                                       viommu->base_addr, viommu->cap, &id);
+                if (ret) {
+                    LOGED(ERROR, domid, "create vIOMMU fail");
+                    ret = ERROR_FAIL;
+                    goto out;
+                }
+            }
+        }
     }
 
     if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV &&
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 9/29] tools/libxc: Add viommu operations in libxc
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (7 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 10:17   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 10/29] vtd: add and align register definitions Lan Tianyu
                   ` (20 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

This patch adds XEN_DOMCTL_viommu_op hypercall. This hypercall
comprises two sub-commands:
- create(): create a vIOMMU in Xen, given viommu type, register-set
            location and capabilities
- destroy(): destroy a vIOMMU specified by viommu_id

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
v3:
 - Remove API for querying viommu capabilities
 - Remove pointless cast
 - Polish commit message
 - Coding style
---
 tools/libxc/Makefile          |  1 +
 tools/libxc/include/xenctrl.h |  4 +++
 tools/libxc/xc_viommu.c       | 64 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 69 insertions(+)
 create mode 100644 tools/libxc/xc_viommu.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 9a019e8..7d8c4b4 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -51,6 +51,7 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
 CTRL_SRCS-y       += xc_evtchn_compat.c
 CTRL_SRCS-y       += xc_gnttab_compat.c
 CTRL_SRCS-y       += xc_devicemodel_compat.c
+CTRL_SRCS-y       += xc_viommu.c
 
 GUEST_SRCS-y :=
 GUEST_SRCS-y += xg_private.c xc_suspend.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 43151cb..bedca1f 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2501,6 +2501,10 @@ enum xc_static_cpu_featuremask {
 const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
 const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
 
+int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
+                     uint64_t base_addr, uint64_t cap, uint32_t *viommu_id);
+int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id);
+
 #endif
 
 int xc_livepatch_upload(xc_interface *xch,
diff --git a/tools/libxc/xc_viommu.c b/tools/libxc/xc_viommu.c
new file mode 100644
index 0000000..17507c5
--- /dev/null
+++ b/tools/libxc/xc_viommu.c
@@ -0,0 +1,64 @@
+/*
+ * xc_viommu.c
+ *
+ * viommu related API functions.
+ *
+ * Copyright (C) 2017 Intel Corporation
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License, version 2.1, as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xc_private.h"
+
+int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
+                     uint64_t base_addr, uint64_t cap, uint32_t *viommu_id)
+{
+    int rc;
+
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_viommu_op;
+    domctl.domain = dom;
+    domctl.u.viommu_op.cmd = XEN_DOMCTL_create_viommu;
+    domctl.u.viommu_op.u.create.viommu_type = type;
+    domctl.u.viommu_op.u.create.base_address = base_addr;
+    domctl.u.viommu_op.u.create.capabilities = cap;
+
+    rc = do_domctl(xch, &domctl);
+    if ( !rc )
+        *viommu_id = domctl.u.viommu_op.u.create.viommu_id;
+
+    return rc;
+}
+
+int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id)
+{
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_viommu_op;
+    domctl.domain = dom;
+    domctl.u.viommu_op.cmd = XEN_DOMCTL_destroy_viommu;
+    domctl.u.viommu_op.u.destroy.viommu_id = viommu_id;
+
+    return do_domctl(xch, &domctl);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 10/29] vtd: add and align register definitions
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (8 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 9/29] tools/libxc: Add viommu operations in libxc Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 10:21   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

No functional changes.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
 xen/drivers/passthrough/vtd/iommu.h | 54 +++++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 72c1a2e..d7e433e 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -23,31 +23,39 @@
 #include <asm/msi.h>
 
 /*
- * Intel IOMMU register specification per version 1.0 public spec.
+ * Intel IOMMU register specification per version 2.4 public spec.
  */
 
-#define    DMAR_VER_REG    0x0    /* Arch version supported by this IOMMU */
-#define    DMAR_CAP_REG    0x8    /* Hardware supported capabilities */
-#define    DMAR_ECAP_REG    0x10    /* Extended capabilities supported */
-#define    DMAR_GCMD_REG    0x18    /* Global command register */
-#define    DMAR_GSTS_REG    0x1c    /* Global status register */
-#define    DMAR_RTADDR_REG    0x20    /* Root entry table */
-#define    DMAR_CCMD_REG    0x28    /* Context command reg */
-#define    DMAR_FSTS_REG    0x34    /* Fault Status register */
-#define    DMAR_FECTL_REG    0x38    /* Fault control register */
-#define    DMAR_FEDATA_REG    0x3c    /* Fault event interrupt data register */
-#define    DMAR_FEADDR_REG    0x40    /* Fault event interrupt addr register */
-#define    DMAR_FEUADDR_REG 0x44    /* Upper address register */
-#define    DMAR_AFLOG_REG    0x58    /* Advanced Fault control */
-#define    DMAR_PMEN_REG    0x64    /* Enable Protected Memory Region */
-#define    DMAR_PLMBASE_REG 0x68    /* PMRR Low addr */
-#define    DMAR_PLMLIMIT_REG 0x6c    /* PMRR low limit */
-#define    DMAR_PHMBASE_REG 0x70    /* pmrr high base addr */
-#define    DMAR_PHMLIMIT_REG 0x78    /* pmrr high limit */
-#define    DMAR_IQH_REG    0x80    /* invalidation queue head */
-#define    DMAR_IQT_REG    0x88    /* invalidation queue tail */
-#define    DMAR_IQA_REG    0x90    /* invalidation queue addr */
-#define    DMAR_IRTA_REG   0xB8    /* intr remap */
+#define DMAR_VER_REG            0x0  /* Arch version supported by this IOMMU */
+#define DMAR_CAP_REG            0x8  /* Hardware supported capabilities */
+#define DMAR_ECAP_REG           0x10 /* Extended capabilities supported */
+#define DMAR_GCMD_REG           0x18 /* Global command register */
+#define DMAR_GSTS_REG           0x1c /* Global status register */
+#define DMAR_RTADDR_REG         0x20 /* Root entry table */
+#define DMAR_CCMD_REG           0x28 /* Context command reg */
+#define DMAR_FSTS_REG           0x34 /* Fault Status register */
+#define DMAR_FECTL_REG          0x38 /* Fault control register */
+#define DMAR_FEDATA_REG         0x3c /* Fault event interrupt data register */
+#define DMAR_FEADDR_REG         0x40 /* Fault event interrupt addr register */
+#define DMAR_FEUADDR_REG        0x44 /* Upper address register */
+#define DMAR_AFLOG_REG          0x58 /* Advanced Fault control */
+#define DMAR_PMEN_REG           0x64 /* Enable Protected Memory Region */
+#define DMAR_PLMBASE_REG        0x68 /* PMRR Low addr */
+#define DMAR_PLMLIMIT_REG       0x6c /* PMRR low limit */
+#define DMAR_PHMBASE_REG        0x70 /* pmrr high base addr */
+#define DMAR_PHMLIMIT_REG       0x78 /* pmrr high limit */
+#define DMAR_IQH_REG            0x80 /* invalidation queue head */
+#define DMAR_IQT_REG            0x88 /* invalidation queue tail */
+#define DMAR_IQT_REG_HI         0x8c
+#define DMAR_IQA_REG            0x90 /* invalidation queue addr */
+#define DMAR_IQA_REG_HI         0x94
+#define DMAR_ICS_REG            0x9c /* Invalidation complete status */
+#define DMAR_IECTL_REG          0xa0 /* Invalidation event control */
+#define DMAR_IEDATA_REG         0xa4 /* Invalidation event data */
+#define DMAR_IEADDR_REG         0xa8 /* Invalidation event address */
+#define DMAR_IEUADDR_REG        0xac /* Invalidation event address */
+#define DMAR_IRTA_REG           0xb8 /* Interrupt remapping table addr */
+#define DMAR_IRTA_REG_HI        0xbc
 
 #define OFFSET_STRIDE        (9)
 #define dmar_readl(dmar, reg) readl((dmar) + (reg))
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (9 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 10/29] vtd: add and align register definitions Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 11:20   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
                   ` (18 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

This patch adds create/destroy function for the emulated VTD
and adapts it to the common VIOMMU abstraction.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/Makefile |   7 +-
 xen/drivers/passthrough/vtd/iommu.h  |  23 +++++-
 xen/drivers/passthrough/vtd/vvtd.c   | 147 +++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+), 7 deletions(-)
 create mode 100644 xen/drivers/passthrough/vtd/vvtd.c

diff --git a/xen/drivers/passthrough/vtd/Makefile b/xen/drivers/passthrough/vtd/Makefile
index f302653..163c7fe 100644
--- a/xen/drivers/passthrough/vtd/Makefile
+++ b/xen/drivers/passthrough/vtd/Makefile
@@ -1,8 +1,9 @@
 subdir-$(CONFIG_X86) += x86
 
-obj-y += iommu.o
 obj-y += dmar.o
-obj-y += utils.o
-obj-y += qinval.o
 obj-y += intremap.o
+obj-y += iommu.o
+obj-y += qinval.o
 obj-y += quirks.o
+obj-y += utils.o
+obj-$(CONFIG_VIOMMU) += vvtd.o
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index d7e433e..ef038c9 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -66,6 +66,12 @@
 #define VER_MAJOR(v)        (((v) & 0xf0) >> 4)
 #define VER_MINOR(v)        ((v) & 0x0f)
 
+/* Supported Adjusted Guest Address Widths */
+#define DMA_CAP_SAGAW_SHIFT         8
+ /* 39-bit AGAW, 3-level page-table */
+#define DMA_CAP_SAGAW_39bit         (0x2ULL << DMA_CAP_SAGAW_SHIFT)
+#define DMA_CAP_ND_64K              6ULL
+
 /*
  * Decoding Capability Register
  */
@@ -74,6 +80,7 @@
 #define cap_write_drain(c)     (((c) >> 54) & 1)
 #define cap_max_amask_val(c)   (((c) >> 48) & 0x3f)
 #define cap_num_fault_regs(c)  ((((c) >> 40) & 0xff) + 1)
+#define cap_set_num_fault_regs(c)  ((((c) - 1) & 0xff) << 40)
 #define cap_pgsel_inv(c)       (((c) >> 39) & 1)
 
 #define cap_super_page_val(c)  (((c) >> 34) & 0xf)
@@ -85,11 +92,13 @@
 #define cap_sps_1tb(c)         ((c >> 37) & 1)
 
 #define cap_fault_reg_offset(c)    ((((c) >> 24) & 0x3ff) * 16)
+#define cap_set_fault_reg_offset(c) ((((c) / 16) & 0x3ff) << 24 )
 
 #define cap_isoch(c)        (((c) >> 23) & 1)
 #define cap_qos(c)        (((c) >> 22) & 1)
 #define cap_mgaw(c)        ((((c) >> 16) & 0x3f) + 1)
-#define cap_sagaw(c)        (((c) >> 8) & 0x1f)
+#define cap_set_mgaw(c)     ((((c) - 1) & 0x3f) << 16)
+#define cap_sagaw(c)        (((c) >> DMA_CAP_SAGAW_SHIFT) & 0x1f)
 #define cap_caching_mode(c)    (((c) >> 7) & 1)
 #define cap_phmr(c)        (((c) >> 6) & 1)
 #define cap_plmr(c)        (((c) >> 5) & 1)
@@ -104,10 +113,16 @@
 #define ecap_niotlb_iunits(e)    ((((e) >> 24) & 0xff) + 1)
 #define ecap_iotlb_offset(e)     ((((e) >> 8) & 0x3ff) * 16)
 #define ecap_coherent(e)         ((e >> 0) & 0x1)
-#define ecap_queued_inval(e)     ((e >> 1) & 0x1)
+#define DMA_ECAP_QI_SHIFT        1
+#define DMA_ECAP_QI              (1ULL << DMA_ECAP_QI_SHIFT)
+#define ecap_queued_inval(e)     ((e >> DMA_ECAP_QI_SHIFT) & 0x1)
 #define ecap_dev_iotlb(e)        ((e >> 2) & 0x1)
-#define ecap_intr_remap(e)       ((e >> 3) & 0x1)
-#define ecap_eim(e)              ((e >> 4) & 0x1)
+#define DMA_ECAP_IR_SHIFT        3
+#define DMA_ECAP_IR              (1ULL << DMA_ECAP_IR_SHIFT)
+#define ecap_intr_remap(e)       ((e >> DMA_ECAP_IR_SHIFT) & 0x1)
+#define DMA_ECAP_EIM_SHIFT       4
+#define DMA_ECAP_EIM             (1ULL << DMA_ECAP_EIM_SHIFT)
+#define ecap_eim(e)              ((e >> DMA_ECAP_EIM_SHIFT) & 0x1)
 #define ecap_cache_hints(e)      ((e >> 5) & 0x1)
 #define ecap_pass_thru(e)        ((e >> 6) & 0x1)
 #define ecap_snp_ctl(e)          ((e >> 7) & 0x1)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
new file mode 100644
index 0000000..c851ec7
--- /dev/null
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -0,0 +1,147 @@
+/*
+ * vvtd.c
+ *
+ * virtualize VTD for HVM.
+ *
+ * Copyright (C) 2017 Chao Gao, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/domain_page.h>
+#include <xen/sched.h>
+#include <xen/types.h>
+#include <xen/viommu.h>
+#include <xen/xmalloc.h>
+#include <asm/current.h>
+#include <asm/hvm/domain.h>
+#include <asm/page.h>
+
+#include "iommu.h"
+
+/* Supported capabilities by vvtd */
+unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
+
+union hvm_hw_vvtd_regs {
+    uint32_t data32[256];
+    uint64_t data64[128];
+};
+
+struct vvtd {
+    /* Address range of remapping hardware register-set */
+    uint64_t base_addr;
+    uint64_t length;
+    /* Point back to the owner domain */
+    struct domain *domain;
+    union hvm_hw_vvtd_regs *regs;
+    struct page_info *regs_page;
+};
+
+static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
+{
+    vtd->regs->data32[reg/sizeof(uint32_t)] = value;
+}
+
+static inline uint32_t vvtd_get_reg(struct vvtd *vtd, uint32_t reg)
+{
+    return vtd->regs->data32[reg/sizeof(uint32_t)];
+}
+
+static inline void vvtd_set_reg_quad(struct vvtd *vtd, uint32_t reg,
+                                     uint64_t value)
+{
+    vtd->regs->data64[reg/sizeof(uint64_t)] = value;
+}
+
+static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
+{
+    return vtd->regs->data64[reg/sizeof(uint64_t)];
+}
+
+static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
+{
+    uint64_t cap = cap_set_num_fault_regs(1ULL) |
+                   cap_set_fault_reg_offset(0x220ULL) |
+                   cap_set_mgaw(39ULL) | DMA_CAP_SAGAW_39bit |
+                   DMA_CAP_ND_64K;
+    uint64_t ecap = DMA_ECAP_IR | DMA_ECAP_EIM | DMA_ECAP_QI;
+
+    vvtd_set_reg(vvtd, DMAR_VER_REG, 0x10UL);
+    vvtd_set_reg_quad(vvtd, DMAR_CAP_REG, cap);
+    vvtd_set_reg_quad(vvtd, DMAR_ECAP_REG, ecap);
+    vvtd_set_reg(vvtd, DMAR_FECTL_REG, 0x80000000UL);
+    vvtd_set_reg(vvtd, DMAR_IECTL_REG, 0x80000000UL);
+}
+
+static int vvtd_create(struct domain *d, struct viommu *viommu)
+{
+    struct vvtd *vvtd;
+    int ret;
+
+    if ( !is_hvm_domain(d) || (viommu->base_address & (PAGE_SIZE - 1)) ||
+        (~vvtd_caps & viommu->caps) )
+        return -EINVAL;
+
+    ret = -ENOMEM;
+    vvtd = xzalloc_bytes(sizeof(struct vvtd));
+    if ( !vvtd )
+        return ret;
+
+    vvtd->regs_page = alloc_domheap_page(d, MEMF_no_owner);
+    if ( !vvtd->regs_page )
+        goto out1;
+
+    vvtd->regs = __map_domain_page_global(vvtd->regs_page);
+    if ( !vvtd->regs )
+        goto out2;
+    clear_page(vvtd->regs);
+
+    vvtd_reset(vvtd, viommu->caps);
+    vvtd->base_addr = viommu->base_address;
+    vvtd->domain = d;
+
+    viommu->priv = vvtd;
+
+    return 0;
+
+ out2:
+    free_domheap_page(vvtd->regs_page);
+ out1:
+    xfree(vvtd);
+    return ret;
+}
+
+static int vvtd_destroy(struct viommu *viommu)
+{
+    struct vvtd *vvtd = viommu->priv;
+
+    if ( vvtd )
+    {
+        unmap_domain_page_global(vvtd->regs);
+        free_domheap_page(vvtd->regs_page);
+        xfree(vvtd);
+    }
+    return 0;
+}
+
+struct viommu_ops vvtd_hvm_vmx_ops = {
+    .create = vvtd_create,
+    .destroy = vvtd_destroy
+};
+
+static int vvtd_register(void)
+{
+    viommu_register_type(VIOMMU_TYPE_INTEL_VTD, &vvtd_hvm_vmx_ops);
+    return 0;
+}
+__initcall(vvtd_register);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (10 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 11:34   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
                   ` (17 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

This patch adds VVTD MMIO handler to deal with MMIO access.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/vvtd.c | 91 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index c851ec7..a3002c3 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -47,6 +47,29 @@ struct vvtd {
     struct page_info *regs_page;
 };
 
+/* Setting viommu_verbose enables debugging messages of vIOMMU */
+bool __read_mostly viommu_verbose;
+boolean_runtime_param("viommu_verbose", viommu_verbose);
+
+#ifndef NDEBUG
+#define vvtd_info(fmt...) do {                    \
+    if ( viommu_verbose )                         \
+        gprintk(XENLOG_G_INFO, ## fmt);           \
+} while(0)
+#define vvtd_debug(fmt...) do {                   \
+    if ( viommu_verbose && printk_ratelimit() )   \
+        printk(XENLOG_G_DEBUG fmt);               \
+} while(0)
+#else
+#define vvtd_info(fmt...) do {} while(0)
+#define vvtd_debug(fmt...) do {} while(0)
+#endif
+
+struct vvtd *domain_vvtd(struct domain *d)
+{
+    return (d->viommu) ? d->viommu->priv : NULL;
+}
+
 static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
 {
     vtd->regs->data32[reg/sizeof(uint32_t)] = value;
@@ -68,6 +91,73 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
     return vtd->regs->data64[reg/sizeof(uint64_t)];
 }
 
+static int vvtd_in_range(struct vcpu *v, unsigned long addr)
+{
+    struct vvtd *vvtd = domain_vvtd(v->domain);
+
+    if ( vvtd )
+        return (addr >= vvtd->base_addr) &&
+               (addr < vvtd->base_addr + PAGE_SIZE);
+    return 0;
+}
+
+static int vvtd_read(struct vcpu *v, unsigned long addr,
+                     unsigned int len, unsigned long *pval)
+{
+    struct vvtd *vvtd = domain_vvtd(v->domain);
+    unsigned int offset = addr - vvtd->base_addr;
+
+    vvtd_info("Read offset %x len %d\n", offset, len);
+
+    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
+        return X86EMUL_OKAY;
+
+    if ( len == 4 )
+        *pval = vvtd_get_reg(vvtd, offset);
+    else
+        *pval = vvtd_get_reg_quad(vvtd, offset);
+
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write(struct vcpu *v, unsigned long addr,
+                      unsigned int len, unsigned long val)
+{
+    struct vvtd *vvtd = domain_vvtd(v->domain);
+    unsigned int offset = addr - vvtd->base_addr;
+
+    vvtd_info("Write offset %x len %d val %lx\n", offset, len, val);
+
+    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
+        return X86EMUL_OKAY;
+
+    if ( len == 4 )
+    {
+        switch ( offset )
+        {
+        case DMAR_IEDATA_REG:
+        case DMAR_IEADDR_REG:
+        case DMAR_IEUADDR_REG:
+        case DMAR_FEDATA_REG:
+        case DMAR_FEADDR_REG:
+        case DMAR_FEUADDR_REG:
+            vvtd_set_reg(vvtd, offset, val);
+            break;
+
+        default:
+            break;
+        }
+    }
+
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_mmio_ops vvtd_mmio_ops = {
+    .check = vvtd_in_range,
+    .read = vvtd_read,
+    .write = vvtd_write
+};
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = cap_set_num_fault_regs(1ULL) |
@@ -109,6 +199,7 @@ static int vvtd_create(struct domain *d, struct viommu *viommu)
     vvtd_reset(vvtd, viommu->caps);
     vvtd->base_addr = viommu->base_address;
     vvtd->domain = d;
+    register_mmio_handler(d, &vvtd_mmio_ops);
 
     viommu->priv = vvtd;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (11 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 11:56   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 14/29] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
                   ` (16 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Software sets this field to set/update the interrupt remapping table pointer
used by hardware. The interrupt remapping table pointer is specified through
the Interrupt Remapping Table Address (IRTA_REG) register.

This patch emulates this operation and adds some new fields in VVTD to track
info (e.g. the table's gfn and max supported entries) of interrupt remapping
table.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - ignore unaligned r/w of vt-d hardware registers and return X86EMUL_OK
---
 xen/drivers/passthrough/vtd/iommu.h | 12 ++++++-
 xen/drivers/passthrough/vtd/vvtd.c  | 69 +++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index ef038c9..a0d5ec8 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -153,6 +153,8 @@
 #define DMA_GCMD_IRE    (((u64)1) << 25)
 #define DMA_GCMD_SIRTP  (((u64)1) << 24)
 #define DMA_GCMD_CFI    (((u64)1) << 23)
+/* mask of one-shot bits */
+#define DMA_GCMD_ONE_SHOT_MASK 0x96ffffff 
 
 /* GSTS_REG */
 #define DMA_GSTS_TES    (((u64)1) << 31)
@@ -162,9 +164,17 @@
 #define DMA_GSTS_WBFS   (((u64)1) << 27)
 #define DMA_GSTS_QIES   (((u64)1) <<26)
 #define DMA_GSTS_IRES   (((u64)1) <<25)
-#define DMA_GSTS_SIRTPS (((u64)1) << 24)
+#define DMA_GSTS_SIRTPS_SHIFT   24
+#define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_SHIFT)
 #define DMA_GSTS_CFIS   (((u64)1) <<23)
 
+/* IRTA_REG */
+/* The base of 4KB aligned interrupt remapping table */
+#define DMA_IRTA_ADDR(val)      ((val) & ~0xfffULL)
+/* The size of remapping table is 2^(x+1), where x is the size field in IRTA */
+#define DMA_IRTA_S(val)         (val & 0xf)
+#define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
+
 /* PMEN_REG */
 #define DMA_PMEN_EPM    (((u32)1) << 31)
 #define DMA_PMEN_PRS    (((u32)1) << 0)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index a3002c3..6736956 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -32,6 +32,13 @@
 /* Supported capabilities by vvtd */
 unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
 
+struct hvm_hw_vvtd_status {
+    uint32_t eim_enabled : 1;
+    uint32_t irt_max_entry;
+    /* Interrupt remapping table base gfn */
+    uint64_t irt;
+};
+
 union hvm_hw_vvtd_regs {
     uint32_t data32[256];
     uint64_t data64[128];
@@ -43,6 +50,8 @@ struct vvtd {
     uint64_t length;
     /* Point back to the owner domain */
     struct domain *domain;
+
+    struct hvm_hw_vvtd_status status;
     union hvm_hw_vvtd_regs *regs;
     struct page_info *regs_page;
 };
@@ -70,6 +79,11 @@ struct vvtd *domain_vvtd(struct domain *d)
     return (d->viommu) ? d->viommu->priv : NULL;
 }
 
+static inline void vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    __set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
+}
+
 static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
 {
     vtd->regs->data32[reg/sizeof(uint32_t)] = value;
@@ -91,6 +105,44 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
     return vtd->regs->data64[reg/sizeof(uint64_t)];
 }
 
+static void vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
+{
+    uint64_t irta = vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG);
+
+    if ( !(val & DMA_GCMD_SIRTP) )
+        return;
+
+    vvtd->status.irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
+    vvtd->status.irt_max_entry = DMA_IRTA_SIZE(irta);
+    vvtd->status.eim_enabled = !!(irta & IRTA_EIME);
+    vvtd_info("Update IR info (addr=%lx eim=%d size=%d).",
+              vvtd->status.irt, vvtd->status.eim_enabled,
+              vvtd->status.irt_max_entry);
+    vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_SIRTPS_SHIFT);
+}
+
+static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
+{
+    uint32_t orig = vvtd_get_reg(vvtd, DMAR_GSTS_REG);
+    uint32_t changed;
+
+    orig = orig & DMA_GCMD_ONE_SHOT_MASK;   /* reset the one-shot bits */
+    changed = orig ^ val;
+
+    if ( !changed )
+        return X86EMUL_OKAY;
+
+    if ( changed & (changed - 1) )
+        vvtd_info("Guest attempts to write %x to GCMD (current GSTS is %x)," 
+                  "it would lead to update multiple fields",
+                  val, orig);
+
+    if ( changed & DMA_GCMD_SIRTP )
+        vvtd_handle_gcmd_sirtp(vvtd, val);
+
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_in_range(struct vcpu *v, unsigned long addr)
 {
     struct vvtd *vvtd = domain_vvtd(v->domain);
@@ -135,12 +187,17 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
     {
         switch ( offset )
         {
+        case DMAR_GCMD_REG:
+            return vvtd_write_gcmd(vvtd, val);
+
         case DMAR_IEDATA_REG:
         case DMAR_IEADDR_REG:
         case DMAR_IEUADDR_REG:
         case DMAR_FEDATA_REG:
         case DMAR_FEADDR_REG:
         case DMAR_FEUADDR_REG:
+        case DMAR_IRTA_REG:
+        case DMAR_IRTA_REG_HI:
             vvtd_set_reg(vvtd, offset, val);
             break;
 
@@ -148,6 +205,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             break;
         }
     }
+    else /* len == 8 */
+    {
+        switch ( offset )
+        {
+        case DMAR_IRTA_REG:
+            vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);
+            break;
+
+        default:
+            break;
+        }
+    }
 
     return X86EMUL_OKAY;
 }
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 14/29] x86/vvtd: Enable Interrupt Remapping through GCMD
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (12 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 13:42   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request Lan Tianyu
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Software writes this field to enable/disable interrupt reampping. This patch
emulate IRES field of GCMD.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  3 ++-
 xen/drivers/passthrough/vtd/vvtd.c  | 30 +++++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index a0d5ec8..703726f 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -163,7 +163,8 @@
 #define DMA_GSTS_AFLS   (((u64)1) << 28)
 #define DMA_GSTS_WBFS   (((u64)1) << 27)
 #define DMA_GSTS_QIES   (((u64)1) <<26)
-#define DMA_GSTS_IRES   (((u64)1) <<25)
+#define DMA_GSTS_IRES_SHIFT     25
+#define DMA_GSTS_IRES   (((u64)1) << DMA_GSTS_IRES_SHIFT)
 #define DMA_GSTS_SIRTPS_SHIFT   24
 #define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_SHIFT)
 #define DMA_GSTS_CFIS   (((u64)1) <<23)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 6736956..a0f63e9 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -33,7 +33,8 @@
 unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
 
 struct hvm_hw_vvtd_status {
-    uint32_t eim_enabled : 1;
+    uint32_t eim_enabled : 1,
+             intremap_enabled : 1;
     uint32_t irt_max_entry;
     /* Interrupt remapping table base gfn */
     uint64_t irt;
@@ -84,6 +85,11 @@ static inline void vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
     __set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
 }
 
+static inline void vvtd_clear_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    __clear_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
+}
+
 static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
 {
     vtd->regs->data32[reg/sizeof(uint32_t)] = value;
@@ -105,6 +111,23 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
     return vtd->regs->data64[reg/sizeof(uint64_t)];
 }
 
+static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
+{
+    vvtd_info("%sable Interrupt Remapping",
+              (val & DMA_GCMD_IRE) ? "En" : "Dis");
+
+    if ( val & DMA_GCMD_IRE )
+    {
+        vvtd->status.intremap_enabled = true;
+        vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_SHIFT);
+    }
+    else
+    {
+        vvtd->status.intremap_enabled = false;
+        vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_SHIFT);
+    }
+}
+
 static void vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
 {
     uint64_t irta = vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG);
@@ -112,6 +135,9 @@ static void vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
     if ( !(val & DMA_GCMD_SIRTP) )
         return;
 
+    if ( vvtd->status.intremap_enabled )
+        vvtd_info("Update Interrupt Remapping Table when active\n");
+
     vvtd->status.irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
     vvtd->status.irt_max_entry = DMA_IRTA_SIZE(irta);
     vvtd->status.eim_enabled = !!(irta & IRTA_EIME);
@@ -139,6 +165,8 @@ static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
 
     if ( changed & DMA_GCMD_SIRTP )
         vvtd_handle_gcmd_sirtp(vvtd, val);
+    if ( changed & DMA_GCMD_IRE )
+        vvtd_handle_gcmd_ire(vvtd, val);
 
     return X86EMUL_OKAY;
 }
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (13 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 14/29] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 14:26   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When a remapping interrupt request arrives, remapping hardware computes the
interrupt_index per the algorithm described in VTD spec
"Interrupt Remapping Table", interprets the IRTE and generates a remapped
interrupt request.

This patch introduces viommu_handle_irq_request() to emulate the process how
remapping hardware handles a remapping interrupt request.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - Encode map_guest_page()'s error into void* to avoid using another parameter
---
 xen/drivers/passthrough/vtd/iommu.h |  21 +++
 xen/drivers/passthrough/vtd/vvtd.c  | 264 +++++++++++++++++++++++++++++++++++-
 2 files changed, 284 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 703726f..790384f 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -218,6 +218,21 @@
 #define dma_frcd_source_id(c) (c & 0xffff)
 #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
 
+enum VTD_FAULT_TYPE
+{
+    /* Interrupt remapping transition faults */
+    VTD_FR_IR_REQ_RSVD      = 0x20, /* One or more IR request reserved
+                                     * fields set */
+    VTD_FR_IR_INDEX_OVER    = 0x21, /* Index value greater than max */
+    VTD_FR_IR_ENTRY_P       = 0x22, /* Present (P) not set in IRTE */
+    VTD_FR_IR_ROOT_INVAL    = 0x23, /* IR Root table invalid */
+    VTD_FR_IR_IRTE_RSVD     = 0x24, /* IRTE Rsvd field non-zero with
+                                     * Present flag set */
+    VTD_FR_IR_REQ_COMPAT    = 0x25, /* Encountered compatible IR
+                                     * request while disabled */
+    VTD_FR_IR_SID_ERR       = 0x26, /* Invalid Source-ID */
+};
+
 /*
  * 0: Present
  * 1-11: Reserved
@@ -358,6 +373,12 @@ struct iremap_entry {
 };
 
 /*
+ * When VT-d doesn't enable Extended Interrupt Mode. Hardware only interprets
+ * only 8-bits ([15:8]) of Destination-ID field in the IRTEs.
+ */
+#define IRTE_xAPIC_DEST_MASK 0xff00
+
+/*
  * Posted-interrupt descriptor address is 64 bits with 64-byte aligned, only
  * the upper 26 bits of lest significiant 32 bits is available.
  */
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index a0f63e9..90c00f5 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -23,11 +23,17 @@
 #include <xen/types.h>
 #include <xen/viommu.h>
 #include <xen/xmalloc.h>
+#include <asm/apic.h>
 #include <asm/current.h>
+#include <asm/event.h>
 #include <asm/hvm/domain.h>
+#include <asm/io_apic.h>
 #include <asm/page.h>
+#include <asm/p2m.h>
+#include <asm/viommu.h>
 
 #include "iommu.h"
+#include "vtd.h"
 
 /* Supported capabilities by vvtd */
 unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
@@ -111,6 +117,132 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
     return vtd->regs->data64[reg/sizeof(uint64_t)];
 }
 
+static void* map_guest_page(struct domain *d, uint64_t gfn)
+{
+    struct page_info *p;
+    void *ret;
+
+    p = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
+    if ( !p )
+        return ERR_PTR(-EINVAL);
+
+    if ( !get_page_type(p, PGT_writable_page) )
+    {
+        put_page(p);
+        return ERR_PTR(-EINVAL);
+    }
+
+    ret = __map_domain_page_global(p);
+    if ( !ret )
+    {
+        put_page_and_type(p);
+        return ERR_PTR(-ENOMEM);
+    }
+
+    return ret;
+}
+
+static void unmap_guest_page(void *virt)
+{
+    struct page_info *page;
+
+    ASSERT((unsigned long)virt & PAGE_MASK);
+    page = mfn_to_page(domain_page_map_to_mfn(virt));
+
+    unmap_domain_page_global(virt);
+    put_page_and_type(page);
+}
+
+static void vvtd_inj_irq(struct vlapic *target, uint8_t vector,
+                         uint8_t trig_mode, uint8_t delivery_mode)
+{
+    vvtd_debug("dest=v%d, delivery_mode=%x vector=%d trig_mode=%d\n",
+               vlapic_vcpu(target)->vcpu_id, delivery_mode, vector, trig_mode);
+
+    ASSERT((delivery_mode == dest_Fixed) ||
+           (delivery_mode == dest_LowestPrio));
+
+    vlapic_set_irq(target, vector, trig_mode);
+}
+
+static int vvtd_delivery(struct domain *d, uint8_t vector,
+                         uint32_t dest, uint8_t dest_mode,
+                         uint8_t delivery_mode, uint8_t trig_mode)
+{
+    struct vlapic *target;
+    struct vcpu *v;
+
+    switch ( delivery_mode )
+    {
+    case dest_LowestPrio:
+        target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
+        if ( target != NULL )
+        {
+            vvtd_inj_irq(target, vector, trig_mode, delivery_mode);
+            break;
+        }
+        vvtd_debug("null round robin: vector=%02x\n", vector);
+        break;
+
+    case dest_Fixed:
+        for_each_vcpu ( d, v )
+            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode) )
+                vvtd_inj_irq(vcpu_vlapic(v), vector, trig_mode, delivery_mode);
+        break;
+
+    case dest_NMI:
+        for_each_vcpu ( d, v )
+            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode) &&
+                 !test_and_set_bool(v->nmi_pending) )
+                vcpu_kick(v);
+        break;
+
+    default:
+        gdprintk(XENLOG_WARNING, "Unsupported VTD delivery mode %d\n",
+                 delivery_mode);
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static uint32_t irq_remapping_request_index(
+    const struct arch_irq_remapping_request *irq)
+{
+    if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
+    {
+        uint32_t index;
+        struct msi_msg_remap_entry msi_msg =
+        {
+            .address_lo = { .val = irq->msg.msi.addr },
+            .data = irq->msg.msi.data,
+        };
+
+        index = (msi_msg.address_lo.index_15 << 15) +
+                msi_msg.address_lo.index_0_14;
+        if ( msi_msg.address_lo.SHV )
+            index += (uint16_t)msi_msg.data;
+
+        return index;
+    }
+    else if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
+    {
+        struct IO_APIC_route_remap_entry remap_rte = { .val = irq->msg.rte };
+
+        return (remap_rte.index_15 << 15) + remap_rte.index_0_14;
+    }
+    ASSERT_UNREACHABLE();
+
+    return 0;
+}
+
+static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
+{
+    /* In xAPIC mode, only 8-bits([15:8]) are valid */
+    return vvtd->status.eim_enabled ? dest :
+           MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
+}
+
 static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
 {
     vvtd_info("%sable Interrupt Remapping",
@@ -255,6 +387,135 @@ static const struct hvm_mmio_ops vvtd_mmio_ops = {
     .write = vvtd_write
 };
 
+static void vvtd_handle_fault(struct vvtd *vvtd,
+                              struct arch_irq_remapping_request *irq,
+                              struct iremap_entry *irte,
+                              unsigned int fault,
+                              bool record_fault)
+{
+   if ( !record_fault )
+        return;
+
+    switch ( fault )
+    {
+    case VTD_FR_IR_SID_ERR:
+    case VTD_FR_IR_IRTE_RSVD:
+    case VTD_FR_IR_ENTRY_P:
+        if ( qinval_fault_disable(*irte) )
+            break;
+    /* fall through */
+    case VTD_FR_IR_INDEX_OVER:
+    case VTD_FR_IR_ROOT_INVAL:
+        /* TODO: handle fault (e.g. record and report this fault to VM */
+        break;
+
+    default:
+        gdprintk(XENLOG_INFO, "Can't handle VT-d fault %x\n", fault);
+    }
+    return;
+}
+
+static bool vvtd_irq_request_sanity_check(const struct vvtd *vvtd,
+                                          struct arch_irq_remapping_request *irq)
+{
+    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
+    {
+        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
+
+        ASSERT(rte.format);
+        return !!rte.reserved;
+    }
+    else if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
+    {
+        struct msi_msg_remap_entry msi_msg =
+        { .address_lo = { .val = irq->msg.msi.addr } };
+
+        ASSERT(msi_msg.address_lo.format);
+        return 0;
+    }
+    ASSERT_UNREACHABLE();
+
+    return 0;
+}
+
+/*
+ * 'record_fault' is a flag to indicate whether we need recording a fault
+ * and notifying guest when a fault happens during fetching vIRTE.
+ */
+static int vvtd_get_entry(struct vvtd *vvtd,
+                          struct arch_irq_remapping_request *irq,
+                          struct iremap_entry *dest,
+                          bool record_fault)
+{
+    uint32_t entry = irq_remapping_request_index(irq);
+    struct iremap_entry  *irte, *irt_page;
+
+    vvtd_debug("interpret a request with index %x\n", entry);
+
+    if ( vvtd_irq_request_sanity_check(vvtd, irq) )
+    {
+        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_REQ_RSVD, record_fault);
+        return -EINVAL;
+    }
+
+    if ( entry > vvtd->status.irt_max_entry )
+    {
+        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_INDEX_OVER, record_fault);
+        return -EACCES;
+    }
+
+    irt_page = map_guest_page(vvtd->domain,
+                              vvtd->status.irt + (entry >> IREMAP_ENTRY_ORDER));
+    if ( IS_ERR(irt_page) )
+    {
+        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_ROOT_INVAL, record_fault);
+        return PTR_ERR(irt_page);
+    }
+
+    irte = irt_page + (entry % (1 << IREMAP_ENTRY_ORDER));
+    dest->val = irte->val;
+    if ( !qinval_present(*irte) )
+    {
+        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_ENTRY_P, record_fault);
+        unmap_guest_page(irt_page);
+        return -ENOENT;
+    }
+
+    /* Check reserved bits */
+    if ( (irte->remap.res_1 || irte->remap.res_2 || irte->remap.res_3 ||
+          irte->remap.res_4) )
+    {
+        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_IRTE_RSVD, record_fault);
+        unmap_guest_page(irt_page);
+        return -EINVAL;
+    }
+
+    /* FIXME: We don't check against the source ID */
+    unmap_guest_page(irt_page);
+
+    return 0;
+}
+
+static int vvtd_handle_irq_request(struct domain *d,
+                                   struct arch_irq_remapping_request *irq)
+{
+    struct iremap_entry irte;
+    int ret;
+    struct vvtd *vvtd = domain_vvtd(d);
+
+    if ( !vvtd || !vvtd->status.intremap_enabled )
+        return -ENODEV;
+
+    ret = vvtd_get_entry(vvtd, irq, &irte, true);
+    if ( ret )
+        return ret;
+
+    return vvtd_delivery(vvtd->domain, irte.remap.vector,
+                         irte_dest(vvtd, irte.remap.dst),
+                         irte.remap.dm, irte.remap.dlm,
+                         irte.remap.tm);
+}
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = cap_set_num_fault_regs(1ULL) |
@@ -324,7 +585,8 @@ static int vvtd_destroy(struct viommu *viommu)
 
 struct viommu_ops vvtd_hvm_vmx_ops = {
     .create = vvtd_create,
-    .destroy = vvtd_destroy
+    .destroy = vvtd_destroy,
+    .handle_irq_request = vvtd_handle_irq_request
 };
 
 static int vvtd_register(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (14 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 14:39   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 17/29] x86/vvtd: add a helper function to decide the interrupt format Lan Tianyu
                   ` (13 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Without interrupt remapping, interrupt attributes can be extracted from
msi message or IOAPIC RTE. However, with interrupt remapping enabled,
the attributes are enclosed in the associated IRTE. This callback is
for cases in which the caller wants to acquire interrupt attributes, for
example:
1. vioapic_get_vector(). With vIOMMU, the RTE may don't contain vector.
2. perform EOI which is always based on the interrupt vector.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
v3:
 - add example cases in which we will use this function.
---
 xen/drivers/passthrough/vtd/vvtd.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 90c00f5..5e22ace 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -516,6 +516,26 @@ static int vvtd_handle_irq_request(struct domain *d,
                          irte.remap.tm);
 }
 
+static int vvtd_get_irq_info(struct domain *d,
+                             struct arch_irq_remapping_request *irq,
+                             struct arch_irq_remapping_info *info)
+{
+    int ret;
+    struct iremap_entry irte;
+    struct vvtd *vvtd = domain_vvtd(d);
+
+    ret = vvtd_get_entry(vvtd, irq, &irte, false);
+    if ( ret )
+        return ret;
+
+    info->vector = irte.remap.vector;
+    info->dest = irte_dest(vvtd, irte.remap.dst);
+    info->dest_mode = irte.remap.dm;
+    info->delivery_mode = irte.remap.dlm;
+
+    return 0;
+}
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = cap_set_num_fault_regs(1ULL) |
@@ -586,7 +606,8 @@ static int vvtd_destroy(struct viommu *viommu)
 struct viommu_ops vvtd_hvm_vmx_ops = {
     .create = vvtd_create,
     .destroy = vvtd_destroy,
-    .handle_irq_request = vvtd_handle_irq_request
+    .handle_irq_request = vvtd_handle_irq_request,
+    .get_irq_info = vvtd_get_irq_info
 };
 
 static int vvtd_register(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 17/29] x86/vvtd: add a helper function to decide the interrupt format
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (15 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 14:43   ` Roger Pau Monné
  2017-09-22  3:01 ` [PATCH V3 18/29] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

Different platform may use different method to distinguish
remapping format interrupt and normal format interrupt.

Intel uses one bit in IOAPIC RTE or MSI address register to
indicate the interrupt is remapping format. vvtd will handle
all the interrupts when .check_irq_remapping() return true.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/vvtd.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 5e22ace..bd1cadd 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -536,6 +536,28 @@ static int vvtd_get_irq_info(struct domain *d,
     return 0;
 }
 
+/* Probe whether the interrupt request is an remapping format */
+static bool vvtd_is_remapping(struct domain *d,
+                              struct arch_irq_remapping_request *irq)
+{
+    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
+    {
+        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
+
+        return rte.format;
+    }
+    else if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
+    {
+        struct msi_msg_remap_entry msi_msg =
+        { .address_lo = { .val = irq->msg.msi.addr } };
+
+        return msi_msg.address_lo.format;
+    }
+    ASSERT_UNREACHABLE();
+
+    return 0;
+}
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = cap_set_num_fault_regs(1ULL) |
@@ -607,7 +629,8 @@ struct viommu_ops vvtd_hvm_vmx_ops = {
     .create = vvtd_create,
     .destroy = vvtd_destroy,
     .handle_irq_request = vvtd_handle_irq_request,
-    .get_irq_info = vvtd_get_irq_info
+    .get_irq_info = vvtd_get_irq_info,
+    .check_irq_remapping = vvtd_is_remapping
 };
 
 static int vvtd_register(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 18/29] VIOMMU: Add irq request callback to deal with irq remapping
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (16 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 17/29] x86/vvtd: add a helper function to decide the interrupt format Lan Tianyu
@ 2017-09-22  3:01 ` Lan Tianyu
  2017-10-19 15:00   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 19/29] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

This patch is to add irq request callback for platform implementation
to deal with irq remapping request.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/viommu.c          | 15 +++++++++
 xen/include/asm-x86/viommu.h | 72 ++++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/viommu.h     | 11 +++++++
 3 files changed, 98 insertions(+)
 create mode 100644 xen/include/asm-x86/viommu.h

diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index 55feb5d..b517158 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -163,6 +163,21 @@ int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
     return rc;
 }
 
+int viommu_handle_irq_request(struct domain *d,
+                              struct arch_irq_remapping_request *request)
+{
+    struct viommu *viommu = d->viommu;
+
+    if ( !viommu )
+        return -EINVAL;
+
+    ASSERT(viommu->ops);
+    if ( !viommu->ops->handle_irq_request )
+        return -EINVAL;
+
+    return viommu->ops->handle_irq_request(d, request);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
new file mode 100644
index 0000000..366fbb6
--- /dev/null
+++ b/xen/include/asm-x86/viommu.h
@@ -0,0 +1,72 @@
+/*
+ * include/asm-x86/viommu.h
+ *
+ * Copyright (c) 2017 Intel Corporation.
+ * Author: Lan Tianyu <tianyu.lan@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+#ifndef __ARCH_X86_VIOMMU_H__
+#define __ARCH_X86_VIOMMU_H__
+
+/* IRQ request type */
+#define VIOMMU_REQUEST_IRQ_MSI          0
+#define VIOMMU_REQUEST_IRQ_APIC         1
+
+struct arch_irq_remapping_request
+{
+    union {
+        /* MSI */
+        struct {
+            uint64_t addr;
+            uint32_t data;
+        } msi;
+        /* Redirection Entry in IOAPIC */
+        uint64_t rte;
+    } msg;
+    uint16_t source_id;
+    uint8_t type;
+};
+
+static inline void irq_request_ioapic_fill(struct arch_irq_remapping_request *req,
+                                           uint32_t ioapic_id, uint64_t rte)
+{
+    ASSERT(req);
+    req->type = VIOMMU_REQUEST_IRQ_APIC;
+    req->source_id = ioapic_id;
+    req->msg.rte = rte;
+}
+
+static inline void irq_request_msi_fill(struct arch_irq_remapping_request *req,
+                                        uint32_t source_id, uint64_t addr,
+                                        uint32_t data)
+{
+    ASSERT(req);
+    req->type = VIOMMU_REQUEST_IRQ_MSI;
+    req->source_id = source_id;
+    req->msg.msi.addr = addr;
+    req->msg.msi.data = data;
+}
+
+#endif /* __ARCH_X86_VIOMMU_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index baa8ab7..230f6b1 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -21,10 +21,13 @@
 #define __XEN_VIOMMU_H__
 
 struct viommu;
+struct arch_irq_remapping_request;
 
 struct viommu_ops {
     int (*create)(struct domain *d, struct viommu *viommu);
     int (*destroy)(struct viommu *viommu);
+    int (*handle_irq_request)(struct domain *d,
+                              struct arch_irq_remapping_request *request);
 };
 
 struct viommu {
@@ -45,11 +48,19 @@ int viommu_register_type(uint64_t type, struct viommu_ops *ops);
 int viommu_destroy_domain(struct domain *d);
 int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
                   bool_t *need_copy);
+int viommu_handle_irq_request(struct domain *d,
+                              struct arch_irq_remapping_request *request);
 #else
 static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
 {
     return -EINVAL;
 }
+static inline int
+viommu_handle_irq_request(struct domain *d,
+                          struct arch_irq_remapping_request *request)
+{
+    return -EINVAL;
+}
 #endif
 
 #endif /* __XEN_VIOMMU_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 19/29] x86/vioapic: Hook interrupt delivery of vIOAPIC
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (17 preceding siblings ...)
  2017-09-22  3:01 ` [PATCH V3 18/29] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 15:37   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
                   ` (10 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When irq remapping is enabled, IOAPIC Redirection Entry may be in remapping
format. If that, generate an irq_remapping_request and call the common
VIOMMU abstraction's callback to handle this interrupt request. Device
model is responsible for checking the request's validity.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - use the new interface to check remapping format.
---
 xen/arch/x86/hvm/vioapic.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 72cae93..5d0d1cd 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -30,6 +30,7 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/viommu.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/vpic.h>
@@ -38,6 +39,7 @@
 #include <asm/current.h>
 #include <asm/event.h>
 #include <asm/io_apic.h>
+#include <asm/viommu.h>
 
 /* HACK: Route IRQ0 only to VCPU0 to prevent time jumps. */
 #define IRQ0_SPECIAL_ROUTING 1
@@ -387,9 +389,17 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
     struct vlapic *target;
     struct vcpu *v;
     unsigned int irq = vioapic->base_gsi + pin;
+    struct arch_irq_remapping_request request;
 
     ASSERT(spin_is_locked(&d->arch.hvm_domain.irq_lock));
 
+    irq_request_ioapic_fill(&request, vioapic->id, vioapic->redirtbl[pin].bits);
+    if ( viommu_check_irq_remapping(d, &request) )
+    {
+        viommu_handle_irq_request(d, &request);
+        return;
+    }
+
     HVM_DBG_LOG(DBG_LEVEL_IOAPIC,
                 "dest=%x dest_mode=%x delivery_mode=%x "
                 "vector=%x trig_mode=%x",
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (18 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 19/29] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 15:42   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 21/29] VIOMMU: Introduce callback of checking irq remapping mode Lan Tianyu
                   ` (9 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

This patch is to add get_irq_info callback for platform implementation
to convert irq remapping request to irq info (E,G vector, dest, dest_mode
and so on).

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/viommu.c          | 16 ++++++++++++++++
 xen/include/asm-x86/viommu.h |  8 ++++++++
 xen/include/xen/viommu.h     | 14 ++++++++++++++
 3 files changed, 38 insertions(+)

diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index b517158..0708e43 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -178,6 +178,22 @@ int viommu_handle_irq_request(struct domain *d,
     return viommu->ops->handle_irq_request(d, request);
 }
 
+int viommu_get_irq_info(struct domain *d,
+                        struct arch_irq_remapping_request *request,
+                        struct arch_irq_remapping_info *irq_info)
+{
+    struct viommu *viommu = d->viommu;
+
+    if ( !viommu )
+        return -EINVAL;
+
+    ASSERT(viommu->ops);
+    if ( !viommu->ops->get_irq_info )
+        return -EINVAL;
+
+    return viommu->ops->get_irq_info(d, request, irq_info);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
index 366fbb6..586b6bd 100644
--- a/xen/include/asm-x86/viommu.h
+++ b/xen/include/asm-x86/viommu.h
@@ -24,6 +24,14 @@
 #define VIOMMU_REQUEST_IRQ_MSI          0
 #define VIOMMU_REQUEST_IRQ_APIC         1
 
+struct arch_irq_remapping_info
+{
+    uint8_t  vector;
+    uint32_t dest;
+    uint32_t dest_mode:1;
+    uint32_t delivery_mode:3;
+};
+
 struct arch_irq_remapping_request
 {
     union {
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index 230f6b1..beb40cd 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -21,6 +21,7 @@
 #define __XEN_VIOMMU_H__
 
 struct viommu;
+struct arch_irq_remapping_info;
 struct arch_irq_remapping_request;
 
 struct viommu_ops {
@@ -28,6 +29,9 @@ struct viommu_ops {
     int (*destroy)(struct viommu *viommu);
     int (*handle_irq_request)(struct domain *d,
                               struct arch_irq_remapping_request *request);
+    int (*get_irq_info)(struct domain *d,
+                        struct arch_irq_remapping_request *request,
+                        struct arch_irq_remapping_info *info);
 };
 
 struct viommu {
@@ -50,6 +54,9 @@ int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
                   bool_t *need_copy);
 int viommu_handle_irq_request(struct domain *d,
                               struct arch_irq_remapping_request *request);
+int viommu_get_irq_info(struct domain *d,
+                        struct arch_irq_remapping_request *request,
+                        struct arch_irq_remapping_info *irq_info);
 #else
 static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
 {
@@ -61,6 +68,13 @@ viommu_handle_irq_request(struct domain *d,
 {
     return -EINVAL;
 }
+static inline int
+viommu_get_irq_info(struct domain *d,
+                    struct arch_irq_remapping_request *request,
+                    struct arch_irq_remapping_info *irq_info);
+{
+    return -EINVAL;
+}
 #endif
 
 #endif /* __XEN_VIOMMU_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 21/29] VIOMMU: Introduce callback of checking irq remapping mode
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (19 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 15:43   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, chao.gao

This patch is to add callback for vIOAPIC and vMSI to check whether interrupt
remapping is enabled.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/viommu.c      | 15 +++++++++++++++
 xen/include/xen/viommu.h | 10 ++++++++++
 2 files changed, 25 insertions(+)

diff --git a/xen/common/viommu.c b/xen/common/viommu.c
index 0708e43..ff95465 100644
--- a/xen/common/viommu.c
+++ b/xen/common/viommu.c
@@ -194,6 +194,21 @@ int viommu_get_irq_info(struct domain *d,
     return viommu->ops->get_irq_info(d, request, irq_info);
 }
 
+bool viommu_check_irq_remapping(struct domain *d,
+                                struct arch_irq_remapping_request *request)
+{
+    struct viommu *viommu = d->viommu;
+
+    if ( !viommu )
+        return false;
+
+    ASSERT(viommu->ops);
+    if ( !viommu->ops->check_irq_remapping )
+        return false;
+
+    return viommu->ops->check_irq_remapping(d, request);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
index beb40cd..b5ac1e6 100644
--- a/xen/include/xen/viommu.h
+++ b/xen/include/xen/viommu.h
@@ -26,6 +26,8 @@ struct arch_irq_remapping_request;
 
 struct viommu_ops {
     int (*create)(struct domain *d, struct viommu *viommu);
+    bool (*check_irq_remapping)(struct domain *d,
+                                struct arch_irq_remapping_request *request);
     int (*destroy)(struct viommu *viommu);
     int (*handle_irq_request)(struct domain *d,
                               struct arch_irq_remapping_request *request);
@@ -57,6 +59,8 @@ int viommu_handle_irq_request(struct domain *d,
 int viommu_get_irq_info(struct domain *d,
                         struct arch_irq_remapping_request *request,
                         struct arch_irq_remapping_info *irq_info);
+bool viommu_check_irq_remapping(struct domain *d,
+                                struct arch_irq_remapping_request *request);
 #else
 static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
 {
@@ -75,6 +79,12 @@ viommu_get_irq_info(struct domain *d,
 {
     return -EINVAL;
 }
+static inline bool
+viommu_check_irq_remapping(struct domain *d,
+                           struct arch_irq_remapping_request *request)
+{
+    return false;
+}
 #endif
 
 #endif /* __XEN_VIOMMU_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (20 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 21/29] VIOMMU: Introduce callback of checking irq remapping mode Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 15:49   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 23/29] passthrough: move some fields of hvm_gmsi_info to a sub-structure Lan Tianyu
                   ` (7 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When IOAPIC RTE is in remapping format, it doesn't contain the vector of
interrupt. For this case, the RTE contains an index of interrupt remapping
table where the vector of interrupt is stored. This patchs gets the vector
through a vIOMMU interface.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/vioapic.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 5d0d1cd..9e47ef4 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -561,11 +561,25 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
 {
     unsigned int pin;
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
+    struct arch_irq_remapping_request request;
 
     if ( !vioapic )
         return -EINVAL;
 
-    return vioapic->redirtbl[pin].fields.vector;
+    irq_request_ioapic_fill(&request, vioapic->id, vioapic->redirtbl[pin].bits);
+    if ( viommu_check_irq_remapping(vioapic->domain, &request) )
+    {
+        int err;
+        struct arch_irq_remapping_info info;
+
+        err = viommu_get_irq_info(vioapic->domain, &request, &info);
+        return !err ? info.vector : err;
+    }
+    else
+    {
+        return vioapic->redirtbl[pin].fields.vector;
+    }
+
 }
 
 int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 23/29] passthrough: move some fields of hvm_gmsi_info to a sub-structure
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (21 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-09-22  3:02 ` [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

No functional change. It is a preparation for introducing new fields in
hvm_gmsi_info to manage remapping format msi bound to a physical msi.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/vmsi.c       |  4 ++--
 xen/drivers/passthrough/io.c  | 34 ++++++++++++++++++----------------
 xen/include/asm-x86/hvm/irq.h |  8 ++++++--
 3 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 9b35e9b..7f21853 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -101,8 +101,8 @@ int vmsi_deliver(
 
 void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 {
-    uint32_t flags = pirq_dpci->gmsi.gflags;
-    int vector = pirq_dpci->gmsi.gvec;
+    uint32_t flags = pirq_dpci->gmsi.legacy.gflags;
+    int vector = pirq_dpci->gmsi.legacy.gvec;
     uint8_t dest = (uint8_t)flags;
     bool dest_mode = flags & XEN_DOMCTL_VMSI_X86_DM_MASK;
     uint8_t delivery_mode = MASK_EXTR(flags, XEN_DOMCTL_VMSI_X86_DELIV_MASK);
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ec9f41a..fb44223 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -350,8 +350,8 @@ int pt_irq_create_bind(
         {
             pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED | HVM_IRQ_DPCI_MACH_MSI |
                                HVM_IRQ_DPCI_GUEST_MSI;
-            pirq_dpci->gmsi.gvec = pt_irq_bind->u.msi.gvec;
-            pirq_dpci->gmsi.gflags = gflags;
+            pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
+            pirq_dpci->gmsi.legacy.gflags = gflags;
             /*
              * 'pt_irq_create_bind' can be called after 'pt_irq_destroy_bind'.
              * The 'pirq_cleanup_check' which would free the structure is only
@@ -383,8 +383,8 @@ int pt_irq_create_bind(
             }
             if ( unlikely(rc) )
             {
-                pirq_dpci->gmsi.gflags = 0;
-                pirq_dpci->gmsi.gvec = 0;
+                pirq_dpci->gmsi.legacy.gflags = 0;
+                pirq_dpci->gmsi.legacy.gvec = 0;
                 pirq_dpci->dom = NULL;
                 pirq_dpci->flags = 0;
                 pirq_cleanup_check(info, d);
@@ -403,21 +403,22 @@ int pt_irq_create_bind(
             }
 
             /* If pirq is already mapped as vmsi, update guest data/addr. */
-            if ( pirq_dpci->gmsi.gvec != pt_irq_bind->u.msi.gvec ||
-                 pirq_dpci->gmsi.gflags != gflags )
+            if ( pirq_dpci->gmsi.legacy.gvec != pt_irq_bind->u.msi.gvec ||
+                 pirq_dpci->gmsi.legacy.gflags != gflags )
             {
                 /* Directly clear pending EOIs before enabling new MSI info. */
                 pirq_guest_eoi(info);
 
-                pirq_dpci->gmsi.gvec = pt_irq_bind->u.msi.gvec;
-                pirq_dpci->gmsi.gflags = gflags;
+        }
+                pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
+                pirq_dpci->gmsi.legacy.gflags = gflags;
             }
         }
         /* Calculate dest_vcpu_id for MSI-type pirq migration. */
-        dest = MASK_EXTR(pirq_dpci->gmsi.gflags,
+        dest = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
                          XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
-        dest_mode = pirq_dpci->gmsi.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
-        delivery_mode = MASK_EXTR(pirq_dpci->gmsi.gflags,
+        dest_mode = pirq_dpci->gmsi.legacy.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
+        delivery_mode = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
                                   XEN_DOMCTL_VMSI_X86_DELIV_MASK);
 
         dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
@@ -430,7 +431,7 @@ int pt_irq_create_bind(
         {
             if ( delivery_mode == dest_LowestPrio )
                 vcpu = vector_hashing_dest(d, dest, dest_mode,
-                                           pirq_dpci->gmsi.gvec);
+                                           pirq_dpci->gmsi.legacy.gvec);
             if ( vcpu )
                 pirq_dpci->gmsi.posted = true;
         }
@@ -440,7 +441,7 @@ int pt_irq_create_bind(
         /* Use interrupt posting if it is supported. */
         if ( iommu_intpost )
             pi_update_irte(vcpu ? &vcpu->arch.hvm_vmx.pi_desc : NULL,
-                           info, pirq_dpci->gmsi.gvec);
+                           info, pirq_dpci->gmsi.legacy.gvec);
 
         if ( pt_irq_bind->u.msi.gflags & XEN_DOMCTL_VMSI_X86_UNMASKED )
         {
@@ -835,11 +836,12 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
     int vector = (long)arg;
 
     if ( (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
-         (pirq_dpci->gmsi.gvec == vector) )
+         (pirq_dpci->gmsi.legacy.gvec == vector) )
     {
-        unsigned int dest = MASK_EXTR(pirq_dpci->gmsi.gflags,
+        unsigned int dest = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
                                       XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
-        bool dest_mode = pirq_dpci->gmsi.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
+        bool dest_mode = pirq_dpci->gmsi.legacy.gflags &
+                         XEN_DOMCTL_VMSI_X86_DM_MASK;
 
         if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
                                dest_mode) )
diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
index 3b6b4bd..bd8a918 100644
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -132,8 +132,12 @@ struct dev_intx_gsi_link {
 #define HVM_IRQ_DPCI_TRANSLATE       (1u << _HVM_IRQ_DPCI_TRANSLATE_SHIFT)
 
 struct hvm_gmsi_info {
-    uint32_t gvec;
-    uint32_t gflags;
+    union {
+        struct {
+            uint32_t gvec;
+            uint32_t gflags;
+        } legacy;
+    };
     int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
     bool posted; /* directly deliver to guest via VT-d PI? */
 };
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (22 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 23/29] passthrough: move some fields of hvm_gmsi_info to a sub-structure Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 16:03   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
                   ` (5 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, sstabellini, wei.liu2, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich, roger.pau, Chao Gao

From: Chao Gao <chao.gao@intel.com>

When exposing vIOMMU (vvtd) to guest, guest can configure the msi to
remapping format. For pass-through device, the physical interrupt now
can be bound with remapping format msi. This patch introduce a flag,
HVM_IRQ_DPCI_GUEST_REMAPPED, which indicate a physical interrupt is
bound with remapping format guest interrupt. Thus, we can use
(HVM_IRQ_DPCI_GUEST_REMAPPED | HVM_IRQ_DPCI_GUEST_MSI) to show the new
binding type. Also provide an new interface to manage the new binding.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

---
v3:
 - Introduce a new flag HVM_IRQ_DPCI_GUEST_REMAPPED
 - Remove the flag HVM_IRQ_DPCI_GUEST_MSI_IR
---
 tools/libxc/include/xenctrl.h |  17 +++++
 tools/libxc/xc_domain.c       |  53 +++++++++++++++
 xen/drivers/passthrough/io.c  | 155 +++++++++++++++++++++++++++++++++++-------
 xen/include/asm-x86/hvm/irq.h |   7 ++
 xen/include/public/domctl.h   |   7 ++
 5 files changed, 216 insertions(+), 23 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index bedca1f..1a17974 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1720,6 +1720,15 @@ int xc_domain_ioport_mapping(xc_interface *xch,
                              uint32_t nr_ports,
                              uint32_t add_mapping);
 
+int xc_domain_update_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr,
+    uint64_t gtable);
+
 int xc_domain_update_msi_irq(
     xc_interface *xch,
     uint32_t domid,
@@ -1734,6 +1743,14 @@ int xc_domain_unbind_msi_irq(xc_interface *xch,
                              uint32_t pirq,
                              uint32_t gflags);
 
+int xc_domain_unbind_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr);
+
 int xc_domain_bind_pt_irq(xc_interface *xch,
                           uint32_t domid,
                           uint8_t machine_irq,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 3bab4e8..4b6a510 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1702,8 +1702,34 @@ int xc_deassign_dt_device(
     return rc;
 }
 
+int xc_domain_update_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr,
+    uint64_t gtable)
+{
+    int rc;
+    xen_domctl_bind_pt_irq_t *bind;
+
+    DECLARE_DOMCTL;
 
+    domctl.cmd = XEN_DOMCTL_bind_pt_irq;
+    domctl.domain = (domid_t)domid;
 
+    bind = &(domctl.u.bind_pt_irq);
+    bind->irq_type = PT_IRQ_TYPE_MSI_IR;
+    bind->machine_irq = pirq;
+    bind->u.msi_ir.source_id = source_id;
+    bind->u.msi_ir.data = data;
+    bind->u.msi_ir.addr = addr;
+    bind->u.msi_ir.gtable = gtable;
+
+    rc = do_domctl(xch, &domctl);
+    return rc;
+}
 
 int xc_domain_update_msi_irq(
     xc_interface *xch,
@@ -1732,6 +1758,33 @@ int xc_domain_update_msi_irq(
     return rc;
 }
 
+int xc_domain_unbind_msi_irq_remapping(
+    xc_interface *xch,
+    uint32_t domid,
+    uint32_t pirq,
+    uint32_t source_id,
+    uint32_t data,
+    uint64_t addr)
+{
+    int rc;
+    xen_domctl_bind_pt_irq_t *bind;
+
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_unbind_pt_irq;
+    domctl.domain = (domid_t)domid;
+
+    bind = &(domctl.u.bind_pt_irq);
+    bind->irq_type = PT_IRQ_TYPE_MSI_IR;
+    bind->machine_irq = pirq;
+    bind->u.msi_ir.source_id = source_id;
+    bind->u.msi_ir.data = data;
+    bind->u.msi_ir.addr = addr;
+
+    rc = do_domctl(xch, &domctl);
+    return rc;
+}
+
 int xc_domain_unbind_msi_irq(
     xc_interface *xch,
     uint32_t domid,
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index fb44223..6196334 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -21,9 +21,11 @@
 #include <xen/iommu.h>
 #include <xen/cpu.h>
 #include <xen/irq.h>
+#include <xen/viommu.h>
 #include <asm/hvm/irq.h>
 #include <asm/hvm/support.h>
 #include <asm/io_apic.h>
+#include <asm/viommu.h>
 
 static DEFINE_PER_CPU(struct list_head, dpci_list);
 
@@ -275,6 +277,106 @@ static struct vcpu *vector_hashing_dest(const struct domain *d,
     return dest;
 }
 
+static void set_hvm_gmsi_info(struct hvm_gmsi_info *msi,
+                              xen_domctl_bind_pt_irq_t *pt_irq_bind)
+{
+    switch (pt_irq_bind->irq_type)
+    {
+    case PT_IRQ_TYPE_MSI:
+        msi->legacy.gvec = pt_irq_bind->u.msi.gvec;
+        msi->legacy.gflags = pt_irq_bind->u.msi.gflags;
+        break;
+
+    case PT_IRQ_TYPE_MSI_IR:
+        msi->intremap.source_id = pt_irq_bind->u.msi_ir.source_id;
+        msi->intremap.data = pt_irq_bind->u.msi_ir.data;
+        msi->intremap.addr = pt_irq_bind->u.msi_ir.addr;
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+    }
+}
+
+static void clear_hvm_gmsi_info(struct hvm_gmsi_info *msi, int irq_type)
+{
+    switch (irq_type)
+    {
+    case PT_IRQ_TYPE_MSI:
+        msi->legacy.gvec = 0;
+        msi->legacy.gflags = 0;
+        break;
+
+    case PT_IRQ_TYPE_MSI_IR:
+        msi->intremap.source_id = 0;
+        msi->intremap.data = 0;
+        msi->intremap.addr = 0;
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+    }
+}
+
+static bool hvm_gmsi_info_need_update(struct hvm_gmsi_info *msi,
+                                      xen_domctl_bind_pt_irq_t *pt_irq_bind)
+{
+    switch (pt_irq_bind->irq_type)
+    {
+    case PT_IRQ_TYPE_MSI:
+        return ((msi->legacy.gvec != pt_irq_bind->u.msi.gvec) ||
+                (msi->legacy.gflags != pt_irq_bind->u.msi.gflags));
+
+    case PT_IRQ_TYPE_MSI_IR:
+        return ((msi->intremap.source_id != pt_irq_bind->u.msi_ir.source_id) ||
+                (msi->intremap.data != pt_irq_bind->u.msi_ir.data) ||
+                (msi->intremap.addr != pt_irq_bind->u.msi_ir.addr));
+
+    default:
+        ASSERT_UNREACHABLE();
+    }
+
+    return 0;
+}
+
+static int pirq_dpci_2_msi_attr(struct domain *d,
+                                struct hvm_pirq_dpci *pirq_dpci, uint8_t *gvec,
+                                uint32_t *dest, bool *dm, uint8_t *dlm)
+{
+    int rc = 0;
+
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_REMAPPED )
+    {
+        struct arch_irq_remapping_request request;
+        struct arch_irq_remapping_info irq_info;
+
+        irq_request_msi_fill(&request, pirq_dpci->gmsi.intremap.source_id,
+                             pirq_dpci->gmsi.intremap.addr,
+                             pirq_dpci->gmsi.intremap.data);
+        rc = viommu_get_irq_info(d, &request, &irq_info);
+        if ( rc )
+            return rc;
+
+        *gvec = irq_info.vector;
+        *dest = irq_info.dest;
+        *dm = irq_info.dest_mode;
+        *dlm = irq_info.delivery_mode;
+    }
+    else if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI )
+    {
+        *gvec = pirq_dpci->gmsi.legacy.gvec;
+        *dest = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
+                          XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
+        *dm = pirq_dpci->gmsi.legacy.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
+        *dlm = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
+                         XEN_DOMCTL_VMSI_X86_DELIV_MASK);
+    }
+    else
+        ASSERT_UNREACHABLE();
+
+    return rc;
+}
+
 int pt_irq_create_bind(
     struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
 {
@@ -338,20 +440,24 @@ int pt_irq_create_bind(
     switch ( pt_irq_bind->irq_type )
     {
     case PT_IRQ_TYPE_MSI:
+    case PT_IRQ_TYPE_MSI_IR:
     {
-        uint8_t dest, delivery_mode;
+        uint8_t delivery_mode, gvec;
+        uint32_t dest;
         bool dest_mode;
         int dest_vcpu_id;
         const struct vcpu *vcpu;
-        uint32_t gflags = pt_irq_bind->u.msi.gflags &
-                          ~XEN_DOMCTL_VMSI_X86_UNMASKED;
+        bool ir = (pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_IR);
+        uint64_t gtable = ir ? pt_irq_bind->u.msi_ir.gtable :
+                          pt_irq_bind->u.msi.gtable;
 
         if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
         {
             pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED | HVM_IRQ_DPCI_MACH_MSI |
                                HVM_IRQ_DPCI_GUEST_MSI;
-            pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
-            pirq_dpci->gmsi.legacy.gflags = gflags;
+            if ( ir )
+                pirq_dpci->flags |= HVM_IRQ_DPCI_GUEST_REMAPPED;
+            set_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind);
             /*
              * 'pt_irq_create_bind' can be called after 'pt_irq_destroy_bind'.
              * The 'pirq_cleanup_check' which would free the structure is only
@@ -366,9 +472,9 @@ int pt_irq_create_bind(
             pirq_dpci->dom = d;
             /* bind after hvm_irq_dpci is setup to avoid race with irq handler*/
             rc = pirq_guest_bind(d->vcpu[0], info, 0);
-            if ( rc == 0 && pt_irq_bind->u.msi.gtable )
+            if ( rc == 0 && gtable )
             {
-                rc = msixtbl_pt_register(d, info, pt_irq_bind->u.msi.gtable);
+                rc = msixtbl_pt_register(d, info, gtable);
                 if ( unlikely(rc) )
                 {
                     pirq_guest_unbind(d, info);
@@ -383,8 +489,7 @@ int pt_irq_create_bind(
             }
             if ( unlikely(rc) )
             {
-                pirq_dpci->gmsi.legacy.gflags = 0;
-                pirq_dpci->gmsi.legacy.gvec = 0;
+                clear_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind->irq_type);
                 pirq_dpci->dom = NULL;
                 pirq_dpci->flags = 0;
                 pirq_cleanup_check(info, d);
@@ -396,6 +501,9 @@ int pt_irq_create_bind(
         {
             uint32_t mask = HVM_IRQ_DPCI_MACH_MSI | HVM_IRQ_DPCI_GUEST_MSI;
 
+            if ( ir )
+                mask |= HVM_IRQ_DPCI_GUEST_REMAPPED;
+
             if ( (pirq_dpci->flags & mask) != mask )
             {
                 spin_unlock(&d->event_lock);
@@ -403,31 +511,30 @@ int pt_irq_create_bind(
             }
 
             /* If pirq is already mapped as vmsi, update guest data/addr. */
-            if ( pirq_dpci->gmsi.legacy.gvec != pt_irq_bind->u.msi.gvec ||
-                 pirq_dpci->gmsi.legacy.gflags != gflags )
+            if ( hvm_gmsi_info_need_update(&pirq_dpci->gmsi, pt_irq_bind) )
             {
                 /* Directly clear pending EOIs before enabling new MSI info. */
                 pirq_guest_eoi(info);
 
-        }
-                pirq_dpci->gmsi.legacy.gvec = pt_irq_bind->u.msi.gvec;
-                pirq_dpci->gmsi.legacy.gflags = gflags;
+                set_hvm_gmsi_info(&pirq_dpci->gmsi, pt_irq_bind);
             }
         }
         /* Calculate dest_vcpu_id for MSI-type pirq migration. */
-        dest = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
-                         XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
-        dest_mode = pirq_dpci->gmsi.legacy.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
-        delivery_mode = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
-                                  XEN_DOMCTL_VMSI_X86_DELIV_MASK);
-
+        rc = pirq_dpci_2_msi_attr(d, pirq_dpci, &gvec, &dest, &dest_mode,
+                                  &delivery_mode);
+        if ( unlikely(rc) )
+        {
+            spin_unlock(&d->event_lock);
+            return rc;
+        }
         dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode);
         pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id;
         spin_unlock(&d->event_lock);
 
         pirq_dpci->gmsi.posted = false;
         vcpu = (dest_vcpu_id >= 0) ? d->vcpu[dest_vcpu_id] : NULL;
-        if ( iommu_intpost )
+        /* FIXME: won't use interrupt posting for guest's remapping MSIs */
+        if ( iommu_intpost && !ir )
         {
             if ( delivery_mode == dest_LowestPrio )
                 vcpu = vector_hashing_dest(d, dest, dest_mode,
@@ -439,7 +546,7 @@ int pt_irq_create_bind(
             hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
 
         /* Use interrupt posting if it is supported. */
-        if ( iommu_intpost )
+        if ( iommu_intpost && !ir )
             pi_update_irte(vcpu ? &vcpu->arch.hvm_vmx.pi_desc : NULL,
                            info, pirq_dpci->gmsi.legacy.gvec);
 
@@ -646,6 +753,7 @@ int pt_irq_destroy_bind(
         }
         break;
     case PT_IRQ_TYPE_MSI:
+    case PT_IRQ_TYPE_MSI_IR:
         break;
     default:
         return -EOPNOTSUPP;
@@ -664,7 +772,8 @@ int pt_irq_destroy_bind(
     pirq = pirq_info(d, machine_gsi);
     pirq_dpci = pirq_dpci(pirq);
 
-    if ( hvm_irq_dpci && pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI )
+    if ( hvm_irq_dpci && pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI &&
+         pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI_IR )
     {
         unsigned int bus = pt_irq_bind->u.pci.bus;
         unsigned int device = pt_irq_bind->u.pci.device;
diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
index bd8a918..4f5d37b 100644
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -121,6 +121,7 @@ struct dev_intx_gsi_link {
 #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
 #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
 #define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
+#define _HVM_IRQ_DPCI_GUEST_REMAPPED_SHIFT      7
 #define _HVM_IRQ_DPCI_TRANSLATE_SHIFT          15
 #define HVM_IRQ_DPCI_MACH_PCI        (1u << _HVM_IRQ_DPCI_MACH_PCI_SHIFT)
 #define HVM_IRQ_DPCI_MACH_MSI        (1u << _HVM_IRQ_DPCI_MACH_MSI_SHIFT)
@@ -128,6 +129,7 @@ struct dev_intx_gsi_link {
 #define HVM_IRQ_DPCI_EOI_LATCH       (1u << _HVM_IRQ_DPCI_EOI_LATCH_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_PCI       (1u << _HVM_IRQ_DPCI_GUEST_PCI_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_MSI       (1u << _HVM_IRQ_DPCI_GUEST_MSI_SHIFT)
+#define HVM_IRQ_DPCI_GUEST_REMAPPED  (1u << _HVM_IRQ_DPCI_GUEST_REMAPPED_SHIFT)
 #define HVM_IRQ_DPCI_IDENTITY_GSI    (1u << _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT)
 #define HVM_IRQ_DPCI_TRANSLATE       (1u << _HVM_IRQ_DPCI_TRANSLATE_SHIFT)
 
@@ -137,6 +139,11 @@ struct hvm_gmsi_info {
             uint32_t gvec;
             uint32_t gflags;
         } legacy;
+        struct {
+            uint32_t source_id;
+            uint32_t data;
+            uint64_t addr;
+        } intremap;
     };
     int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
     bool posted; /* directly deliver to guest via VT-d PI? */
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 68854b6..8c59cfc 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -559,6 +559,7 @@ typedef enum pt_irq_type_e {
     PT_IRQ_TYPE_MSI,
     PT_IRQ_TYPE_MSI_TRANSLATE,
     PT_IRQ_TYPE_SPI,    /* ARM: valid range 32-1019 */
+    PT_IRQ_TYPE_MSI_IR,
 } pt_irq_type_t;
 struct xen_domctl_bind_pt_irq {
     uint32_t machine_irq;
@@ -586,6 +587,12 @@ struct xen_domctl_bind_pt_irq {
             uint64_aligned_t gtable;
         } msi;
         struct {
+            uint32_t source_id;
+            uint32_t data;
+            uint64_t addr;
+            uint64_t gtable;
+        } msi_ir;
+        struct {
             uint16_t spi;
         } spi;
     } u;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (23 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 16:07   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults Lan Tianyu
                   ` (4 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, jbeulich, Chao Gao, roger.pau

From: Chao Gao <chao.gao@intel.com>

In two situations, hypervisor delivers a msi to a hvm guest. One is
when qemu sends a request to hypervisor through XEN_DMOP_inject_msi.
The other is when a physical interrupt arrives and it has been bound
to a guest msi.

For the former, the msi is routed to common vIOMMU layer if it is in
remapping format. For the latter, if the pt irq is bound to a guest
remapping msi, a new remapping msi is constructed based on the binding
information and routed to common vIOMMU layer.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/arch/x86/hvm/irq.c       |  7 +++++++
 xen/arch/x86/hvm/vmsi.c      | 14 +++++++++++++-
 xen/drivers/passthrough/io.c | 21 ++++++++++-----------
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index e425df9..e99ba7d 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -23,9 +23,11 @@
 #include <xen/sched.h>
 #include <xen/irq.h>
 #include <xen/keyhandler.h>
+#include <xen/viommu.h>
 #include <asm/hvm/domain.h>
 #include <asm/hvm/support.h>
 #include <asm/msi.h>
+#include <asm/viommu.h>
 
 /* Must be called with hvm_domain->irq_lock hold */
 static void assert_gsi(struct domain *d, unsigned ioapic_gsi)
@@ -339,6 +341,11 @@ int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
     uint8_t trig_mode = (data & MSI_DATA_TRIGGER_MASK)
         >> MSI_DATA_TRIGGER_SHIFT;
     uint8_t vector = data & MSI_DATA_VECTOR_MASK;
+    struct arch_irq_remapping_request request;
+
+    irq_request_msi_fill(&request, 0, addr, data);
+    if ( viommu_check_irq_remapping(d, &request) )
+        return viommu_handle_irq_request(d, &request);
 
     if ( !vector )
     {
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 7f21853..1244df1 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -31,6 +31,7 @@
 #include <xen/errno.h>
 #include <xen/sched.h>
 #include <xen/irq.h>
+#include <xen/viommu.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/vpic.h>
@@ -39,6 +40,7 @@
 #include <asm/current.h>
 #include <asm/event.h>
 #include <asm/io_apic.h>
+#include <asm/viommu.h>
 
 static void vmsi_inj_irq(
     struct vlapic *target,
@@ -115,7 +117,17 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 
     ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
 
-    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_REMAPPED )
+    {
+        struct arch_irq_remapping_request request;
+
+        irq_request_msi_fill(&request, pirq_dpci->gmsi.intremap.source_id,
+                             pirq_dpci->gmsi.intremap.addr,
+                             pirq_dpci->gmsi.intremap.data);
+        viommu_handle_irq_request(d, &request);
+    }
+    else
+        vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
 }
 
 /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 6196334..349a8cf 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -942,21 +942,20 @@ static void __msi_pirq_eoi(struct hvm_pirq_dpci *pirq_dpci)
 static int _hvm_dpci_msi_eoi(struct domain *d,
                              struct hvm_pirq_dpci *pirq_dpci, void *arg)
 {
-    int vector = (long)arg;
+    uint8_t vector, dlm, vector_target = (long)arg;
+    uint32_t dest;
+    bool dm;
 
-    if ( (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
-         (pirq_dpci->gmsi.legacy.gvec == vector) )
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI )
     {
-        unsigned int dest = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
-                                      XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
-        bool dest_mode = pirq_dpci->gmsi.legacy.gflags &
-                         XEN_DOMCTL_VMSI_X86_DM_MASK;
+        if ( pirq_dpci_2_msi_attr(d, pirq_dpci, &vector, &dest, &dm, &dlm) )
+            return 0;
 
-        if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
-                               dest_mode) )
+        if ( vector == vector_target &&
+             vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest, dm) )
         {
-            __msi_pirq_eoi(pirq_dpci);
-            return 1;
+                __msi_pirq_eoi(pirq_dpci);
+                return 1;
         }
     }
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (24 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-19 16:31   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 27/29] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, jbeulich, Chao Gao, roger.pau

From: Chao Gao <chao.gao@intel.com>

Interrupt translation faults are non-recoverable fault. When faults
are triggered, it needs to populate fault info to Fault Recording
Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
to deal with faults.

This patch emulates hardware's handling interrupt translation
faults (more information about the process can be found in VT-d spec,
chipter "Translation Faults", section "Non-Recoverable Fault
Reporting" and section "Non-Recoverable Logging").
Specifically, viommu_record_fault() records the fault information and
viommu_report_non_recoverable_fault() reports faults to software.
Currently, only Primary Fault Logging is supported and the Number of
Fault-recording Registers is 1.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  60 +++++++--
 xen/drivers/passthrough/vtd/vvtd.c  | 252 +++++++++++++++++++++++++++++++++++-
 2 files changed, 301 insertions(+), 11 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 790384f..e19b045 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -198,26 +198,66 @@
 #define DMA_CCMD_CAIG_MASK(x) (((u64)x) & ((u64) 0x3 << 59))
 
 /* FECTL_REG */
-#define DMA_FECTL_IM (((u64)1) << 31)
+#define DMA_FECTL_IM_SHIFT 31
+#define DMA_FECTL_IM (1U << DMA_FECTL_IM_SHIFT)
+#define DMA_FECTL_IP_SHIFT 30
+#define DMA_FECTL_IP (1U << DMA_FECTL_IP_SHIFT)
 
 /* FSTS_REG */
-#define DMA_FSTS_PFO ((u64)1 << 0)
-#define DMA_FSTS_PPF ((u64)1 << 1)
-#define DMA_FSTS_AFO ((u64)1 << 2)
-#define DMA_FSTS_APF ((u64)1 << 3)
-#define DMA_FSTS_IQE ((u64)1 << 4)
-#define DMA_FSTS_ICE ((u64)1 << 5)
-#define DMA_FSTS_ITE ((u64)1 << 6)
-#define DMA_FSTS_FAULTS    DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE
+#define DMA_FSTS_PFO_SHIFT 0
+#define DMA_FSTS_PFO (1U << DMA_FSTS_PFO_SHIFT)
+#define DMA_FSTS_PPF_SHIFT 1
+#define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_SHIFT)
+#define DMA_FSTS_AFO (1U << 2)
+#define DMA_FSTS_APF (1U << 3)
+#define DMA_FSTS_IQE (1U << 4)
+#define DMA_FSTS_ICE (1U << 5)
+#define DMA_FSTS_ITE (1U << 6)
+#define DMA_FSTS_PRO_SHIFT 7
+#define DMA_FSTS_PRO (1U << DMA_FSTS_PRO_SHIFT)
+#define DMA_FSTS_FAULTS    (DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | \
+                            DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | \
+                            DMA_FSTS_ITE | DMA_FSTS_PRO)
+#define DMA_FSTS_RW1CS     (DMA_FSTS_PFO | DMA_FSTS_AFO | DMA_FSTS_APF | \
+                            DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE | \
+                            DMA_FSTS_PRO)
 #define dma_fsts_fault_record_index(s) (((s) >> 8) & 0xff)
 
 /* FRCD_REG, 32 bits access */
-#define DMA_FRCD_F (((u64)1) << 31)
+#define DMA_FRCD_LEN            0x10
+#define DMA_FRCD2_OFFSET        0x8
+#define DMA_FRCD3_OFFSET        0xc
+#define DMA_FRCD_F_SHIFT        31
+#define DMA_FRCD_F ((u64)1 << DMA_FRCD_F_SHIFT)
 #define dma_frcd_type(d) ((d >> 30) & 1)
 #define dma_frcd_fault_reason(c) (c & 0xff)
 #define dma_frcd_source_id(c) (c & 0xffff)
 #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
 
+struct vtd_fault_record_register
+{
+    union {
+        struct {
+            uint64_t lo;
+            uint64_t hi;
+        } bits;
+        struct {
+            uint64_t rsvd0          :12,
+                     fault_info     :52;
+            uint64_t source_id      :16,
+                     rsvd1          :9,
+                     pmr            :1,  /* Privilege Mode Requested */
+                     exe            :1,  /* Execute Permission Requested */
+                     pasid_p        :1,  /* PASID Present */
+                     fault_reason   :8,  /* Fault Reason */
+                     pasid_val      :20, /* PASID Value */
+                     addr_type      :2,  /* Address Type */
+                     type           :1,  /* Type. (0) Write (1) Read/AtomicOp */
+                     fault          :1;  /* Fault */
+        } fields;
+    };
+};
+
 enum VTD_FAULT_TYPE
 {
     /* Interrupt remapping transition faults */
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index bd1cadd..745941c 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -19,6 +19,7 @@
  */
 
 #include <xen/domain_page.h>
+#include <xen/lib.h>
 #include <xen/sched.h>
 #include <xen/types.h>
 #include <xen/viommu.h>
@@ -41,6 +42,7 @@ unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
 struct hvm_hw_vvtd_status {
     uint32_t eim_enabled : 1,
              intremap_enabled : 1;
+    uint32_t fault_index;
     uint32_t irt_max_entry;
     /* Interrupt remapping table base gfn */
     uint64_t irt;
@@ -86,6 +88,22 @@ struct vvtd *domain_vvtd(struct domain *d)
     return (d->viommu) ? d->viommu->priv : NULL;
 }
 
+static inline int vvtd_test_and_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    return test_and_set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
+}
+
+static inline int vvtd_test_and_clear_bit(struct vvtd *vvtd, uint32_t reg,
+                                          int nr)
+{
+    return test_and_clear_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
+}
+
+static inline int vvtd_test_bit(struct vvtd *vvtd, uint32_t reg, int nr)
+{
+    return test_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
+}
+
 static inline void vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
 {
     __set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
@@ -206,6 +224,23 @@ static int vvtd_delivery(struct domain *d, uint8_t vector,
     return 0;
 }
 
+void vvtd_generate_interrupt(const struct vvtd *vvtd, uint32_t addr,
+                             uint32_t data)
+{
+    uint8_t dest, dm, dlm, tm, vector;
+
+    vvtd_debug("Sending interrupt %x %x to d%d",
+               addr, data, vvtd->domain->domain_id);
+
+    dest = MASK_EXTR(addr, MSI_ADDR_DEST_ID_MASK);
+    dm = !!(addr & MSI_ADDR_DESTMODE_MASK);
+    dlm = MASK_EXTR(data, MSI_DATA_DELIVERY_MODE_MASK);
+    tm = MASK_EXTR(data, MSI_DATA_TRIGGER_MASK);
+    vector = data & MSI_DATA_VECTOR_MASK;
+
+    vvtd_delivery(vvtd->domain, vector, dest, dm, dlm, tm);
+}
+
 static uint32_t irq_remapping_request_index(
     const struct arch_irq_remapping_request *irq)
 {
@@ -243,6 +278,207 @@ static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
            MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
 }
 
+static void vvtd_report_non_recoverable_fault(struct vvtd *vvtd, int reason)
+{
+    uint32_t fsts;
+
+    fsts = vvtd_get_reg(vvtd, DMAR_FSTS_REG);
+    vvtd_set_bit(vvtd, DMAR_FSTS_REG, reason);
+
+    /*
+     * Accoroding to VT-d spec "Non-Recoverable Fault Event" chapter, if
+     * there are any previously reported interrupt conditions that are yet to
+     * be sevices by software, the Fault Event interrrupt is not generated.
+     */
+    if ( fsts & DMA_FSTS_FAULTS )
+        return;
+
+    vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
+    if ( !vvtd_test_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_SHIFT) )
+    {
+        uint32_t fe_data, fe_addr;
+        fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
+        fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
+        vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
+        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
+    }
+}
+
+static void vvtd_update_ppf(struct vvtd *vvtd)
+{
+    int i;
+    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
+    unsigned int base = cap_fault_reg_offset(cap);
+
+    for ( i = 0; i < cap_num_fault_regs(cap); i++ )
+    {
+        if ( vvtd_test_bit(vvtd, base + i * DMA_FRCD_LEN + DMA_FRCD3_OFFSET,
+                           DMA_FRCD_F_SHIFT) )
+        {
+            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PPF_SHIFT);
+            return;
+        }
+    }
+    /*
+     * No Primary Fault is in Fault Record Registers, thus clear PPF bit in
+     * FSTS.
+     */
+    vvtd_clear_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PPF_SHIFT);
+
+    /* If no fault is in FSTS, clear pending bit in FECTL. */
+    if ( !(vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS) )
+        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
+}
+
+/*
+ * Commit a fault to emulated Fault Record Registers.
+ */
+static void vvtd_commit_frcd(struct vvtd *vvtd, int idx,
+                             struct vtd_fault_record_register *frcd)
+{
+    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
+    unsigned int base = cap_fault_reg_offset(cap);
+
+    vvtd_set_reg_quad(vvtd, base + idx * DMA_FRCD_LEN, frcd->bits.lo);
+    vvtd_set_reg_quad(vvtd, base + idx * DMA_FRCD_LEN + 8, frcd->bits.hi);
+    vvtd_update_ppf(vvtd);
+}
+
+/*
+ * Allocate a FRCD for the caller. If success, return the FRI. Or, return -1
+ * when failure.
+ */
+static int vvtd_alloc_frcd(struct vvtd *vvtd)
+{
+    int prev;
+    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
+    unsigned int base = cap_fault_reg_offset(cap);
+
+    /* Set the F bit to indicate the FRCD is in use. */
+    if ( !vvtd_test_and_set_bit(vvtd,
+                                base + vvtd->status.fault_index * DMA_FRCD_LEN +
+                                DMA_FRCD3_OFFSET, DMA_FRCD_F_SHIFT) )
+    {
+        prev = vvtd->status.fault_index;
+        vvtd->status.fault_index = (prev + 1) % cap_num_fault_regs(cap);
+        return vvtd->status.fault_index;
+    }
+    return -ENOMEM;
+}
+
+static void vvtd_free_frcd(struct vvtd *vvtd, int i)
+{
+    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
+    unsigned int base = cap_fault_reg_offset(cap);
+
+    vvtd_clear_bit(vvtd, base + i * DMA_FRCD_LEN + DMA_FRCD3_OFFSET,
+                   DMA_FRCD_F_SHIFT);
+}
+
+static int vvtd_record_fault(struct vvtd *vvtd,
+                             struct arch_irq_remapping_request *request,
+                             int reason)
+{
+    struct vtd_fault_record_register frcd;
+    int fault_index;
+
+    switch(reason)
+    {
+    case VTD_FR_IR_REQ_RSVD:
+    case VTD_FR_IR_INDEX_OVER:
+    case VTD_FR_IR_ENTRY_P:
+    case VTD_FR_IR_ROOT_INVAL:
+    case VTD_FR_IR_IRTE_RSVD:
+    case VTD_FR_IR_REQ_COMPAT:
+    case VTD_FR_IR_SID_ERR:
+        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_SHIFT) )
+            return X86EMUL_OKAY;
+
+        /* No available Fault Record means Fault overflowed */
+        fault_index = vvtd_alloc_frcd(vvtd);
+        if ( fault_index == -1 )
+        {
+            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PFO_SHIFT);
+            return X86EMUL_OKAY;
+        }
+        memset(&frcd, 0, sizeof(frcd));
+        frcd.fields.fault_reason = (uint8_t)reason;
+        frcd.fields.fault_info = ((uint64_t)irq_remapping_request_index(request)) << 36;
+        frcd.fields.source_id = (uint16_t)request->source_id;
+        frcd.fields.fault = 1;
+        vvtd_commit_frcd(vvtd, fault_index, &frcd);
+        return X86EMUL_OKAY;
+
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
+
+    gdprintk(XENLOG_ERR, "Can't handle vVTD Fault (reason 0x%x).", reason);
+    domain_crash(vvtd->domain);
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
+{
+    /* Writing a 1 means clear fault */
+    if ( val & DMA_FRCD_F )
+    {
+        vvtd_free_frcd(vvtd, 0);
+        vvtd_update_ppf(vvtd);
+    }
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
+{
+    /*
+     * Only DMA_FECTL_IM bit is writable. Generate pending event when unmask.
+     */
+    if ( !(val & DMA_FECTL_IM) )
+    {
+        /* Clear IM */
+        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_SHIFT);
+        if ( vvtd_test_and_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT) )
+        {
+            uint32_t fe_data, fe_addr;
+
+            fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
+            fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
+            vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
+        }
+    }
+    else
+        vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_SHIFT);
+
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
+{
+    int i, max_fault_index = DMA_FSTS_PRO_SHIFT;
+    uint64_t bits_to_clear = val & DMA_FSTS_RW1CS;
+
+    if ( bits_to_clear )
+    {
+        i = find_first_bit(&bits_to_clear, max_fault_index / 8 + 1);
+        while ( i <= max_fault_index )
+        {
+            vvtd_clear_bit(vvtd, DMAR_FSTS_REG, i);
+            i = find_next_bit(&bits_to_clear, max_fault_index / 8 + 1, i + 1);
+        }
+    }
+
+    /*
+     * Clear IP field when all status fields in the Fault Status Register
+     * being clear.
+     */
+    if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
+        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
+
+    return X86EMUL_OKAY;
+}
+
 static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
 {
     vvtd_info("%sable Interrupt Remapping",
@@ -336,7 +572,9 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
                       unsigned int len, unsigned long val)
 {
     struct vvtd *vvtd = domain_vvtd(v->domain);
+    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
     unsigned int offset = addr - vvtd->base_addr;
+    unsigned int fault_offset = cap_fault_reg_offset(cap);
 
     vvtd_info("Write offset %x len %d val %lx\n", offset, len, val);
 
@@ -350,6 +588,12 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
         case DMAR_GCMD_REG:
             return vvtd_write_gcmd(vvtd, val);
 
+        case DMAR_FSTS_REG:
+            return vvtd_write_fsts(vvtd, val);
+
+        case DMAR_FECTL_REG:
+            return vvtd_write_fectl(vvtd, val);
+
         case DMAR_IEDATA_REG:
         case DMAR_IEADDR_REG:
         case DMAR_IEUADDR_REG:
@@ -362,6 +606,9 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             break;
 
         default:
+            if ( offset == fault_offset + DMA_FRCD3_OFFSET )
+                return vvtd_write_frcd3(vvtd, val);
+
             break;
         }
     }
@@ -374,6 +621,9 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             break;
 
         default:
+            if ( offset == fault_offset + DMA_FRCD2_OFFSET )
+                return vvtd_write_frcd3(vvtd, val >> 32);
+
             break;
         }
     }
@@ -406,7 +656,7 @@ static void vvtd_handle_fault(struct vvtd *vvtd,
     /* fall through */
     case VTD_FR_IR_INDEX_OVER:
     case VTD_FR_IR_ROOT_INVAL:
-        /* TODO: handle fault (e.g. record and report this fault to VM */
+        vvtd_record_fault(vvtd, irq, fault);
         break;
 
     default:
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 27/29] x86/vvtd: Enable Queued Invalidation through GCMD
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (25 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-20 10:30   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
                   ` (2 subsequent siblings)
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, jbeulich, Chao Gao, roger.pau

From: Chao Gao <chao.gao@intel.com>

Software writes to QIE field of GCMD to enable or disable queued
invalidations. This patch emulates QIE field of GCMD.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  3 ++-
 xen/drivers/passthrough/vtd/vvtd.c  | 17 +++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index e19b045..c69cd21 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -162,7 +162,8 @@
 #define DMA_GSTS_FLS    (((u64)1) << 29)
 #define DMA_GSTS_AFLS   (((u64)1) << 28)
 #define DMA_GSTS_WBFS   (((u64)1) << 27)
-#define DMA_GSTS_QIES   (((u64)1) <<26)
+#define DMA_GSTS_QIES_SHIFT     26
+#define DMA_GSTS_QIES   (((u64)1) << DMA_GSTS_QIES_SHIFT)
 #define DMA_GSTS_IRES_SHIFT     25
 #define DMA_GSTS_IRES   (((u64)1) << DMA_GSTS_IRES_SHIFT)
 #define DMA_GSTS_SIRTPS_SHIFT   24
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 745941c..55f7a46 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -496,6 +496,19 @@ static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
     }
 }
 
+static void vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)
+{
+    vvtd_info("%sable Queue Invalidation", (val & DMA_GCMD_QIE) ? "En" : "Dis");
+
+    if ( val & DMA_GCMD_QIE )
+        vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_SHIFT);
+    else
+    {
+        vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, 0);
+        vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_SHIFT);
+    }
+}
+
 static void vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
 {
     uint64_t irta = vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG);
@@ -535,6 +548,10 @@ static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
         vvtd_handle_gcmd_sirtp(vvtd, val);
     if ( changed & DMA_GCMD_IRE )
         vvtd_handle_gcmd_ire(vvtd, val);
+    if ( changed & DMA_GCMD_QIE )
+        vvtd_handle_gcmd_qie(vvtd, val);
+    if ( changed & ~(DMA_GCMD_SIRTP | DMA_GCMD_IRE | DMA_GCMD_QIE) )
+        vvtd_info("Only SIRTP, IRE, QIE in GCMD are handled");
 
     return X86EMUL_OKAY;
 }
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (26 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 27/29] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-20 11:20   ` Roger Pau Monné
  2017-09-22  3:02 ` [PATCH V3 29/29] x86/vvtd: save and restore emulated VT-d Lan Tianyu
  2017-10-20 11:36 ` [PATCH V3 00/29] Roger Pau Monné
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, jbeulich, Chao Gao, roger.pau

From: Chao Gao <chao.gao@intel.com>

Queued Invalidation Interface is an expanded invalidation interface with
extended capabilities. Hardware implementations report support for queued
invalidation interface through the Extended Capability Register. The queued
invalidation interface uses an Invalidation Queue (IQ), which is a circular
buffer in system memory. Software submits commands by writing Invalidation
Descriptors to the IQ.

In this patch, a new function viommu_process_iq() is used for emulating how
hardware handles invalidation requests through QI.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.h |  19 ++-
 xen/drivers/passthrough/vtd/vvtd.c  | 232 ++++++++++++++++++++++++++++++++++++
 2 files changed, 250 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index c69cd21..c2b83f1 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -177,6 +177,21 @@
 #define DMA_IRTA_S(val)         (val & 0xf)
 #define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
 
+/* IQA_REG */
+#define DMA_IQA_ADDR(val)       (val & ~0xfffULL)
+#define DMA_IQA_QS(val)         (val & 0x7)
+#define DMA_IQA_RSVD            0xff8ULL
+
+/* IECTL_REG */
+#define DMA_IECTL_IM_SHIFT 31
+#define DMA_IECTL_IM            (1 << DMA_IECTL_IM_SHIFT)
+#define DMA_IECTL_IP_SHIFT 30
+#define DMA_IECTL_IP            (1 << DMA_IECTL_IP_SHIFT)
+
+/* ICS_REG */
+#define DMA_ICS_IWC_SHIFT       0
+#define DMA_ICS_IWC             (1 << DMA_ICS_IWC_SHIFT)
+
 /* PMEN_REG */
 #define DMA_PMEN_EPM    (((u32)1) << 31)
 #define DMA_PMEN_PRS    (((u32)1) << 0)
@@ -211,7 +226,8 @@
 #define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_SHIFT)
 #define DMA_FSTS_AFO (1U << 2)
 #define DMA_FSTS_APF (1U << 3)
-#define DMA_FSTS_IQE (1U << 4)
+#define DMA_FSTS_IQE_SHIFT 4
+#define DMA_FSTS_IQE (1U << DMA_FSTS_IQE_SHIFT)
 #define DMA_FSTS_ICE (1U << 5)
 #define DMA_FSTS_ITE (1U << 6)
 #define DMA_FSTS_PRO_SHIFT 7
@@ -562,6 +578,7 @@ struct qinval_entry {
 
 /* Queue invalidation head/tail shift */
 #define QINVAL_INDEX_SHIFT 4
+#define QINVAL_INDEX_MASK  0x7fff0ULL
 
 #define qinval_present(v) ((v).lo & 1)
 #define qinval_fault_disable(v) (((v).lo >> 1) & 1)
diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 55f7a46..668d0c9 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -28,6 +28,7 @@
 #include <asm/current.h>
 #include <asm/event.h>
 #include <asm/hvm/domain.h>
+#include <asm/hvm/support.h>
 #include <asm/io_apic.h>
 #include <asm/page.h>
 #include <asm/p2m.h>
@@ -419,6 +420,177 @@ static int vvtd_record_fault(struct vvtd *vvtd,
     return X86EMUL_OKAY;
 }
 
+/*
+ * Process a invalidation descriptor. Currently, only two types descriptors,
+ * Interrupt Entry Cache Invalidation Descritor and Invalidation Wait
+ * Descriptor are handled.
+ * @vvtd: the virtual vtd instance
+ * @i: the index of the invalidation descriptor to be processed
+ *
+ * If success return 0, or return non-zero when failure.
+ */
+static int process_iqe(struct vvtd *vvtd, int i)
+{
+    uint64_t iqa;
+    struct qinval_entry *qinval_page;
+    int ret = 0;
+
+    iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
+    qinval_page = map_guest_page(vvtd->domain, DMA_IQA_ADDR(iqa)>>PAGE_SHIFT);
+    if ( IS_ERR(qinval_page) )
+    {
+        gdprintk(XENLOG_ERR, "Can't map guest IRT (rc %ld)",
+                 PTR_ERR(qinval_page));
+        return PTR_ERR(qinval_page);
+    }
+
+    switch ( qinval_page[i].q.inv_wait_dsc.lo.type )
+    {
+    case TYPE_INVAL_WAIT:
+        if ( qinval_page[i].q.inv_wait_dsc.lo.sw )
+        {
+            uint32_t data = qinval_page[i].q.inv_wait_dsc.lo.sdata;
+            uint64_t addr = (qinval_page[i].q.inv_wait_dsc.hi.saddr << 2);
+
+            ret = hvm_copy_to_guest_phys(addr, &data, sizeof(data), current);
+            if ( ret )
+                vvtd_info("Failed to write status address");
+        }
+
+        /*
+         * The following code generates an invalidation completion event
+         * indicating the invalidation wait descriptor completion. Note that
+         * the following code fragment is not tested properly.
+         */
+        if ( qinval_page[i].q.inv_wait_dsc.lo.iflag )
+        {
+            uint32_t ie_data, ie_addr;
+            if ( !vvtd_test_and_set_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_SHIFT) )
+            {
+                vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
+                if ( !vvtd_test_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT) )
+                {
+                    ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
+                    ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
+                    vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
+                    vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
+                }
+            }
+        }
+        break;
+
+    case TYPE_INVAL_IEC:
+        /*
+         * Currently, no cache is preserved in hypervisor. Only need to update
+         * pIRTEs which are modified in binding process.
+         */
+        break;
+
+    default:
+        goto error;
+    }
+
+    unmap_guest_page((void*)qinval_page);
+    return ret;
+
+ error:
+    unmap_guest_page((void*)qinval_page);
+    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
+    domain_crash(vvtd->domain);
+    return ret;
+}
+
+/*
+ * Invalidate all the descriptors in Invalidation Queue.
+ */
+static void vvtd_process_iq(struct vvtd *vvtd)
+{
+    uint64_t iqh, iqt, iqa, max_entry, i;
+    int err = 0;
+
+    /*
+     * No new descriptor is fetched from the Invalidation Queue until
+     * software clears the IQE field in the Fault Status Register
+     */
+    if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_SHIFT) )
+        return;
+
+    iqh = vvtd_get_reg_quad(vvtd, DMAR_IQH_REG);
+    iqt = vvtd_get_reg_quad(vvtd, DMAR_IQT_REG);
+    iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
+
+    max_entry = 1 << (QINVAL_ENTRY_ORDER + DMA_IQA_QS(iqa));
+    iqh = MASK_EXTR(iqh, QINVAL_INDEX_MASK);
+    iqt = MASK_EXTR(iqt, QINVAL_INDEX_MASK);
+
+    ASSERT(iqt < max_entry);
+    if ( iqh == iqt )
+        return;
+
+    for ( i = iqh; i != iqt; i = (i + 1) % max_entry )
+    {
+        err = process_iqe(vvtd, i);
+        if ( err )
+            break;
+    }
+    vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, i << QINVAL_INDEX_SHIFT);
+
+    /*
+     * When IQE set, IQH references the desriptor associated with the error.
+     */
+    if ( err )
+        vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_IQE_SHIFT);
+}
+
+static int vvtd_write_iqt(struct vvtd *vvtd, unsigned long val)
+{
+    uint64_t max_entry, iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
+
+    if ( val & ~QINVAL_INDEX_MASK )
+    {
+        vvtd_info("Attempt to set reserved bits in Invalidation Queue Tail");
+        return X86EMUL_OKAY;
+    }
+
+    max_entry = 1 << (QINVAL_ENTRY_ORDER + DMA_IQA_QS(iqa));
+    if ( MASK_EXTR(val, QINVAL_INDEX_MASK) >= max_entry )
+    {
+        vvtd_info("IQT: Value %lx exceeded supported max index.", val);
+        return X86EMUL_OKAY;
+    }
+
+    vvtd_set_reg_quad(vvtd, DMAR_IQT_REG, val);
+    vvtd_process_iq(vvtd);
+
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_iqa(struct vvtd *vvtd, unsigned long val)
+{
+    uint32_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
+    unsigned int guest_max_addr_width = cap_mgaw(cap);
+
+    if ( val & (~((1ULL << guest_max_addr_width) - 1) | DMA_IQA_RSVD) )
+    {
+        vvtd_info("Attempt to set reserved bits in Invalidation Queue Address");
+        return X86EMUL_OKAY;
+    }
+
+    vvtd_set_reg_quad(vvtd, DMAR_IQA_REG, val);
+    return X86EMUL_OKAY;
+}
+
+static int vvtd_write_ics(struct vvtd *vvtd, uint32_t val)
+{
+    if ( val & DMA_ICS_IWC )
+    {
+        vvtd_clear_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_SHIFT);
+        /*When IWC field is cleared, the IP field needs to be cleared */
+        vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
+    }
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
 {
     /* Writing a 1 means clear fault */
@@ -430,6 +602,30 @@ static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
     return X86EMUL_OKAY;
 }
 
+static int vvtd_write_iectl(struct vvtd *vvtd, uint32_t val)
+{
+    /*
+     * Only DMA_IECTL_IM bit is writable. Generate pending event when unmask.
+     */
+    if ( !(val & DMA_IECTL_IM) )
+    {
+        /* Clear IM and clear IP */
+        vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT);
+        if ( vvtd_test_and_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT) )
+        {
+            uint32_t ie_data, ie_addr;
+
+            ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
+            ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
+            vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
+        }
+    }
+    else
+        vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT);
+
+    return X86EMUL_OKAY;
+}
+
 static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
 {
     /*
@@ -476,6 +672,10 @@ static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
     if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
         vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
 
+    /* Continue to deal invalidation when IQE is clear */
+    if ( !vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_SHIFT) )
+        vvtd_process_iq(vvtd);
+
     return X86EMUL_OKAY;
 }
 
@@ -611,6 +811,32 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
         case DMAR_FECTL_REG:
             return vvtd_write_fectl(vvtd, val);
 
+        case DMAR_IECTL_REG:
+            return vvtd_write_iectl(vvtd, val);
+
+        case DMAR_ICS_REG:
+            return vvtd_write_ics(vvtd, val);
+
+        case DMAR_IQT_REG:
+            return vvtd_write_iqt(vvtd, (uint32_t)val);
+
+        case DMAR_IQA_REG:
+        {
+            uint32_t iqa_hi;
+
+            iqa_hi = vvtd_get_reg(vvtd, DMAR_IQA_REG_HI);
+            return vvtd_write_iqa(vvtd,
+                                 (uint32_t)val | ((uint64_t)iqa_hi << 32));
+        }
+
+        case DMAR_IQA_REG_HI:
+        {
+            uint32_t iqa_lo;
+
+            iqa_lo = vvtd_get_reg(vvtd, DMAR_IQA_REG);
+            return vvtd_write_iqa(vvtd, (val << 32) | iqa_lo);
+        }
+
         case DMAR_IEDATA_REG:
         case DMAR_IEADDR_REG:
         case DMAR_IEUADDR_REG:
@@ -637,6 +863,12 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
             vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);
             break;
 
+        case DMAR_IQT_REG:
+            return vvtd_write_iqt(vvtd, val);
+
+        case DMAR_IQA_REG:
+            return vvtd_write_iqa(vvtd, val);
+
         default:
             if ( offset == fault_offset + DMA_FRCD2_OFFSET )
                 return vvtd_write_frcd3(vvtd, val >> 32);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH V3 29/29] x86/vvtd: save and restore emulated VT-d
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (27 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
@ 2017-09-22  3:02 ` Lan Tianyu
  2017-10-20 11:25   ` Roger Pau Monné
  2017-10-20 11:36 ` [PATCH V3 00/29] Roger Pau Monné
  29 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-09-22  3:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Lan Tianyu, kevin.tian, andrew.cooper3, jbeulich, Chao Gao, roger.pau

From: Chao Gao <chao.gao@intel.com>

Provide a save-restore pair to save/restore registers and non-register
status.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
v3:
 - use one entry to save both vvtd registers and other intermediate
 state
---
 xen/drivers/passthrough/vtd/vvtd.c     | 66 ++++++++++++++++++++++++++--------
 xen/include/public/arch-x86/hvm/save.h | 25 ++++++++++++-
 2 files changed, 76 insertions(+), 15 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
index 668d0c9..2aecd93 100644
--- a/xen/drivers/passthrough/vtd/vvtd.c
+++ b/xen/drivers/passthrough/vtd/vvtd.c
@@ -28,11 +28,13 @@
 #include <asm/current.h>
 #include <asm/event.h>
 #include <asm/hvm/domain.h>
+#include <asm/hvm/save.h>
 #include <asm/hvm/support.h>
 #include <asm/io_apic.h>
 #include <asm/page.h>
 #include <asm/p2m.h>
 #include <asm/viommu.h>
+#include <public/hvm/save.h>
 
 #include "iommu.h"
 #include "vtd.h"
@@ -40,20 +42,6 @@
 /* Supported capabilities by vvtd */
 unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
 
-struct hvm_hw_vvtd_status {
-    uint32_t eim_enabled : 1,
-             intremap_enabled : 1;
-    uint32_t fault_index;
-    uint32_t irt_max_entry;
-    /* Interrupt remapping table base gfn */
-    uint64_t irt;
-};
-
-union hvm_hw_vvtd_regs {
-    uint32_t data32[256];
-    uint64_t data64[128];
-};
-
 struct vvtd {
     /* Address range of remapping hardware register-set */
     uint64_t base_addr;
@@ -1057,6 +1045,56 @@ static bool vvtd_is_remapping(struct domain *d,
     return 0;
 }
 
+static int vvtd_load(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_hw_vvtd *hw_vvtd;
+
+    if ( !domain_vvtd(d) )
+        return -ENODEV;
+
+    hw_vvtd = xmalloc(struct hvm_hw_vvtd);
+    if ( !hw_vvtd )
+        return -ENOMEM;
+
+    if ( hvm_load_entry(VVTD, h, hw_vvtd) )
+    {
+        xfree(hw_vvtd);
+        return -EINVAL;
+    }
+
+    memcpy(&domain_vvtd(d)->status, &hw_vvtd->status,
+           sizeof(struct hvm_hw_vvtd_status));
+    memcpy(domain_vvtd(d)->regs, &hw_vvtd->regs,
+           sizeof(union hvm_hw_vvtd_regs));
+    xfree(hw_vvtd);
+
+    return 0;
+}
+
+static int vvtd_save(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_hw_vvtd *hw_vvtd;
+    int ret;
+
+    if ( !domain_vvtd(d) )
+        return 0;
+
+    hw_vvtd = xmalloc(struct hvm_hw_vvtd);
+    if ( !hw_vvtd )
+        return -ENOMEM;
+
+    memcpy(&hw_vvtd->status, &domain_vvtd(d)->status,
+           sizeof(struct hvm_hw_vvtd_status));
+    memcpy(&hw_vvtd->regs, domain_vvtd(d)->regs,
+           sizeof(union hvm_hw_vvtd_regs));
+    ret = hvm_save_entry(VVTD, 0, h, hw_vvtd);
+    xfree(hw_vvtd);
+
+    return ret;
+}
+
+HVM_REGISTER_SAVE_RESTORE(VVTD, vvtd_save, vvtd_load, 1, HVMSR_PER_DOM);
+
 static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
 {
     uint64_t cap = cap_set_num_fault_regs(1ULL) |
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index fd7bf3f..181abb2 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -639,10 +639,33 @@ struct hvm_msr {
 
 #define CPU_MSR_CODE  20
 
+union hvm_hw_vvtd_regs {
+    uint32_t data32[256];
+    uint64_t data64[128];
+};
+
+struct hvm_hw_vvtd_status
+{
+    uint32_t eim_enabled : 1,
+             intremap_enabled : 1;
+    uint32_t fault_index;
+    uint32_t irt_max_entry;
+    /* Interrupt remapping table base gfn */
+    uint64_t irt;
+};
+
+struct hvm_hw_vvtd
+{
+    union hvm_hw_vvtd_regs regs;
+    struct hvm_hw_vvtd_status status;
+};
+
+DECLARE_HVM_SAVE_TYPE(VVTD, 21, struct hvm_hw_vvtd);
+
 /* 
  * Largest type-code in use
  */
-#define HVM_SAVE_CODE_MAX 20
+#define HVM_SAVE_CODE_MAX 21
 
 #endif /* __XEN_PUBLIC_HVM_SAVE_X86_H__ */
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
  2017-09-22  3:01 ` [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
@ 2017-10-18 13:26   ` Roger Pau Monné
  2017-10-19  2:26     ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-18 13:26 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote:
> This patch is to add Xen virtual IOMMU doc to introduce motivation,
> framework, vIOMMU hypercall and xl configuration.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  docs/misc/viommu.txt | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 136 insertions(+)
>  create mode 100644 docs/misc/viommu.txt
> 
> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
> new file mode 100644
> index 0000000..348e8c4
> --- /dev/null
> +++ b/docs/misc/viommu.txt
> @@ -0,0 +1,136 @@
> +Xen virtual IOMMU
> +
> +Motivation
> +==========
> +Enable more than 128 vcpu support
> +
> +The current requirements of HPC cloud service requires VM with a high
> +number of CPUs in order to achieve high performance in parallel
> +computing.
> +
> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy
> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is
> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available
> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode
> +supports 32-bit APIC ID and it requires the interrupt remapping functionality
> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs
> +
> +The reason for this is that there is no modification for existing PCI MSI
> +and IOAPIC when introduce X2APIC.

I'm not sure the above sentence makes much sense. IMHO I would just
remove it.

> PCI MSI/IOAPIC can only send interrupt
> +message containing 8-bit APIC ID, which cannot address cpus with >254
> +APIC ID. Interrupt remapping supports 32-bit APIC ID and so it's necessary
> +for >128 vcpus support.
> +
> +
> +vIOMMU Architecture
> +===================
> +vIOMMU device model is inside Xen hypervisor for following factors
> +    1) Avoid round trips between Qemu and Xen hypervisor
> +    2) Ease of integration with the rest of hypervisor
> +    3) HVMlite/PVH doesn't use Qemu

Just use PVH here, HVMlite == PVH now.

> +
> +* Interrupt remapping overview.
> +Interrupts from virtual devices and physical devices are delivered
> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
> +this procedure.
> +
> ++---------------------------------------------------+
> +|Qemu                       |VM                     |
> +|                           | +----------------+    |
> +|                           | |  Device driver |    |
> +|                           | +--------+-------+    |
> +|                           |          ^            |
> +|       +----------------+  | +--------+-------+    |
> +|       | Virtual device |  | |  IRQ subsystem |    |
> +|       +-------+--------+  | +--------+-------+    |
> +|               |           |          ^            |
> +|               |           |          |            |
> ++---------------------------+-----------------------+
> +|hypervisor     |                      | VIRQ       |
> +|               |            +---------+--------+   |
> +|               |            |      vLAPIC      |   |
> +|               |VIRQ        +---------+--------+   |
> +|               |                      ^            |
> +|               |                      |            |
> +|               |            +---------+--------+   |
> +|               |            |      vIOMMU      |   |
> +|               |            +---------+--------+   |
> +|               |                      ^            |
> +|               |                      |            |
> +|               |            +---------+--------+   |
> +|               |            |   vIOAPIC/vMSI   |   |
> +|               |            +----+----+--------+   |
> +|               |                 ^    ^            |
> +|               +-----------------+    |            |
> +|                                      |            |
> ++---------------------------------------------------+
> +HW                                     |IRQ
> +                                +-------------------+
> +                                |   PCI Device      |
> +                                +-------------------+
> +
> +
> +vIOMMU hypercall
> +================
> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create/destroy
> +vIOMMUs.
> +
> +* vIOMMU hypercall parameter structure
> +
> +/* vIOMMU type - specify vendor vIOMMU device model */
> +#define VIOMMU_TYPE_INTEL_VTD	       0
> +
> +/* vIOMMU capabilities */
> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> +
> +struct xen_domctl_viommu_op {
> +    uint32_t cmd;
> +#define XEN_DOMCTL_create_viommu          0
> +#define XEN_DOMCTL_destroy_viommu         1

I would invert the order of the domctl names:

#define XEN_DOMCTL_viommu_create          0
#define XEN_DOMCTL_viommu_destroy         1

It's clearer if the operation is the last part of the name.

> +    union {
> +        struct {
> +            /* IN - vIOMMU type  */
> +            uint64_t viommu_type;

Hm, do we really need a uint64_t for the IOMMU type? A uint8_t should
be more that enough (256 different IOMMU implementations).

> +            /* IN - MMIO base address of vIOMMU. */
> +            uint64_t base_address;
> +            /* IN - Capabilities with which we want to create */
> +            uint64_t capabilities;
> +            /* OUT - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } create_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } destroy_viommu;

Do you really need the destroy operation? Do we expect to hot-unplug
vIOMMUs? Otherwise vIOMMUs should be removed when the domain is
destroyed.

> +    } u;
> +};
> +
> +- XEN_DOMCTL_create_viommu
> +    Create vIOMMU device with vIOMMU_type, capabilities and MMIO base
> +address. Hypervisor allocates viommu_id for new vIOMMU instance and return
> +back. The vIOMMU device model in hypervisor should check whether it can
> +support the input capabilities and return error if not.
> +
> +- XEN_DOMCTL_destroy_viommu
> +    Destroy vIOMMU in Xen hypervisor with viommu_id as parameter.
> +
> +These vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU
> +support for single VM.(e.g, parameters of create/destroy vIOMMU includes
> +vIOMMU id). But function implementation only supports one vIOMMU per VM so far.
> +
> +Xen hypervisor vIOMMU command
> +=============================
> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor.
> +It's default disabled.

Hm, I'm not sure we really need this. At the end viommu will be
disabled by default for guests, unless explicitly enabled in the
config file.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-09-22  3:01 ` [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance Lan Tianyu
@ 2017-10-18 14:05   ` Roger Pau Monné
  2017-10-19  6:31     ` Lan Tianyu
  2017-10-30  1:51     ` Lan Tianyu
  0 siblings, 2 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-18 14:05 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:01:43PM -0400, Lan Tianyu wrote:
> This patch is to introduce an abstract layer for arch vIOMMU implementation
> to deal with requests from dom0. Arch vIOMMU code needs to provide callback
> to do create and destroy operation.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  docs/misc/xen-command-line.markdown |   7 ++
>  xen/arch/x86/Kconfig                |   1 +
>  xen/common/Kconfig                  |   3 +
>  xen/common/Makefile                 |   1 +
>  xen/common/domain.c                 |   4 +
>  xen/common/viommu.c                 | 144 ++++++++++++++++++++++++++++++++++++
>  xen/include/xen/sched.h             |   8 ++
>  xen/include/xen/viommu.h            |  63 ++++++++++++++++
>  8 files changed, 231 insertions(+)
>  create mode 100644 xen/common/viommu.c
>  create mode 100644 xen/include/xen/viommu.h
> 
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index 9797c8d..dfd1db5 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1825,3 +1825,10 @@ mode.
>  > Default: `true`
>  
>  Permit use of the `xsave/xrstor` instructions.
> +
> +### viommu
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Permit use of viommu interface to create and destroy viommu device model.
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 30c2769..1f1de96 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -23,6 +23,7 @@ config X86
>  	select HAS_PDX
>  	select NUMA
>  	select VGA
> +	select VIOMMU
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index dc8e876..2ad2c8d 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -49,6 +49,9 @@ config HAS_CHECKPOLICY
>  	string
>  	option env="XEN_HAS_CHECKPOLICY"
>  
> +config VIOMMU
> +	bool
> +
>  config KEXEC
>  	bool "kexec support"
>  	default y
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 39e2614..da32f71 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -56,6 +56,7 @@ obj-y += time.o
>  obj-y += timer.o
>  obj-y += trace.o
>  obj-y += version.o
> +obj-$(CONFIG_VIOMMU) += viommu.o
>  obj-y += virtual_region.o
>  obj-y += vm_event.o
>  obj-y += vmap.o
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 5aebcf2..cdb1c9d 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -814,6 +814,10 @@ static void complete_domain_destroy(struct rcu_head *head)
>  
>      sched_destroy_domain(d);
>  
> +#ifdef CONFIG_VIOMMU
> +    viommu_destroy_domain(d);
> +#endif
> +
>      /* Free page used by xen oprofile buffer. */
>  #ifdef CONFIG_XENOPROF
>      free_xenoprof_pages(d);
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> new file mode 100644
> index 0000000..64d91e6
> --- /dev/null
> +++ b/xen/common/viommu.c
> @@ -0,0 +1,144 @@
> +/*
> + * common/viommu.c
> + *
> + * Copyright (c) 2017 Intel Corporation
> + * Author: Lan Tianyu <tianyu.lan@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/spinlock.h>
> +#include <xen/types.h>
> +#include <xen/viommu.h>
> +
> +bool __read_mostly opt_viommu;
> +boolean_param("viommu", opt_viommu);
> +
> +static DEFINE_SPINLOCK(type_list_lock);
> +static LIST_HEAD(type_list);
> +
> +struct viommu_type {
> +    uint64_t type;

The comment I've made about type being uint64_t in the other patch
stands here.

> +    struct viommu_ops *ops;
> +    struct list_head node;
> +};
> +
> +int viommu_destroy_domain(struct domain *d)
> +{
> +    int ret;
> +
> +    if ( !d->viommu )
> +        return -EINVAL;

ENODEV would be better.

> +
> +    ret = d->viommu->ops->destroy(d->viommu);
> +    if ( ret < 0 )
> +        return ret;
> +
> +    xfree(d->viommu);
> +    d->viommu = NULL;

Newline preferably.

> +    return 0;
> +}
> +
> +static struct viommu_type *viommu_get_type(uint64_t type)
> +{
> +    struct viommu_type *viommu_type = NULL;
> +
> +    spin_lock(&type_list_lock);
> +    list_for_each_entry( viommu_type, &type_list, node )
> +    {
> +        if ( viommu_type->type == type )
> +        {
> +            spin_unlock(&type_list_lock);
> +            return viommu_type;
> +        }
> +    }
> +    spin_unlock(&type_list_lock);

Why do you need a lock here, and a list at all?

AFAICT vIOMMU types will never be added at runtime.

> +
> +    return NULL;
> +}
> +
> +int viommu_register_type(uint64_t type, struct viommu_ops *ops)
> +{
> +    struct viommu_type *viommu_type = NULL;
> +
> +    if ( !viommu_enabled() )
> +        return -ENODEV;
> +
> +    if ( viommu_get_type(type) )
> +        return -EEXIST;
> +
> +    viommu_type = xzalloc(struct viommu_type);
> +    if ( !viommu_type )
> +        return -ENOMEM;
> +
> +    viommu_type->type = type;
> +    viommu_type->ops = ops;
> +
> +    spin_lock(&type_list_lock);
> +    list_add_tail(&viommu_type->node, &type_list);
> +    spin_unlock(&type_list_lock);
> +
> +    return 0;
> +}

As mentioned above, I think this viommu_register_type helper could be
avoided. I would rather use a macro similar to REGISTER_SCHEDULER in
order to populate an array at link time, and then just iterate over
it.

> +
> +static int viommu_create(struct domain *d, uint64_t type,
> +                         uint64_t base_address, uint64_t caps,
> +                         uint32_t *viommu_id)

I'm quite sure this doesn't compile, you are adding a static function
here that's not used at all in this patch. Please be careful and don't
introduce patches that will break the build.

> +{
> +    struct viommu *viommu;
> +    struct viommu_type *viommu_type = NULL;
> +    int rc;
> +
> +    /* Only support one vIOMMU per domain. */
> +    if ( d->viommu )
> +        return -E2BIG;
> +
> +    viommu_type = viommu_get_type(type);
> +    if ( !viommu_type )
> +        return -EINVAL;
> +
> +    if ( !viommu_type->ops || !viommu_type->ops->create )
> +        return -EINVAL;

Can this really happen? What's the point in having a iommu_type
without ops or without the create op? I think this should be an ASSERT
instead.

> +
> +    viommu = xzalloc(struct viommu);
> +    if ( !viommu )
> +        return -ENOMEM;
> +
> +    viommu->base_address = base_address;
> +    viommu->caps = caps;
> +    viommu->ops = viommu_type->ops;
> +
> +    rc = viommu->ops->create(d, viommu);
> +    if ( rc < 0 )
> +    {
> +        xfree(viommu);
> +        return rc;
> +    }
> +
> +    d->viommu = viommu;
> +
> +    /* Only support one vIOMMU per domain. */
> +    *viommu_id = 0;
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 5b8f8c6..750f235 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -33,6 +33,10 @@
>  DEFINE_XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t);
>  #endif
>  
> +#ifdef CONFIG_VIOMMU
> +#include <xen/viommu.h>
> +#endif

I would suggest you place the CONFIG_VIOMMU inside of the header
itself.

> +
>  /*
>   * Stats
>   *
> @@ -479,6 +483,10 @@ struct domain
>      rwlock_t vnuma_rwlock;
>      struct vnuma_info *vnuma;
>  
> +#ifdef CONFIG_VIOMMU
> +    struct viommu *viommu;
> +#endif

Shouldn't this go inside of x86/hvm/domain.h? (hvm_domain) PV guests
will certainly never be able to use it.

> +
>      /* Common monitor options */
>      struct {
>          unsigned int guest_request_enabled       : 1;
> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
> new file mode 100644
> index 0000000..636a2a3
> --- /dev/null
> +++ b/xen/include/xen/viommu.h
> @@ -0,0 +1,63 @@
> +/*
> + * include/xen/viommu.h
> + *
> + * Copyright (c) 2017, Intel Corporation
> + * Author: Lan Tianyu <tianyu.lan@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#ifndef __XEN_VIOMMU_H__
> +#define __XEN_VIOMMU_H__
> +
> +struct viommu;
> +
> +struct viommu_ops {
> +    int (*create)(struct domain *d, struct viommu *viommu);
> +    int (*destroy)(struct viommu *viommu);
> +};
> +
> +struct viommu {
> +    uint64_t base_address;
> +    uint64_t caps;
> +    const struct viommu_ops *ops;
> +    void *priv;
> +};
> +
> +#ifdef CONFIG_VIOMMU

Why do you only protect certain parts of the file with
CONFIG_VIOMMU?

> +extern bool opt_viommu;
> +static inline bool viommu_enabled(void)
> +{
> +    return opt_viommu;
> +}
> +
> +int viommu_register_type(uint64_t type, struct viommu_ops *ops);
> +int viommu_destroy_domain(struct domain *d);
> +#else
> +static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
> +{
> +    return -EINVAL;
> +}

Why don't you also provide a dummy viommu_destroy_domain helper to be
used in domain.c?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-09-22  3:01 ` [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
@ 2017-10-18 14:18   ` Roger Pau Monné
  2017-10-19  6:41     ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-18 14:18 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:01:44PM -0400, Lan Tianyu wrote:
> This patch is to introduce create, destroy and query capabilities
> command for vIOMMU. vIOMMU layer will deal with requests and call
> arch vIOMMU ops.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/domctl.c         |  6 ++++++
>  xen/common/viommu.c         | 30 ++++++++++++++++++++++++++++++
>  xen/include/public/domctl.h | 42 ++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/viommu.h    |  2 ++
>  4 files changed, 80 insertions(+)
> 
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 42658e5..7e28237 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -1149,6 +1149,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>              copyback = 1;
>          break;
>  
> +#ifdef CONFIG_VIOMMU
> +    case XEN_DOMCTL_viommu_op:
> +        ret = viommu_domctl(d, &op->u.viommu_op, &copyback);

IMHO, I'm not really sure if it's worth to pass the copyback parameter
around. Can you just do the copy if !ret?

> +        break;
> +#endif

Instead of guarding every call to a viommu related function with
CONFIG_VIOMMU I would rather add dummy replacements for them in the
!CONFIG_VIOMMU case in the viommu.h header.

> +
>      default:
>          ret = arch_do_domctl(op, d, u_domctl);
>          break;
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index 64d91e6..55feb5d 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -133,6 +133,36 @@ static int viommu_create(struct domain *d, uint64_t type,
>      return 0;
>  }
>  
> +int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
> +                  bool *need_copy)
> +{
> +    int rc = -EINVAL;

Why do you need to set rc to EINVAL? AFAICT there's no path that would
return rc without being initialized.

> +
> +    if ( !viommu_enabled() )
> +        return -ENODEV;
> +
> +    switch ( op->cmd )
> +    {
> +    case XEN_DOMCTL_create_viommu:
> +        rc = viommu_create(d, op->u.create.viommu_type,
> +                           op->u.create.base_address,
> +                           op->u.create.capabilities,
> +                           &op->u.create.viommu_id);
> +        if ( !rc )
> +            *need_copy = true;
> +        break;
> +
> +    case XEN_DOMCTL_destroy_viommu:
> +        rc = viommu_destroy_domain(d);
> +        break;
> +
> +    default:
> +        return -ENOSYS;
> +    }
> +
> +    return rc;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 50ff58f..68854b6 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1163,6 +1163,46 @@ struct xen_domctl_psr_cat_op {
>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
>  
> +/*  vIOMMU helper
> + *
> + *  vIOMMU interface can be used to create/destroy vIOMMU and
> + *  query vIOMMU capabilities.
> + */
> +
> +/* vIOMMU type - specify vendor vIOMMU device model */
> +#define VIOMMU_TYPE_INTEL_VTD           0
> +
> +/* vIOMMU capabilities */
> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)

Please put those two defines next to the fields they belong to.

> +
> +struct xen_domctl_viommu_op {
> +    uint32_t cmd;
> +#define XEN_DOMCTL_create_viommu          0
> +#define XEN_DOMCTL_destroy_viommu         1

Would be nice if the values where right aligned.

> +    union {
> +        struct {
> +            /* IN - vIOMMU type */
> +            uint64_t viommu_type;
> +            /* 
> +             * IN - MMIO base address of vIOMMU. vIOMMU device models
> +             * are in charge of to check base_address.
> +             */
> +            uint64_t base_address;
> +            /* IN - Capabilities with which we want to create */
> +            uint64_t capabilities;
> +            /* OUT - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } create;
> +
> +        struct {
> +            /* IN - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } destroy;
> +    } u;
> +};

See my comments about the struct in patch 01/29.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures
  2017-09-22  3:01 ` [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
@ 2017-10-18 14:36   ` Roger Pau Monné
  2017-10-19  6:46     ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-18 14:36 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:45PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Add dmar table structure according Chapter 8 "BIOS Considerations" of
> VTd spec Rev. 2.4.
> 
> VTd spec:http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  tools/libacpi/acpi2_0.h | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 61 insertions(+)
> 
> diff --git a/tools/libacpi/acpi2_0.h b/tools/libacpi/acpi2_0.h
> index 2619ba3..758a823 100644
> --- a/tools/libacpi/acpi2_0.h
> +++ b/tools/libacpi/acpi2_0.h
> @@ -422,6 +422,65 @@ struct acpi_20_slit {
>  };
>  
>  /*
> + * DMA Remapping Table header definition (DMAR)
> + */
> +
> +/*
> + * DMAR Flags.
> + */
> +#define ACPI_DMAR_INTR_REMAP        (1 << 0)
> +#define ACPI_DMAR_X2APIC_OPT_OUT    (1 << 1)
> +
> +struct acpi_dmar {
> +    struct acpi_header header;
> +    uint8_t host_address_width;
> +    uint8_t flags;
> +    uint8_t reserved[10];
> +};
> +
> +/*
> + * Device Scope Types
> + */
> +#define ACPI_DMAR_DEVICE_SCOPE_PCI_ENDPOINT             0x01
> +#define ACPI_DMAR_DEVICE_SCOPE_PCI_SUB_HIERARACHY       0x01
                                                           ^0x02
> +#define ACPI_DMAR_DEVICE_SCOPE_IOAPIC                   0x03
> +#define ACPI_DMAR_DEVICE_SCOPE_HPET                     0x04
> +#define ACPI_DMAR_DEVICE_SCOPE_ACPI_NAMESPACE_DEVICE    0x05

Maybe you could try to reduce the length of the defines?

> +
> +struct dmar_device_scope {
> +    uint8_t type;
> +    uint8_t length;
> +    uint8_t reserved[2];
> +    uint8_t enumeration_id;
> +    uint8_t bus;
> +    uint16_t path[0];
> +};
> +
> +/*
> + * DMA Remapping Hardware Unit Types
> + */
> +#define ACPI_DMAR_TYPE_HARDWARE_UNIT        0x00
> +#define ACPI_DMAR_TYPE_RESERVED_MEMORY      0x01
> +#define ACPI_DMAR_TYPE_ATSR                 0x02
> +#define ACPI_DMAR_TYPE_HARDWARE_AFFINITY    0x03
> +#define ACPI_DMAR_TYPE_ANDD                 0x04

I think you either use acronyms for all of them (like ATSR and ANDD)
or not. But mixing acronyms with full names is confusing.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-09-22  3:01 ` [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
@ 2017-10-18 15:12   ` Roger Pau Monné
  2017-10-19  8:09     ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-18 15:12 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:46PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> The BIOS reports the remapping hardware units in a platform to system software
> through the DMA Remapping Reporting (DMAR) ACPI table.
> New fields are introduces for DMAR table. These new fields are set by
                 ^ introduced
> toolstack through parsing guest's config file. construct_dmar() is added to
> build DMAR table according to the new fields.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
> v3:
>  - Remove chip-set specific IOAPIC BDF. Instead, let IOAPIC-related
>  info be passed by struct acpi_config.
> 
> ---
>  tools/libacpi/build.c   | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/libacpi/libacpi.h | 12 +++++++++++
>  2 files changed, 65 insertions(+)
> 
> diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
> index f9881c9..5ee8fcd 100644
> --- a/tools/libacpi/build.c
> +++ b/tools/libacpi/build.c
> @@ -303,6 +303,59 @@ static struct acpi_20_slit *construct_slit(struct acpi_ctxt *ctxt,
>      return slit;
>  }
>  
> +/*
> + * Only one DMA remapping hardware unit is exposed and all devices
> + * are under the remapping hardware unit. I/O APIC should be explicitly
> + * enumerated.
> + */
> +struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
> +                                 const struct acpi_config *config)
> +{
> +    struct acpi_dmar *dmar;
> +    struct acpi_dmar_hardware_unit *drhd;
> +    struct dmar_device_scope *scope;
> +    unsigned int size;
> +    unsigned int ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);

I'm not sure I follow why you need to add the size of a uint16_t here.

> +
> +    size = sizeof(*dmar) + sizeof(*drhd) + ioapic_scope_size;

size can be initialized at declaration time.

> +
> +    dmar = ctxt->mem_ops.alloc(ctxt, size, 16);

Even dmar can be initialized at declaration time.

> +    if ( !dmar )
> +        return NULL;
> +
> +    memset(dmar, 0, size);
> +    dmar->header.signature = ACPI_2_0_DMAR_SIGNATURE;
> +    dmar->header.revision = ACPI_2_0_DMAR_REVISION;
> +    dmar->header.length = size;
> +    fixed_strcpy(dmar->header.oem_id, ACPI_OEM_ID);
> +    fixed_strcpy(dmar->header.oem_table_id, ACPI_OEM_TABLE_ID);
> +    dmar->header.oem_revision = ACPI_OEM_REVISION;
> +    dmar->header.creator_id   = ACPI_CREATOR_ID;
> +    dmar->header.creator_revision = ACPI_CREATOR_REVISION;
> +    dmar->host_address_width = config->host_addr_width - 1;
> +    if ( config->iommu_intremap_supported )
> +        dmar->flags |= ACPI_DMAR_INTR_REMAP;
> +    if ( !config->iommu_x2apic_supported )
> +        dmar->flags |= ACPI_DMAR_X2APIC_OPT_OUT;

Is there any reason why we would want to create a guest with a vIOMMU
but not x2APIC support?

> +
> +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
                                                      ^ space
> +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
> +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
> +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
> +    drhd->pci_segment = 0;
> +    drhd->base_address = config->iommu_base_addr;
> +
> +    scope = &drhd->scope[0];
> +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
> +    scope->length = ioapic_scope_size;
> +    scope->enumeration_id = config->ioapic_id;
> +    scope->bus = config->ioapic_bus;
> +    scope->path[0] = config->ioapic_devfn;
> +
> +    set_checksum(dmar, offsetof(struct acpi_header, checksum), size);
> +    return dmar;
> +}
> +
>  static int construct_passthrough_tables(struct acpi_ctxt *ctxt,
>                                          unsigned long *table_ptrs,
>                                          int nr_tables,
> diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
> index a2efd23..fdd6a78 100644
> --- a/tools/libacpi/libacpi.h
> +++ b/tools/libacpi/libacpi.h
> @@ -20,6 +20,8 @@
>  #ifndef __LIBACPI_H__
>  #define __LIBACPI_H__
>  
> +#include <stdbool.h>

I'm quite sure you shouldn't add this here, see how headers are added
using LIBACPI_STDUTILS.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
  2017-10-18 13:26   ` Roger Pau Monné
@ 2017-10-19  2:26     ` Lan Tianyu
  2017-10-19  8:49       ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-10-19  2:26 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

Hi Roger:
     Thanks for review.

On 2017年10月18日 21:26, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote:
>> This patch is to add Xen virtual IOMMU doc to introduce motivation,
>> framework, vIOMMU hypercall and xl configuration.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  docs/misc/viommu.txt | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 136 insertions(+)
>>  create mode 100644 docs/misc/viommu.txt
>>
>> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
>> new file mode 100644
>> index 0000000..348e8c4
>> --- /dev/null
>> +++ b/docs/misc/viommu.txt
>> @@ -0,0 +1,136 @@
>> +Xen virtual IOMMU
>> +
>> +Motivation
>> +==========
>> +Enable more than 128 vcpu support
>> +
>> +The current requirements of HPC cloud service requires VM with a high
>> +number of CPUs in order to achieve high performance in parallel
>> +computing.
>> +
>> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy
>> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is
>> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available
>> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode
>> +supports 32-bit APIC ID and it requires the interrupt remapping functionality
>> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs
>> +
>> +The reason for this is that there is no modification for existing PCI MSI
>> +and IOAPIC when introduce X2APIC.
> 
> I'm not sure the above sentence makes much sense. IMHO I would just
> remove it.

OK. Will remove.

> 
>> PCI MSI/IOAPIC can only send interrupt
>> +message containing 8-bit APIC ID, which cannot address cpus with >254
>> +APIC ID. Interrupt remapping supports 32-bit APIC ID and so it's necessary
>> +for >128 vcpus support.
>> +
>> +
>> +vIOMMU Architecture
>> +===================
>> +vIOMMU device model is inside Xen hypervisor for following factors
>> +    1) Avoid round trips between Qemu and Xen hypervisor
>> +    2) Ease of integration with the rest of hypervisor
>> +    3) HVMlite/PVH doesn't use Qemu
> 
> Just use PVH here, HVMlite == PVH now.

OK.

> 
>> +
>> +* Interrupt remapping overview.
>> +Interrupts from virtual devices and physical devices are delivered
>> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
>> +this procedure.
>> +
>> ++---------------------------------------------------+
>> +|Qemu                       |VM                     |
>> +|                           | +----------------+    |
>> +|                           | |  Device driver |    |
>> +|                           | +--------+-------+    |
>> +|                           |          ^            |
>> +|       +----------------+  | +--------+-------+    |
>> +|       | Virtual device |  | |  IRQ subsystem |    |
>> +|       +-------+--------+  | +--------+-------+    |
>> +|               |           |          ^            |
>> +|               |           |          |            |
>> ++---------------------------+-----------------------+
>> +|hypervisor     |                      | VIRQ       |
>> +|               |            +---------+--------+   |
>> +|               |            |      vLAPIC      |   |
>> +|               |VIRQ        +---------+--------+   |
>> +|               |                      ^            |
>> +|               |                      |            |
>> +|               |            +---------+--------+   |
>> +|               |            |      vIOMMU      |   |
>> +|               |            +---------+--------+   |
>> +|               |                      ^            |
>> +|               |                      |            |
>> +|               |            +---------+--------+   |
>> +|               |            |   vIOAPIC/vMSI   |   |
>> +|               |            +----+----+--------+   |
>> +|               |                 ^    ^            |
>> +|               +-----------------+    |            |
>> +|                                      |            |
>> ++---------------------------------------------------+
>> +HW                                     |IRQ
>> +                                +-------------------+
>> +                                |   PCI Device      |
>> +                                +-------------------+
>> +
>> +
>> +vIOMMU hypercall
>> +================
>> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create/destroy
>> +vIOMMUs.
>> +
>> +* vIOMMU hypercall parameter structure
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD	       0
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> +    uint32_t cmd;
>> +#define XEN_DOMCTL_create_viommu          0
>> +#define XEN_DOMCTL_destroy_viommu         1
> 
> I would invert the order of the domctl names:
> 
> #define XEN_DOMCTL_viommu_create          0
> #define XEN_DOMCTL_viommu_destroy         1
> 
> It's clearer if the operation is the last part of the name.

OK. Will update.

> 
>> +    union {
>> +        struct {
>> +            /* IN - vIOMMU type  */
>> +            uint64_t viommu_type;
> 
> Hm, do we really need a uint64_t for the IOMMU type? A uint8_t should
> be more that enough (256 different IOMMU implementations).

OK. Will update.

> 
>> +            /* IN - MMIO base address of vIOMMU. */
>> +            uint64_t base_address;
>> +            /* IN - Capabilities with which we want to create */
>> +            uint64_t capabilities;
>> +            /* OUT - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } create_viommu;
>> +
>> +        struct {
>> +            /* IN - vIOMMU identity */
>> +            uint32_t viommu_id;
>> +        } destroy_viommu;
> 
> Do you really need the destroy operation? Do we expect to hot-unplug
> vIOMMUs? Otherwise vIOMMUs should be removed when the domain is
> destroyed.

Yes. no such requirement so far and added it just for multi-vIOMMU
consideration. I will remove it and add back when it's really needed.

> 
>> +    } u;
>> +};
>> +
>> +- XEN_DOMCTL_create_viommu
>> +    Create vIOMMU device with vIOMMU_type, capabilities and MMIO base
>> +address. Hypervisor allocates viommu_id for new vIOMMU instance and return
>> +back. The vIOMMU device model in hypervisor should check whether it can
>> +support the input capabilities and return error if not.
>> +
>> +- XEN_DOMCTL_destroy_viommu
>> +    Destroy vIOMMU in Xen hypervisor with viommu_id as parameter.
>> +
>> +These vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU
>> +support for single VM.(e.g, parameters of create/destroy vIOMMU includes
>> +vIOMMU id). But function implementation only supports one vIOMMU per VM so far.
>> +
>> +Xen hypervisor vIOMMU command
>> +=============================
>> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor.
>> +It's default disabled.
> 
> Hm, I'm not sure we really need this. At the end viommu will be
> disabled by default for guests, unless explicitly enabled in the
> config file.

This is according to Jan's early comments on RFC patch
https://patchwork.kernel.org/patch/9733869/.

"It's actually a question whether in our current scheme a Kconfig
option is appropriate here in the first place. I'd rather see this be
an always built feature which needs enabling on the command line
for the time being."


> 
> Thanks, Roger.
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-18 14:05   ` Roger Pau Monné
@ 2017-10-19  6:31     ` Lan Tianyu
  2017-10-19  8:47       ` Roger Pau Monné
  2017-10-30  1:51     ` Lan Tianyu
  1 sibling, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-10-19  6:31 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月18日 22:05, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:43PM -0400, Lan Tianyu wrote:
>> +int viommu_destroy_domain(struct domain *d)
>> +{
>> +    int ret;
>> +
>> +    if ( !d->viommu )
>> +        return -EINVAL;
> 
> ENODEV would be better.

OK. Will update.

> 
>> +
>> +    ret = d->viommu->ops->destroy(d->viommu);
>> +    if ( ret < 0 )
>> +        return ret;
>> +
>> +    xfree(d->viommu);
>> +    d->viommu = NULL;
> 
> Newline preferably.

OK.

> 
>> +    return 0;
>> +}
>> +
>> +static struct viommu_type *viommu_get_type(uint64_t type)
>> +{
>> +    struct viommu_type *viommu_type = NULL;
>> +
>> +    spin_lock(&type_list_lock);
>> +    list_for_each_entry( viommu_type, &type_list, node )
>> +    {
>> +        if ( viommu_type->type == type )
>> +        {
>> +            spin_unlock(&type_list_lock);
>> +            return viommu_type;
>> +        }
>> +    }
>> +    spin_unlock(&type_list_lock);
> 
> Why do you need a lock here, and a list at all?
> 
> AFAICT vIOMMU types will never be added at runtime.

Yes, will remove it.

> 
>> +
>> +    return NULL;
>> +}
>> +
>> +int viommu_register_type(uint64_t type, struct viommu_ops *ops)
>> +{
>> +    struct viommu_type *viommu_type = NULL;
>> +
>> +    if ( !viommu_enabled() )
>> +        return -ENODEV;
>> +
>> +    if ( viommu_get_type(type) )
>> +        return -EEXIST;
>> +
>> +    viommu_type = xzalloc(struct viommu_type);
>> +    if ( !viommu_type )
>> +        return -ENOMEM;
>> +
>> +    viommu_type->type = type;
>> +    viommu_type->ops = ops;
>> +
>> +    spin_lock(&type_list_lock);
>> +    list_add_tail(&viommu_type->node, &type_list);
>> +    spin_unlock(&type_list_lock);
>> +
>> +    return 0;
>> +}
> 
> As mentioned above, I think this viommu_register_type helper could be
> avoided. I would rather use a macro similar to REGISTER_SCHEDULER in
> order to populate an array at link time, and then just iterate over
> it.
> 
>> +
>> +static int viommu_create(struct domain *d, uint64_t type,
>> +                         uint64_t base_address, uint64_t caps,
>> +                         uint32_t *viommu_id)
> 
> I'm quite sure this doesn't compile, you are adding a static function
> here that's not used at all in this patch. Please be careful and don't
> introduce patches that will break the build.

This function will be used in the next patch. "DOMCTL: Introduce new
DOMCTL commands for vIOMMU support.". So this doesn't break patchset
build. Will combine these two patches to avoid such issue.


> 
>> +{
>> +    struct viommu *viommu;
>> +    struct viommu_type *viommu_type = NULL;
>> +    int rc;
>> +
>> +    /* Only support one vIOMMU per domain. */
>> +    if ( d->viommu )
>> +        return -E2BIG;
>> +
>> +    viommu_type = viommu_get_type(type);
>> +    if ( !viommu_type )
>> +        return -EINVAL;
>> +
>> +    if ( !viommu_type->ops || !viommu_type->ops->create )
>> +        return -EINVAL;
> 
> Can this really happen? What's the point in having a iommu_type
> without ops or without the create op? I think this should be an ASSERT
> instead.

How about add ASSERT(viommu_type->ops->create) here?

> 
>> +
>> +    viommu = xzalloc(struct viommu);
>> +    if ( !viommu )
>> +        return -ENOMEM;
>> +
>> +    viommu->base_address = base_address;
>> +    viommu->caps = caps;
>> +    viommu->ops = viommu_type->ops;
>> +
>> +    rc = viommu->ops->create(d, viommu);
>> +    if ( rc < 0 )
>> +    {
>> +        xfree(viommu);
>> +        return rc;
>> +    }
>> +
>> +    d->viommu = viommu;
>> +
>> +    /* Only support one vIOMMU per domain. */
>> +    *viommu_id = 0;
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index 5b8f8c6..750f235 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -33,6 +33,10 @@
>>  DEFINE_XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t);
>>  #endif
>>  
>> +#ifdef CONFIG_VIOMMU
>> +#include <xen/viommu.h>
>> +#endif
> 
> I would suggest you place the CONFIG_VIOMMU inside of the header
> itself.
> 
>> +
>>  /*
>>   * Stats
>>   *
>> @@ -479,6 +483,10 @@ struct domain
>>      rwlock_t vnuma_rwlock;
>>      struct vnuma_info *vnuma;
>>  
>> +#ifdef CONFIG_VIOMMU
>> +    struct viommu *viommu;
>> +#endif
> 
> Shouldn't this go inside of x86/hvm/domain.h? (hvm_domain) PV guests
> will certainly never be able to use it.

vIOMMU framework should be generic for all platforms and so didn't put
this in arch/x86.

> 
>> +
>>      /* Common monitor options */
>>      struct {
>>          unsigned int guest_request_enabled       : 1;
>> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
>> new file mode 100644
>> index 0000000..636a2a3
>> --- /dev/null
>> +++ b/xen/include/xen/viommu.h
>> @@ -0,0 +1,63 @@
>> +/*
>> + * include/xen/viommu.h
>> + *
>> + * Copyright (c) 2017, Intel Corporation
>> + * Author: Lan Tianyu <tianyu.lan@intel.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + */
>> +#ifndef __XEN_VIOMMU_H__
>> +#define __XEN_VIOMMU_H__
>> +
>> +struct viommu;
>> +
>> +struct viommu_ops {
>> +    int (*create)(struct domain *d, struct viommu *viommu);
>> +    int (*destroy)(struct viommu *viommu);
>> +};
>> +
>> +struct viommu {
>> +    uint64_t base_address;
>> +    uint64_t caps;
>> +    const struct viommu_ops *ops;
>> +    void *priv;
>> +};
>> +
>> +#ifdef CONFIG_VIOMMU
> 
> Why do you only protect certain parts of the file with
> CONFIG_VIOMMU?

After some considerations, CONFIG_VIOMMU should protect all field(new
structure definition and function declaration) in the file except some
dummy function. This will help to remove some CONFIG_VIOMMU check in
other places.

> 
>> +extern bool opt_viommu;
>> +static inline bool viommu_enabled(void)
>> +{
>> +    return opt_viommu;
>> +}
>> +
>> +int viommu_register_type(uint64_t type, struct viommu_ops *ops);
>> +int viommu_destroy_domain(struct domain *d);
>> +#else
>> +static inline int viommu_register_type(uint64_t type, struct viommu_ops *ops)
>> +{
>> +    return -EINVAL;
>> +}
> 
> Why don't you also provide a dummy viommu_destroy_domain helper to be
> used in domain.c?
> 

After above change,  I think we just need viommu_destroy_domain() and
viommu_domctl() which is called in the common code path for x86 and ARM.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support
  2017-10-18 14:18   ` Roger Pau Monné
@ 2017-10-19  6:41     ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-19  6:41 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月18日 22:18, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:44PM -0400, Lan Tianyu wrote:
>> This patch is to introduce create, destroy and query capabilities
>> command for vIOMMU. vIOMMU layer will deal with requests and call
>> arch vIOMMU ops.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/domctl.c         |  6 ++++++
>>  xen/common/viommu.c         | 30 ++++++++++++++++++++++++++++++
>>  xen/include/public/domctl.h | 42 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/xen/viommu.h    |  2 ++
>>  4 files changed, 80 insertions(+)
>>
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index 42658e5..7e28237 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -1149,6 +1149,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>              copyback = 1;
>>          break;
>>  
>> +#ifdef CONFIG_VIOMMU
>> +    case XEN_DOMCTL_viommu_op:
>> +        ret = viommu_domctl(d, &op->u.viommu_op, &copyback);
> 
> IMHO, I'm not really sure if it's worth to pass the copyback parameter
> around. Can you just do the copy if !ret?

Yes, will update.

> 
>> +        break;
>> +#endif
> 
> Instead of guarding every call to a viommu related function with
> CONFIG_VIOMMU I would rather add dummy replacements for them in the
> !CONFIG_VIOMMU case in the viommu.h header.


OK.

> 
>> +
>>      default:
>>          ret = arch_do_domctl(op, d, u_domctl);
>>          break;
>>  /*
>>   * Local variables:
>>   * mode: C
>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>> index 50ff58f..68854b6 100644
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -1163,6 +1163,46 @@ struct xen_domctl_psr_cat_op {
>>  typedef struct xen_domctl_psr_cat_op xen_domctl_psr_cat_op_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cat_op_t);
>>  
>> +/*  vIOMMU helper
>> + *
>> + *  vIOMMU interface can be used to create/destroy vIOMMU and
>> + *  query vIOMMU capabilities.
>> + */
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD           0
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> 
> Please put those two defines next to the fields they belong to.

OK.




-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures
  2017-10-18 14:36   ` Roger Pau Monné
@ 2017-10-19  6:46     ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-19  6:46 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On 2017年10月18日 22:36, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:45PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> Add dmar table structure according Chapter 8 "BIOS Considerations" of
>> VTd spec Rev. 2.4.
>>
>> VTd spec:http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  tools/libacpi/acpi2_0.h | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 61 insertions(+)
>>
>> diff --git a/tools/libacpi/acpi2_0.h b/tools/libacpi/acpi2_0.h
>> index 2619ba3..758a823 100644
>> --- a/tools/libacpi/acpi2_0.h
>> +++ b/tools/libacpi/acpi2_0.h
>> @@ -422,6 +422,65 @@ struct acpi_20_slit {
>>  };
>>  
>>  /*
>> + * DMA Remapping Table header definition (DMAR)
>> + */
>> +
>> +/*
>> + * DMAR Flags.
>> + */
>> +#define ACPI_DMAR_INTR_REMAP        (1 << 0)
>> +#define ACPI_DMAR_X2APIC_OPT_OUT    (1 << 1)
>> +
>> +struct acpi_dmar {
>> +    struct acpi_header header;
>> +    uint8_t host_address_width;
>> +    uint8_t flags;
>> +    uint8_t reserved[10];
>> +};
>> +
>> +/*
>> + * Device Scope Types
>> + */
>> +#define ACPI_DMAR_DEVICE_SCOPE_PCI_ENDPOINT             0x01
>> +#define ACPI_DMAR_DEVICE_SCOPE_PCI_SUB_HIERARACHY       0x01
>                                                            ^0x02
>> +#define ACPI_DMAR_DEVICE_SCOPE_IOAPIC                   0x03
>> +#define ACPI_DMAR_DEVICE_SCOPE_HPET                     0x04
>> +#define ACPI_DMAR_DEVICE_SCOPE_ACPI_NAMESPACE_DEVICE    0x05
> 
> Maybe you could try to reduce the length of the defines?

Sure. Will update.

> 
>> +
>> +struct dmar_device_scope {
>> +    uint8_t type;
>> +    uint8_t length;
>> +    uint8_t reserved[2];
>> +    uint8_t enumeration_id;
>> +    uint8_t bus;
>> +    uint16_t path[0];
>> +};
>> +
>> +/*
>> + * DMA Remapping Hardware Unit Types
>> + */
>> +#define ACPI_DMAR_TYPE_HARDWARE_UNIT        0x00
>> +#define ACPI_DMAR_TYPE_RESERVED_MEMORY      0x01
>> +#define ACPI_DMAR_TYPE_ATSR                 0x02
>> +#define ACPI_DMAR_TYPE_HARDWARE_AFFINITY    0x03
>> +#define ACPI_DMAR_TYPE_ANDD                 0x04
> 
> I think you either use acronyms for all of them (like ATSR and ANDD)
> or not. But mixing acronyms with full names is confusing.

OK. Will update.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-10-18 15:12   ` Roger Pau Monné
@ 2017-10-19  8:09     ` Lan Tianyu
  2017-10-19  8:40       ` Roger Pau Monné
  2017-10-19 11:31       ` Jan Beulich
  0 siblings, 2 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-19  8:09 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On 2017年10月18日 23:12, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:46PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>>
>> The BIOS reports the remapping hardware units in a platform to system software
>> through the DMA Remapping Reporting (DMAR) ACPI table.
>> New fields are introduces for DMAR table. These new fields are set by
>                  ^ introduced
>> toolstack through parsing guest's config file. construct_dmar() is added to
>> build DMAR table according to the new fields.
>>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>> v3:
>>  - Remove chip-set specific IOAPIC BDF. Instead, let IOAPIC-related
>>  info be passed by struct acpi_config.
>>
>> ---
>>  tools/libacpi/build.c   | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  tools/libacpi/libacpi.h | 12 +++++++++++
>>  2 files changed, 65 insertions(+)
>>
>> diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
>> index f9881c9..5ee8fcd 100644
>> --- a/tools/libacpi/build.c
>> +++ b/tools/libacpi/build.c
>> @@ -303,6 +303,59 @@ static struct acpi_20_slit *construct_slit(struct acpi_ctxt *ctxt,
>>      return slit;
>>  }
>>  
>> +/*
>> + * Only one DMA remapping hardware unit is exposed and all devices
>> + * are under the remapping hardware unit. I/O APIC should be explicitly
>> + * enumerated.
>> + */
>> +struct acpi_dmar *construct_dmar(struct acpi_ctxt *ctxt,
>> +                                 const struct acpi_config *config)
>> +{
>> +    struct acpi_dmar *dmar;
>> +    struct acpi_dmar_hardware_unit *drhd;
>> +    struct dmar_device_scope *scope;
>> +    unsigned int size;
>> +    unsigned int ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);
> 
> I'm not sure I follow why you need to add the size of a uint16_t here.
> 
>> +
>> +    size = sizeof(*dmar) + sizeof(*drhd) + ioapic_scope_size;
> 
> size can be initialized at declaration time.
> 
>> +
>> +    dmar = ctxt->mem_ops.alloc(ctxt, size, 16);
> 
> Even dmar can be initialized at declaration time.
> 

OK. Will update.

>> +    if ( !dmar )
>> +        return NULL;
>> +
>> +    memset(dmar, 0, size);
>> +    dmar->header.signature = ACPI_2_0_DMAR_SIGNATURE;
>> +    dmar->header.revision = ACPI_2_0_DMAR_REVISION;
>> +    dmar->header.length = size;
>> +    fixed_strcpy(dmar->header.oem_id, ACPI_OEM_ID);
>> +    fixed_strcpy(dmar->header.oem_table_id, ACPI_OEM_TABLE_ID);
>> +    dmar->header.oem_revision = ACPI_OEM_REVISION;
>> +    dmar->header.creator_id   = ACPI_CREATOR_ID;
>> +    dmar->header.creator_revision = ACPI_CREATOR_REVISION;
>> +    dmar->host_address_width = config->host_addr_width - 1;
>> +    if ( config->iommu_intremap_supported )
>> +        dmar->flags |= ACPI_DMAR_INTR_REMAP;
>> +    if ( !config->iommu_x2apic_supported )
>> +        dmar->flags |= ACPI_DMAR_X2APIC_OPT_OUT;
> 
> Is there any reason why we would want to create a guest with a vIOMMU
> but not x2APIC support?

Will remove this.

> 
>> +
>> +    drhd = (struct acpi_dmar_hardware_unit *)((void*)dmar + sizeof(*dmar));
>                                                       ^ space
>> +    drhd->type = ACPI_DMAR_TYPE_HARDWARE_UNIT;
>> +    drhd->length = sizeof(*drhd) + ioapic_scope_size;
>> +    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
>> +    drhd->pci_segment = 0;
>> +    drhd->base_address = config->iommu_base_addr;
>> +
>> +    scope = &drhd->scope[0];
>> +    scope->type = ACPI_DMAR_DEVICE_SCOPE_IOAPIC;
>> +    scope->length = ioapic_scope_size;
>> +    scope->enumeration_id = config->ioapic_id;
>> +    scope->bus = config->ioapic_bus;
>> +    scope->path[0] = config->ioapic_devfn;
>> +
>> +    set_checksum(dmar, offsetof(struct acpi_header, checksum), size);
>> +    return dmar;
>> +}
>> +
>>  static int construct_passthrough_tables(struct acpi_ctxt *ctxt,
>>                                          unsigned long *table_ptrs,
>>                                          int nr_tables,
>> diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
>> index a2efd23..fdd6a78 100644
>> --- a/tools/libacpi/libacpi.h
>> +++ b/tools/libacpi/libacpi.h
>> @@ -20,6 +20,8 @@
>>  #ifndef __LIBACPI_H__
>>  #define __LIBACPI_H__
>>  
>> +#include <stdbool.h>
> 
> I'm quite sure you shouldn't add this here, see how headers are added
> using LIBACPI_STDUTILS.
> 

We may replace bool with uint8_t xxx:1 to avoid introduce new head file.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-10-19  8:09     ` Lan Tianyu
@ 2017-10-19  8:40       ` Roger Pau Monné
  2017-10-25  6:06         ` Lan Tianyu
  2017-10-19 11:31       ` Jan Beulich
  1 sibling, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19  8:40 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Oct 19, 2017 at 04:09:02PM +0800, Lan Tianyu wrote:
> On 2017年10月18日 23:12, Roger Pau Monné wrote:
> >> diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
> >> index a2efd23..fdd6a78 100644
> >> --- a/tools/libacpi/libacpi.h
> >> +++ b/tools/libacpi/libacpi.h
> >> @@ -20,6 +20,8 @@
> >>  #ifndef __LIBACPI_H__
> >>  #define __LIBACPI_H__
> >>  
> >> +#include <stdbool.h>
> > 
> > I'm quite sure you shouldn't add this here, see how headers are added
> > using LIBACPI_STDUTILS.
> > 
> 
> We may replace bool with uint8_t xxx:1 to avoid introduce new head file.

Did you check whether including stdbool is actually required? AFAICT
hvmloader util.h already includes it, and you would only have to
introduce it in libxl if it's not there yet.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-19  6:31     ` Lan Tianyu
@ 2017-10-19  8:47       ` Roger Pau Monné
  2017-10-25  1:43         ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19  8:47 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Oct 19, 2017 at 02:31:22PM +0800, Lan Tianyu wrote:
> On 2017年10月18日 22:05, Roger Pau Monné wrote:
> > On Thu, Sep 21, 2017 at 11:01:43PM -0400, Lan Tianyu wrote:
> >> +static int viommu_create(struct domain *d, uint64_t type,
> >> +                         uint64_t base_address, uint64_t caps,
> >> +                         uint32_t *viommu_id)
> > 
> > I'm quite sure this doesn't compile, you are adding a static function
> > here that's not used at all in this patch. Please be careful and don't
> > introduce patches that will break the build.
> 
> This function will be used in the next patch. "DOMCTL: Introduce new
> DOMCTL commands for vIOMMU support.". So this doesn't break patchset
> build. Will combine these two patches to avoid such issue.

If it's used in the next patch, then simply introduce it there.

> 
> 
> > 
> >> +{
> >> +    struct viommu *viommu;
> >> +    struct viommu_type *viommu_type = NULL;
> >> +    int rc;
> >> +
> >> +    /* Only support one vIOMMU per domain. */
> >> +    if ( d->viommu )
> >> +        return -E2BIG;
> >> +
> >> +    viommu_type = viommu_get_type(type);
> >> +    if ( !viommu_type )
> >> +        return -EINVAL;
> >> +
> >> +    if ( !viommu_type->ops || !viommu_type->ops->create )
> >> +        return -EINVAL;
> > 
> > Can this really happen? What's the point in having a iommu_type
> > without ops or without the create op? I think this should be an ASSERT
> > instead.
> 
> How about add ASSERT(viommu_type->ops->create) here?

Since ops is already a pointer I would rather do

ASSERT(viommu_type->ops && viommu_type->ops->create);

Or else you risk a NULL pointer dereference.

> >> +
> >>  /*
> >>   * Stats
> >>   *
> >> @@ -479,6 +483,10 @@ struct domain
> >>      rwlock_t vnuma_rwlock;
> >>      struct vnuma_info *vnuma;
> >>  
> >> +#ifdef CONFIG_VIOMMU
> >> +    struct viommu *viommu;
> >> +#endif
> > 
> > Shouldn't this go inside of x86/hvm/domain.h? (hvm_domain) PV guests
> > will certainly never be able to use it.
> 
> vIOMMU framework should be generic for all platforms and so didn't put
> this in arch/x86.

For all platforms supporting HVM, for PV I don't think it makes sense.
Since AFAIK ARM guest type is also HVM I would rather introduce this
field in the hvm_domain structure rather than the generic domain
structure.

You might want to wait for feedback from others regarding this issue.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
  2017-10-19  2:26     ` Lan Tianyu
@ 2017-10-19  8:49       ` Roger Pau Monné
  2017-10-19 11:28         ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19  8:49 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Oct 19, 2017 at 10:26:36AM +0800, Lan Tianyu wrote:
> Hi Roger:
>      Thanks for review.
> 
> On 2017年10月18日 21:26, Roger Pau Monné wrote:
> > On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote:
> >> +Xen hypervisor vIOMMU command
> >> +=============================
> >> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor.
> >> +It's default disabled.
> > 
> > Hm, I'm not sure we really need this. At the end viommu will be
> > disabled by default for guests, unless explicitly enabled in the
> > config file.
> 
> This is according to Jan's early comments on RFC patch
> https://patchwork.kernel.org/patch/9733869/.
> 
> "It's actually a question whether in our current scheme a Kconfig
> option is appropriate here in the first place. I'd rather see this be
> an always built feature which needs enabling on the command line
> for the time being."

So if I read this correctly Jan wanted you to ditch the Kconfig option
and instead rely on the command line option to enable/disable it.

I don't have a strong opinion here, so it's fine for me if you want to
keep both the Kconfig option and the command line one.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-09-22  3:01 ` [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
@ 2017-10-19  9:49   ` Roger Pau Monné
  2017-10-20  1:36     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19  9:49 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:47PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> A field, viommu_info, is added to struct libxl_domain_build_info. Several
> attributes can be specified by guest config file for virtual IOMMU. These
> attributes are used for DMAR construction and vIOMMU creation.

IMHO this should come much later in the series, ideally you would
introduce the xl/libxl code in the last patches, together with the
xl.cfg man page change.

> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> v3:
>  - allow an array of viommu other than only one viommu to present to guest.
>  During domain building, an error would be raised for
>  multiple viommus case since we haven't implemented this yet.
>  - provide a libxl__viommu_set_default() for viommu
> 
> ---
>  docs/man/xl.cfg.pod.5.in    | 27 +++++++++++++++++++++++
>  tools/libxl/libxl_create.c  | 52 +++++++++++++++++++++++++++++++++++++++++++++
>  tools/libxl/libxl_types.idl | 12 +++++++++++
>  tools/xl/xl_parse.c         | 52 ++++++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 142 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
> index 79cb2ea..9cd7dd7 100644
> --- a/docs/man/xl.cfg.pod.5.in
> +++ b/docs/man/xl.cfg.pod.5.in
> @@ -1547,6 +1547,33 @@ L<http://www.microsoft.com/en-us/download/details.aspx?id=30707>
>  
>  =back 
>  
> +=item B<viommu=[ "VIOMMU_STRING", "VIOMMU_STRING", ...]>
> +
> +Specifies the vIOMMUs which are to be provided to the guest.
> +
> +B<VIOMMU_STRING> has the form C<KEY=VALUE,KEY=VALUE,...> where:
> +
> +=over 4
> +
> +=item B<KEY=VALUE>
> +
> +Possible B<KEY>s are:
> +
> +=over 4
> +
> +=item B<type="STRING">
> +
> +Currently there is only one valid type:
> +
> +(x86 only) "intel_vtd" means providing a emulated Intel VT-d to the guest.
> +
> +=item B<intremap=BOOLEAN>
> +
> +Specifies whether the vIOMMU should support interrupt remapping
> +and default 'true'.
> +
> +=back
> +
>  =head3 Guest Virtual Time Controls
>  
>  =over 4
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 9123585..decd7a8 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -27,6 +27,8 @@
>  
>  #include <xen-xsm/flask/flask.h>
>  
> +#define VIOMMU_VTD_BASE_ADDR        0xfed90000ULL

This should be in libxl_arch.h see LAPIC_BASE_ADDRESS.

> +
>  int libxl__domain_create_info_setdefault(libxl__gc *gc,
>                                           libxl_domain_create_info *c_info)
>  {
> @@ -59,6 +61,47 @@ void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
>                              LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT;
>  }
>  
> +static int libxl__viommu_set_default(libxl__gc *gc,
> +                                     libxl_domain_build_info *b_info)
> +{
> +    int i;
> +
> +    if (!b_info->num_viommus)
> +        return 0;
> +
> +    for (i = 0; i < b_info->num_viommus; i++) {
> +        libxl_viommu_info *viommu = &b_info->viommu[i];
> +
> +        if (libxl_defbool_is_default(viommu->intremap))
> +            libxl_defbool_set(&viommu->intremap, true);
> +
> +        if (!libxl_defbool_val(viommu->intremap)) {
> +            LOGE(ERROR, "Cannot create one virtual VTD without intremap");
> +            return ERROR_INVAL;
> +        }
> +
> +        if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> +            /*
> +             * If there are multiple vIOMMUs, we need arrange all vIOMMUs to
> +             * avoid overlap. Put a check here in case we get here for multiple
> +             * vIOMMUs case.
> +             */
> +            if (b_info->num_viommus > 1) {
> +                LOGE(ERROR, "Multiple vIOMMUs support is under implementation");

s/LOGE/LOG/ LOGE should only be used when errno is set (which is not
the case here).

> +                return ERROR_INVAL;
> +            }
> +
> +            /* Set default values to unexposed fields */
> +            viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
> +
> +            /* Set desired capbilities */
> +            viommu->cap = VIOMMU_CAP_IRQ_REMAPPING;

I'm not sure whether this code should be in libxl_x86.c, but
libxl__domain_build_info_setdefault is already quite messed up, so I
guess it's fine.

> +        }

Shouldn't this be:

switch(viommu->type) {
case LIBXL_VIOMMU_TYPE_INTEL_VTD:
    ...
    break;

default:
    return ERROR_INVAL;
}

So that you catch type being set to an invalid vIOMMU type?

> +    }
> +
> +    return 0;
> +}
> +
>  int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                                          libxl_domain_build_info *b_info)
>  {
> @@ -214,6 +257,9 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>  
>      libxl__arch_domain_build_info_acpi_setdefault(b_info);
>  
> +    if (libxl__viommu_set_default(gc, b_info))
> +        return ERROR_FAIL;
> +
>      switch (b_info->type) {
>      case LIBXL_DOMAIN_TYPE_HVM:
>          if (b_info->shadow_memkb == LIBXL_MEMKB_DEFAULT)
> @@ -890,6 +936,12 @@ static void initiate_domain_create(libxl__egc *egc,
>          goto error_out;
>      }
>  
> +    if (d_config->b_info.num_viommus > 1) {
> +        ret = ERROR_INVAL;
> +        LOGD(ERROR, domid, "Cannot support multiple vIOMMUs");
> +        goto error_out;
> +    }

Er, you already have this check in libxl__viommu_set_default, and in
any case I would just rely on the hypervisor failing to create more
than one vIOMMU per domain, rather than adding the same check here.

> +
>      ret = libxl__domain_create_info_setdefault(gc, &d_config->c_info);
>      if (ret) {
>          LOGD(ERROR, domid, "Unable to set domain create info defaults");
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 173d70a..286c960 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -450,6 +450,17 @@ libxl_altp2m_mode = Enumeration("altp2m_mode", [
>      (3, "limited"),
>      ], init_val = "LIBXL_ALTP2M_MODE_DISABLED")
>  
> +libxl_viommu_type = Enumeration("viommu_type", [
> +    (1, "intel_vtd"),
> +    ])
> +
> +libxl_viommu_info = Struct("viommu_info", [
> +    ("type",            libxl_viommu_type),
> +    ("intremap",        libxl_defbool),
> +    ("cap",             uint64),
> +    ("base_addr",       uint64),
> +    ])
> +
>  libxl_domain_build_info = Struct("domain_build_info",[
>      ("max_vcpus",       integer),
>      ("avail_vcpus",     libxl_bitmap),
> @@ -506,6 +517,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>      # 65000 which is reserved by the toolstack.
>      ("device_tree",      string),
>      ("acpi",             libxl_defbool),
> +    ("viommu",           Array(libxl_viommu_info, "num_viommus")),
>      ("u", KeyedUnion(None, libxl_domain_type, "type",
>                  [("hvm", Struct(None, [("firmware",         string),
>                                         ("bios",             libxl_bios_type),
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 02ddd2e..34f8128 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -804,6 +804,38 @@ int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token)
>      return 0;
>  }
>  
> +/* Parses viommu data and adds info into viommu
> + * Returns 1 if the input doesn't form a valid viommu
> + * or parsed values are not correct. Successful parse returns 0 */
> +static int parse_viommu_config(libxl_viommu_info *viommu, const char *info)
> +{
> +    char *ptr, *oparg, *saveptr = NULL, *buf = xstrdup(info);
> +
> +    ptr = strtok_r(buf, ",", &saveptr);
> +    if (MATCH_OPTION("type", ptr, oparg)) {
> +        if (!strcmp(oparg, "intel_vtd")) {
> +            viommu->type = LIBXL_VIOMMU_TYPE_INTEL_VTD;
> +        } else {
> +            fprintf(stderr, "Invalid viommu type: %s\n", oparg);
> +            return 1;
> +        }
> +    } else {
> +        fprintf(stderr, "viommu type should be set first: %s\n", oparg);
> +        return 1;
> +    }
> +
> +    for (ptr = strtok_r(NULL, ",", &saveptr); ptr;
> +         ptr = strtok_r(NULL, ",", &saveptr)) {
> +        if (MATCH_OPTION("intremap", ptr, oparg)) {
> +            libxl_defbool_set(&viommu->intremap, !!strtoul(oparg, NULL, 0));

No need for the !!.

> +        } else {
> +            fprintf(stderr, "Unknown string `%s' in viommu spec\n", ptr);
> +            return 1;
> +        }
> +    }
> +    return 0;
> +}
> +
>  void parse_config_data(const char *config_source,
>                         const char *config_data,
>                         int config_len,
> @@ -813,7 +845,7 @@ void parse_config_data(const char *config_source,
>      long l, vcpus = 0;
>      XLU_Config *config;
>      XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
> -                   *usbctrls, *usbdevs, *p9devs;
> +                   *usbctrls, *usbdevs, *p9devs, *iommus;
>      XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs,
>                     *mca_caps;
>      int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps;
> @@ -1037,6 +1069,24 @@ void parse_config_data(const char *config_source,
>      xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0);
>      xlu_cfg_get_defbool(config, "acpi", &b_info->acpi, 0);
>  
> +    if (!xlu_cfg_get_list (config, "viommu", &iommus, 0, 0)) {
> +        b_info->num_viommus = 0;
> +        b_info->viommu = NULL;

This should not be needed, num_viommus and viommu should already be
zeroed by the initialize functions.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-09-22  3:01 ` [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
@ 2017-10-19 10:00   ` Roger Pau Monné
  2017-10-20  1:44     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 10:00 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:48PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> A new logic is added to build ACPI DMAR table in tool stack for a guest
> with one virtual VTD and pass through it to guest via existing mechanism. If
> there already are ACPI tables needed to pass through, we joint the tables.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> v3:
>  - build dmar and initialize related acpi_modules struct in
>  libxl_x86_acpi.c, keeping in accordance with pvh.
> 
> ---
>  tools/libxl/libxl_x86.c      |  3 +-
>  tools/libxl/libxl_x86_acpi.c | 98 ++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 96 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> index 455f6f0..23c9a55 100644
> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -381,8 +381,7 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>  {
>      int rc = 0;
>  
> -    if ((info->type == LIBXL_DOMAIN_TYPE_HVM) &&
> -        (info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)) {
> +    if (info->type == LIBXL_DOMAIN_TYPE_HVM) {

You will have to rebase this on top of current staging,
LIBXL_DEVICE_MODEL_VERSION_NONE is now gone.

>          rc = libxl__dom_load_acpi(gc, info, dom);
>          if (rc != 0)
>              LOGE(ERROR, "libxl_dom_load_acpi failed");
> diff --git a/tools/libxl/libxl_x86_acpi.c b/tools/libxl/libxl_x86_acpi.c
> index 1761756..adf02f4 100644
> --- a/tools/libxl/libxl_x86_acpi.c
> +++ b/tools/libxl/libxl_x86_acpi.c
> @@ -16,6 +16,7 @@
>  #include "libxl_arch.h"
>  #include <xen/hvm/hvm_info_table.h>
>  #include <xen/hvm/e820.h>
> +#include "libacpi/acpi2_0.h"
>  #include "libacpi/libacpi.h"
>  
>  #include <xc_dom.h>
> @@ -161,9 +162,9 @@ out:
>      return rc;
>  }
>  
> -int libxl__dom_load_acpi(libxl__gc *gc,
> -                         const libxl_domain_build_info *b_info,
> -                         struct xc_dom_image *dom)
> +static int libxl__dom_load_acpi_pvh(libxl__gc *gc,
> +                                    const libxl_domain_build_info *b_info,
> +                                    struct xc_dom_image *dom)
>  {
>      struct acpi_config config = {0};
>      struct libxl_acpi_ctxt libxl_ctxt;
> @@ -236,6 +237,97 @@ out:
>      return rc;
>  }
>  
> +static void *acpi_memalign(struct acpi_ctxt *ctxt, uint32_t size,
> +                           uint32_t align)
> +{
> +    int ret;
> +    void *ptr;
> +
> +    ret = posix_memalign(&ptr, align, size);
> +    if (ret != 0 || !ptr)
> +        return NULL;
> +
> +    return ptr;
> +}
> +
> +/*
> + * For hvm, we don't need build acpi in libxl. Instead, it's built in hvmloader.
> + * But if one hvm has virtual VTD(s), we build DMAR table for it and joint this
> + * table with existing content in acpi_modules in order to employ HVM
> + * firmware pass-through mechanism to pass-through DMAR table.
> + */
> +static int libxl__dom_load_acpi_hvm(libxl__gc *gc,
> +                                    const libxl_domain_build_info *b_info,
> +                                    struct xc_dom_image *dom)
> +{

AFAICT there's some code duplication between libxl__dom_load_acpi_hvm
and libxl__dom_load_acpi_pvh, isn't there a chance you could put this
in a common function?

> +    struct acpi_config config = { 0 };
> +    struct acpi_ctxt ctxt;
> +    void *table;
> +    uint32_t len;
> +
> +    if ((b_info->type != LIBXL_DOMAIN_TYPE_HVM) ||
> +        (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) ||
> +        (b_info->num_viommus != 1) ||
> +        (b_info->viommu[0].type != LIBXL_VIOMMU_TYPE_INTEL_VTD))
> +        return 0;
> +
> +    ctxt.mem_ops.alloc = acpi_memalign;
> +    ctxt.mem_ops.v2p = virt_to_phys;
> +    ctxt.mem_ops.free = acpi_mem_free;
> +
> +    if (libxl_defbool_val(b_info->viommu[0].intremap))
> +        config.iommu_intremap_supported = true;
> +    /* x2apic is always enabled since in no case we must disable it */
> +    config.iommu_x2apic_supported = true;
> +    config.iommu_base_addr = b_info->viommu[0].base_addr;

I don't see libxl__dom_load_acpi_pvh setting any of the vIOMMU fields.

> +
> +    /* IOAPIC id and PSEUDO BDF */
> +    config.ioapic_id = 1;
> +    config.ioapic_bus = 0xff;
> +    config.ioapic_devfn = 0x0;
> +
> +    config.host_addr_width = 39;
> +
> +    table = construct_dmar(&ctxt, &config);
> +    if ( !table )
> +        return ERROR_NOMEM;
> +    len = ((struct acpi_header *)table)->length;
> +
> +    if (len) {
> +        libxl__ptr_add(gc, table);
> +        if (!dom->acpi_modules[0].data) {
> +            dom->acpi_modules[0].data = table;
> +            dom->acpi_modules[0].length = len;
> +        } else {
> +            /* joint tables */
> +            void *newdata;
> +
> +            newdata = libxl__malloc(gc, len + dom->acpi_modules[0].length);
> +            memcpy(newdata, dom->acpi_modules[0].data,
> +                   dom->acpi_modules[0].length);
> +            memcpy(newdata + dom->acpi_modules[0].length, table, len);
> +
> +            free(dom->acpi_modules[0].data);
> +            dom->acpi_modules[0].data = newdata;
> +            dom->acpi_modules[0].length += len;
> +        }
> +    }
> +    return 0;
> +}
> +
> +int libxl__dom_load_acpi(libxl__gc *gc,
> +                         const libxl_domain_build_info *b_info,
> +                         struct xc_dom_image *dom)
> +{
> +
> +    if (b_info->type != LIBXL_DOMAIN_TYPE_HVM)
> +        return 0;

Keep in mind a new PVH domain type has been introduced recently in
libxl, you will have to change this to b_info->type == LIBXL_DOMAIN_TYPE_PV.

> +
> +    if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)
> +        return libxl__dom_load_acpi_pvh(gc, b_info, dom);
> +    else
> +        return libxl__dom_load_acpi_hvm(gc, b_info, dom);
> +}
>  /*
>   * Local variables:
>   * mode: C
> -- 
> 1.8.3.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction
  2017-09-22  3:01 ` [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction Lan Tianyu
@ 2017-10-19 10:13   ` Roger Pau Monné
  2017-10-26 12:05     ` Wei Liu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 10:13 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:49PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> If guest is configured to have a vIOMMU, create it during domain construction.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> v3:
>  - Remove the process of querying capabilities.
> ---
>  tools/libxl/libxl_x86.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> index 23c9a55..25cae5f 100644
> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -341,8 +341,25 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
>      if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {
>          unsigned long shadow = DIV_ROUNDUP(d_config->b_info.shadow_memkb,
>                                             1024);
> +        int i;

unsigned int.

> +
>          xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION,
>                            NULL, 0, &shadow, 0, NULL);
> +
> +        for (i = 0; i < d_config->b_info.num_viommus; i++) {
> +            uint32_t id;
> +            libxl_viommu_info *viommu = d_config->b_info.viommu + i;

Since this is an array I would rather prefer that you use
&d_config->b_info.viommu[i].

> +
> +            if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> +                ret = xc_viommu_create(ctx->xch, domid, VIOMMU_TYPE_INTEL_VTD,
> +                                       viommu->base_addr, viommu->cap, &id);

As said in another patch: this will break compilation because
xc_viommu_create is introduced in patch 9.

Please organize the patches in a way that the code always compiles and
works fine. Keep in mind that the Xen tree should be bisectable
always.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 9/29] tools/libxc: Add viommu operations in libxc
  2017-09-22  3:01 ` [PATCH V3 9/29] tools/libxc: Add viommu operations in libxc Lan Tianyu
@ 2017-10-19 10:17   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 10:17 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:50PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds XEN_DOMCTL_viommu_op hypercall. This hypercall
> comprises two sub-commands:

Patch description doesn't match actual code. This patch doesn't add
any new hypercalls, it just adds libxc helpers for
XEN_DOMCTL_viommu_op.

> - create(): create a vIOMMU in Xen, given viommu type, register-set
>             location and capabilities
> - destroy(): destroy a vIOMMU specified by viommu_id
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
> v3:
>  - Remove API for querying viommu capabilities
>  - Remove pointless cast
>  - Polish commit message
>  - Coding style
> ---
>  tools/libxc/Makefile          |  1 +
>  tools/libxc/include/xenctrl.h |  4 +++
>  tools/libxc/xc_viommu.c       | 64 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 69 insertions(+)
>  create mode 100644 tools/libxc/xc_viommu.c
> 
> diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
> index 9a019e8..7d8c4b4 100644
> --- a/tools/libxc/Makefile
> +++ b/tools/libxc/Makefile
> @@ -51,6 +51,7 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
>  CTRL_SRCS-y       += xc_evtchn_compat.c
>  CTRL_SRCS-y       += xc_gnttab_compat.c
>  CTRL_SRCS-y       += xc_devicemodel_compat.c
> +CTRL_SRCS-y       += xc_viommu.c
>  
>  GUEST_SRCS-y :=
>  GUEST_SRCS-y += xg_private.c xc_suspend.c
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 43151cb..bedca1f 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2501,6 +2501,10 @@ enum xc_static_cpu_featuremask {
>  const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
>  const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
>  
> +int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
> +                     uint64_t base_addr, uint64_t cap, uint32_t *viommu_id);
> +int xc_viommu_destroy(xc_interface *xch, domid_t dom, uint32_t viommu_id);
> +
>  #endif
>  
>  int xc_livepatch_upload(xc_interface *xch,
> diff --git a/tools/libxc/xc_viommu.c b/tools/libxc/xc_viommu.c
> new file mode 100644
> index 0000000..17507c5
> --- /dev/null
> +++ b/tools/libxc/xc_viommu.c
> @@ -0,0 +1,64 @@
> +/*
> + * xc_viommu.c
> + *
> + * viommu related API functions.
> + *
> + * Copyright (C) 2017 Intel Corporation
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License, version 2.1, as published by the Free Software Foundation.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "xc_private.h"
> +
> +int xc_viommu_create(xc_interface *xch, domid_t dom, uint64_t type,
> +                     uint64_t base_addr, uint64_t cap, uint32_t *viommu_id)
> +{
> +    int rc;
> +

Extra newline.

> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_op;
> +    domctl.domain = dom;
> +    domctl.u.viommu_op.cmd = XEN_DOMCTL_create_viommu;
> +    domctl.u.viommu_op.u.create.viommu_type = type;

Please remove the "viommu_" prefix from the field, it's not needed
and just using "type" is already clear.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 10/29] vtd: add and align register definitions
  2017-09-22  3:01 ` [PATCH V3 10/29] vtd: add and align register definitions Lan Tianyu
@ 2017-10-19 10:21   ` Roger Pau Monné
  2017-10-20  1:47     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 10:21 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:51PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> No functional changes.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Would have been nice to maybe split this into two, one patch that
simply fixes the alignment and another one that introduces the new
defines (or even introduce the new defines when they are actually
needed).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM
  2017-09-22  3:01 ` [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
@ 2017-10-19 11:20   ` Roger Pau Monné
  2017-10-20  2:46     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 11:20 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:52PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds create/destroy function for the emulated VTD
> and adapts it to the common VIOMMU abstraction.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/Makefile |   7 +-
>  xen/drivers/passthrough/vtd/iommu.h  |  23 +++++-
>  xen/drivers/passthrough/vtd/vvtd.c   | 147 +++++++++++++++++++++++++++++++++++
>  3 files changed, 170 insertions(+), 7 deletions(-)
>  create mode 100644 xen/drivers/passthrough/vtd/vvtd.c
> 
> diff --git a/xen/drivers/passthrough/vtd/Makefile b/xen/drivers/passthrough/vtd/Makefile
> index f302653..163c7fe 100644
> --- a/xen/drivers/passthrough/vtd/Makefile
> +++ b/xen/drivers/passthrough/vtd/Makefile
> @@ -1,8 +1,9 @@
>  subdir-$(CONFIG_X86) += x86
>  
> -obj-y += iommu.o
>  obj-y += dmar.o
> -obj-y += utils.o
> -obj-y += qinval.o
>  obj-y += intremap.o
> +obj-y += iommu.o
> +obj-y += qinval.o
>  obj-y += quirks.o
> +obj-y += utils.o

Why do you need to shuffle the list above?

Also I'm not sure the Intel vIOMMU implementation should live here. As
you can see the path is:

xen/drivers/passthrough/vtd/

The vIOMMU is not tied to passthrough at all, so I would rather place
it in:

xen/drivers/vvtd/

Or maybe you can create something like:

xen/drivers/viommu/

So that all vIOMMU implementations can share some code.

> +obj-$(CONFIG_VIOMMU) += vvtd.o
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index d7e433e..ef038c9 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -66,6 +66,12 @@
>  #define VER_MAJOR(v)        (((v) & 0xf0) >> 4)
>  #define VER_MINOR(v)        ((v) & 0x0f)
>  
> +/* Supported Adjusted Guest Address Widths */
> +#define DMA_CAP_SAGAW_SHIFT         8
> + /* 39-bit AGAW, 3-level page-table */
> +#define DMA_CAP_SAGAW_39bit         (0x2ULL << DMA_CAP_SAGAW_SHIFT)
> +#define DMA_CAP_ND_64K              6ULL
> +
>  /*
>   * Decoding Capability Register
>   */
> @@ -74,6 +80,7 @@
>  #define cap_write_drain(c)     (((c) >> 54) & 1)
>  #define cap_max_amask_val(c)   (((c) >> 48) & 0x3f)
>  #define cap_num_fault_regs(c)  ((((c) >> 40) & 0xff) + 1)
> +#define cap_set_num_fault_regs(c)  ((((c) - 1) & 0xff) << 40)
>  #define cap_pgsel_inv(c)       (((c) >> 39) & 1)
>  
>  #define cap_super_page_val(c)  (((c) >> 34) & 0xf)
> @@ -85,11 +92,13 @@
>  #define cap_sps_1tb(c)         ((c >> 37) & 1)
>  
>  #define cap_fault_reg_offset(c)    ((((c) >> 24) & 0x3ff) * 16)
> +#define cap_set_fault_reg_offset(c) ((((c) / 16) & 0x3ff) << 24 )
>  
>  #define cap_isoch(c)        (((c) >> 23) & 1)
>  #define cap_qos(c)        (((c) >> 22) & 1)
>  #define cap_mgaw(c)        ((((c) >> 16) & 0x3f) + 1)
> -#define cap_sagaw(c)        (((c) >> 8) & 0x1f)
> +#define cap_set_mgaw(c)     ((((c) - 1) & 0x3f) << 16)
> +#define cap_sagaw(c)        (((c) >> DMA_CAP_SAGAW_SHIFT) & 0x1f)
>  #define cap_caching_mode(c)    (((c) >> 7) & 1)
>  #define cap_phmr(c)        (((c) >> 6) & 1)
>  #define cap_plmr(c)        (((c) >> 5) & 1)
> @@ -104,10 +113,16 @@
>  #define ecap_niotlb_iunits(e)    ((((e) >> 24) & 0xff) + 1)
>  #define ecap_iotlb_offset(e)     ((((e) >> 8) & 0x3ff) * 16)
>  #define ecap_coherent(e)         ((e >> 0) & 0x1)
> -#define ecap_queued_inval(e)     ((e >> 1) & 0x1)
> +#define DMA_ECAP_QI_SHIFT        1
> +#define DMA_ECAP_QI              (1ULL << DMA_ECAP_QI_SHIFT)
> +#define ecap_queued_inval(e)     ((e >> DMA_ECAP_QI_SHIFT) & 0x1)

Looks like this could be based on MASK_EXTR instead, but seeing how
the file is full of open-coded mask extracts I'm not sure it's worth
it anymore.

>  #define ecap_dev_iotlb(e)        ((e >> 2) & 0x1)
> -#define ecap_intr_remap(e)       ((e >> 3) & 0x1)
> -#define ecap_eim(e)              ((e >> 4) & 0x1)
> +#define DMA_ECAP_IR_SHIFT        3
> +#define DMA_ECAP_IR              (1ULL << DMA_ECAP_IR_SHIFT)
> +#define ecap_intr_remap(e)       ((e >> DMA_ECAP_IR_SHIFT) & 0x1)
> +#define DMA_ECAP_EIM_SHIFT       4
> +#define DMA_ECAP_EIM             (1ULL << DMA_ECAP_EIM_SHIFT)
> +#define ecap_eim(e)              ((e >> DMA_ECAP_EIM_SHIFT) & 0x1)

Maybe worth placing all the DMA_ECAP_* defines in a separate section?
Seems like how it's done for other features like DMA_FSTS or
DMA_CCMD.

>  #define ecap_cache_hints(e)      ((e >> 5) & 0x1)
>  #define ecap_pass_thru(e)        ((e >> 6) & 0x1)
>  #define ecap_snp_ctl(e)          ((e >> 7) & 0x1)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> new file mode 100644
> index 0000000..c851ec7
> --- /dev/null
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -0,0 +1,147 @@
> +/*
> + * vvtd.c
> + *
> + * virtualize VTD for HVM.
> + *
> + * Copyright (C) 2017 Chao Gao, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain_page.h>
> +#include <xen/sched.h>
> +#include <xen/types.h>
> +#include <xen/viommu.h>
> +#include <xen/xmalloc.h>
> +#include <asm/current.h>
> +#include <asm/hvm/domain.h>
> +#include <asm/page.h>
> +
> +#include "iommu.h"
> +
> +/* Supported capabilities by vvtd */
> +unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;

static?

Or even better, why is this not a define like VIOMMU_MAX_CAPS or
similar.

> +
> +union hvm_hw_vvtd_regs {
> +    uint32_t data32[256];
> +    uint64_t data64[128];
> +};

Do you really need to store all the register space instead of only
storing specific registers?

> +
> +struct vvtd {
> +    /* Address range of remapping hardware register-set */
> +    uint64_t base_addr;
> +    uint64_t length;

The length field doesn't seem to be used below.

> +    /* Point back to the owner domain */
> +    struct domain *domain;
> +    union hvm_hw_vvtd_regs *regs;

Does this need to be a pointer?

> +    struct page_info *regs_page;
> +};
> +
> +static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
> +{
> +    vtd->regs->data32[reg/sizeof(uint32_t)] = value;
> +}
> +
> +static inline uint32_t vvtd_get_reg(struct vvtd *vtd, uint32_t reg)
> +{
> +    return vtd->regs->data32[reg/sizeof(uint32_t)];
> +}
> +
> +static inline void vvtd_set_reg_quad(struct vvtd *vtd, uint32_t reg,
> +                                     uint64_t value)
> +{
> +    vtd->regs->data64[reg/sizeof(uint64_t)] = value;
> +}
> +
> +static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
> +{
> +    return vtd->regs->data64[reg/sizeof(uint64_t)];
> +}
> +
> +static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
> +{
> +    uint64_t cap = cap_set_num_fault_regs(1ULL) |
> +                   cap_set_fault_reg_offset(0x220ULL) |
> +                   cap_set_mgaw(39ULL) | DMA_CAP_SAGAW_39bit |
> +                   DMA_CAP_ND_64K;
> +    uint64_t ecap = DMA_ECAP_IR | DMA_ECAP_EIM | DMA_ECAP_QI;
> +
> +    vvtd_set_reg(vvtd, DMAR_VER_REG, 0x10UL);
> +    vvtd_set_reg_quad(vvtd, DMAR_CAP_REG, cap);
> +    vvtd_set_reg_quad(vvtd, DMAR_ECAP_REG, ecap);
> +    vvtd_set_reg(vvtd, DMAR_FECTL_REG, 0x80000000UL);
> +    vvtd_set_reg(vvtd, DMAR_IECTL_REG, 0x80000000UL);
> +}
> +
> +static int vvtd_create(struct domain *d, struct viommu *viommu)
> +{
> +    struct vvtd *vvtd;
> +    int ret;
> +
> +    if ( !is_hvm_domain(d) || (viommu->base_address & (PAGE_SIZE - 1)) ||
> +        (~vvtd_caps & viommu->caps) )
> +        return -EINVAL;
> +
> +    ret = -ENOMEM;
> +    vvtd = xzalloc_bytes(sizeof(struct vvtd));
> +    if ( !vvtd )
> +        return ret;
> +
> +    vvtd->regs_page = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !vvtd->regs_page )
> +        goto out1;
> +
> +    vvtd->regs = __map_domain_page_global(vvtd->regs_page);
> +    if ( !vvtd->regs )
> +        goto out2;
> +    clear_page(vvtd->regs);

Not sure why vvtd->regs needs to be a pointer, and why it needs to use
a full page. AFAICT the size of hvm_hw_vvtd_regs is 1024B, so you are
wasting 3/4 of a page.

> +
> +    vvtd_reset(vvtd, viommu->caps);
> +    vvtd->base_addr = viommu->base_address;
> +    vvtd->domain = d;
> +
> +    viommu->priv = vvtd;
> +
> +    return 0;
> +
> + out2:
> +    free_domheap_page(vvtd->regs_page);
> + out1:
> +    xfree(vvtd);
> +    return ret;

You should try to avoid using labels. I think this can be solved by
not allocating a separate page for regs.

> +}
> +
> +static int vvtd_destroy(struct viommu *viommu)
> +{
> +    struct vvtd *vvtd = viommu->priv;
> +
> +    if ( vvtd )
> +    {
> +        unmap_domain_page_global(vvtd->regs);
> +        free_domheap_page(vvtd->regs_page);
> +        xfree(vvtd);
> +    }
> +    return 0;
> +}
> +
> +struct viommu_ops vvtd_hvm_vmx_ops = {
> +    .create = vvtd_create,
> +    .destroy = vvtd_destroy
> +};
> +
> +static int vvtd_register(void)
> +{
> +    viommu_register_type(VIOMMU_TYPE_INTEL_VTD, &vvtd_hvm_vmx_ops);
> +    return 0;
> +}
> +__initcall(vvtd_register);

As commented in another patch I think the vIOMMU types should be
registered using a method similar to REGISTER_SCHEDULER.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
  2017-10-19  8:49       ` Roger Pau Monné
@ 2017-10-19 11:28         ` Jan Beulich
  2017-10-24  7:16           ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2017-10-19 11:28 UTC (permalink / raw)
  To: Roger Pau Monné, Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, chao.gao

>>> On 19.10.17 at 10:49, <roger.pau@citrix.com> wrote:
> On Thu, Oct 19, 2017 at 10:26:36AM +0800, Lan Tianyu wrote:
>> Hi Roger:
>>      Thanks for review.
>> 
>> On 2017年10月18日 21:26, Roger Pau Monné wrote:
>> > On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote:
>> >> +Xen hypervisor vIOMMU command
>> >> +=============================
>> >> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in 
> hypervisor.
>> >> +It's default disabled.
>> > 
>> > Hm, I'm not sure we really need this. At the end viommu will be
>> > disabled by default for guests, unless explicitly enabled in the
>> > config file.
>> 
>> This is according to Jan's early comments on RFC patch
>> https://patchwork.kernel.org/patch/9733869/.
>> 
>> "It's actually a question whether in our current scheme a Kconfig
>> option is appropriate here in the first place. I'd rather see this be
>> an always built feature which needs enabling on the command line
>> for the time being."
> 
> So if I read this correctly Jan wanted you to ditch the Kconfig option
> and instead rely on the command line option to enable/disable it.

Yes.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-10-19  8:09     ` Lan Tianyu
  2017-10-19  8:40       ` Roger Pau Monné
@ 2017-10-19 11:31       ` Jan Beulich
  1 sibling, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2017-10-19 11:31 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Roger Pau Monné,
	Chao Gao

>>> On 19.10.17 at 10:09, <tianyu.lan@intel.com> wrote:
> On 2017年10月18日 23:12, Roger Pau Monné wrote:
>> On Thu, Sep 21, 2017 at 11:01:46PM -0400, Lan Tianyu wrote:
>>> --- a/tools/libacpi/libacpi.h
>>> +++ b/tools/libacpi/libacpi.h
>>> @@ -20,6 +20,8 @@
>>>  #ifndef __LIBACPI_H__
>>>  #define __LIBACPI_H__
>>>  
>>> +#include <stdbool.h>
>> 
>> I'm quite sure you shouldn't add this here, see how headers are added
>> using LIBACPI_STDUTILS.
> 
> We may replace bool with uint8_t xxx:1 to avoid introduce new head file.

Please don't - if you mean boolean, please use bool. Roger's
remark, aiui, wasn't about you using bool, but how you do
the header inclusion.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD
  2017-09-22  3:01 ` [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
@ 2017-10-19 11:34   ` Roger Pau Monné
  2017-10-20  2:58     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 11:34 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:53PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This patch adds VVTD MMIO handler to deal with MMIO access.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/vvtd.c | 91 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 91 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index c851ec7..a3002c3 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -47,6 +47,29 @@ struct vvtd {
>      struct page_info *regs_page;
>  };
>  
> +/* Setting viommu_verbose enables debugging messages of vIOMMU */
> +bool __read_mostly viommu_verbose;
> +boolean_runtime_param("viommu_verbose", viommu_verbose);
> +
> +#ifndef NDEBUG
> +#define vvtd_info(fmt...) do {                    \
> +    if ( viommu_verbose )                         \
> +        gprintk(XENLOG_G_INFO, ## fmt);           \

If you use gprintk you should use XENLOG_INFO, the '_G_' variants are
only used with plain printk.

> +} while(0)
> +#define vvtd_debug(fmt...) do {                   \
> +    if ( viommu_verbose && printk_ratelimit() )   \

Not sure why you need printk_ratelimit, XENLOG_G_DEBUG is already
rate-limited.

> +        printk(XENLOG_G_DEBUG fmt);               \

Any reason why vvtd_info uses gprintk and here you use printk?

> +} while(0)
> +#else
> +#define vvtd_info(fmt...) do {} while(0)
> +#define vvtd_debug(fmt...) do {} while(0)

No need for 'fmt...' just '...' will suffice since you are discarding
the parameters anyway.

> +#endif
> +
> +struct vvtd *domain_vvtd(struct domain *d)
> +{
> +    return (d->viommu) ? d->viommu->priv : NULL;

Unneeded parentheses around d->viommu.

Also, it seems wring to call domain_vvtd with !d->viommu. So I think
this helper should just be removed, and d->viommu->priv fetched
directly.

> +}
> +
>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
>  {
>      vtd->regs->data32[reg/sizeof(uint32_t)] = value;
> @@ -68,6 +91,73 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
>      return vtd->regs->data64[reg/sizeof(uint64_t)];
>  }
>  
> +static int vvtd_in_range(struct vcpu *v, unsigned long addr)
> +{
> +    struct vvtd *vvtd = domain_vvtd(v->domain);
> +
> +    if ( vvtd )
> +        return (addr >= vvtd->base_addr) &&
> +               (addr < vvtd->base_addr + PAGE_SIZE);

So the register set covers a PAGE_SIZE, but hvm_hw_vvtd_regs only
covers from 0 to 1024B, it seems like there's something wrong here...

> +    return 0;
> +}
> +
> +static int vvtd_read(struct vcpu *v, unsigned long addr,
> +                     unsigned int len, unsigned long *pval)
> +{
> +    struct vvtd *vvtd = domain_vvtd(v->domain);
> +    unsigned int offset = addr - vvtd->base_addr;
> +
> +    vvtd_info("Read offset %x len %d\n", offset, len);
> +
> +    if ( (len != 4 && len != 8) || (offset & (len - 1)) )

What value does hardware return when performing unaligned reads or
reads with wrong size?

Here you return with pval not set, which is dangerous.

> +        return X86EMUL_OKAY;
> +
> +    if ( len == 4 )
> +        *pval = vvtd_get_reg(vvtd, offset);
> +    else
> +        *pval = vvtd_get_reg_quad(vvtd, offset);

...yet here you don't check for offset < 1024.

> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write(struct vcpu *v, unsigned long addr,
> +                      unsigned int len, unsigned long val)
> +{
> +    struct vvtd *vvtd = domain_vvtd(v->domain);
> +    unsigned int offset = addr - vvtd->base_addr;
> +
> +    vvtd_info("Write offset %x len %d val %lx\n", offset, len, val);
> +
> +    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
> +        return X86EMUL_OKAY;
> +
> +    if ( len == 4 )
> +    {
> +        switch ( offset )
> +        {
> +        case DMAR_IEDATA_REG:
> +        case DMAR_IEADDR_REG:
> +        case DMAR_IEUADDR_REG:
> +        case DMAR_FEDATA_REG:
> +        case DMAR_FEADDR_REG:
> +        case DMAR_FEUADDR_REG:
> +            vvtd_set_reg(vvtd, offset, val);

Hm, so you are using a full page when you only care for 6 4B
registers? Seem like quite of a waste of memory.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  2017-09-22  3:01 ` [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
@ 2017-10-19 11:56   ` Roger Pau Monné
  2017-10-20  4:08     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 11:56 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:54PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Software sets this field to set/update the interrupt remapping table pointer
> used by hardware. The interrupt remapping table pointer is specified through
> the Interrupt Remapping Table Address (IRTA_REG) register.
> 
> This patch emulates this operation and adds some new fields in VVTD to track
> info (e.g. the table's gfn and max supported entries) of interrupt remapping
> table.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> v3:
>  - ignore unaligned r/w of vt-d hardware registers and return X86EMUL_OK
> ---
>  xen/drivers/passthrough/vtd/iommu.h | 12 ++++++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 69 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 80 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index ef038c9..a0d5ec8 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -153,6 +153,8 @@
>  #define DMA_GCMD_IRE    (((u64)1) << 25)
>  #define DMA_GCMD_SIRTP  (((u64)1) << 24)
>  #define DMA_GCMD_CFI    (((u64)1) << 23)
> +/* mask of one-shot bits */
> +#define DMA_GCMD_ONE_SHOT_MASK 0x96ffffff 

Trailing white space.

>  
>  /* GSTS_REG */
>  #define DMA_GSTS_TES    (((u64)1) << 31)
> @@ -162,9 +164,17 @@
>  #define DMA_GSTS_WBFS   (((u64)1) << 27)
>  #define DMA_GSTS_QIES   (((u64)1) <<26)
>  #define DMA_GSTS_IRES   (((u64)1) <<25)
> -#define DMA_GSTS_SIRTPS (((u64)1) << 24)
> +#define DMA_GSTS_SIRTPS_SHIFT   24
> +#define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_SHIFT)
>  #define DMA_GSTS_CFIS   (((u64)1) <<23)
>  
> +/* IRTA_REG */
> +/* The base of 4KB aligned interrupt remapping table */
> +#define DMA_IRTA_ADDR(val)      ((val) & ~0xfffULL)
> +/* The size of remapping table is 2^(x+1), where x is the size field in IRTA */
> +#define DMA_IRTA_S(val)         (val & 0xf)
> +#define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
> +
>  /* PMEN_REG */
>  #define DMA_PMEN_EPM    (((u32)1) << 31)
>  #define DMA_PMEN_PRS    (((u32)1) << 0)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index a3002c3..6736956 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -32,6 +32,13 @@
>  /* Supported capabilities by vvtd */
>  unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
>  
> +struct hvm_hw_vvtd_status {
> +    uint32_t eim_enabled : 1;

bool maybe?

> +    uint32_t irt_max_entry;
> +    /* Interrupt remapping table base gfn */
> +    uint64_t irt;

If it's a gfn, use gfn_t as the type.

> +};
> +
>  union hvm_hw_vvtd_regs {
>      uint32_t data32[256];
>      uint64_t data64[128];
> @@ -43,6 +50,8 @@ struct vvtd {
>      uint64_t length;
>      /* Point back to the owner domain */
>      struct domain *domain;
> +
> +    struct hvm_hw_vvtd_status status;

Why you need a sub-struct for this, can't this just be placed inside
of hvm_hw_vvtd_regs directly?

>      union hvm_hw_vvtd_regs *regs;
>      struct page_info *regs_page;
>  };
> @@ -70,6 +79,11 @@ struct vvtd *domain_vvtd(struct domain *d)
>      return (d->viommu) ? d->viommu->priv : NULL;
>  }
>  
> +static inline void vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
> +{
> +    __set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
> +}
> +
>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
>  {
>      vtd->regs->data32[reg/sizeof(uint32_t)] = value;
> @@ -91,6 +105,44 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
>      return vtd->regs->data64[reg/sizeof(uint64_t)];
>  }
>  
> +static void vvtd_handle_gcmd_sirtp(struct vvtd *vvtd, uint32_t val)
> +{
> +    uint64_t irta = vvtd_get_reg_quad(vvtd, DMAR_IRTA_REG);
> +
> +    if ( !(val & DMA_GCMD_SIRTP) )
> +        return;
> +
> +    vvtd->status.irt = DMA_IRTA_ADDR(irta) >> PAGE_SHIFT;
> +    vvtd->status.irt_max_entry = DMA_IRTA_SIZE(irta);
> +    vvtd->status.eim_enabled = !!(irta & IRTA_EIME);

If you use a bool you don't need the '!!'.

> +    vvtd_info("Update IR info (addr=%lx eim=%d size=%d).",

The final '.' is unneeded IMHO.

> +              vvtd->status.irt, vvtd->status.eim_enabled,
> +              vvtd->status.irt_max_entry);
> +    vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_SIRTPS_SHIFT);
> +}
> +
> +static int vvtd_write_gcmd(struct vvtd *vvtd, uint32_t val)
> +{
> +    uint32_t orig = vvtd_get_reg(vvtd, DMAR_GSTS_REG);
> +    uint32_t changed;
> +
> +    orig = orig & DMA_GCMD_ONE_SHOT_MASK;   /* reset the one-shot bits */
> +    changed = orig ^ val;
> +
> +    if ( !changed )
> +        return X86EMUL_OKAY;
> +
> +    if ( changed & (changed - 1) )
> +        vvtd_info("Guest attempts to write %x to GCMD (current GSTS is %x)," 

Trailing white-space.

> +                  "it would lead to update multiple fields",

Also try to reduce the size of the message, so it can fit in a single
line.

> +                  val, orig);
> +
> +    if ( changed & DMA_GCMD_SIRTP )
> +        vvtd_handle_gcmd_sirtp(vvtd, val);
> +
> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_in_range(struct vcpu *v, unsigned long addr)
>  {
>      struct vvtd *vvtd = domain_vvtd(v->domain);
> @@ -135,12 +187,17 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>      {
>          switch ( offset )
>          {
> +        case DMAR_GCMD_REG:
> +            return vvtd_write_gcmd(vvtd, val);
> +
>          case DMAR_IEDATA_REG:
>          case DMAR_IEADDR_REG:
>          case DMAR_IEUADDR_REG:
>          case DMAR_FEDATA_REG:
>          case DMAR_FEADDR_REG:
>          case DMAR_FEUADDR_REG:
> +        case DMAR_IRTA_REG:
> +        case DMAR_IRTA_REG_HI:
>              vvtd_set_reg(vvtd, offset, val);
>              break;
>  
> @@ -148,6 +205,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>              break;
>          }
>      }
> +    else /* len == 8 */
> +    {
> +        switch ( offset )
> +        {
> +        case DMAR_IRTA_REG:
> +            vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);

I have kind of a generic comment regarding the handlers in general,
which I will just make here. Don't you need some kind of locking to
prevent concurrent read/write accesses to the registers?

Also the 'if' to handle different sized accesses to the same registers
seems quite cumbersome. I would think there's a better way to handle
this with a single switch statement.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 14/29] x86/vvtd: Enable Interrupt Remapping through GCMD
  2017-09-22  3:01 ` [PATCH V3 14/29] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
@ 2017-10-19 13:42   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 13:42 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:55PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Software writes this field to enable/disable interrupt reampping. This patch
> emulate IRES field of GCMD.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  3 ++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 30 +++++++++++++++++++++++++++++-
>  2 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index a0d5ec8..703726f 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -163,7 +163,8 @@
>  #define DMA_GSTS_AFLS   (((u64)1) << 28)
>  #define DMA_GSTS_WBFS   (((u64)1) << 27)
>  #define DMA_GSTS_QIES   (((u64)1) <<26)
> -#define DMA_GSTS_IRES   (((u64)1) <<25)
> +#define DMA_GSTS_IRES_SHIFT     25
> +#define DMA_GSTS_IRES   (((u64)1) << DMA_GSTS_IRES_SHIFT)
>  #define DMA_GSTS_SIRTPS_SHIFT   24
>  #define DMA_GSTS_SIRTPS (((u64)1) << DMA_GSTS_SIRTPS_SHIFT)
>  #define DMA_GSTS_CFIS   (((u64)1) <<23)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 6736956..a0f63e9 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -33,7 +33,8 @@
>  unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
>  
>  struct hvm_hw_vvtd_status {
> -    uint32_t eim_enabled : 1;
> +    uint32_t eim_enabled : 1,
> +             intremap_enabled : 1;

Again please use bool.

>      uint32_t irt_max_entry;
>      /* Interrupt remapping table base gfn */
>      uint64_t irt;
> @@ -84,6 +85,11 @@ static inline void vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
>      __set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
>  }
>  
> +static inline void vvtd_clear_bit(struct vvtd *vvtd, uint32_t reg, int nr)
> +{
> +    __clear_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
> +}

I'm not sure this functions are helpful, maybe it would be better to
just have a macro to get vvtd->regs->data32[reg/sizeof(uint32_t)]
instead, which seems to be the cumbersome part of the expression above
(and in vvtd_set_bit).

> +
>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
>  {
>      vtd->regs->data32[reg/sizeof(uint32_t)] = value;
> @@ -105,6 +111,23 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
>      return vtd->regs->data64[reg/sizeof(uint64_t)];
>  }
>  
> +static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
> +{
> +    vvtd_info("%sable Interrupt Remapping",
> +              (val & DMA_GCMD_IRE) ? "En" : "Dis");
> +
> +    if ( val & DMA_GCMD_IRE )
> +    {
> +        vvtd->status.intremap_enabled = true;
> +        vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_SHIFT);
> +    }
> +    else
> +    {
> +        vvtd->status.intremap_enabled = false;
> +        vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_SHIFT);
> +    }


You cna write the above like:

vvtd->status.intremap_enabled = val & DMA_GCMD_IRE;
(val & DMA_GCMD_IRE) ? vvtd_set_bit : vvtd_clear_bit
    (vvtd, DMAR_GSTS_REG, DMA_GSTS_IRES_SHIFT);

Or similar (certainly setting vvtd->status.intremap_enabled doesn't
need to be branched).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request
  2017-09-22  3:01 ` [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request Lan Tianyu
@ 2017-10-19 14:26   ` Roger Pau Monné
  2017-10-20  5:16     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 14:26 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:56PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> When a remapping interrupt request arrives, remapping hardware computes the
> interrupt_index per the algorithm described in VTD spec
> "Interrupt Remapping Table", interprets the IRTE and generates a remapped
> interrupt request.
> 
> This patch introduces viommu_handle_irq_request() to emulate the process how
> remapping hardware handles a remapping interrupt request.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> v3:
>  - Encode map_guest_page()'s error into void* to avoid using another parameter
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  21 +++
>  xen/drivers/passthrough/vtd/vvtd.c  | 264 +++++++++++++++++++++++++++++++++++-
>  2 files changed, 284 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 703726f..790384f 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -218,6 +218,21 @@
>  #define dma_frcd_source_id(c) (c & 0xffff)
>  #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
>  
> +enum VTD_FAULT_TYPE
> +{
> +    /* Interrupt remapping transition faults */
> +    VTD_FR_IR_REQ_RSVD      = 0x20, /* One or more IR request reserved
> +                                     * fields set */
> +    VTD_FR_IR_INDEX_OVER    = 0x21, /* Index value greater than max */
> +    VTD_FR_IR_ENTRY_P       = 0x22, /* Present (P) not set in IRTE */
> +    VTD_FR_IR_ROOT_INVAL    = 0x23, /* IR Root table invalid */
> +    VTD_FR_IR_IRTE_RSVD     = 0x24, /* IRTE Rsvd field non-zero with
> +                                     * Present flag set */
> +    VTD_FR_IR_REQ_COMPAT    = 0x25, /* Encountered compatible IR
> +                                     * request while disabled */
> +    VTD_FR_IR_SID_ERR       = 0x26, /* Invalid Source-ID */
> +};

Why does this need to be an enum? Plus enum type names should not be
all in uppercase.

In any case, I would just use defines, like it's done for all other
values in the file.

> +
>  /*
>   * 0: Present
>   * 1-11: Reserved
> @@ -358,6 +373,12 @@ struct iremap_entry {
>  };
>  
>  /*
> + * When VT-d doesn't enable Extended Interrupt Mode. Hardware only interprets
> + * only 8-bits ([15:8]) of Destination-ID field in the IRTEs.
> + */
> +#define IRTE_xAPIC_DEST_MASK 0xff00
> +
> +/*
>   * Posted-interrupt descriptor address is 64 bits with 64-byte aligned, only
>   * the upper 26 bits of lest significiant 32 bits is available.
>   */
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index a0f63e9..90c00f5 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -23,11 +23,17 @@
>  #include <xen/types.h>
>  #include <xen/viommu.h>
>  #include <xen/xmalloc.h>
> +#include <asm/apic.h>
>  #include <asm/current.h>
> +#include <asm/event.h>
>  #include <asm/hvm/domain.h>
> +#include <asm/io_apic.h>
>  #include <asm/page.h>
> +#include <asm/p2m.h>
> +#include <asm/viommu.h>
>  
>  #include "iommu.h"
> +#include "vtd.h"
>  
>  /* Supported capabilities by vvtd */
>  unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
> @@ -111,6 +117,132 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
>      return vtd->regs->data64[reg/sizeof(uint64_t)];
>  }
>  
> +static void* map_guest_page(struct domain *d, uint64_t gfn)

gfn_t seems like a better type then uint64_t.

> +{
> +    struct page_info *p;
> +    void *ret;
> +
> +    p = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);

You can do the initialization of p at declaration.

> +    if ( !p )
> +        return ERR_PTR(-EINVAL);
> +
> +    if ( !get_page_type(p, PGT_writable_page) )
> +    {
> +        put_page(p);
> +        return ERR_PTR(-EINVAL);
> +    }
> +
> +    ret = __map_domain_page_global(p);
> +    if ( !ret )
> +    {
> +        put_page_and_type(p);
> +        return ERR_PTR(-ENOMEM);
> +    }
> +
> +    return ret;
> +}
> +
> +static void unmap_guest_page(void *virt)
> +{
> +    struct page_info *page;
> +
> +    ASSERT((unsigned long)virt & PAGE_MASK);

I'm not sure I get the point of the check above.

> +    page = mfn_to_page(domain_page_map_to_mfn(virt));
> +
> +    unmap_domain_page_global(virt);
> +    put_page_and_type(page);
> +}
> +
> +static void vvtd_inj_irq(struct vlapic *target, uint8_t vector,
> +                         uint8_t trig_mode, uint8_t delivery_mode)
> +{
> +    vvtd_debug("dest=v%d, delivery_mode=%x vector=%d trig_mode=%d\n",
> +               vlapic_vcpu(target)->vcpu_id, delivery_mode, vector, trig_mode);
> +
> +    ASSERT((delivery_mode == dest_Fixed) ||
> +           (delivery_mode == dest_LowestPrio));
> +
> +    vlapic_set_irq(target, vector, trig_mode);
> +}
> +
> +static int vvtd_delivery(struct domain *d, uint8_t vector,
> +                         uint32_t dest, uint8_t dest_mode,
> +                         uint8_t delivery_mode, uint8_t trig_mode)
> +{
> +    struct vlapic *target;
> +    struct vcpu *v;
> +
> +    switch ( delivery_mode )
> +    {
> +    case dest_LowestPrio:
> +        target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
> +        if ( target != NULL )
> +        {
> +            vvtd_inj_irq(target, vector, trig_mode, delivery_mode);
> +            break;
> +        }
> +        vvtd_debug("null round robin: vector=%02x\n", vector);
> +        break;
> +
> +    case dest_Fixed:
> +        for_each_vcpu ( d, v )
> +            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode) )
> +                vvtd_inj_irq(vcpu_vlapic(v), vector, trig_mode, delivery_mode);
> +        break;
> +
> +    case dest_NMI:
> +        for_each_vcpu ( d, v )
> +            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode) &&
> +                 !test_and_set_bool(v->nmi_pending) )
> +                vcpu_kick(v);
> +        break;
> +
> +    default:
> +        gdprintk(XENLOG_WARNING, "Unsupported VTD delivery mode %d\n",
> +                 delivery_mode);
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static uint32_t irq_remapping_request_index(
> +    const struct arch_irq_remapping_request *irq)
> +{
> +    if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
> +    {
> +        uint32_t index;
> +        struct msi_msg_remap_entry msi_msg =
> +        {
> +            .address_lo = { .val = irq->msg.msi.addr },
> +            .data = irq->msg.msi.data,
> +        };
> +
> +        index = (msi_msg.address_lo.index_15 << 15) +
> +                msi_msg.address_lo.index_0_14;
> +        if ( msi_msg.address_lo.SHV )
> +            index += (uint16_t)msi_msg.data;
> +
> +        return index;
> +    }
> +    else if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
> +    {
> +        struct IO_APIC_route_remap_entry remap_rte = { .val = irq->msg.rte };
> +
> +        return (remap_rte.index_15 << 15) + remap_rte.index_0_14;
> +    }

IMHO a switch with a single return would be better here:

uint32_t index = 0;

switch ( irq->type )
{
case ...:
    index = ...;
    break;
}

return index;

> +    ASSERT_UNREACHABLE();
> +
> +    return 0;
> +}
> +
> +static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
> +{
> +    /* In xAPIC mode, only 8-bits([15:8]) are valid */
> +    return vvtd->status.eim_enabled ? dest
                                       : MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);

It's easier to read style wise.

> +}
> +
>  static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
>  {
>      vvtd_info("%sable Interrupt Remapping",
> @@ -255,6 +387,135 @@ static const struct hvm_mmio_ops vvtd_mmio_ops = {
>      .write = vvtd_write
>  };
>  
> +static void vvtd_handle_fault(struct vvtd *vvtd,
> +                              struct arch_irq_remapping_request *irq,
> +                              struct iremap_entry *irte,
> +                              unsigned int fault,
> +                              bool record_fault)
> +{
> +   if ( !record_fault )
> +        return;
> +
> +    switch ( fault )
> +    {
> +    case VTD_FR_IR_SID_ERR:
> +    case VTD_FR_IR_IRTE_RSVD:
> +    case VTD_FR_IR_ENTRY_P:
> +        if ( qinval_fault_disable(*irte) )
> +            break;
> +    /* fall through */
> +    case VTD_FR_IR_INDEX_OVER:
> +    case VTD_FR_IR_ROOT_INVAL:
> +        /* TODO: handle fault (e.g. record and report this fault to VM */
> +        break;
> +
> +    default:
> +        gdprintk(XENLOG_INFO, "Can't handle VT-d fault %x\n", fault);

You already defined some vvtd specific debug helpers, why are those
not used here? gdprintk (as the 'd' denotes) is only for debug
purposes.

> +    }
> +    return;
> +}
> +
> +static bool vvtd_irq_request_sanity_check(const struct vvtd *vvtd,
> +                                          struct arch_irq_remapping_request *irq)
> +{
> +    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
> +    {
> +        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
> +
> +        ASSERT(rte.format);

Is it fine to ASSERT here? Can't the guest set rte.format to whatever
it wants?

> +        return !!rte.reserved;
> +    }
> +    else if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
> +    {
> +        struct msi_msg_remap_entry msi_msg =
> +        { .address_lo = { .val = irq->msg.msi.addr } };
> +
> +        ASSERT(msi_msg.address_lo.format);
> +        return 0;
> +    }
> +    ASSERT_UNREACHABLE();

Again I think a switch would be better here.

> +
> +    return 0;
> +}
> +
> +/*
> + * 'record_fault' is a flag to indicate whether we need recording a fault
> + * and notifying guest when a fault happens during fetching vIRTE.
> + */
> +static int vvtd_get_entry(struct vvtd *vvtd,
> +                          struct arch_irq_remapping_request *irq,
> +                          struct iremap_entry *dest,
> +                          bool record_fault)
> +{
> +    uint32_t entry = irq_remapping_request_index(irq);
> +    struct iremap_entry  *irte, *irt_page;
> +
> +    vvtd_debug("interpret a request with index %x\n", entry);
> +
> +    if ( vvtd_irq_request_sanity_check(vvtd, irq) )
> +    {
> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_REQ_RSVD, record_fault);
> +        return -EINVAL;
> +    }
> +
> +    if ( entry > vvtd->status.irt_max_entry )
> +    {
> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_INDEX_OVER, record_fault);
> +        return -EACCES;
> +    }
> +
> +    irt_page = map_guest_page(vvtd->domain,
> +                              vvtd->status.irt + (entry >> IREMAP_ENTRY_ORDER));

Since AFAICT you have to read this page(s) every time an interrupt
needs to be delivered, wouldn't it make sense for performance reasons
to have the page permanently mapped?

What's the maximum number of pages that can be used here?

> +    if ( IS_ERR(irt_page) )
> +    {
> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_ROOT_INVAL, record_fault);
> +        return PTR_ERR(irt_page);
> +    }
> +
> +    irte = irt_page + (entry % (1 << IREMAP_ENTRY_ORDER));
> +    dest->val = irte->val;

Not that it matters much, but for coherency reasons I would only set
dest->val after all the checks have been performed.

> +    if ( !qinval_present(*irte) )
> +    {
> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_ENTRY_P, record_fault);
> +        unmap_guest_page(irt_page);
> +        return -ENOENT;
> +    }
> +
> +    /* Check reserved bits */
> +    if ( (irte->remap.res_1 || irte->remap.res_2 || irte->remap.res_3 ||
> +          irte->remap.res_4) )
> +    {
> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_IRTE_RSVD, record_fault);
> +        unmap_guest_page(irt_page);
> +        return -EINVAL;
> +    }
> +
> +    /* FIXME: We don't check against the source ID */
> +    unmap_guest_page(irt_page);
> +
> +    return 0;
> +}
> +
> +static int vvtd_handle_irq_request(struct domain *d,
> +                                   struct arch_irq_remapping_request *irq)
> +{
> +    struct iremap_entry irte;
> +    int ret;
> +    struct vvtd *vvtd = domain_vvtd(d);
> +
> +    if ( !vvtd || !vvtd->status.intremap_enabled )
> +        return -ENODEV;
> +
> +    ret = vvtd_get_entry(vvtd, irq, &irte, true);
> +    if ( ret )
> +        return ret;
> +
> +    return vvtd_delivery(vvtd->domain, irte.remap.vector,
> +                         irte_dest(vvtd, irte.remap.dst),
> +                         irte.remap.dm, irte.remap.dlm,
> +                         irte.remap.tm);
> +}
> +
>  static void vvtd_reset(struct vvtd *vvtd, uint64_t capability)
>  {
>      uint64_t cap = cap_set_num_fault_regs(1ULL) |
> @@ -324,7 +585,8 @@ static int vvtd_destroy(struct viommu *viommu)
>  
>  struct viommu_ops vvtd_hvm_vmx_ops = {
>      .create = vvtd_create,
> -    .destroy = vvtd_destroy
> +    .destroy = vvtd_destroy,
> +    .handle_irq_request = vvtd_handle_irq_request

You can add a ',' at the end, so that further additions don't need to
change two lines (and the same should be done with vvtd_destroy).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE
  2017-09-22  3:01 ` [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
@ 2017-10-19 14:39   ` Roger Pau Monné
  2017-10-20  5:22     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 14:39 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:57PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Without interrupt remapping, interrupt attributes can be extracted from
> msi message or IOAPIC RTE. However, with interrupt remapping enabled,
> the attributes are enclosed in the associated IRTE. This callback is
> for cases in which the caller wants to acquire interrupt attributes, for
> example:
> 1. vioapic_get_vector(). With vIOMMU, the RTE may don't contain vector.
                                                    ^ not
> 2. perform EOI which is always based on the interrupt vector.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
> v3:
>  - add example cases in which we will use this function.
> ---
>  xen/drivers/passthrough/vtd/vvtd.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 90c00f5..5e22ace 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -516,6 +516,26 @@ static int vvtd_handle_irq_request(struct domain *d,
>                           irte.remap.tm);
>  }
>  
> +static int vvtd_get_irq_info(struct domain *d,
> +                             struct arch_irq_remapping_request *irq,
> +                             struct arch_irq_remapping_info *info)
> +{
> +    int ret;
> +    struct iremap_entry irte;
> +    struct vvtd *vvtd = domain_vvtd(d);

I've realized that some of the helpers perform a if (!vvtd ) return
check, while others don't (like this one). Are some handlers expected
to be called without a vIOMMU? If so it would be good to list them
clearly.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 17/29] x86/vvtd: add a helper function to decide the interrupt format
  2017-09-22  3:01 ` [PATCH V3 17/29] x86/vvtd: add a helper function to decide the interrupt format Lan Tianyu
@ 2017-10-19 14:43   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 14:43 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:01:58PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Different platform may use different method to distinguish
> remapping format interrupt and normal format interrupt.
> 
> Intel uses one bit in IOAPIC RTE or MSI address register to
> indicate the interrupt is remapping format. vvtd will handle
> all the interrupts when .check_irq_remapping() return true.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/vvtd.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 5e22ace..bd1cadd 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -536,6 +536,28 @@ static int vvtd_get_irq_info(struct domain *d,
>      return 0;
>  }
>  
> +/* Probe whether the interrupt request is an remapping format */
> +static bool vvtd_is_remapping(struct domain *d,
> +                              struct arch_irq_remapping_request *irq)
> +{
> +    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
> +    {
> +        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
> +
> +        return rte.format;
> +    }
> +    else if ( irq->type == VIOMMU_REQUEST_IRQ_MSI )
> +    {
> +        struct msi_msg_remap_entry msi_msg =
> +        { .address_lo = { .val = irq->msg.msi.addr } };
> +
> +        return msi_msg.address_lo.format;
> +    }
> +    ASSERT_UNREACHABLE();

Switch please.

Also there's a bunch of temporary IO_APIC_route_remap_entry and
msi_msg_remap_entry local structures. Why don't you just create some
kind of union in arch_irq_remapping_request so that you don't need to
do this each time?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 18/29] VIOMMU: Add irq request callback to deal with irq remapping
  2017-09-22  3:01 ` [PATCH V3 18/29] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
@ 2017-10-19 15:00   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 15:00 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:01:59PM -0400, Lan Tianyu wrote:
> This patch is to add irq request callback for platform implementation
> to deal with irq remapping request.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c          | 15 +++++++++
>  xen/include/asm-x86/viommu.h | 72 ++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/viommu.h     | 11 +++++++
>  3 files changed, 98 insertions(+)
>  create mode 100644 xen/include/asm-x86/viommu.h
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index 55feb5d..b517158 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -163,6 +163,21 @@ int viommu_domctl(struct domain *d, struct xen_domctl_viommu_op *op,
>      return rc;
>  }
>  
> +int viommu_handle_irq_request(struct domain *d,
> +                              struct arch_irq_remapping_request *request)
> +{
> +    struct viommu *viommu = d->viommu;
> +
> +    if ( !viommu )
> +        return -EINVAL;

ENODEV

> +
> +    ASSERT(viommu->ops);
> +    if ( !viommu->ops->handle_irq_request )
> +        return -EINVAL;
> +
> +    return viommu->ops->handle_irq_request(d, request);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
> new file mode 100644
> index 0000000..366fbb6
> --- /dev/null
> +++ b/xen/include/asm-x86/viommu.h
> @@ -0,0 +1,72 @@
> +/*
> + * include/asm-x86/viommu.h
> + *
> + * Copyright (c) 2017 Intel Corporation.
> + * Author: Lan Tianyu <tianyu.lan@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#ifndef __ARCH_X86_VIOMMU_H__
> +#define __ARCH_X86_VIOMMU_H__
> +
> +/* IRQ request type */
> +#define VIOMMU_REQUEST_IRQ_MSI          0
> +#define VIOMMU_REQUEST_IRQ_APIC         1
> +
> +struct arch_irq_remapping_request

Oh, so you have been using arch_irq_remapping_request in previous
patches without it being introduced. This is becoming more and more
hard to review. I will try to finish reviewing the whole series but
please, in the future make sure that each patch compiles on it's
own.

It's impossible to properly review a series when you use a structure
that has not yet been introduced.

> +{
> +    union {
> +        /* MSI */
> +        struct {
> +            uint64_t addr;
> +            uint32_t data;
> +        } msi;
> +        /* Redirection Entry in IOAPIC */
> +        uint64_t rte;
> +    } msg;
> +    uint16_t source_id;
> +    uint8_t type;

Why don't you make this an enum?

> +};
> +
> +static inline void irq_request_ioapic_fill(struct arch_irq_remapping_request *req,
> +                                           uint32_t ioapic_id, uint64_t rte)
> +{
> +    ASSERT(req);
> +    req->type = VIOMMU_REQUEST_IRQ_APIC;
> +    req->source_id = ioapic_id;
> +    req->msg.rte = rte;
> +}
> +
> +static inline void irq_request_msi_fill(struct arch_irq_remapping_request *req,
> +                                        uint32_t source_id, uint64_t addr,
> +                                        uint32_t data)
> +{
> +    ASSERT(req);
> +    req->type = VIOMMU_REQUEST_IRQ_MSI;
> +    req->source_id = source_id;
> +    req->msg.msi.addr = addr;
> +    req->msg.msi.data = data;
> +}

You are introducing two functions here that are not used in this
patch. They should be added when they are used, or else it's very hard
to review.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 19/29] x86/vioapic: Hook interrupt delivery of vIOAPIC
  2017-09-22  3:02 ` [PATCH V3 19/29] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
@ 2017-10-19 15:37   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 15:37 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:02:00PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> When irq remapping is enabled, IOAPIC Redirection Entry may be in remapping
> format. If that, generate an irq_remapping_request and call the common
> VIOMMU abstraction's callback to handle this interrupt request. Device
> model is responsible for checking the request's validity.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> v3:
>  - use the new interface to check remapping format.
> ---
>  xen/arch/x86/hvm/vioapic.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 72cae93..5d0d1cd 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -30,6 +30,7 @@
>  #include <xen/lib.h>
>  #include <xen/errno.h>
>  #include <xen/sched.h>
> +#include <xen/viommu.h>
>  #include <public/hvm/ioreq.h>
>  #include <asm/hvm/io.h>
>  #include <asm/hvm/vpic.h>
> @@ -38,6 +39,7 @@
>  #include <asm/current.h>
>  #include <asm/event.h>
>  #include <asm/io_apic.h>
> +#include <asm/viommu.h>

I think asm/viommu should be included by viommu.h.

>  
>  /* HACK: Route IRQ0 only to VCPU0 to prevent time jumps. */
>  #define IRQ0_SPECIAL_ROUTING 1
> @@ -387,9 +389,17 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>      struct vlapic *target;
>      struct vcpu *v;
>      unsigned int irq = vioapic->base_gsi + pin;
> +    struct arch_irq_remapping_request request;
>  
>      ASSERT(spin_is_locked(&d->arch.hvm_domain.irq_lock));
>  
> +    irq_request_ioapic_fill(&request, vioapic->id, vioapic->redirtbl[pin].bits);

So the macro introduced in the previous patch should instead be
introduced here.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-09-22  3:02 ` [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
@ 2017-10-19 15:42   ` Roger Pau Monné
  2017-10-25  7:30     ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 15:42 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:02:01PM -0400, Lan Tianyu wrote:
> This patch is to add get_irq_info callback for platform implementation
> to convert irq remapping request to irq info (E,G vector, dest, dest_mode
> and so on).
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c          | 16 ++++++++++++++++
>  xen/include/asm-x86/viommu.h |  8 ++++++++
>  xen/include/xen/viommu.h     | 14 ++++++++++++++
>  3 files changed, 38 insertions(+)
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index b517158..0708e43 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -178,6 +178,22 @@ int viommu_handle_irq_request(struct domain *d,
>      return viommu->ops->handle_irq_request(d, request);
>  }
>  
> +int viommu_get_irq_info(struct domain *d,
> +                        struct arch_irq_remapping_request *request,
> +                        struct arch_irq_remapping_info *irq_info)
> +{
> +    struct viommu *viommu = d->viommu;
> +
> +    if ( !viommu )
> +        return -EINVAL;

OK, here there's a check for !viommu. Can we please have this written
down in the header? (ie: which functions are safe/expected to be
called without a viommu)

> +
> +    ASSERT(viommu->ops);
> +    if ( !viommu->ops->get_irq_info )
> +        return -EINVAL;
> +
> +    return viommu->ops->get_irq_info(d, request, irq_info);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
> index 366fbb6..586b6bd 100644
> --- a/xen/include/asm-x86/viommu.h
> +++ b/xen/include/asm-x86/viommu.h
> @@ -24,6 +24,14 @@
>  #define VIOMMU_REQUEST_IRQ_MSI          0
>  #define VIOMMU_REQUEST_IRQ_APIC         1
>  
> +struct arch_irq_remapping_info
> +{
> +    uint8_t  vector;
> +    uint32_t dest;
> +    uint32_t dest_mode:1;
> +    uint32_t delivery_mode:3;

Why uint32_t for this two last fields? Also please sort them so that
the padding is limited at the end of the structure.

> +};
> +
>  struct arch_irq_remapping_request
>  {
>      union {
> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
> index 230f6b1..beb40cd 100644
> --- a/xen/include/xen/viommu.h
> +++ b/xen/include/xen/viommu.h
> @@ -21,6 +21,7 @@
>  #define __XEN_VIOMMU_H__
>  
>  struct viommu;
> +struct arch_irq_remapping_info;
>  struct arch_irq_remapping_request;

If you include asm/viommu.h in viommu.h you don't need to forward
declarations.

>  
>  struct viommu_ops {
> @@ -28,6 +29,9 @@ struct viommu_ops {
>      int (*destroy)(struct viommu *viommu);
>      int (*handle_irq_request)(struct domain *d,
>                                struct arch_irq_remapping_request *request);
> +    int (*get_irq_info)(struct domain *d,
> +                        struct arch_irq_remapping_request *request,

AFAICT d and request should be constified.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 21/29] VIOMMU: Introduce callback of checking irq remapping mode
  2017-09-22  3:02 ` [PATCH V3 21/29] VIOMMU: Introduce callback of checking irq remapping mode Lan Tianyu
@ 2017-10-19 15:43   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 15:43 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:02:02PM -0400, Lan Tianyu wrote:
> This patch is to add callback for vIOAPIC and vMSI to check whether interrupt
> remapping is enabled.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/viommu.c      | 15 +++++++++++++++
>  xen/include/xen/viommu.h | 10 ++++++++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
> index 0708e43..ff95465 100644
> --- a/xen/common/viommu.c
> +++ b/xen/common/viommu.c
> @@ -194,6 +194,21 @@ int viommu_get_irq_info(struct domain *d,
>      return viommu->ops->get_irq_info(d, request, irq_info);
>  }
>  
> +bool viommu_check_irq_remapping(struct domain *d,
> +                                struct arch_irq_remapping_request *request)

Both should be constified.

> +{
> +    struct viommu *viommu = d->viommu;
> +
> +    if ( !viommu )
> +        return false;
> +
> +    ASSERT(viommu->ops);
> +    if ( !viommu->ops->check_irq_remapping )
> +        return false;
> +
> +    return viommu->ops->check_irq_remapping(d, request);

IMHO this helper should be introduced together with the vvtd
implementation of check_irq_remapping.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-09-22  3:02 ` [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
@ 2017-10-19 15:49   ` Roger Pau Monné
  2017-10-19 15:56     ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 15:49 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:02:03PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> When IOAPIC RTE is in remapping format, it doesn't contain the vector of
> interrupt. For this case, the RTE contains an index of interrupt remapping
> table where the vector of interrupt is stored. This patchs gets the vector
> through a vIOMMU interface.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/arch/x86/hvm/vioapic.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 5d0d1cd..9e47ef4 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -561,11 +561,25 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>  {
>      unsigned int pin;
>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
> +    struct arch_irq_remapping_request request;
>  
>      if ( !vioapic )
>          return -EINVAL;
>  
> -    return vioapic->redirtbl[pin].fields.vector;
> +    irq_request_ioapic_fill(&request, vioapic->id, vioapic->redirtbl[pin].bits);
> +    if ( viommu_check_irq_remapping(vioapic->domain, &request) )
> +    {
> +        int err;
> +        struct arch_irq_remapping_info info;
> +
> +        err = viommu_get_irq_info(vioapic->domain, &request, &info);
> +        return !err ? info.vector : err;

You can simplify this as return err :? info.vector;

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-10-19 15:49   ` Roger Pau Monné
@ 2017-10-19 15:56     ` Jan Beulich
  2017-10-20  1:04       ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Jan Beulich @ 2017-10-19 15:56 UTC (permalink / raw)
  To: Roger Pau Monné, Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, Chao Gao

>>> On 19.10.17 at 17:49, <roger.pau@citrix.com> wrote:
> On Thu, Sep 21, 2017 at 11:02:03PM -0400, Lan Tianyu wrote:
>> --- a/xen/arch/x86/hvm/vioapic.c
>> +++ b/xen/arch/x86/hvm/vioapic.c
>> @@ -561,11 +561,25 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>>  {
>>      unsigned int pin;
>>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>> +    struct arch_irq_remapping_request request;
>>  
>>      if ( !vioapic )
>>          return -EINVAL;
>>  
>> -    return vioapic->redirtbl[pin].fields.vector;
>> +    irq_request_ioapic_fill(&request, vioapic->id, vioapic->redirtbl[pin].bits);
>> +    if ( viommu_check_irq_remapping(vioapic->domain, &request) )
>> +    {
>> +        int err;
>> +        struct arch_irq_remapping_info info;
>> +
>> +        err = viommu_get_irq_info(vioapic->domain, &request, &info);
>> +        return !err ? info.vector : err;
> 
> You can simplify this as return err :? info.vector;

At which point the local variable becomes pretty pointless.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-09-22  3:02 ` [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
@ 2017-10-19 16:03   ` Roger Pau Monné
  2017-10-20  5:39     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 16:03 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Sep 21, 2017 at 11:02:05PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>

The title for this patch it's wrong, it modifies both the hypervisor
and libxc. Please fix it.

> When exposing vIOMMU (vvtd) to guest, guest can configure the msi to
> remapping format. For pass-through device, the physical interrupt now
> can be bound with remapping format msi. This patch introduce a flag,
> HVM_IRQ_DPCI_GUEST_REMAPPED, which indicate a physical interrupt is
> bound with remapping format guest interrupt. Thus, we can use
> (HVM_IRQ_DPCI_GUEST_REMAPPED | HVM_IRQ_DPCI_GUEST_MSI) to show the new
> binding type. Also provide an new interface to manage the new binding.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> 
> ---
> diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
> index bd8a918..4f5d37b 100644
> --- a/xen/include/asm-x86/hvm/irq.h
> +++ b/xen/include/asm-x86/hvm/irq.h
> @@ -121,6 +121,7 @@ struct dev_intx_gsi_link {
>  #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
>  #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
>  #define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
> +#define _HVM_IRQ_DPCI_GUEST_REMAPPED_SHIFT      7
>  #define _HVM_IRQ_DPCI_TRANSLATE_SHIFT          15
>  #define HVM_IRQ_DPCI_MACH_PCI        (1u << _HVM_IRQ_DPCI_MACH_PCI_SHIFT)
>  #define HVM_IRQ_DPCI_MACH_MSI        (1u << _HVM_IRQ_DPCI_MACH_MSI_SHIFT)
> @@ -128,6 +129,7 @@ struct dev_intx_gsi_link {
>  #define HVM_IRQ_DPCI_EOI_LATCH       (1u << _HVM_IRQ_DPCI_EOI_LATCH_SHIFT)
>  #define HVM_IRQ_DPCI_GUEST_PCI       (1u << _HVM_IRQ_DPCI_GUEST_PCI_SHIFT)
>  #define HVM_IRQ_DPCI_GUEST_MSI       (1u << _HVM_IRQ_DPCI_GUEST_MSI_SHIFT)
> +#define HVM_IRQ_DPCI_GUEST_REMAPPED  (1u << _HVM_IRQ_DPCI_GUEST_REMAPPED_SHIFT)
>  #define HVM_IRQ_DPCI_IDENTITY_GSI    (1u << _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT)
>  #define HVM_IRQ_DPCI_TRANSLATE       (1u << _HVM_IRQ_DPCI_TRANSLATE_SHIFT)

Please keep this sorted. It should go after the _GSI one.

>  
> @@ -137,6 +139,11 @@ struct hvm_gmsi_info {
>              uint32_t gvec;
>              uint32_t gflags;
>          } legacy;
> +        struct {
> +            uint32_t source_id;
> +            uint32_t data;
> +            uint64_t addr;
> +        } intremap;
>      };
>      int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
>      bool posted; /* directly deliver to guest via VT-d PI? */
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 68854b6..8c59cfc 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -559,6 +559,7 @@ typedef enum pt_irq_type_e {
>      PT_IRQ_TYPE_MSI,
>      PT_IRQ_TYPE_MSI_TRANSLATE,
>      PT_IRQ_TYPE_SPI,    /* ARM: valid range 32-1019 */
> +    PT_IRQ_TYPE_MSI_IR,

Introducing a new irq type seems dubious, at the end this is still a
MSI interrupt.

>  } pt_irq_type_t;
>  struct xen_domctl_bind_pt_irq {
>      uint32_t machine_irq;
> @@ -586,6 +587,12 @@ struct xen_domctl_bind_pt_irq {
>              uint64_aligned_t gtable;
>          } msi;
>          struct {
> +            uint32_t source_id;
> +            uint32_t data;
> +            uint64_t addr;
> +            uint64_t gtable;
> +        } msi_ir;

Have you tried to expand gflags somehow so that you don't need a new
type together with a new structure?

It seems quite cumbersome and also involves adding more handlers to
libxc.

At the end this is a domctl interface, so you should be able to modify
it at will.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest
  2017-09-22  3:02 ` [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
@ 2017-10-19 16:07   ` Roger Pau Monné
  2017-10-20  6:48     ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 16:07 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: andrew.cooper3, kevin.tian, Chao Gao, jbeulich, xen-devel

On Thu, Sep 21, 2017 at 11:02:06PM -0400, Lan Tianyu wrote:
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index 6196334..349a8cf 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -942,21 +942,20 @@ static void __msi_pirq_eoi(struct hvm_pirq_dpci *pirq_dpci)
>  static int _hvm_dpci_msi_eoi(struct domain *d,
>                               struct hvm_pirq_dpci *pirq_dpci, void *arg)
>  {
> -    int vector = (long)arg;
> +    uint8_t vector, dlm, vector_target = (long)arg;

Since you are changing this, please cast to (uint8_t) instead.

> +    uint32_t dest;
> +    bool dm;

Why are you moving dest, dm, dlm and vector here?

>  
> -    if ( (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
> -         (pirq_dpci->gmsi.legacy.gvec == vector) )
> +    if ( pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI )
>      {
> -        unsigned int dest = MASK_EXTR(pirq_dpci->gmsi.legacy.gflags,
> -                                      XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
> -        bool dest_mode = pirq_dpci->gmsi.legacy.gflags &
> -                         XEN_DOMCTL_VMSI_X86_DM_MASK;

AFAICT their scope is limited to this if.

> +        if ( pirq_dpci_2_msi_attr(d, pirq_dpci, &vector, &dest, &dm, &dlm) )
> +            return 0;
>  
> -        if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
> -                               dest_mode) )
> +        if ( vector == vector_target &&
> +             vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest, dm) )
>          {
> -            __msi_pirq_eoi(pirq_dpci);
> -            return 1;
> +                __msi_pirq_eoi(pirq_dpci);
> +                return 1;
>          }
>      }
>  
> -- 
> 1.8.3.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults
  2017-09-22  3:02 ` [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults Lan Tianyu
@ 2017-10-19 16:31   ` Roger Pau Monné
  2017-10-20  5:54     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-19 16:31 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: andrew.cooper3, kevin.tian, Chao Gao, jbeulich, xen-devel

On Thu, Sep 21, 2017 at 11:02:07PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Interrupt translation faults are non-recoverable fault. When faults
> are triggered, it needs to populate fault info to Fault Recording
> Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
> to deal with faults.
> 
> This patch emulates hardware's handling interrupt translation
> faults (more information about the process can be found in VT-d spec,
> chipter "Translation Faults", section "Non-Recoverable Fault
> Reporting" and section "Non-Recoverable Logging").
> Specifically, viommu_record_fault() records the fault information and
> viommu_report_non_recoverable_fault() reports faults to software.
> Currently, only Primary Fault Logging is supported and the Number of
> Fault-recording Registers is 1.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  60 +++++++--
>  xen/drivers/passthrough/vtd/vvtd.c  | 252 +++++++++++++++++++++++++++++++++++-
>  2 files changed, 301 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index 790384f..e19b045 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -198,26 +198,66 @@
>  #define DMA_CCMD_CAIG_MASK(x) (((u64)x) & ((u64) 0x3 << 59))
>  
>  /* FECTL_REG */
> -#define DMA_FECTL_IM (((u64)1) << 31)
> +#define DMA_FECTL_IM_SHIFT 31
> +#define DMA_FECTL_IM (1U << DMA_FECTL_IM_SHIFT)
> +#define DMA_FECTL_IP_SHIFT 30
> +#define DMA_FECTL_IP (1U << DMA_FECTL_IP_SHIFT)

Is it fine to change those from uint64_t to unsigned int?

>  
>  /* FSTS_REG */
> -#define DMA_FSTS_PFO ((u64)1 << 0)
> -#define DMA_FSTS_PPF ((u64)1 << 1)
> -#define DMA_FSTS_AFO ((u64)1 << 2)
> -#define DMA_FSTS_APF ((u64)1 << 3)
> -#define DMA_FSTS_IQE ((u64)1 << 4)
> -#define DMA_FSTS_ICE ((u64)1 << 5)
> -#define DMA_FSTS_ITE ((u64)1 << 6)
> -#define DMA_FSTS_FAULTS    DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE
> +#define DMA_FSTS_PFO_SHIFT 0
> +#define DMA_FSTS_PFO (1U << DMA_FSTS_PFO_SHIFT)
> +#define DMA_FSTS_PPF_SHIFT 1
> +#define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_SHIFT)
> +#define DMA_FSTS_AFO (1U << 2)
> +#define DMA_FSTS_APF (1U << 3)
> +#define DMA_FSTS_IQE (1U << 4)
> +#define DMA_FSTS_ICE (1U << 5)
> +#define DMA_FSTS_ITE (1U << 6)

This seemingly non-functional changes should be done in a separate
patch.

> +#define DMA_FSTS_PRO_SHIFT 7
> +#define DMA_FSTS_PRO (1U << DMA_FSTS_PRO_SHIFT)
> +#define DMA_FSTS_FAULTS    (DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | \
> +                            DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | \
> +                            DMA_FSTS_ITE | DMA_FSTS_PRO)
> +#define DMA_FSTS_RW1CS     (DMA_FSTS_PFO | DMA_FSTS_AFO | DMA_FSTS_APF | \
> +                            DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE | \
> +                            DMA_FSTS_PRO)
>  #define dma_fsts_fault_record_index(s) (((s) >> 8) & 0xff)
>  
>  /* FRCD_REG, 32 bits access */
> -#define DMA_FRCD_F (((u64)1) << 31)
> +#define DMA_FRCD_LEN            0x10
> +#define DMA_FRCD2_OFFSET        0x8
> +#define DMA_FRCD3_OFFSET        0xc
> +#define DMA_FRCD_F_SHIFT        31
> +#define DMA_FRCD_F ((u64)1 << DMA_FRCD_F_SHIFT)
>  #define dma_frcd_type(d) ((d >> 30) & 1)
>  #define dma_frcd_fault_reason(c) (c & 0xff)
>  #define dma_frcd_source_id(c) (c & 0xffff)
>  #define dma_frcd_page_addr(d) (d & (((u64)-1) << 12)) /* low 64 bit */
>  
> +struct vtd_fault_record_register
> +{
> +    union {
> +        struct {
> +            uint64_t lo;
> +            uint64_t hi;
> +        } bits;
> +        struct {
> +            uint64_t rsvd0          :12,
> +                     fault_info     :52;
> +            uint64_t source_id      :16,
> +                     rsvd1          :9,
> +                     pmr            :1,  /* Privilege Mode Requested */
> +                     exe            :1,  /* Execute Permission Requested */
> +                     pasid_p        :1,  /* PASID Present */
> +                     fault_reason   :8,  /* Fault Reason */
> +                     pasid_val      :20, /* PASID Value */
> +                     addr_type      :2,  /* Address Type */
> +                     type           :1,  /* Type. (0) Write (1) Read/AtomicOp */
> +                     fault          :1;  /* Fault */
> +        } fields;
> +    };
> +};
> +
>  enum VTD_FAULT_TYPE
>  {
>      /* Interrupt remapping transition faults */
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index bd1cadd..745941c 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -19,6 +19,7 @@
>   */
>  
>  #include <xen/domain_page.h>
> +#include <xen/lib.h>
>  #include <xen/sched.h>
>  #include <xen/types.h>
>  #include <xen/viommu.h>
> @@ -41,6 +42,7 @@ unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
>  struct hvm_hw_vvtd_status {
>      uint32_t eim_enabled : 1,
>               intremap_enabled : 1;
> +    uint32_t fault_index;
>      uint32_t irt_max_entry;
>      /* Interrupt remapping table base gfn */
>      uint64_t irt;
> @@ -86,6 +88,22 @@ struct vvtd *domain_vvtd(struct domain *d)
>      return (d->viommu) ? d->viommu->priv : NULL;
>  }
>  
> +static inline int vvtd_test_and_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
> +{
> +    return test_and_set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
> +}
> +
> +static inline int vvtd_test_and_clear_bit(struct vvtd *vvtd, uint32_t reg,
> +                                          int nr)
> +{
> +    return test_and_clear_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
> +}
> +
> +static inline int vvtd_test_bit(struct vvtd *vvtd, uint32_t reg, int nr)
> +{
> +    return test_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);
> +}
> +
>  static inline void vvtd_set_bit(struct vvtd *vvtd, uint32_t reg, int nr)
>  {
>      __set_bit(nr, &vvtd->regs->data32[reg/sizeof(uint32_t)]);

You seem to use a mix of __ (locked) and non-locked bitopts, as said
before, please get your locking straight, and use the bitops that you
need accordingly.

> @@ -206,6 +224,23 @@ static int vvtd_delivery(struct domain *d, uint8_t vector,
>      return 0;
>  }
>  
> +void vvtd_generate_interrupt(const struct vvtd *vvtd, uint32_t addr,
> +                             uint32_t data)
> +{
> +    uint8_t dest, dm, dlm, tm, vector;
> +
> +    vvtd_debug("Sending interrupt %x %x to d%d",
> +               addr, data, vvtd->domain->domain_id);
> +
> +    dest = MASK_EXTR(addr, MSI_ADDR_DEST_ID_MASK);
> +    dm = !!(addr & MSI_ADDR_DESTMODE_MASK);

dm wants to be bool instead.

> +    dlm = MASK_EXTR(data, MSI_DATA_DELIVERY_MODE_MASK);
> +    tm = MASK_EXTR(data, MSI_DATA_TRIGGER_MASK);
> +    vector = data & MSI_DATA_VECTOR_MASK;

Please use MASK_EXTR.

You can also initialize all of them at declaration.

> +
> +    vvtd_delivery(vvtd->domain, vector, dest, dm, dlm, tm);
> +}
> +
>  static uint32_t irq_remapping_request_index(
>      const struct arch_irq_remapping_request *irq)
>  {
> @@ -243,6 +278,207 @@ static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
>             MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
>  }
>  
> +static void vvtd_report_non_recoverable_fault(struct vvtd *vvtd, int reason)
> +{
> +    uint32_t fsts;
> +
> +    fsts = vvtd_get_reg(vvtd, DMAR_FSTS_REG);

Initialize at declaration time.

> +    vvtd_set_bit(vvtd, DMAR_FSTS_REG, reason);
> +
> +    /*
> +     * Accoroding to VT-d spec "Non-Recoverable Fault Event" chapter, if
> +     * there are any previously reported interrupt conditions that are yet to
> +     * be sevices by software, the Fault Event interrrupt is not generated.
> +     */
> +    if ( fsts & DMA_FSTS_FAULTS )
> +        return;
> +
> +    vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
> +    if ( !vvtd_test_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_SHIFT) )
> +    {
> +        uint32_t fe_data, fe_addr;

Missing newline.

> +        fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
> +        fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);

Initialize at declaration.

> +        vvtd_generate_interrupt(vvtd, fe_addr, fe_data);
> +        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
> +    }
> +}
> +
> +static void vvtd_update_ppf(struct vvtd *vvtd)
> +{
> +    int i;
> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);

This returns a uint32_t. But I see cap is used only once, so there's
no point in having a local variable for it.

> +    unsigned int base = cap_fault_reg_offset(cap);
> +
> +    for ( i = 0; i < cap_num_fault_regs(cap); i++ )
> +    {
> +        if ( vvtd_test_bit(vvtd, base + i * DMA_FRCD_LEN + DMA_FRCD3_OFFSET,
> +                           DMA_FRCD_F_SHIFT) )
> +        {
> +            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PPF_SHIFT);
> +            return;
> +        }
> +    }
> +    /*
> +     * No Primary Fault is in Fault Record Registers, thus clear PPF bit in
> +     * FSTS.
> +     */
> +    vvtd_clear_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PPF_SHIFT);
> +
> +    /* If no fault is in FSTS, clear pending bit in FECTL. */
> +    if ( !(vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS) )
> +        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
> +}
> +
> +/*
> + * Commit a fault to emulated Fault Record Registers.
> + */
> +static void vvtd_commit_frcd(struct vvtd *vvtd, int idx,
> +                             struct vtd_fault_record_register *frcd)

frcd wants to be const.

> +{
> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
> +    unsigned int base = cap_fault_reg_offset(cap);

Same here.

> +
> +    vvtd_set_reg_quad(vvtd, base + idx * DMA_FRCD_LEN, frcd->bits.lo);
> +    vvtd_set_reg_quad(vvtd, base + idx * DMA_FRCD_LEN + 8, frcd->bits.hi);
> +    vvtd_update_ppf(vvtd);
> +}
> +
> +/*
> + * Allocate a FRCD for the caller. If success, return the FRI. Or, return -1
> + * when failure.
> + */
> +static int vvtd_alloc_frcd(struct vvtd *vvtd)
> +{
> +    int prev;
> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
> +    unsigned int base = cap_fault_reg_offset(cap);
> +
> +    /* Set the F bit to indicate the FRCD is in use. */
> +    if ( !vvtd_test_and_set_bit(vvtd,
> +                                base + vvtd->status.fault_index * DMA_FRCD_LEN +
> +                                DMA_FRCD3_OFFSET, DMA_FRCD_F_SHIFT) )
> +    {
> +        prev = vvtd->status.fault_index;
> +        vvtd->status.fault_index = (prev + 1) % cap_num_fault_regs(cap);
> +        return vvtd->status.fault_index;

I would prefer that you return the index as an unsigned int parameter
passed by reference rather than as the return value of the function,
but that might not be the preference of others.

> +    }
> +    return -ENOMEM;
> +}
> +
> +static void vvtd_free_frcd(struct vvtd *vvtd, int i)
> +{
> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
> +    unsigned int base = cap_fault_reg_offset(cap);
> +
> +    vvtd_clear_bit(vvtd, base + i * DMA_FRCD_LEN + DMA_FRCD3_OFFSET,
> +                   DMA_FRCD_F_SHIFT);
> +}
> +
> +static int vvtd_record_fault(struct vvtd *vvtd,
> +                             struct arch_irq_remapping_request *request,
> +                             int reason)
> +{
> +    struct vtd_fault_record_register frcd;
> +    int fault_index;
> +
> +    switch(reason)
> +    {
> +    case VTD_FR_IR_REQ_RSVD:
> +    case VTD_FR_IR_INDEX_OVER:
> +    case VTD_FR_IR_ENTRY_P:
> +    case VTD_FR_IR_ROOT_INVAL:
> +    case VTD_FR_IR_IRTE_RSVD:
> +    case VTD_FR_IR_REQ_COMPAT:
> +    case VTD_FR_IR_SID_ERR:
> +        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_SHIFT) )
> +            return X86EMUL_OKAY;
> +
> +        /* No available Fault Record means Fault overflowed */
> +        fault_index = vvtd_alloc_frcd(vvtd);
> +        if ( fault_index == -1 )

Erm, wouldn't vvtd_alloc_frcd return -ENOMEM in case of error? Ie: you
should check if ( fault_index < 0 ).

> +        {
> +            vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_PFO_SHIFT);
> +            return X86EMUL_OKAY;
> +        }
> +        memset(&frcd, 0, sizeof(frcd));
> +        frcd.fields.fault_reason = (uint8_t)reason;
> +        frcd.fields.fault_info = ((uint64_t)irq_remapping_request_index(request)) << 36;

This line is clearly too long.

> +        frcd.fields.source_id = (uint16_t)request->source_id;

Why do you need the casting for reason and source_id?

> +        frcd.fields.fault = 1;
> +        vvtd_commit_frcd(vvtd, fault_index, &frcd);
> +        return X86EMUL_OKAY;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        break;
> +    }
> +
> +    gdprintk(XENLOG_ERR, "Can't handle vVTD Fault (reason 0x%x).", reason);
> +    domain_crash(vvtd->domain);
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
> +{
> +    /* Writing a 1 means clear fault */
> +    if ( val & DMA_FRCD_F )
> +    {
> +        vvtd_free_frcd(vvtd, 0);
> +        vvtd_update_ppf(vvtd);
> +    }
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
> +{
> +    /*
> +     * Only DMA_FECTL_IM bit is writable. Generate pending event when unmask.
> +     */
> +    if ( !(val & DMA_FECTL_IM) )
> +    {
> +        /* Clear IM */
> +        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_SHIFT);
> +        if ( vvtd_test_and_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT) )
> +        {
> +            uint32_t fe_data, fe_addr;
> +
> +            fe_data = vvtd_get_reg(vvtd, DMAR_FEDATA_REG);
> +            fe_addr = vvtd_get_reg(vvtd, DMAR_FEADDR_REG);
> +            vvtd_generate_interrupt(vvtd, fe_addr, fe_data);

You don't need all this local variables, just put the calls to
vvtd_get_reg at vvtd_generate_interrupt.

> +        }
> +    }
> +    else
> +        vvtd_set_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IM_SHIFT);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
> +{
> +    int i, max_fault_index = DMA_FSTS_PRO_SHIFT;
> +    uint64_t bits_to_clear = val & DMA_FSTS_RW1CS;
> +
> +    if ( bits_to_clear )
> +    {

i wants to be unsigned int and declared here, inside of the if.

> +        i = find_first_bit(&bits_to_clear, max_fault_index / 8 + 1);
> +        while ( i <= max_fault_index )
> +        {
> +            vvtd_clear_bit(vvtd, DMAR_FSTS_REG, i);
> +            i = find_next_bit(&bits_to_clear, max_fault_index / 8 + 1, i + 1);
> +        }

A for would be more suitable for this loop.

> +    }
> +
> +    /*
> +     * Clear IP field when all status fields in the Fault Status Register
> +     * being clear.
> +     */
> +    if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
> +        vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
> +
> +    return X86EMUL_OKAY;
> +}
> +
>  static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
>  {
>      vvtd_info("%sable Interrupt Remapping",
> @@ -336,7 +572,9 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>                        unsigned int len, unsigned long val)
>  {
>      struct vvtd *vvtd = domain_vvtd(v->domain);
> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
>      unsigned int offset = addr - vvtd->base_addr;
> +    unsigned int fault_offset = cap_fault_reg_offset(cap);

Again vvtd_get_reg return a uint32_t, and you don't seem to use it
elsewhere apart from cap_fault_reg_offset so please ditch it.

>      vvtd_info("Write offset %x len %d val %lx\n", offset, len, val);
>  
> @@ -350,6 +588,12 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>          case DMAR_GCMD_REG:
>              return vvtd_write_gcmd(vvtd, val);
>  
> +        case DMAR_FSTS_REG:
> +            return vvtd_write_fsts(vvtd, val);
> +
> +        case DMAR_FECTL_REG:
> +            return vvtd_write_fectl(vvtd, val);
> +
>          case DMAR_IEDATA_REG:
>          case DMAR_IEADDR_REG:
>          case DMAR_IEUADDR_REG:
> @@ -362,6 +606,9 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>              break;
>  
>          default:
> +            if ( offset == fault_offset + DMA_FRCD3_OFFSET )

Parentheses around the addition.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE
  2017-10-19 15:56     ` Jan Beulich
@ 2017-10-20  1:04       ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  1:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Roger Pau Monné

On Thu, Oct 19, 2017 at 09:56:34AM -0600, Jan Beulich wrote:
>>>> On 19.10.17 at 17:49, <roger.pau@citrix.com> wrote:
>> On Thu, Sep 21, 2017 at 11:02:03PM -0400, Lan Tianyu wrote:
>>> --- a/xen/arch/x86/hvm/vioapic.c
>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>> @@ -561,11 +561,25 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>>>  {
>>>      unsigned int pin;
>>>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>>> +    struct arch_irq_remapping_request request;
>>>  
>>>      if ( !vioapic )
>>>          return -EINVAL;
>>>  
>>> -    return vioapic->redirtbl[pin].fields.vector;
>>> +    irq_request_ioapic_fill(&request, vioapic->id, vioapic->redirtbl[pin].bits);
>>> +    if ( viommu_check_irq_remapping(vioapic->domain, &request) )
>>> +    {
>>> +        int err;
>>> +        struct arch_irq_remapping_info info;
>>> +
>>> +        err = viommu_get_irq_info(vioapic->domain, &request, &info);
>>> +        return !err ? info.vector : err;
>> 
>> You can simplify this as return err :? info.vector;
>
>At which point the local variable becomes pretty pointless.

Maybe we can remove 'err' and return
unlikely(viommu_get_irq_info(...)) ?: info.vector;

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes
  2017-10-19  9:49   ` Roger Pau Monné
@ 2017-10-20  1:36     ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  1:36 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 10:49:22AM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:47PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> A field, viommu_info, is added to struct libxl_domain_build_info. Several
>> attributes can be specified by guest config file for virtual IOMMU. These
>> attributes are used for DMAR construction and vIOMMU creation.
>
>IMHO this should come much later in the series, ideally you would
>introduce the xl/libxl code in the last patches, together with the
>xl.cfg man page change.

It can be put to the end of this series. But I prefer to introduce the
vIOMMU from up to down (means the use interface goes first and then how
to implement a vIOMMU step by step) for it may be easier to understand. 

>
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index 9123585..decd7a8 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -27,6 +27,8 @@
>>  
>>  #include <xen-xsm/flask/flask.h>
>>  
>> +#define VIOMMU_VTD_BASE_ADDR        0xfed90000ULL
>
>This should be in libxl_arch.h see LAPIC_BASE_ADDRESS.

Agree.

>
>> +
>>  int libxl__domain_create_info_setdefault(libxl__gc *gc,
>>                                           libxl_domain_create_info *c_info)
>>  {
>> @@ -59,6 +61,47 @@ void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
>>                              LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT;
>>  }
>>  
>> +static int libxl__viommu_set_default(libxl__gc *gc,
>> +                                     libxl_domain_build_info *b_info)
>> +{
>> +    int i;
>> +
>> +    if (!b_info->num_viommus)
>> +        return 0;
>> +
>> +    for (i = 0; i < b_info->num_viommus; i++) {
>> +        libxl_viommu_info *viommu = &b_info->viommu[i];
>> +
>> +        if (libxl_defbool_is_default(viommu->intremap))
>> +            libxl_defbool_set(&viommu->intremap, true);
>> +
>> +        if (!libxl_defbool_val(viommu->intremap)) {
>> +            LOGE(ERROR, "Cannot create one virtual VTD without intremap");
>> +            return ERROR_INVAL;
>> +        }
>> +
>> +        if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
>> +            /*
>> +             * If there are multiple vIOMMUs, we need arrange all vIOMMUs to
>> +             * avoid overlap. Put a check here in case we get here for multiple
>> +             * vIOMMUs case.
>> +             */
>> +            if (b_info->num_viommus > 1) {
>> +                LOGE(ERROR, "Multiple vIOMMUs support is under implementation");
>
>s/LOGE/LOG/ LOGE should only be used when errno is set (which is not
>the case here).

yes.

>
>> +                return ERROR_INVAL;
>> +            }
>> +
>> +            /* Set default values to unexposed fields */
>> +            viommu->base_addr = VIOMMU_VTD_BASE_ADDR;
>> +
>> +            /* Set desired capbilities */
>> +            viommu->cap = VIOMMU_CAP_IRQ_REMAPPING;
>
>I'm not sure whether this code should be in libxl_x86.c, but
>libxl__domain_build_info_setdefault is already quite messed up, so I
>guess it's fine.
>
>> +        }
>
>Shouldn't this be:
>
>switch(viommu->type) {
>case LIBXL_VIOMMU_TYPE_INTEL_VTD:
>    ...
>    break;
>
>default:
>    return ERROR_INVAL;
>}
>
>So that you catch type being set to an invalid vIOMMU type?

sure. Will update.

>
>> +    if (d_config->b_info.num_viommus > 1) {
>> +        ret = ERROR_INVAL;
>> +        LOGD(ERROR, domid, "Cannot support multiple vIOMMUs");
>> +        goto error_out;
>> +    }
>
>Er, you already have this check in libxl__viommu_set_default, and in
>any case I would just rely on the hypervisor failing to create more
>than one vIOMMU per domain, rather than adding the same check here.

It is fine to me. Will remove all checks against viommu numbers in
toolstack.

Thanks
chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD
  2017-10-19 10:00   ` Roger Pau Monné
@ 2017-10-20  1:44     ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  1:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 11:00:27AM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:48PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> A new logic is added to build ACPI DMAR table in tool stack for a guest
>> with one virtual VTD and pass through it to guest via existing mechanism. If
>> there already are ACPI tables needed to pass through, we joint the tables.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> 
>> ---
>> +/*
>> + * For hvm, we don't need build acpi in libxl. Instead, it's built in hvmloader.
>> + * But if one hvm has virtual VTD(s), we build DMAR table for it and joint this
>> + * table with existing content in acpi_modules in order to employ HVM
>> + * firmware pass-through mechanism to pass-through DMAR table.
>> + */
>> +static int libxl__dom_load_acpi_hvm(libxl__gc *gc,
>> +                                    const libxl_domain_build_info *b_info,
>> +                                    struct xc_dom_image *dom)
>> +{
>
>AFAICT there's some code duplication between libxl__dom_load_acpi_hvm
>and libxl__dom_load_acpi_pvh, isn't there a chance you could put this
>in a common function?

Will give it a shot.

>
>> +    struct acpi_config config = { 0 };
>> +    struct acpi_ctxt ctxt;
>> +    void *table;
>> +    uint32_t len;
>> +
>> +    if ((b_info->type != LIBXL_DOMAIN_TYPE_HVM) ||
>> +        (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) ||
>> +        (b_info->num_viommus != 1) ||
>> +        (b_info->viommu[0].type != LIBXL_VIOMMU_TYPE_INTEL_VTD))
>> +        return 0;
>> +
>> +    ctxt.mem_ops.alloc = acpi_memalign;
>> +    ctxt.mem_ops.v2p = virt_to_phys;
>> +    ctxt.mem_ops.free = acpi_mem_free;
>> +
>> +    if (libxl_defbool_val(b_info->viommu[0].intremap))
>> +        config.iommu_intremap_supported = true;
>> +    /* x2apic is always enabled since in no case we must disable it */
>> +    config.iommu_x2apic_supported = true;
>> +    config.iommu_base_addr = b_info->viommu[0].base_addr;
>
>I don't see libxl__dom_load_acpi_pvh setting any of the vIOMMU fields.

I didn't try to enable vIOMMU for PVH. I will attemp to add vIOMMU
support for PVH and put those patches at the end of this series. 

>
>> +int libxl__dom_load_acpi(libxl__gc *gc,
>> +                         const libxl_domain_build_info *b_info,
>> +                         struct xc_dom_image *dom)
>> +{
>> +
>> +    if (b_info->type != LIBXL_DOMAIN_TYPE_HVM)
>> +        return 0;
>
>Keep in mind a new PVH domain type has been introduced recently in
>libxl, you will have to change this to b_info->type == LIBXL_DOMAIN_TYPE_PV.

Thanks for your kind reminder.

Chao


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 10/29] vtd: add and align register definitions
  2017-10-19 10:21   ` Roger Pau Monné
@ 2017-10-20  1:47     ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  1:47 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 11:21:35AM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:51PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> No functional changes.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>
>Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks

>
>Would have been nice to maybe split this into two, one patch that
>simply fixes the alignment and another one that introduces the new
>defines (or even introduce the new defines when they are actually
>needed).

Will divide it into two parts.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM
  2017-10-19 11:20   ` Roger Pau Monné
@ 2017-10-20  2:46     ` Chao Gao
  2017-10-20  6:56       ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-20  2:46 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 12:20:35PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:52PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> This patch adds create/destroy function for the emulated VTD
>> and adapts it to the common VIOMMU abstraction.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  
>> -obj-y += iommu.o
>>  obj-y += dmar.o
>> -obj-y += utils.o
>> -obj-y += qinval.o
>>  obj-y += intremap.o
>> +obj-y += iommu.o
>> +obj-y += qinval.o
>>  obj-y += quirks.o
>> +obj-y += utils.o
>
>Why do you need to shuffle the list above?

I placed them in alphabetic order.

>
>Also I'm not sure the Intel vIOMMU implementation should live here. As
>you can see the path is:
>
>xen/drivers/passthrough/vtd/
>
>The vIOMMU is not tied to passthrough at all, so I would rather place
>it in:
>
>xen/drivers/vvtd/
>
>Or maybe you can create something like:
>
>xen/drivers/viommu/
>
>So that all vIOMMU implementations can share some code.
>

vvtd and vtd use the same header files (i.g. vtd.h). That is why we put
it there.  If that, we shoule move the related header files to a public
directory.

>>  #define cap_isoch(c)        (((c) >> 23) & 1)
>>  #define cap_qos(c)        (((c) >> 22) & 1)
>>  #define cap_mgaw(c)        ((((c) >> 16) & 0x3f) + 1)
>> -#define cap_sagaw(c)        (((c) >> 8) & 0x1f)
>> +#define cap_set_mgaw(c)     ((((c) - 1) & 0x3f) << 16)
>> +#define cap_sagaw(c)        (((c) >> DMA_CAP_SAGAW_SHIFT) & 0x1f)
>>  #define cap_caching_mode(c)    (((c) >> 7) & 1)
>>  #define cap_phmr(c)        (((c) >> 6) & 1)
>>  #define cap_plmr(c)        (((c) >> 5) & 1)
>> @@ -104,10 +113,16 @@
>>  #define ecap_niotlb_iunits(e)    ((((e) >> 24) & 0xff) + 1)
>>  #define ecap_iotlb_offset(e)     ((((e) >> 8) & 0x3ff) * 16)
>>  #define ecap_coherent(e)         ((e >> 0) & 0x1)
>> -#define ecap_queued_inval(e)     ((e >> 1) & 0x1)
>> +#define DMA_ECAP_QI_SHIFT        1
>> +#define DMA_ECAP_QI              (1ULL << DMA_ECAP_QI_SHIFT)
>> +#define ecap_queued_inval(e)     ((e >> DMA_ECAP_QI_SHIFT) & 0x1)
>
>Looks like this could be based on MASK_EXTR instead, but seeing how
>the file is full of open-coded mask extracts I'm not sure it's worth
>it anymore.
>
>>  #define ecap_dev_iotlb(e)        ((e >> 2) & 0x1)
>> -#define ecap_intr_remap(e)       ((e >> 3) & 0x1)
>> -#define ecap_eim(e)              ((e >> 4) & 0x1)
>> +#define DMA_ECAP_IR_SHIFT        3
>> +#define DMA_ECAP_IR              (1ULL << DMA_ECAP_IR_SHIFT)
>> +#define ecap_intr_remap(e)       ((e >> DMA_ECAP_IR_SHIFT) & 0x1)
>> +#define DMA_ECAP_EIM_SHIFT       4
>> +#define DMA_ECAP_EIM             (1ULL << DMA_ECAP_EIM_SHIFT)
>> +#define ecap_eim(e)              ((e >> DMA_ECAP_EIM_SHIFT) & 0x1)
>
>Maybe worth placing all the DMA_ECAP_* defines in a separate section?
>Seems like how it's done for other features like DMA_FSTS or
>DMA_CCMD.

Got it.

>> +
>> +/* Supported capabilities by vvtd */
>> +unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
>
>static?
>
>Or even better, why is this not a define like VIOMMU_MAX_CAPS or
>similar.

Yeah. It should be renamed to VVTD_MAX_CAPS.

>
>> +
>> +union hvm_hw_vvtd_regs {
>> +    uint32_t data32[256];
>> +    uint64_t data64[128];
>> +};
>
>Do you really need to store all the register space instead of only
>storing specific registers?

I prefer to store all the registers for we don't need a trick to map
the real offset in hardware to the index in the array.

>
>> +
>> +struct vvtd {
>> +    /* Address range of remapping hardware register-set */
>> +    uint64_t base_addr;
>> +    uint64_t length;
>
>The length field doesn't seem to be used below.

will remove it.

>
>> +    /* Point back to the owner domain */
>> +    struct domain *domain;
>> +    union hvm_hw_vvtd_regs *regs;
>
>Does this need to be a pointer?

Seems not.
>
>> +    struct page_info *regs_page;
>> +};
>> +
>> +static int vvtd_create(struct domain *d, struct viommu *viommu)
>> +{
>> +    struct vvtd *vvtd;
>> +    int ret;
>> +
>> +    if ( !is_hvm_domain(d) || (viommu->base_address & (PAGE_SIZE - 1)) ||
>> +        (~vvtd_caps & viommu->caps) )
>> +        return -EINVAL;
>> +
>> +    ret = -ENOMEM;
>> +    vvtd = xzalloc_bytes(sizeof(struct vvtd));
>> +    if ( !vvtd )
>> +        return ret;
>> +
>> +    vvtd->regs_page = alloc_domheap_page(d, MEMF_no_owner);
>> +    if ( !vvtd->regs_page )
>> +        goto out1;
>> +
>> +    vvtd->regs = __map_domain_page_global(vvtd->regs_page);
>> +    if ( !vvtd->regs )
>> +        goto out2;
>> +    clear_page(vvtd->regs);
>
>Not sure why vvtd->regs needs to be a pointer, and why it needs to use
>a full page. AFAICT the size of hvm_hw_vvtd_regs is 1024B, so you are
>wasting 3/4 of a page.

I will define registers as an array directly and 
shrink the size to the number we are really used now.

>> +struct viommu_ops vvtd_hvm_vmx_ops = {
>> +    .create = vvtd_create,
>> +    .destroy = vvtd_destroy
>> +};
>> +
>> +static int vvtd_register(void)
>> +{
>> +    viommu_register_type(VIOMMU_TYPE_INTEL_VTD, &vvtd_hvm_vmx_ops);
>> +    return 0;
>> +}
>> +__initcall(vvtd_register);
>
>As commented in another patch I think the vIOMMU types should be
>registered using a method similar to REGISTER_SCHEDULER.

Both are ok to me. Will follow your suggestion.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD
  2017-10-19 11:34   ` Roger Pau Monné
@ 2017-10-20  2:58     ` Chao Gao
  2017-10-20  9:51       ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-20  2:58 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 12:34:54PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:53PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> This patch adds VVTD MMIO handler to deal with MMIO access.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/drivers/passthrough/vtd/vvtd.c | 91 ++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 91 insertions(+)
>> 
>> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
>> index c851ec7..a3002c3 100644
>> --- a/xen/drivers/passthrough/vtd/vvtd.c
>> +++ b/xen/drivers/passthrough/vtd/vvtd.c
>> @@ -47,6 +47,29 @@ struct vvtd {
>>      struct page_info *regs_page;
>>  };
>>  
>> +/* Setting viommu_verbose enables debugging messages of vIOMMU */
>> +bool __read_mostly viommu_verbose;
>> +boolean_runtime_param("viommu_verbose", viommu_verbose);
>> +
>> +#ifndef NDEBUG
>> +#define vvtd_info(fmt...) do {                    \
>> +    if ( viommu_verbose )                         \
>> +        gprintk(XENLOG_G_INFO, ## fmt);           \
>
>If you use gprintk you should use XENLOG_INFO, the '_G_' variants are
>only used with plain printk.
>
>> +} while(0)
>> +#define vvtd_debug(fmt...) do {                   \
>> +    if ( viommu_verbose && printk_ratelimit() )   \
>
>Not sure why you need printk_ratelimit, XENLOG_G_DEBUG is already
>rate-limited.
>
>> +        printk(XENLOG_G_DEBUG fmt);               \
>
>Any reason why vvtd_info uses gprintk and here you use printk?
>
>> +} while(0)
>> +#else
>> +#define vvtd_info(fmt...) do {} while(0)
>> +#define vvtd_debug(fmt...) do {} while(0)
>
>No need for 'fmt...' just '...' will suffice since you are discarding
>the parameters anyway.
>
>> +#endif
>> +
>> +struct vvtd *domain_vvtd(struct domain *d)
>> +{
>> +    return (d->viommu) ? d->viommu->priv : NULL;
>
>Unneeded parentheses around d->viommu.
>
>Also, it seems wring to call domain_vvtd with !d->viommu. So I think
>this helper should just be removed, and d->viommu->priv fetched
>directly.
>
>> +}
>> +
>>  static inline void vvtd_set_reg(struct vvtd *vtd, uint32_t reg, uint32_t value)
>>  {
>>      vtd->regs->data32[reg/sizeof(uint32_t)] = value;
>> @@ -68,6 +91,73 @@ static inline uint64_t vvtd_get_reg_quad(struct vvtd *vtd, uint32_t reg)
>>      return vtd->regs->data64[reg/sizeof(uint64_t)];
>>  }
>>  
>> +static int vvtd_in_range(struct vcpu *v, unsigned long addr)
>> +{
>> +    struct vvtd *vvtd = domain_vvtd(v->domain);
>> +
>> +    if ( vvtd )
>> +        return (addr >= vvtd->base_addr) &&
>> +               (addr < vvtd->base_addr + PAGE_SIZE);
>
>So the register set covers a PAGE_SIZE, but hvm_hw_vvtd_regs only
>covers from 0 to 1024B, it seems like there's something wrong here...
>
>> +    return 0;
>> +}
>> +
>> +static int vvtd_read(struct vcpu *v, unsigned long addr,
>> +                     unsigned int len, unsigned long *pval)
>> +{
>> +    struct vvtd *vvtd = domain_vvtd(v->domain);
>> +    unsigned int offset = addr - vvtd->base_addr;
>> +
>> +    vvtd_info("Read offset %x len %d\n", offset, len);
>> +
>> +    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
>
>What value does hardware return when performing unaligned reads or
>reads with wrong size?

According to VT-d spec section 10.2, "Software must access 64-bit and
128-bit registers as either aligned quadwords or aligned doublewords".
I am afraid there is no specific hardware action for unaligned access
information. We can treat it as undefined? Then do nothing.
But I did see windows driver has such accesses. We need to add a
workaround for windows later.

>
>Here you return with pval not set, which is dangerous.

Indeed. But I need check whether the pval is initialized by the caller.
If that, it is safe.

>
>> +        return X86EMUL_OKAY;
>> +
>> +    if ( len == 4 )
>> +        *pval = vvtd_get_reg(vvtd, offset);
>> +    else
>> +        *pval = vvtd_get_reg_quad(vvtd, offset);
>
>...yet here you don't check for offset < 1024.
>
>> +
>> +    return X86EMUL_OKAY;
>> +}
>> +
>> +static int vvtd_write(struct vcpu *v, unsigned long addr,
>> +                      unsigned int len, unsigned long val)
>> +{
>> +    struct vvtd *vvtd = domain_vvtd(v->domain);
>> +    unsigned int offset = addr - vvtd->base_addr;
>> +
>> +    vvtd_info("Write offset %x len %d val %lx\n", offset, len, val);
>> +
>> +    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
>> +        return X86EMUL_OKAY;
>> +
>> +    if ( len == 4 )
>> +    {
>> +        switch ( offset )
>> +        {
>> +        case DMAR_IEDATA_REG:
>> +        case DMAR_IEADDR_REG:
>> +        case DMAR_IEUADDR_REG:
>> +        case DMAR_FEDATA_REG:
>> +        case DMAR_FEADDR_REG:
>> +        case DMAR_FEUADDR_REG:
>> +            vvtd_set_reg(vvtd, offset, val);
>
>Hm, so you are using a full page when you only care for 6 4B
>registers? Seem like quite of a waste of memory.

Registers are added here when according features are introduced.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  2017-10-19 11:56   ` Roger Pau Monné
@ 2017-10-20  4:08     ` Chao Gao
  2017-10-20  6:57       ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-20  4:08 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 12:56:45PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:54PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Software sets this field to set/update the interrupt remapping table pointer
>> used by hardware. The interrupt remapping table pointer is specified through
>> the Interrupt Remapping Table Address (IRTA_REG) register.
>> 
>> This patch emulates this operation and adds some new fields in VVTD to track
>> info (e.g. the table's gfn and max supported entries) of interrupt remapping
>> table.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> 
>> ---
>> @@ -148,6 +205,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>>              break;
>>          }
>>      }
>> +    else /* len == 8 */
>> +    {
>> +        switch ( offset )
>> +        {
>> +        case DMAR_IRTA_REG:
>> +            vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);
>
>I have kind of a generic comment regarding the handlers in general,
>which I will just make here. Don't you need some kind of locking to
>prevent concurrent read/write accesses to the registers?

I think guest should be responsible to avoid concurrency.
Xen only needs to not be fooled (crashed) by a malicious guest.

>
>Also the 'if' to handle different sized accesses to the same registers
>seems quite cumbersome. I would think there's a better way to handle
>this with a single switch statement.

Will use only one switch statement and maybe add if-else for the
cases which can be accessed with different size.

Thanks
chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request
  2017-10-19 14:26   ` Roger Pau Monné
@ 2017-10-20  5:16     ` Chao Gao
  2017-10-20 10:01       ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-20  5:16 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 03:26:30PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:56PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> When a remapping interrupt request arrives, remapping hardware computes the
>> interrupt_index per the algorithm described in VTD spec
>> "Interrupt Remapping Table", interprets the IRTE and generates a remapped
>> interrupt request.
>> 
>> This patch introduces viommu_handle_irq_request() to emulate the process how
>> remapping hardware handles a remapping interrupt request.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> 
>> ---
>>  
>> +enum VTD_FAULT_TYPE
>> +{
>> +    /* Interrupt remapping transition faults */
>> +    VTD_FR_IR_REQ_RSVD      = 0x20, /* One or more IR request reserved
>> +                                     * fields set */
>> +    VTD_FR_IR_INDEX_OVER    = 0x21, /* Index value greater than max */
>> +    VTD_FR_IR_ENTRY_P       = 0x22, /* Present (P) not set in IRTE */
>> +    VTD_FR_IR_ROOT_INVAL    = 0x23, /* IR Root table invalid */
>> +    VTD_FR_IR_IRTE_RSVD     = 0x24, /* IRTE Rsvd field non-zero with
>> +                                     * Present flag set */
>> +    VTD_FR_IR_REQ_COMPAT    = 0x25, /* Encountered compatible IR
>> +                                     * request while disabled */
>> +    VTD_FR_IR_SID_ERR       = 0x26, /* Invalid Source-ID */
>> +};
>
>Why does this need to be an enum? Plus enum type names should not be
>all in uppercase.
>
>In any case, I would just use defines, like it's done for all other
>values in the file.

Sure. Will follow your suggestion.

>> +static void unmap_guest_page(void *virt)
>> +{
>> +    struct page_info *page;
>> +
>> +    ASSERT((unsigned long)virt & PAGE_MASK);
>
>I'm not sure I get the point of the check above.

I intended to check the address is 4K-page aligned. It should be

ASSERT(!((unsigned long)virt & (PAGE_SIZE - 1)))

>> +}
>> +
>> +static inline uint32_t irte_dest(struct vvtd *vvtd, uint32_t dest)
>> +{
>> +    /* In xAPIC mode, only 8-bits([15:8]) are valid */
>> +    return vvtd->status.eim_enabled ? dest
>                                       : MASK_EXTR(dest, IRTE_xAPIC_DEST_MASK);
>
>It's easier to read style wise.

sure.

>
>> +}
>> +
>>  static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
>>  {
>>      vvtd_info("%sable Interrupt Remapping",
>> @@ -255,6 +387,135 @@ static const struct hvm_mmio_ops vvtd_mmio_ops = {
>>      .write = vvtd_write
>>  };
>>  
>> +static void vvtd_handle_fault(struct vvtd *vvtd,
>> +                              struct arch_irq_remapping_request *irq,
>> +                              struct iremap_entry *irte,
>> +                              unsigned int fault,
>> +                              bool record_fault)
>> +{
>> +   if ( !record_fault )
>> +        return;
>> +
>> +    switch ( fault )
>> +    {
>> +    case VTD_FR_IR_SID_ERR:
>> +    case VTD_FR_IR_IRTE_RSVD:
>> +    case VTD_FR_IR_ENTRY_P:
>> +        if ( qinval_fault_disable(*irte) )
>> +            break;
>> +    /* fall through */
>> +    case VTD_FR_IR_INDEX_OVER:
>> +    case VTD_FR_IR_ROOT_INVAL:
>> +        /* TODO: handle fault (e.g. record and report this fault to VM */
>> +        break;
>> +
>> +    default:
>> +        gdprintk(XENLOG_INFO, "Can't handle VT-d fault %x\n", fault);
>
>You already defined some vvtd specific debug helpers, why are those
>not used here? gdprintk (as the 'd' denotes) is only for debug
>purposes.

The default case means we encounter a bug in our code. I want to output
this kind of message even for non-debug version. I should use gprintk.

>
>> +    }
>> +    return;
>> +}
>> +
>> +static bool vvtd_irq_request_sanity_check(const struct vvtd *vvtd,
>> +                                          struct arch_irq_remapping_request *irq)
>> +{
>> +    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
>> +    {
>> +        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
>> +
>> +        ASSERT(rte.format);
>
>Is it fine to ASSERT here? Can't the guest set rte.format to whatever
>it wants?

Guest can use legacy format interrupt (i.e. rte.format = 0). However,
we only reach here when callback 'check_irq_remapping' return true and
for vvtd, 'check_irq_remapping' just returns the format bit of irq request.
If here ret.format isn't true, there must be a bug in our code.

>> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_REQ_RSVD, record_fault);
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( entry > vvtd->status.irt_max_entry )
>> +    {
>> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_INDEX_OVER, record_fault);
>> +        return -EACCES;
>> +    }
>> +
>> +    irt_page = map_guest_page(vvtd->domain,
>> +                              vvtd->status.irt + (entry >> IREMAP_ENTRY_ORDER));
>
>Since AFAICT you have to read this page(s) every time an interrupt
>needs to be delivered, wouldn't it make sense for performance reasons
>to have the page permanently mapped?

Yes. It is. Actually, we have a draft patch to do this. But to justify
the necessity, I should run some benchmark at first. Mapping a guest
page is slow on x86, right?

>
>What's the maximum number of pages that can be used here?

VT-d current support 2^16 entries at most. The size of each entry is 128
byte. Thus, we need 2^11 pages at most.

>
>> +    if ( IS_ERR(irt_page) )
>> +    {
>> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_ROOT_INVAL, record_fault);
>> +        return PTR_ERR(irt_page);
>> +    }
>> +
>> +    irte = irt_page + (entry % (1 << IREMAP_ENTRY_ORDER));
>> +    dest->val = irte->val;
>
>Not that it matters much, but for coherency reasons I would only set
>dest->val after all the checks have been performed.

agree.

Thanks
chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE
  2017-10-19 14:39   ` Roger Pau Monné
@ 2017-10-20  5:22     ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  5:22 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 03:39:44PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:01:57PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Without interrupt remapping, interrupt attributes can be extracted from
>> msi message or IOAPIC RTE. However, with interrupt remapping enabled,
>> the attributes are enclosed in the associated IRTE. This callback is
>> for cases in which the caller wants to acquire interrupt attributes, for
>> example:
>> 1. vioapic_get_vector(). With vIOMMU, the RTE may don't contain vector.
>                                                    ^ not
>> 2. perform EOI which is always based on the interrupt vector.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>> v3:
>>  - add example cases in which we will use this function.
>> ---
>>  xen/drivers/passthrough/vtd/vvtd.c | 23 ++++++++++++++++++++++-
>>  1 file changed, 22 insertions(+), 1 deletion(-)
>> 
>> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
>> index 90c00f5..5e22ace 100644
>> --- a/xen/drivers/passthrough/vtd/vvtd.c
>> +++ b/xen/drivers/passthrough/vtd/vvtd.c
>> @@ -516,6 +516,26 @@ static int vvtd_handle_irq_request(struct domain *d,
>>                           irte.remap.tm);
>>  }
>>  
>> +static int vvtd_get_irq_info(struct domain *d,
>> +                             struct arch_irq_remapping_request *irq,
>> +                             struct arch_irq_remapping_info *info)
>> +{
>> +    int ret;
>> +    struct iremap_entry irte;
>> +    struct vvtd *vvtd = domain_vvtd(d);
>
>I've realized that some of the helpers perform a if (!vvtd ) return
>check, while others don't (like this one). Are some handlers expected
>to be called without a vIOMMU?

No. I forgot to check the existence of a vIOMMU here.

Thanks
chao


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq
  2017-10-19 16:03   ` Roger Pau Monné
@ 2017-10-20  5:39     ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  5:39 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Thu, Oct 19, 2017 at 05:03:26PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:02:05PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>
>The title for this patch it's wrong, it modifies both the hypervisor
>and libxc. Please fix it.
>
>> When exposing vIOMMU (vvtd) to guest, guest can configure the msi to
>> remapping format. For pass-through device, the physical interrupt now
>> can be bound with remapping format msi. This patch introduce a flag,
>> HVM_IRQ_DPCI_GUEST_REMAPPED, which indicate a physical interrupt is
>> bound with remapping format guest interrupt. Thus, we can use
>> (HVM_IRQ_DPCI_GUEST_REMAPPED | HVM_IRQ_DPCI_GUEST_MSI) to show the new
>> binding type. Also provide an new interface to manage the new binding.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> 
>> ---
>> diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
>> index bd8a918..4f5d37b 100644
>> --- a/xen/include/asm-x86/hvm/irq.h
>> +++ b/xen/include/asm-x86/hvm/irq.h
>> @@ -121,6 +121,7 @@ struct dev_intx_gsi_link {
>>  #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
>>  #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
>>  #define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
>> +#define _HVM_IRQ_DPCI_GUEST_REMAPPED_SHIFT      7
>>  #define _HVM_IRQ_DPCI_TRANSLATE_SHIFT          15
>>  #define HVM_IRQ_DPCI_MACH_PCI        (1u << _HVM_IRQ_DPCI_MACH_PCI_SHIFT)
>>  #define HVM_IRQ_DPCI_MACH_MSI        (1u << _HVM_IRQ_DPCI_MACH_MSI_SHIFT)
>> @@ -128,6 +129,7 @@ struct dev_intx_gsi_link {
>>  #define HVM_IRQ_DPCI_EOI_LATCH       (1u << _HVM_IRQ_DPCI_EOI_LATCH_SHIFT)
>>  #define HVM_IRQ_DPCI_GUEST_PCI       (1u << _HVM_IRQ_DPCI_GUEST_PCI_SHIFT)
>>  #define HVM_IRQ_DPCI_GUEST_MSI       (1u << _HVM_IRQ_DPCI_GUEST_MSI_SHIFT)
>> +#define HVM_IRQ_DPCI_GUEST_REMAPPED  (1u << _HVM_IRQ_DPCI_GUEST_REMAPPED_SHIFT)
>>  #define HVM_IRQ_DPCI_IDENTITY_GSI    (1u << _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT)
>>  #define HVM_IRQ_DPCI_TRANSLATE       (1u << _HVM_IRQ_DPCI_TRANSLATE_SHIFT)
>
>Please keep this sorted. It should go after the _GSI one.
>
>>  
>> @@ -137,6 +139,11 @@ struct hvm_gmsi_info {
>>              uint32_t gvec;
>>              uint32_t gflags;
>>          } legacy;
>> +        struct {
>> +            uint32_t source_id;
>> +            uint32_t data;
>> +            uint64_t addr;
>> +        } intremap;
>>      };
>>      int dest_vcpu_id; /* -1 :multi-dest, non-negative: dest_vcpu_id */
>>      bool posted; /* directly deliver to guest via VT-d PI? */
>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
>> index 68854b6..8c59cfc 100644
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -559,6 +559,7 @@ typedef enum pt_irq_type_e {
>>      PT_IRQ_TYPE_MSI,
>>      PT_IRQ_TYPE_MSI_TRANSLATE,
>>      PT_IRQ_TYPE_SPI,    /* ARM: valid range 32-1019 */
>> +    PT_IRQ_TYPE_MSI_IR,
>
>Introducing a new irq type seems dubious, at the end this is still a
>MSI interrupt.
>
>>  } pt_irq_type_t;
>>  struct xen_domctl_bind_pt_irq {
>>      uint32_t machine_irq;
>> @@ -586,6 +587,12 @@ struct xen_domctl_bind_pt_irq {
>>              uint64_aligned_t gtable;
>>          } msi;
>>          struct {
>> +            uint32_t source_id;
>> +            uint32_t data;
>> +            uint64_t addr;
>> +            uint64_t gtable;
>> +        } msi_ir;
>
>Have you tried to expand gflags somehow so that you don't need a new
>type together with a new structure?

gflags doesn't have enough bits to contain so much information.

>
>It seems quite cumbersome and also involves adding more handlers to
>libxc.
>
>At the end this is a domctl interface, so you should be able to modify
>it at will.

Considering gtable and gflags are also needed for 'msi_ir', 
modifying the existing interface seems better than adding an new one.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults
  2017-10-19 16:31   ` Roger Pau Monné
@ 2017-10-20  5:54     ` Chao Gao
  2017-10-20 10:08       ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-20  5:54 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, andrew.cooper3, kevin.tian, jbeulich, xen-devel

On Thu, Oct 19, 2017 at 05:31:37PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:02:07PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Interrupt translation faults are non-recoverable fault. When faults
>> are triggered, it needs to populate fault info to Fault Recording
>> Registers and inject vIOMMU msi interrupt to notify guest IOMMU driver
>> to deal with faults.
>> 
>> This patch emulates hardware's handling interrupt translation
>> faults (more information about the process can be found in VT-d spec,
>> chipter "Translation Faults", section "Non-Recoverable Fault
>> Reporting" and section "Non-Recoverable Logging").
>> Specifically, viommu_record_fault() records the fault information and
>> viommu_report_non_recoverable_fault() reports faults to software.
>> Currently, only Primary Fault Logging is supported and the Number of
>> Fault-recording Registers is 1.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/drivers/passthrough/vtd/iommu.h |  60 +++++++--
>>  xen/drivers/passthrough/vtd/vvtd.c  | 252 +++++++++++++++++++++++++++++++++++-
>>  2 files changed, 301 insertions(+), 11 deletions(-)
>> 
>> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
>> index 790384f..e19b045 100644
>> --- a/xen/drivers/passthrough/vtd/iommu.h
>> +++ b/xen/drivers/passthrough/vtd/iommu.h
>> @@ -198,26 +198,66 @@
>>  #define DMA_CCMD_CAIG_MASK(x) (((u64)x) & ((u64) 0x3 << 59))
>>  
>>  /* FECTL_REG */
>> -#define DMA_FECTL_IM (((u64)1) << 31)
>> +#define DMA_FECTL_IM_SHIFT 31
>> +#define DMA_FECTL_IM (1U << DMA_FECTL_IM_SHIFT)
>> +#define DMA_FECTL_IP_SHIFT 30
>> +#define DMA_FECTL_IP (1U << DMA_FECTL_IP_SHIFT)
>
>Is it fine to change those from uint64_t to unsigned int?

Yes. The FECTL and FSTS are 32-bit registers.

>
>>  
>>  /* FSTS_REG */
>> -#define DMA_FSTS_PFO ((u64)1 << 0)
>> -#define DMA_FSTS_PPF ((u64)1 << 1)
>> -#define DMA_FSTS_AFO ((u64)1 << 2)
>> -#define DMA_FSTS_APF ((u64)1 << 3)
>> -#define DMA_FSTS_IQE ((u64)1 << 4)
>> -#define DMA_FSTS_ICE ((u64)1 << 5)
>> -#define DMA_FSTS_ITE ((u64)1 << 6)
>> -#define DMA_FSTS_FAULTS    DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_AFO | DMA_FSTS_APF | DMA_FSTS_IQE | DMA_FSTS_ICE | DMA_FSTS_ITE
>> +#define DMA_FSTS_PFO_SHIFT 0
>> +#define DMA_FSTS_PFO (1U << DMA_FSTS_PFO_SHIFT)
>> +#define DMA_FSTS_PPF_SHIFT 1
>> +#define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_SHIFT)
>> +#define DMA_FSTS_AFO (1U << 2)
>> +#define DMA_FSTS_APF (1U << 3)
>> +#define DMA_FSTS_IQE (1U << 4)
>> +#define DMA_FSTS_ICE (1U << 5)
>> +#define DMA_FSTS_ITE (1U << 6)
>
>This seemingly non-functional changes should be done in a separate
>patch.

sure.

>> +static int vvtd_alloc_frcd(struct vvtd *vvtd)
>> +{
>> +    int prev;
>> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
>> +    unsigned int base = cap_fault_reg_offset(cap);
>> +
>> +    /* Set the F bit to indicate the FRCD is in use. */
>> +    if ( !vvtd_test_and_set_bit(vvtd,
>> +                                base + vvtd->status.fault_index * DMA_FRCD_LEN +
>> +                                DMA_FRCD3_OFFSET, DMA_FRCD_F_SHIFT) )
>> +    {
>> +        prev = vvtd->status.fault_index;
>> +        vvtd->status.fault_index = (prev + 1) % cap_num_fault_regs(cap);
>> +        return vvtd->status.fault_index;
>
>I would prefer that you return the index as an unsigned int parameter
>passed by reference rather than as the return value of the function,
>but that might not be the preference of others.

What are the pros and cons?

>> +static int vvtd_record_fault(struct vvtd *vvtd,
>> +                             struct arch_irq_remapping_request *request,
>> +                             int reason)
>> +{
>> +    struct vtd_fault_record_register frcd;
>> +    int fault_index;
>> +
>> +    switch(reason)
>> +    {
>> +    case VTD_FR_IR_REQ_RSVD:
>> +    case VTD_FR_IR_INDEX_OVER:
>> +    case VTD_FR_IR_ENTRY_P:
>> +    case VTD_FR_IR_ROOT_INVAL:
>> +    case VTD_FR_IR_IRTE_RSVD:
>> +    case VTD_FR_IR_REQ_COMPAT:
>> +    case VTD_FR_IR_SID_ERR:
>> +        if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_PFO_SHIFT) )
>> +            return X86EMUL_OKAY;
>> +
>> +        /* No available Fault Record means Fault overflowed */
>> +        fault_index = vvtd_alloc_frcd(vvtd);
>> +        if ( fault_index == -1 )
>
>Erm, wouldn't vvtd_alloc_frcd return -ENOMEM in case of error? Ie: you
>should check if ( fault_index < 0 ).

It is a mistake.

Thanks
Chao


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM
  2017-10-20  6:56       ` Jan Beulich
@ 2017-10-20  6:12         ` Chao Gao
  2017-10-20  8:37         ` Lan Tianyu
  1 sibling, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-20  6:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Roger Pau Monné

On Fri, Oct 20, 2017 at 12:56:03AM -0600, Jan Beulich wrote:
>>>> On 20.10.17 at 04:46, <chao.gao@intel.com> wrote:
>> On Thu, Oct 19, 2017 at 12:20:35PM +0100, Roger Pau Monné wrote:
>>>On Thu, Sep 21, 2017 at 11:01:52PM -0400, Lan Tianyu wrote:
>>>> From: Chao Gao <chao.gao@intel.com>
>>>> 
>>>> This patch adds create/destroy function for the emulated VTD
>>>> and adapts it to the common VIOMMU abstraction.
>>>> 
>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>>>> ---
>>>>  
>>>> -obj-y += iommu.o
>>>>  obj-y += dmar.o
>>>> -obj-y += utils.o
>>>> -obj-y += qinval.o
>>>>  obj-y += intremap.o
>>>> +obj-y += iommu.o
>>>> +obj-y += qinval.o
>>>>  obj-y += quirks.o
>>>> +obj-y += utils.o
>>>
>>>Why do you need to shuffle the list above?
>> 
>> I placed them in alphabetic order.
>
>Which is appreciated. But this being non-essential for the patch, it
>would avoid (valid) reviewer questions if you said in the description
>this is an intended but non-essential change.

Sure. I will keep this in mind.

>
>>>Also I'm not sure the Intel vIOMMU implementation should live here. As
>>>you can see the path is:
>>>
>>>xen/drivers/passthrough/vtd/
>>>
>>>The vIOMMU is not tied to passthrough at all, so I would rather place
>>>it in:
>
>Hmm, is vIOMMU usable without an actual backing IOMMU?

I think yes. Now, All vIOMMU features are emulated.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest
  2017-10-19 16:07   ` Roger Pau Monné
@ 2017-10-20  6:48     ` Jan Beulich
  0 siblings, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2017-10-20  6:48 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, andrew.cooper3, kevin.tian, xen-devel, Chao Gao

>>> On 19.10.17 at 18:07, <roger.pau@citrix.com> wrote:
> On Thu, Sep 21, 2017 at 11:02:06PM -0400, Lan Tianyu wrote:
>> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
>> index 6196334..349a8cf 100644
>> --- a/xen/drivers/passthrough/io.c
>> +++ b/xen/drivers/passthrough/io.c
>> @@ -942,21 +942,20 @@ static void __msi_pirq_eoi(struct hvm_pirq_dpci *pirq_dpci)
>>  static int _hvm_dpci_msi_eoi(struct domain *d,
>>                               struct hvm_pirq_dpci *pirq_dpci, void *arg)
>>  {
>> -    int vector = (long)arg;
>> +    uint8_t vector, dlm, vector_target = (long)arg;
> 
> Since you are changing this, please cast to (uint8_t) instead.

That would cause a compiler warning - long (or unsigned long) is
the right type to use (on Xen and Linux at least) when wanting to
convert a pointer to an integer.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM
  2017-10-20  2:46     ` Chao Gao
@ 2017-10-20  6:56       ` Jan Beulich
  2017-10-20  6:12         ` Chao Gao
  2017-10-20  8:37         ` Lan Tianyu
  0 siblings, 2 replies; 108+ messages in thread
From: Jan Beulich @ 2017-10-20  6:56 UTC (permalink / raw)
  To: Roger Pau Monné, Chao Gao
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel

>>> On 20.10.17 at 04:46, <chao.gao@intel.com> wrote:
> On Thu, Oct 19, 2017 at 12:20:35PM +0100, Roger Pau Monné wrote:
>>On Thu, Sep 21, 2017 at 11:01:52PM -0400, Lan Tianyu wrote:
>>> From: Chao Gao <chao.gao@intel.com>
>>> 
>>> This patch adds create/destroy function for the emulated VTD
>>> and adapts it to the common VIOMMU abstraction.
>>> 
>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>>> ---
>>>  
>>> -obj-y += iommu.o
>>>  obj-y += dmar.o
>>> -obj-y += utils.o
>>> -obj-y += qinval.o
>>>  obj-y += intremap.o
>>> +obj-y += iommu.o
>>> +obj-y += qinval.o
>>>  obj-y += quirks.o
>>> +obj-y += utils.o
>>
>>Why do you need to shuffle the list above?
> 
> I placed them in alphabetic order.

Which is appreciated. But this being non-essential for the patch, it
would avoid (valid) reviewer questions if you said in the description
this is an intended but non-essential change.

>>Also I'm not sure the Intel vIOMMU implementation should live here. As
>>you can see the path is:
>>
>>xen/drivers/passthrough/vtd/
>>
>>The vIOMMU is not tied to passthrough at all, so I would rather place
>>it in:

Hmm, is vIOMMU usable without an actual backing IOMMU?

>>xen/drivers/vvtd/
>>
>>Or maybe you can create something like:
>>
>>xen/drivers/viommu/
>>
>>So that all vIOMMU implementations can share some code.
>>
> 
> vvtd and vtd use the same header files (i.g. vtd.h). That is why we put
> it there.  If that, we shoule move the related header files to a public
> directory.

And AMD (long ago) had placed their (still incomplete) virtual
implementation into the same directory as well. I.e. at this point
I'm not really opposed to the proposed placement here, albeit
I can see the point of Roger's argument.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD
  2017-10-20  4:08     ` Chao Gao
@ 2017-10-20  6:57       ` Jan Beulich
  0 siblings, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2017-10-20  6:57 UTC (permalink / raw)
  To: Chao Gao
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Roger Pau Monné

>>> On 20.10.17 at 06:08, <chao.gao@intel.com> wrote:
> On Thu, Oct 19, 2017 at 12:56:45PM +0100, Roger Pau Monné wrote:
>>On Thu, Sep 21, 2017 at 11:01:54PM -0400, Lan Tianyu wrote:
>>> @@ -148,6 +205,18 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>>>              break;
>>>          }
>>>      }
>>> +    else /* len == 8 */
>>> +    {
>>> +        switch ( offset )
>>> +        {
>>> +        case DMAR_IRTA_REG:
>>> +            vvtd_set_reg_quad(vvtd, DMAR_IRTA_REG, val);
>>
>>I have kind of a generic comment regarding the handlers in general,
>>which I will just make here. Don't you need some kind of locking to
>>prevent concurrent read/write accesses to the registers?
> 
> I think guest should be responsible to avoid concurrency.
> Xen only needs to not be fooled (crashed) by a malicious guest.

But can you assure this without doing some locking yourself?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM
  2017-10-20  6:56       ` Jan Beulich
  2017-10-20  6:12         ` Chao Gao
@ 2017-10-20  8:37         ` Lan Tianyu
  1 sibling, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-20  8:37 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné, Chao Gao
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel

On 2017年10月20日 14:56, Jan Beulich wrote:
>>>> On 20.10.17 at 04:46, <chao.gao@intel.com> wrote:
>> On Thu, Oct 19, 2017 at 12:20:35PM +0100, Roger Pau Monné wrote:
>>> On Thu, Sep 21, 2017 at 11:01:52PM -0400, Lan Tianyu wrote:
>>>> From: Chao Gao <chao.gao@intel.com>
>>>>
>>>> This patch adds create/destroy function for the emulated VTD
>>>> and adapts it to the common VIOMMU abstraction.
>>>>
>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>>>> ---
>>>>  
>>>> -obj-y += iommu.o
>>>>  obj-y += dmar.o
>>>> -obj-y += utils.o
>>>> -obj-y += qinval.o
>>>>  obj-y += intremap.o
>>>> +obj-y += iommu.o
>>>> +obj-y += qinval.o
>>>>  obj-y += quirks.o
>>>> +obj-y += utils.o
>>>
>>> Why do you need to shuffle the list above?
>>
>> I placed them in alphabetic order.
> 
> Which is appreciated. But this being non-essential for the patch, it
> would avoid (valid) reviewer questions if you said in the description
> this is an intended but non-essential change.
> 
>>> Also I'm not sure the Intel vIOMMU implementation should live here. As
>>> you can see the path is:
>>>
>>> xen/drivers/passthrough/vtd/
>>>
>>> The vIOMMU is not tied to passthrough at all, so I would rather place
>>> it in:
> 
> Hmm, is vIOMMU usable without an actual backing IOMMU?

For interrupt remapping support, we can emulate it without physical IOMMU.

> 
>>> xen/drivers/vvtd/
>>>
>>> Or maybe you can create something like:
>>>
>>> xen/drivers/viommu/
>>>
>>> So that all vIOMMU implementations can share some code.
>>>
>>
>> vvtd and vtd use the same header files (i.g. vtd.h). That is why we put
>> it there.  If that, we shoule move the related header files to a public
>> directory.
> 
> And AMD (long ago) had placed their (still incomplete) virtual
> implementation into the same directory as well. I.e. at this point
> I'm not really opposed to the proposed placement here, albeit
> I can see the point of Roger's argument.
> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD
  2017-10-20  2:58     ` Chao Gao
@ 2017-10-20  9:51       ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20  9:51 UTC (permalink / raw)
  To: Chao Gao, Lan Tianyu, xen-devel, andrew.cooper3, George.Dunlap,
	ian.jackson, jbeulich, konrad.wilk, sstabellini, tim, wei.liu2,
	kevin.tian

On Fri, Oct 20, 2017 at 10:58:32AM +0800, Chao Gao wrote:
> On Thu, Oct 19, 2017 at 12:34:54PM +0100, Roger Pau Monné wrote:
> >On Thu, Sep 21, 2017 at 11:01:53PM -0400, Lan Tianyu wrote:
> >> +    return 0;
> >> +}
> >> +
> >> +static int vvtd_read(struct vcpu *v, unsigned long addr,
> >> +                     unsigned int len, unsigned long *pval)
> >> +{
> >> +    struct vvtd *vvtd = domain_vvtd(v->domain);
> >> +    unsigned int offset = addr - vvtd->base_addr;
> >> +
> >> +    vvtd_info("Read offset %x len %d\n", offset, len);
> >> +
> >> +    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
> >
> >What value does hardware return when performing unaligned reads or
> >reads with wrong size?
> 
> According to VT-d spec section 10.2, "Software must access 64-bit and
> 128-bit registers as either aligned quadwords or aligned doublewords".
> I am afraid there is no specific hardware action for unaligned access
> information. We can treat it as undefined? Then do nothing.
> But I did see windows driver has such accesses. We need to add a
> workaround for windows later.

I would recommend that you do *pval = ~0ul; in that case then.

> >
> >Here you return with pval not set, which is dangerous.
> 
> Indeed. But I need check whether the pval is initialized by the caller.
> If that, it is safe.

Yes, this was recently fixed as part of a XSA, but I would rather
prefer to set pval here in the error case.

> >
> >> +        return X86EMUL_OKAY;
> >> +
> >> +    if ( len == 4 )
> >> +        *pval = vvtd_get_reg(vvtd, offset);
> >> +    else
> >> +        *pval = vvtd_get_reg_quad(vvtd, offset);
> >
> >...yet here you don't check for offset < 1024.
> >
> >> +
> >> +    return X86EMUL_OKAY;
> >> +}
> >> +
> >> +static int vvtd_write(struct vcpu *v, unsigned long addr,
> >> +                      unsigned int len, unsigned long val)
> >> +{
> >> +    struct vvtd *vvtd = domain_vvtd(v->domain);
> >> +    unsigned int offset = addr - vvtd->base_addr;
> >> +
> >> +    vvtd_info("Write offset %x len %d val %lx\n", offset, len, val);
> >> +
> >> +    if ( (len != 4 && len != 8) || (offset & (len - 1)) )
> >> +        return X86EMUL_OKAY;
> >> +
> >> +    if ( len == 4 )
> >> +    {
> >> +        switch ( offset )
> >> +        {
> >> +        case DMAR_IEDATA_REG:
> >> +        case DMAR_IEADDR_REG:
> >> +        case DMAR_IEUADDR_REG:
> >> +        case DMAR_FEDATA_REG:
> >> +        case DMAR_FEADDR_REG:
> >> +        case DMAR_FEUADDR_REG:
> >> +            vvtd_set_reg(vvtd, offset, val);
> >
> >Hm, so you are using a full page when you only care for 6 4B
> >registers? Seem like quite of a waste of memory.
> 
> Registers are added here when according features are introduced.

Even at the end of the series it doesn't seem like you are adding
support to 256 registers. From the code it seems like you allow writes
to 16 4B registers, and although you allow read access to all the
register space it seems quite dubious that you need 256 registers.
Hence the question about trying to minimize memory usage to what's
really needed.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request
  2017-10-20  5:16     ` Chao Gao
@ 2017-10-20 10:01       ` Roger Pau Monné
  2017-10-23  6:44         ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20 10:01 UTC (permalink / raw)
  To: Chao Gao, Lan Tianyu, xen-devel, andrew.cooper3, George.Dunlap,
	ian.jackson, jbeulich, konrad.wilk, sstabellini, tim, wei.liu2,
	kevin.tian

On Fri, Oct 20, 2017 at 01:16:37PM +0800, Chao Gao wrote:
> On Thu, Oct 19, 2017 at 03:26:30PM +0100, Roger Pau Monné wrote:
> >On Thu, Sep 21, 2017 at 11:01:56PM -0400, Lan Tianyu wrote:
> >> +static void unmap_guest_page(void *virt)
> >> +{
> >> +    struct page_info *page;
> >> +
> >> +    ASSERT((unsigned long)virt & PAGE_MASK);
> >
> >I'm not sure I get the point of the check above.
> 
> I intended to check the address is 4K-page aligned. It should be
> 
> ASSERT(!((unsigned long)virt & (PAGE_SIZE - 1)))

Please use the IS_ALIGNED macro.

> >
> >> +    }
> >> +    return;
> >> +}
> >> +
> >> +static bool vvtd_irq_request_sanity_check(const struct vvtd *vvtd,
> >> +                                          struct arch_irq_remapping_request *irq)
> >> +{
> >> +    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
> >> +    {
> >> +        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
> >> +
> >> +        ASSERT(rte.format);
> >
> >Is it fine to ASSERT here? Can't the guest set rte.format to whatever
> >it wants?
> 
> Guest can use legacy format interrupt (i.e. rte.format = 0). However,
> we only reach here when callback 'check_irq_remapping' return true and
> for vvtd, 'check_irq_remapping' just returns the format bit of irq request.
> If here ret.format isn't true, there must be a bug in our code.

Are you sure the correct locks are hold here to prevent the guest
from changing rte while all this processing is happening?

> >> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_REQ_RSVD, record_fault);
> >> +        return -EINVAL;
> >> +    }
> >> +
> >> +    if ( entry > vvtd->status.irt_max_entry )
> >> +    {
> >> +        vvtd_handle_fault(vvtd, irq, NULL, VTD_FR_IR_INDEX_OVER, record_fault);
> >> +        return -EACCES;
> >> +    }
> >> +
> >> +    irt_page = map_guest_page(vvtd->domain,
> >> +                              vvtd->status.irt + (entry >> IREMAP_ENTRY_ORDER));
> >
> >Since AFAICT you have to read this page(s) every time an interrupt
> >needs to be delivered, wouldn't it make sense for performance reasons
> >to have the page permanently mapped?
> 
> Yes. It is. Actually, we have a draft patch to do this. But to justify
> the necessity, I should run some benchmark at first. Mapping a guest
> page is slow on x86, right?

The issue is the tblflush, not the actual modifications of the page
tables.

> >
> >What's the maximum number of pages that can be used here?
> 
> VT-d current support 2^16 entries at most. The size of each entry is 128
> byte. Thus, we need 2^11 pages at most.

Those are guest pages at the end, so it shouldn't be a problem.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults
  2017-10-20  5:54     ` Chao Gao
@ 2017-10-20 10:08       ` Roger Pau Monné
  2017-10-20 14:20         ` Jan Beulich
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20 10:08 UTC (permalink / raw)
  To: Chao Gao, Lan Tianyu, xen-devel, kevin.tian, jbeulich, andrew.cooper3

On Fri, Oct 20, 2017 at 01:54:15PM +0800, Chao Gao wrote:
> On Thu, Oct 19, 2017 at 05:31:37PM +0100, Roger Pau Monné wrote:
> >On Thu, Sep 21, 2017 at 11:02:07PM -0400, Lan Tianyu wrote:
> >> +static int vvtd_alloc_frcd(struct vvtd *vvtd)
> >> +{
> >> +    int prev;
> >> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
> >> +    unsigned int base = cap_fault_reg_offset(cap);
> >> +
> >> +    /* Set the F bit to indicate the FRCD is in use. */
> >> +    if ( !vvtd_test_and_set_bit(vvtd,
> >> +                                base + vvtd->status.fault_index * DMA_FRCD_LEN +
> >> +                                DMA_FRCD3_OFFSET, DMA_FRCD_F_SHIFT) )
> >> +    {
> >> +        prev = vvtd->status.fault_index;
> >> +        vvtd->status.fault_index = (prev + 1) % cap_num_fault_regs(cap);
> >> +        return vvtd->status.fault_index;
> >
> >I would prefer that you return the index as an unsigned int parameter
> >passed by reference rather than as the return value of the function,
> >but that might not be the preference of others.
> 
> What are the pros and cons?

I personally don't like return values that have different meanings
depending on it's sign. Here < 0 means error, while >= 0 is used to
deliver some information.

What I didn't like here specifically (apart from the rant above) is
that I would prefer index to be unsigned int, but I'm not sure that's
enough to ask you to change the function prototype. Just leave it
as-is unless someone else complains.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 27/29] x86/vvtd: Enable Queued Invalidation through GCMD
  2017-09-22  3:02 ` [PATCH V3 27/29] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
@ 2017-10-20 10:30   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20 10:30 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: andrew.cooper3, kevin.tian, Chao Gao, jbeulich, xen-devel

On Thu, Sep 21, 2017 at 11:02:08PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Software writes to QIE field of GCMD to enable or disable queued
> invalidations. This patch emulates QIE field of GCMD.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  3 ++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 17 +++++++++++++++++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index e19b045..c69cd21 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -162,7 +162,8 @@
>  #define DMA_GSTS_FLS    (((u64)1) << 29)
>  #define DMA_GSTS_AFLS   (((u64)1) << 28)
>  #define DMA_GSTS_WBFS   (((u64)1) << 27)
> -#define DMA_GSTS_QIES   (((u64)1) <<26)
> +#define DMA_GSTS_QIES_SHIFT     26
> +#define DMA_GSTS_QIES   (((u64)1) << DMA_GSTS_QIES_SHIFT)
>  #define DMA_GSTS_IRES_SHIFT     25
>  #define DMA_GSTS_IRES   (((u64)1) << DMA_GSTS_IRES_SHIFT)
>  #define DMA_GSTS_SIRTPS_SHIFT   24
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 745941c..55f7a46 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -496,6 +496,19 @@ static void vvtd_handle_gcmd_ire(struct vvtd *vvtd, uint32_t val)
>      }
>  }
>  
> +static void vvtd_handle_gcmd_qie(struct vvtd *vvtd, uint32_t val)

I would use 'write' intead of 'handle', since this is only used by the
write path.

Also you should consider dropping the vvtd prefixes from the static
functions. It's quite clear they are vvtd related, and since they are
static there's no need to add such a prefix.

> +{
> +    vvtd_info("%sable Queue Invalidation", (val & DMA_GCMD_QIE) ? "En" : "Dis");
> +
> +    if ( val & DMA_GCMD_QIE )
> +        vvtd_set_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_SHIFT);
> +    else
> +    {
> +        vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, 0);
> +        vvtd_clear_bit(vvtd, DMAR_GSTS_REG, DMA_GSTS_QIES_SHIFT);
> +    }
> +}

Since I've seen this pattern in other functions, it might be worth
adding a helper that does:

VVTD_SET_BIT(reg, bit, val)
{
    if ( val )
        set_bit(...);
    else
        clear_bit(...);
}

Then the above function could be reduced to:

VVTD_SET_BIT(reg, bit, val);
if ( !val )
    vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, 0);

I expect other functions can also be simplified by this macro.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support
  2017-09-22  3:02 ` [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
@ 2017-10-20 11:20   ` Roger Pau Monné
  2017-10-23  7:50     ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20 11:20 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: andrew.cooper3, kevin.tian, Chao Gao, jbeulich, xen-devel

On Thu, Sep 21, 2017 at 11:02:09PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Queued Invalidation Interface is an expanded invalidation interface with
> extended capabilities. Hardware implementations report support for queued
> invalidation interface through the Extended Capability Register. The queued
> invalidation interface uses an Invalidation Queue (IQ), which is a circular
> buffer in system memory. Software submits commands by writing Invalidation
> Descriptors to the IQ.
> 
> In this patch, a new function viommu_process_iq() is used for emulating how
> hardware handles invalidation requests through QI.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.h |  19 ++-
>  xen/drivers/passthrough/vtd/vvtd.c  | 232 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 250 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
> index c69cd21..c2b83f1 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -177,6 +177,21 @@
>  #define DMA_IRTA_S(val)         (val & 0xf)
>  #define DMA_IRTA_SIZE(val)      (1UL << (DMA_IRTA_S(val) + 1))
>  
> +/* IQA_REG */
> +#define DMA_IQA_ADDR(val)       (val & ~0xfffULL)
> +#define DMA_IQA_QS(val)         (val & 0x7)
> +#define DMA_IQA_RSVD            0xff8ULL
> +
> +/* IECTL_REG */
> +#define DMA_IECTL_IM_SHIFT 31
> +#define DMA_IECTL_IM            (1 << DMA_IECTL_IM_SHIFT)

Isn't this undefined behavior? It should be 1u.

You should consider using u for all the defines added.

> +#define DMA_IECTL_IP_SHIFT 30
> +#define DMA_IECTL_IP            (1 << DMA_IECTL_IP_SHIFT)
> +
> +/* ICS_REG */
> +#define DMA_ICS_IWC_SHIFT       0
> +#define DMA_ICS_IWC             (1 << DMA_ICS_IWC_SHIFT)
> +
>  /* PMEN_REG */
>  #define DMA_PMEN_EPM    (((u32)1) << 31)
>  #define DMA_PMEN_PRS    (((u32)1) << 0)
> @@ -211,7 +226,8 @@
>  #define DMA_FSTS_PPF (1U << DMA_FSTS_PPF_SHIFT)
>  #define DMA_FSTS_AFO (1U << 2)
>  #define DMA_FSTS_APF (1U << 3)
> -#define DMA_FSTS_IQE (1U << 4)
> +#define DMA_FSTS_IQE_SHIFT 4
> +#define DMA_FSTS_IQE (1U << DMA_FSTS_IQE_SHIFT)
>  #define DMA_FSTS_ICE (1U << 5)
>  #define DMA_FSTS_ITE (1U << 6)
>  #define DMA_FSTS_PRO_SHIFT 7
> @@ -562,6 +578,7 @@ struct qinval_entry {
>  
>  /* Queue invalidation head/tail shift */
>  #define QINVAL_INDEX_SHIFT 4
> +#define QINVAL_INDEX_MASK  0x7fff0ULL
>  
>  #define qinval_present(v) ((v).lo & 1)
>  #define qinval_fault_disable(v) (((v).lo >> 1) & 1)
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 55f7a46..668d0c9 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -28,6 +28,7 @@
>  #include <asm/current.h>
>  #include <asm/event.h>
>  #include <asm/hvm/domain.h>
> +#include <asm/hvm/support.h>
>  #include <asm/io_apic.h>
>  #include <asm/page.h>
>  #include <asm/p2m.h>
> @@ -419,6 +420,177 @@ static int vvtd_record_fault(struct vvtd *vvtd,
>      return X86EMUL_OKAY;
>  }
>  
> +/*
> + * Process a invalidation descriptor. Currently, only two types descriptors,
> + * Interrupt Entry Cache Invalidation Descritor and Invalidation Wait
> + * Descriptor are handled.
> + * @vvtd: the virtual vtd instance
> + * @i: the index of the invalidation descriptor to be processed
> + *
> + * If success return 0, or return non-zero when failure.
> + */
> +static int process_iqe(struct vvtd *vvtd, int i)

unsigned int.

> +{
> +    uint64_t iqa;
> +    struct qinval_entry *qinval_page;
> +    int ret = 0;
> +
> +    iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
> +    qinval_page = map_guest_page(vvtd->domain, DMA_IQA_ADDR(iqa)>>PAGE_SHIFT);

PFN_DOWN instead of open coding the shift. Both can be initialized
at declaration. Also AFAICT iqa is only used once, so the local
variable is not needed.

> +    if ( IS_ERR(qinval_page) )
> +    {
> +        gdprintk(XENLOG_ERR, "Can't map guest IRT (rc %ld)",
> +                 PTR_ERR(qinval_page));
> +        return PTR_ERR(qinval_page);
> +    }
> +
> +    switch ( qinval_page[i].q.inv_wait_dsc.lo.type )
> +    {
> +    case TYPE_INVAL_WAIT:
> +        if ( qinval_page[i].q.inv_wait_dsc.lo.sw )
> +        {
> +            uint32_t data = qinval_page[i].q.inv_wait_dsc.lo.sdata;
> +            uint64_t addr = (qinval_page[i].q.inv_wait_dsc.hi.saddr << 2);

Unneeded parentheses.

> +
> +            ret = hvm_copy_to_guest_phys(addr, &data, sizeof(data), current);
> +            if ( ret )
> +                vvtd_info("Failed to write status address");

Don't you need to return or do something here? (like raise some kind
of error?)

> +        }
> +
> +        /*
> +         * The following code generates an invalidation completion event
> +         * indicating the invalidation wait descriptor completion. Note that
> +         * the following code fragment is not tested properly.
> +         */
> +        if ( qinval_page[i].q.inv_wait_dsc.lo.iflag )
> +        {
> +            uint32_t ie_data, ie_addr;

Missing newline, but...

> +            if ( !vvtd_test_and_set_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_SHIFT) )
> +            {
> +                vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
> +                if ( !vvtd_test_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT) )
> +                {
> +                    ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
> +                    ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
> +                    vvtd_generate_interrupt(vvtd, ie_addr, ie_data);

...you don't seem two need the two local variables. They are used only
once.

> +                    vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
> +                }
> +            }
> +        }
> +        break;
> +
> +    case TYPE_INVAL_IEC:
> +        /*
> +         * Currently, no cache is preserved in hypervisor. Only need to update
> +         * pIRTEs which are modified in binding process.
> +         */
> +        break;
> +
> +    default:
> +        goto error;

There's no reason to use a label that's only used for the default
case. Simply place the code in the error label here.

> +    }
> +
> +    unmap_guest_page((void*)qinval_page);
> +    return ret;
> +
> + error:
> +    unmap_guest_page((void*)qinval_page);
> +    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
> +    domain_crash(vvtd->domain);

Do you really need to crash the domain in such case?

> +    return ret;
> +}
> +
> +/*
> + * Invalidate all the descriptors in Invalidation Queue.
> + */
> +static void vvtd_process_iq(struct vvtd *vvtd)
> +{
> +    uint64_t iqh, iqt, iqa, max_entry, i;
> +    int err = 0;
> +
> +    /*
> +     * No new descriptor is fetched from the Invalidation Queue until
> +     * software clears the IQE field in the Fault Status Register
> +     */
> +    if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_SHIFT) )
> +        return;
> +
> +    iqh = vvtd_get_reg_quad(vvtd, DMAR_IQH_REG);
> +    iqt = vvtd_get_reg_quad(vvtd, DMAR_IQT_REG);
> +    iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
> +
> +    max_entry = 1 << (QINVAL_ENTRY_ORDER + DMA_IQA_QS(iqa));
> +    iqh = MASK_EXTR(iqh, QINVAL_INDEX_MASK);
> +    iqt = MASK_EXTR(iqt, QINVAL_INDEX_MASK);

This should be done above, when they are initialized.

> +
> +    ASSERT(iqt < max_entry);
> +    if ( iqh == iqt )
> +        return;
> +
> +    for ( i = iqh; i != iqt; i = (i + 1) % max_entry )
> +    {
> +        err = process_iqe(vvtd, i);

process_iqe takes an int parameter, and here you are feeding it a
uint64_t.

> +        if ( err )
> +            break;
> +    }
> +    vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, i << QINVAL_INDEX_SHIFT);
> +
> +    /*
> +     * When IQE set, IQH references the desriptor associated with the error.
> +     */
> +    if ( err )
> +        vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_IQE_SHIFT);
> +}
> +
> +static int vvtd_write_iqt(struct vvtd *vvtd, unsigned long val)

Not sure there's much point in making this function return int, when
all the return values are X86EMUL_OKAY.

Sionce val here is a register AFAICT, I would rather prefer that you
use an explicit type size (uint64_t or uint32_t as fits).

> +{
> +    uint64_t max_entry, iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
> +
> +    if ( val & ~QINVAL_INDEX_MASK )
> +    {
> +        vvtd_info("Attempt to set reserved bits in Invalidation Queue Tail");
> +        return X86EMUL_OKAY;
> +    }
> +
> +    max_entry = 1 << (QINVAL_ENTRY_ORDER + DMA_IQA_QS(iqa));

1ull please.

> +    if ( MASK_EXTR(val, QINVAL_INDEX_MASK) >= max_entry )
> +    {
> +        vvtd_info("IQT: Value %lx exceeded supported max index.", val);
> +        return X86EMUL_OKAY;
> +    }
> +
> +    vvtd_set_reg_quad(vvtd, DMAR_IQT_REG, val);
> +    vvtd_process_iq(vvtd);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_iqa(struct vvtd *vvtd, unsigned long val)

Same here, it seems like this function should return void, because the
current return value is meaningless, and same comment about 'val'
being uintXX_t.

> +{
> +    uint32_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
> +    unsigned int guest_max_addr_width = cap_mgaw(cap);
> +
> +    if ( val & (~((1ULL << guest_max_addr_width) - 1) | DMA_IQA_RSVD) )
> +    {
> +        vvtd_info("Attempt to set reserved bits in Invalidation Queue Address");
> +        return X86EMUL_OKAY;
> +    }
> +
> +    vvtd_set_reg_quad(vvtd, DMAR_IQA_REG, val);
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vvtd_write_ics(struct vvtd *vvtd, uint32_t val)
> +{
> +    if ( val & DMA_ICS_IWC )
> +    {
> +        vvtd_clear_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_SHIFT);
> +        /*When IWC field is cleared, the IP field needs to be cleared */
             ^ missing space.

> +        vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
> +    }
> +    return X86EMUL_OKAY;

This function wants to be void.

> +}
> +
>  static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
>  {
>      /* Writing a 1 means clear fault */
> @@ -430,6 +602,30 @@ static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val)
>      return X86EMUL_OKAY;
>  }
>  
> +static int vvtd_write_iectl(struct vvtd *vvtd, uint32_t val)

void please.

> +{
> +    /*
> +     * Only DMA_IECTL_IM bit is writable. Generate pending event when unmask.
> +     */

Single line comments use /* ... */

> +    if ( !(val & DMA_IECTL_IM) )
> +    {
> +        /* Clear IM and clear IP */
> +        vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT);
> +        if ( vvtd_test_and_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT) )
> +        {
> +            uint32_t ie_data, ie_addr;
> +
> +            ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
> +            ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
> +            vvtd_generate_interrupt(vvtd, ie_addr, ie_data);

No need for ie_data and ie_addr.

> +        }
> +    }
> +    else
> +        vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT);
> +
> +    return X86EMUL_OKAY;
> +}
> +
>  static int vvtd_write_fectl(struct vvtd *vvtd, uint32_t val)
>  {
>      /*
> @@ -476,6 +672,10 @@ static int vvtd_write_fsts(struct vvtd *vvtd, uint32_t val)
>      if ( !((vvtd_get_reg(vvtd, DMAR_FSTS_REG) & DMA_FSTS_FAULTS)) )
>          vvtd_clear_bit(vvtd, DMAR_FECTL_REG, DMA_FECTL_IP_SHIFT);
>  
> +    /* Continue to deal invalidation when IQE is clear */
> +    if ( !vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_SHIFT) )
> +        vvtd_process_iq(vvtd);
> +
>      return X86EMUL_OKAY;
>  }
>  
> @@ -611,6 +811,32 @@ static int vvtd_write(struct vcpu *v, unsigned long addr,
>          case DMAR_FECTL_REG:
>              return vvtd_write_fectl(vvtd, val);
>  
> +        case DMAR_IECTL_REG:
> +            return vvtd_write_iectl(vvtd, val);
> +
> +        case DMAR_ICS_REG:
> +            return vvtd_write_ics(vvtd, val);
> +
> +        case DMAR_IQT_REG:
> +            return vvtd_write_iqt(vvtd, (uint32_t)val);
> +
> +        case DMAR_IQA_REG:
> +        {
> +            uint32_t iqa_hi;
> +
> +            iqa_hi = vvtd_get_reg(vvtd, DMAR_IQA_REG_HI);

Initialization at declaration time, but since it's used only once, I
would rather prefer that you don't use a local variable at all.

> +            return vvtd_write_iqa(vvtd,
> +                                 (uint32_t)val | ((uint64_t)iqa_hi << 32));
> +        }
> +
> +        case DMAR_IQA_REG_HI:
> +        {
> +            uint32_t iqa_lo;
> +
> +            iqa_lo = vvtd_get_reg(vvtd, DMAR_IQA_REG);
> +            return vvtd_write_iqa(vvtd, (val << 32) | iqa_lo);

Same comment as above regarding iqa_lo.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 29/29] x86/vvtd: save and restore emulated VT-d
  2017-09-22  3:02 ` [PATCH V3 29/29] x86/vvtd: save and restore emulated VT-d Lan Tianyu
@ 2017-10-20 11:25   ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20 11:25 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: andrew.cooper3, kevin.tian, Chao Gao, jbeulich, xen-devel

On Thu, Sep 21, 2017 at 11:02:10PM -0400, Lan Tianyu wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Provide a save-restore pair to save/restore registers and non-register
> status.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
> v3:
>  - use one entry to save both vvtd registers and other intermediate
>  state
> ---
>  xen/drivers/passthrough/vtd/vvtd.c     | 66 ++++++++++++++++++++++++++--------
>  xen/include/public/arch-x86/hvm/save.h | 25 ++++++++++++-
>  2 files changed, 76 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c
> index 668d0c9..2aecd93 100644
> --- a/xen/drivers/passthrough/vtd/vvtd.c
> +++ b/xen/drivers/passthrough/vtd/vvtd.c
> @@ -28,11 +28,13 @@
>  #include <asm/current.h>
>  #include <asm/event.h>
>  #include <asm/hvm/domain.h>
> +#include <asm/hvm/save.h>
>  #include <asm/hvm/support.h>
>  #include <asm/io_apic.h>
>  #include <asm/page.h>
>  #include <asm/p2m.h>
>  #include <asm/viommu.h>
> +#include <public/hvm/save.h>
>  
>  #include "iommu.h"
>  #include "vtd.h"
> @@ -40,20 +42,6 @@
>  /* Supported capabilities by vvtd */
>  unsigned int vvtd_caps = VIOMMU_CAP_IRQ_REMAPPING;
>  
> -struct hvm_hw_vvtd_status {
> -    uint32_t eim_enabled : 1,
> -             intremap_enabled : 1;
> -    uint32_t fault_index;
> -    uint32_t irt_max_entry;
> -    /* Interrupt remapping table base gfn */
> -    uint64_t irt;
> -};
> -
> -union hvm_hw_vvtd_regs {
> -    uint32_t data32[256];
> -    uint64_t data64[128];
> -};
> -
>  struct vvtd {
>      /* Address range of remapping hardware register-set */
>      uint64_t base_addr;
> @@ -1057,6 +1045,56 @@ static bool vvtd_is_remapping(struct domain *d,
>      return 0;
>  }
>  
> +static int vvtd_load(struct domain *d, hvm_domain_context_t *h)
> +{
> +    struct hvm_hw_vvtd *hw_vvtd;
> +
> +    if ( !domain_vvtd(d) )
> +        return -ENODEV;
> +
> +    hw_vvtd = xmalloc(struct hvm_hw_vvtd);
> +    if ( !hw_vvtd )
> +        return -ENOMEM;
> +
> +    if ( hvm_load_entry(VVTD, h, hw_vvtd) )
> +    {
> +        xfree(hw_vvtd);
> +        return -EINVAL;
> +    }

If you declare hvm_hw_vvtd_regs as a field inside of
hvm_hw_vvtd_status you won't need to do this alloc + memcpy, bnecause
you could directly load it to domain_vvtd.

In any case, I think the code here is going to change due to all the
other comments on the previous patches.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 00/29]
  2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
                   ` (28 preceding siblings ...)
  2017-09-22  3:02 ` [PATCH V3 29/29] x86/vvtd: save and restore emulated VT-d Lan Tianyu
@ 2017-10-20 11:36 ` Roger Pau Monné
  2017-10-23  1:23   ` Lan Tianyu
  29 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-20 11:36 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Thu, Sep 21, 2017 at 11:01:41PM -0400, Lan Tianyu wrote:
> Change since v2:
>        1) Remove vIOMMU hypercall of query capabilities and introduce when necessary.
>        2) Remove length field of vIOMMU create parameter of vIOMMU hypercall
>        3) Introduce irq remapping mode callback to vIOMMU framework and vIOMMU device models
> can check irq remapping mode by vendor specific ways.
>        4) Update vIOMMU docs.
>        5) Other changes please see patches' change logs.
> 
> Change since v1:
>        1) Fix coding style issues
>        2) Add definitions for vIOMMU type and capabilities
>        3) Change vIOMMU kconfig and select vIOMMU default on x86
>        4) Put vIOMMU creation in libxl__arch_domain_create()
>        5) Make vIOMMU structure of tool stack more general for both PV and HVM.
> 
> Change since RFC v2:
>        1) Move vvtd.c to drivers/passthrough/vtd directroy. 
>        2) Make vIOMMU always built in on x86
>        3) Add new boot cmd "viommu" to enable viommu function
>        4) Fix some code stype issues.
> 
> Change since RFC v1:
>        1) Add Xen virtual IOMMU doc docs/misc/viommu.txt
>        2) Move vIOMMU hypercall of create/destroy vIOMMU and query  
> capabilities from dmop to domctl suggested by Paul Durrant. Because
> these hypercalls can be done in tool stack and more VM mode(E,G PVH
> or other modes don't use Qemu) can be benefit.
>        3) Add check of input MMIO address and length.
>        4) Add iommu_type in vIOMMU hypercall parameter to specify
> vendor vIOMMU device model(E,G Intel VTD, AMD or ARM IOMMU. So far
> only support Intel VTD).
>        5) Add save and restore support for vvtd
> 
> 
> This patchset is to introduce vIOMMU framework and add virtual VTD's
> interrupt remapping support according "Xen virtual IOMMU high level
> design doc V3"(https://lists.xenproject.org/archives/html/xen-devel/
> 2016-11/msg01391.html).
> 
> - vIOMMU framework
> New framework provides viommu_ops and help functions to abstract
> vIOMMU operations(E,G create, destroy, handle irq remapping request
> and so on). Vendors(Intel, ARM, AMD and son) can implement their
> vIOMMU callbacks.
> 
> - Virtual VTD
> We enable irq remapping function and covers both
> MSI and IOAPIC interrupts. Don't support post interrupt mode emulation
> and post interrupt mode enabled on host with virtual VTD. will add
> later.

Hello,

Just a couple of generic comments on the whole series:

 - Please make sure that the result after each patch is buildable. It
   is of extreme importance the that Xen tree is bisectable at all
   points.

 - Regarding the organization of the series, I would rather prefer
   that you place the design document at the beginning (like it's done
   now), then the hypervisor changes (possibly the generic framework
   first, then the vvtd functionality and finally all the hooks into
   common code) and the toolstack side at the end. This might be just
   my personal taste, but I think it's clearer to review/understand
   rather than mixed as it is now.

 - Finally, please try to make sure that each patch introduces the
   helpers or structures that it needs. For example don't place all
   the "static inline" helpers together with a bunch of structures in
   an isolated patch, and then a bunch of patches that start making
   use of them. Instead introduce the structures or helpers in the
   context when they are used. An exception of this might be for very
   big or generic structures.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults
  2017-10-20 10:08       ` Roger Pau Monné
@ 2017-10-20 14:20         ` Jan Beulich
  0 siblings, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2017-10-20 14:20 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, andrew.cooper3, kevin.tian, xen-devel, Chao Gao

>>> On 20.10.17 at 12:08, <roger.pau@citrix.com> wrote:
> On Fri, Oct 20, 2017 at 01:54:15PM +0800, Chao Gao wrote:
>> On Thu, Oct 19, 2017 at 05:31:37PM +0100, Roger Pau Monné wrote:
>> >On Thu, Sep 21, 2017 at 11:02:07PM -0400, Lan Tianyu wrote:
>> >> +static int vvtd_alloc_frcd(struct vvtd *vvtd)
>> >> +{
>> >> +    int prev;
>> >> +    uint64_t cap = vvtd_get_reg(vvtd, DMAR_CAP_REG);
>> >> +    unsigned int base = cap_fault_reg_offset(cap);
>> >> +
>> >> +    /* Set the F bit to indicate the FRCD is in use. */
>> >> +    if ( !vvtd_test_and_set_bit(vvtd,
>> >> +                                base + vvtd->status.fault_index * DMA_FRCD_LEN +
>> >> +                                DMA_FRCD3_OFFSET, DMA_FRCD_F_SHIFT) )
>> >> +    {
>> >> +        prev = vvtd->status.fault_index;
>> >> +        vvtd->status.fault_index = (prev + 1) % cap_num_fault_regs(cap);
>> >> +        return vvtd->status.fault_index;
>> >
>> >I would prefer that you return the index as an unsigned int parameter
>> >passed by reference rather than as the return value of the function,
>> >but that might not be the preference of others.
>> 
>> What are the pros and cons?
> 
> I personally don't like return values that have different meanings
> depending on it's sign. Here < 0 means error, while >= 0 is used to
> deliver some information.

We do this in various places, so as long as the (theoretical) value
range of valid positive values isn't too large, I'd be fine with this
approach to be used here as well.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 00/29]
  2017-10-20 11:36 ` [PATCH V3 00/29] Roger Pau Monné
@ 2017-10-23  1:23   ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-23  1:23 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月20日 19:36, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:41PM -0400, Lan Tianyu wrote:
>> Change since v2:
>>        1) Remove vIOMMU hypercall of query capabilities and introduce when necessary.
>>        2) Remove length field of vIOMMU create parameter of vIOMMU hypercall
>>        3) Introduce irq remapping mode callback to vIOMMU framework and vIOMMU device models
>> can check irq remapping mode by vendor specific ways.
>>        4) Update vIOMMU docs.
>>        5) Other changes please see patches' change logs.
>>
>> Change since v1:
>>        1) Fix coding style issues
>>        2) Add definitions for vIOMMU type and capabilities
>>        3) Change vIOMMU kconfig and select vIOMMU default on x86
>>        4) Put vIOMMU creation in libxl__arch_domain_create()
>>        5) Make vIOMMU structure of tool stack more general for both PV and HVM.
>>
>> Change since RFC v2:
>>        1) Move vvtd.c to drivers/passthrough/vtd directroy. 
>>        2) Make vIOMMU always built in on x86
>>        3) Add new boot cmd "viommu" to enable viommu function
>>        4) Fix some code stype issues.
>>
>> Change since RFC v1:
>>        1) Add Xen virtual IOMMU doc docs/misc/viommu.txt
>>        2) Move vIOMMU hypercall of create/destroy vIOMMU and query  
>> capabilities from dmop to domctl suggested by Paul Durrant. Because
>> these hypercalls can be done in tool stack and more VM mode(E,G PVH
>> or other modes don't use Qemu) can be benefit.
>>        3) Add check of input MMIO address and length.
>>        4) Add iommu_type in vIOMMU hypercall parameter to specify
>> vendor vIOMMU device model(E,G Intel VTD, AMD or ARM IOMMU. So far
>> only support Intel VTD).
>>        5) Add save and restore support for vvtd
>>
>>
>> This patchset is to introduce vIOMMU framework and add virtual VTD's
>> interrupt remapping support according "Xen virtual IOMMU high level
>> design doc V3"(https://lists.xenproject.org/archives/html/xen-devel/
>> 2016-11/msg01391.html).
>>
>> - vIOMMU framework
>> New framework provides viommu_ops and help functions to abstract
>> vIOMMU operations(E,G create, destroy, handle irq remapping request
>> and so on). Vendors(Intel, ARM, AMD and son) can implement their
>> vIOMMU callbacks.
>>
>> - Virtual VTD
>> We enable irq remapping function and covers both
>> MSI and IOAPIC interrupts. Don't support post interrupt mode emulation
>> and post interrupt mode enabled on host with virtual VTD. will add
>> later.
> 
> Hello,
> 
> Just a couple of generic comments on the whole series:
> 
>  - Please make sure that the result after each patch is buildable. It
>    is of extreme importance the that Xen tree is bisectable at all
>    points.
> 
>  - Regarding the organization of the series, I would rather prefer
>    that you place the design document at the beginning (like it's done
>    now), then the hypervisor changes (possibly the generic framework
>    first, then the vvtd functionality and finally all the hooks into
>    common code) and the toolstack side at the end. This might be just
>    my personal taste, but I think it's clearer to review/understand
>    rather than mixed as it is now.
> 
>  - Finally, please try to make sure that each patch introduces the
>    helpers or structures that it needs. For example don't place all
>    the "static inline" helpers together with a bunch of structures in
>    an isolated patch, and then a bunch of patches that start making
>    use of them. Instead introduce the structures or helpers in the
>    context when they are used. An exception of this might be for very
>    big or generic structures.
> 

Sure. We will follow your guide. Thanks.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request
  2017-10-20 10:01       ` Roger Pau Monné
@ 2017-10-23  6:44         ` Chao Gao
  0 siblings, 0 replies; 108+ messages in thread
From: Chao Gao @ 2017-10-23  6:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich

On Fri, Oct 20, 2017 at 11:01:03AM +0100, Roger Pau Monné wrote:
>On Fri, Oct 20, 2017 at 01:16:37PM +0800, Chao Gao wrote:
>> On Thu, Oct 19, 2017 at 03:26:30PM +0100, Roger Pau Monné wrote:
>> >On Thu, Sep 21, 2017 at 11:01:56PM -0400, Lan Tianyu wrote:
>> >> +static void unmap_guest_page(void *virt)
>> >> +{
>> >> +    struct page_info *page;
>> >> +
>> >> +    ASSERT((unsigned long)virt & PAGE_MASK);
>> >
>> >I'm not sure I get the point of the check above.
>> 
>> I intended to check the address is 4K-page aligned. It should be
>> 
>> ASSERT(!((unsigned long)virt & (PAGE_SIZE - 1)))
>
>Please use the IS_ALIGNED macro.

Ok.

>
>> >
>> >> +    }
>> >> +    return;
>> >> +}
>> >> +
>> >> +static bool vvtd_irq_request_sanity_check(const struct vvtd *vvtd,
>> >> +                                          struct arch_irq_remapping_request *irq)
>> >> +{
>> >> +    if ( irq->type == VIOMMU_REQUEST_IRQ_APIC )
>> >> +    {
>> >> +        struct IO_APIC_route_remap_entry rte = { .val = irq->msg.rte };
>> >> +
>> >> +        ASSERT(rte.format);
>> >
>> >Is it fine to ASSERT here? Can't the guest set rte.format to whatever
>> >it wants?
>> 
>> Guest can use legacy format interrupt (i.e. rte.format = 0). However,
>> we only reach here when callback 'check_irq_remapping' return true and
>> for vvtd, 'check_irq_remapping' just returns the format bit of irq request.
>> If here ret.format isn't true, there must be a bug in our code.
>
>Are you sure the correct locks are hold here to prevent the guest
>from changing rte while all this processing is happening?

The rte here isn't the registers in IOAPIC. It is only (or part of) the
interrupt request (abstract of ioapic rte and msi message). Every time
an interrupt is to be delivered, the interrupt request is composed
according the IOAPIC RTE or MSI message on stack. Then we recognize
the format of the interrupt, means remapping format or not remapping
format. Only for remapping format, the function is called. For
non-remapping format, the interrupt is delivered by ioapic directly and
needn't come here and be translated by vIOMMU.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support
  2017-10-20 11:20   ` Roger Pau Monné
@ 2017-10-23  7:50     ` Chao Gao
  2017-10-23  8:57       ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-23  7:50 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, andrew.cooper3, kevin.tian, jbeulich, xen-devel

On Fri, Oct 20, 2017 at 12:20:06PM +0100, Roger Pau Monné wrote:
>On Thu, Sep 21, 2017 at 11:02:09PM -0400, Lan Tianyu wrote:
>> From: Chao Gao <chao.gao@intel.com>
>> 
>> Queued Invalidation Interface is an expanded invalidation interface with
>> extended capabilities. Hardware implementations report support for queued
>> invalidation interface through the Extended Capability Register. The queued
>> invalidation interface uses an Invalidation Queue (IQ), which is a circular
>> buffer in system memory. Software submits commands by writing Invalidation
>> Descriptors to the IQ.
>> 
>> In this patch, a new function viommu_process_iq() is used for emulating how
>> hardware handles invalidation requests through QI.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>> +static int process_iqe(struct vvtd *vvtd, int i)
>
>unsigned int.
>
>> +{
>> +    uint64_t iqa;
>> +    struct qinval_entry *qinval_page;
>> +    int ret = 0;
>> +
>> +    iqa = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG);
>> +    qinval_page = map_guest_page(vvtd->domain, DMA_IQA_ADDR(iqa)>>PAGE_SHIFT);
>
>PFN_DOWN instead of open coding the shift. Both can be initialized
>at declaration. Also AFAICT iqa is only used once, so the local
>variable is not needed.
>
>> +    if ( IS_ERR(qinval_page) )
>> +    {
>> +        gdprintk(XENLOG_ERR, "Can't map guest IRT (rc %ld)",
>> +                 PTR_ERR(qinval_page));
>> +        return PTR_ERR(qinval_page);
>> +    }
>> +
>> +    switch ( qinval_page[i].q.inv_wait_dsc.lo.type )
>> +    {
>> +    case TYPE_INVAL_WAIT:
>> +        if ( qinval_page[i].q.inv_wait_dsc.lo.sw )
>> +        {
>> +            uint32_t data = qinval_page[i].q.inv_wait_dsc.lo.sdata;
>> +            uint64_t addr = (qinval_page[i].q.inv_wait_dsc.hi.saddr << 2);
>
>Unneeded parentheses.
>
>> +
>> +            ret = hvm_copy_to_guest_phys(addr, &data, sizeof(data), current);
>> +            if ( ret )
>> +                vvtd_info("Failed to write status address");
>
>Don't you need to return or do something here? (like raise some kind
>of error?)

The 'addr' is programmed by guest. Here vvtd cannot finish this write
for some reason (i.e. the 'addr' may be not in the guest physical memory space).
According to VT-d spec 6.5.2.8 Invalidation Wait Descriptor, "Hardware
behavior is undefined if the Status Address specified is not an address
route-able to memory (such as peer address, interrupt address range of
0xFEEX_XXXX etc.) I think that Xen can just ignore it. I should use
vvtd_debug() for it is guest triggerable.

>> +                if ( !vvtd_test_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT) )
>> +                {
>> +                    ie_data = vvtd_get_reg(vvtd, DMAR_IEDATA_REG);
>> +                    ie_addr = vvtd_get_reg(vvtd, DMAR_IEADDR_REG);
>> +                    vvtd_generate_interrupt(vvtd, ie_addr, ie_data);
>
>...you don't seem two need the two local variables. They are used only
>once.
>
>> +                    vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT);
>> +                }
>> +            }
>> +        }
>> +        break;
>> +
>> +    case TYPE_INVAL_IEC:
>> +        /*
>> +         * Currently, no cache is preserved in hypervisor. Only need to update
>> +         * pIRTEs which are modified in binding process.
>> +         */
>> +        break;
>> +
>> +    default:
>> +        goto error;
>
>There's no reason to use a label that's only used for the default
>case. Simply place the code in the error label here.
>
>> +    }
>> +
>> +    unmap_guest_page((void*)qinval_page);
>> +    return ret;
>> +
>> + error:
>> +    unmap_guest_page((void*)qinval_page);
>> +    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
>> +    domain_crash(vvtd->domain);
>
>Do you really need to crash the domain in such case?

We reach here when guest requests some operations vvtd doesn't claim
supported or emulated. I am afraid it also can be triggered by guest.
How about ignoring the invalidation request?

I will change the error message for it isn't internal error.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support
  2017-10-23  8:57       ` Roger Pau Monné
@ 2017-10-23  8:52         ` Chao Gao
  2017-10-23 23:26           ` Tian, Kevin
  0 siblings, 1 reply; 108+ messages in thread
From: Chao Gao @ 2017-10-23  8:52 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, andrew.cooper3, kevin.tian, jbeulich, xen-devel

On Mon, Oct 23, 2017 at 09:57:16AM +0100, Roger Pau Monné wrote:
>On Mon, Oct 23, 2017 at 03:50:24PM +0800, Chao Gao wrote:
>> On Fri, Oct 20, 2017 at 12:20:06PM +0100, Roger Pau Monné wrote:
>> >On Thu, Sep 21, 2017 at 11:02:09PM -0400, Lan Tianyu wrote:
>> >> From: Chao Gao <chao.gao@intel.com>
>> >> +    }
>> >> +
>> >> +    unmap_guest_page((void*)qinval_page);
>> >> +    return ret;
>> >> +
>> >> + error:
>> >> +    unmap_guest_page((void*)qinval_page);
>> >> +    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
>> >> +    domain_crash(vvtd->domain);
>> >
>> >Do you really need to crash the domain in such case?
>> 
>> We reach here when guest requests some operations vvtd doesn't claim
>> supported or emulated. I am afraid it also can be triggered by guest.
>> How about ignoring the invalidation request?
>
>What would real hardware do in such case?

After reading the spec again, I think hardware may generate a fault
event, seeing VT-d spec 10.4.9 Fault Status Register: 
Hardware detected an error associated with the invalidation queue. This
could be due to either a hardware error while fetching a descriptor from
the invalidation queue, or hardware detecting an erroneous or invalid
descriptor in the invalidation queue. At this time, a fault event may be
generated based on the programming of the Fault Event Control register

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support
  2017-10-23  7:50     ` Chao Gao
@ 2017-10-23  8:57       ` Roger Pau Monné
  2017-10-23  8:52         ` Chao Gao
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-23  8:57 UTC (permalink / raw)
  To: Chao Gao, Lan Tianyu, xen-devel, kevin.tian, jbeulich, andrew.cooper3

On Mon, Oct 23, 2017 at 03:50:24PM +0800, Chao Gao wrote:
> On Fri, Oct 20, 2017 at 12:20:06PM +0100, Roger Pau Monné wrote:
> >On Thu, Sep 21, 2017 at 11:02:09PM -0400, Lan Tianyu wrote:
> >> From: Chao Gao <chao.gao@intel.com>
> >> +    }
> >> +
> >> +    unmap_guest_page((void*)qinval_page);
> >> +    return ret;
> >> +
> >> + error:
> >> +    unmap_guest_page((void*)qinval_page);
> >> +    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
> >> +    domain_crash(vvtd->domain);
> >
> >Do you really need to crash the domain in such case?
> 
> We reach here when guest requests some operations vvtd doesn't claim
> supported or emulated. I am afraid it also can be triggered by guest.
> How about ignoring the invalidation request?

What would real hardware do in such case?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support
  2017-10-23  8:52         ` Chao Gao
@ 2017-10-23 23:26           ` Tian, Kevin
  0 siblings, 0 replies; 108+ messages in thread
From: Tian, Kevin @ 2017-10-23 23:26 UTC (permalink / raw)
  To: Gao, Chao, Roger Pau Monné
  Cc: Lan, Tianyu, andrew.cooper3, jbeulich, xen-devel

> From: Gao, Chao
> Sent: Monday, October 23, 2017 4:52 PM
> 
> On Mon, Oct 23, 2017 at 09:57:16AM +0100, Roger Pau Monné wrote:
> >On Mon, Oct 23, 2017 at 03:50:24PM +0800, Chao Gao wrote:
> >> On Fri, Oct 20, 2017 at 12:20:06PM +0100, Roger Pau Monné wrote:
> >> >On Thu, Sep 21, 2017 at 11:02:09PM -0400, Lan Tianyu wrote:
> >> >> From: Chao Gao <chao.gao@intel.com>
> >> >> +    }
> >> >> +
> >> >> +    unmap_guest_page((void*)qinval_page);
> >> >> +    return ret;
> >> >> +
> >> >> + error:
> >> >> +    unmap_guest_page((void*)qinval_page);
> >> >> +    gdprintk(XENLOG_ERR, "Internal error in Queue Invalidation.\n");
> >> >> +    domain_crash(vvtd->domain);
> >> >
> >> >Do you really need to crash the domain in such case?
> >>
> >> We reach here when guest requests some operations vvtd doesn't claim
> >> supported or emulated. I am afraid it also can be triggered by guest.
> >> How about ignoring the invalidation request?
> >
> >What would real hardware do in such case?
> 
> After reading the spec again, I think hardware may generate a fault
> event, seeing VT-d spec 10.4.9 Fault Status Register:
> Hardware detected an error associated with the invalidation queue. This
> could be due to either a hardware error while fetching a descriptor from
> the invalidation queue, or hardware detecting an erroneous or invalid
> descriptor in the invalidation queue. At this time, a fault event may be
> generated based on the programming of the Fault Event Control register
> 

Please do proper emulation according to hardware behavior.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
  2017-10-19 11:28         ` Jan Beulich
@ 2017-10-24  7:16           ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-24  7:16 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, chao.gao

On 2017年10月19日 19:28, Jan Beulich wrote:
>>>> On 19.10.17 at 10:49, <roger.pau@citrix.com> wrote:
>> On Thu, Oct 19, 2017 at 10:26:36AM +0800, Lan Tianyu wrote:
>>> Hi Roger:
>>>      Thanks for review.
>>>
>>> On 2017年10月18日 21:26, Roger Pau Monné wrote:
>>>> On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote:
>>>>> +Xen hypervisor vIOMMU command
>>>>> +=============================
>>>>> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in 
>> hypervisor.
>>>>> +It's default disabled.
>>>>
>>>> Hm, I'm not sure we really need this. At the end viommu will be
>>>> disabled by default for guests, unless explicitly enabled in the
>>>> config file.
>>>
>>> This is according to Jan's early comments on RFC patch
>>> https://patchwork.kernel.org/patch/9733869/.
>>>
>>> "It's actually a question whether in our current scheme a Kconfig
>>> option is appropriate here in the first place. I'd rather see this be
>>> an always built feature which needs enabling on the command line
>>> for the time being."
>>
>> So if I read this correctly Jan wanted you to ditch the Kconfig option
>> and instead rely on the command line option to enable/disable it.
> 
> Yes.
> 
> Jan
> 

OK. I will remove the command in the next version. Thanks for clarification.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-19  8:47       ` Roger Pau Monné
@ 2017-10-25  1:43         ` Lan Tianyu
  2017-10-30  1:41           ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-10-25  1:43 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月19日 16:47, Roger Pau Monné wrote:
> For all platforms supporting HVM, for PV I don't think it makes sense.
> Since AFAIK ARM guest type is also HVM I would rather introduce this
> field in the hvm_domain structure rather than the generic domain
> structure.
> 

This sounds reasonable.

> You might want to wait for feedback from others regarding this issue.
> 

I discussed with Julien before. He hoped no to add viommu code for ARM
first.So struct hvm_domain seems to be better place since it's arch
specific definition and only add struct viommu for struct hvm_domain of x86.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table
  2017-10-19  8:40       ` Roger Pau Monné
@ 2017-10-25  6:06         ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-25  6:06 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On 2017年10月19日 16:40, Roger Pau Monné wrote:
> On Thu, Oct 19, 2017 at 04:09:02PM +0800, Lan Tianyu wrote:
>> On 2017年10月18日 23:12, Roger Pau Monné wrote:
>>>> diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
>>>> index a2efd23..fdd6a78 100644
>>>> --- a/tools/libacpi/libacpi.h
>>>> +++ b/tools/libacpi/libacpi.h
>>>> @@ -20,6 +20,8 @@
>>>>  #ifndef __LIBACPI_H__
>>>>  #define __LIBACPI_H__
>>>>  
>>>> +#include <stdbool.h>
>>>
>>> I'm quite sure you shouldn't add this here, see how headers are added
>>> using LIBACPI_STDUTILS.
>>>
>>
>> We may replace bool with uint8_t xxx:1 to avoid introduce new head file.
> 
> Did you check whether including stdbool is actually required? AFAICT
> hvmloader util.h already includes it, and you would only have to
> introduce it in libxl if it's not there yet.
> 

Yes, you are right. stdbool.h has introduced in both libxl(libxl.h) and
hvmloader(util.h). We just need to adjust include order.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-10-19 15:42   ` Roger Pau Monné
@ 2017-10-25  7:30     ` Lan Tianyu
  2017-10-25  7:43       ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-10-25  7:30 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月19日 23:42, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:02:01PM -0400, Lan Tianyu wrote:
>> This patch is to add get_irq_info callback for platform implementation
>> to convert irq remapping request to irq info (E,G vector, dest, dest_mode
>> and so on).
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/viommu.c          | 16 ++++++++++++++++
>>  xen/include/asm-x86/viommu.h |  8 ++++++++
>>  xen/include/xen/viommu.h     | 14 ++++++++++++++
>>  3 files changed, 38 insertions(+)
>>
>> diff --git a/xen/common/viommu.c b/xen/common/viommu.c
>> index b517158..0708e43 100644
>> --- a/xen/common/viommu.c
>> +++ b/xen/common/viommu.c
>> @@ -178,6 +178,22 @@ int viommu_handle_irq_request(struct domain *d,
>>      return viommu->ops->handle_irq_request(d, request);
>>  }
>>  
>> +int viommu_get_irq_info(struct domain *d,
>> +                        struct arch_irq_remapping_request *request,
>> +                        struct arch_irq_remapping_info *irq_info)
>> +{
>> +    struct viommu *viommu = d->viommu;
>> +
>> +    if ( !viommu )
>> +        return -EINVAL;
> 
> OK, here there's a check for !viommu. Can we please have this written
> down in the header? (ie: which functions are safe/expected to be
> called without a viommu)

Sure. I will add some comments.

> 
>> +
>> +    ASSERT(viommu->ops);
>> +    if ( !viommu->ops->get_irq_info )
>> +        return -EINVAL;
>> +
>> +    return viommu->ops->get_irq_info(d, request, irq_info);
>> +}
>> +
>>  /*
>>   * Local variables:
>>   * mode: C
>> diff --git a/xen/include/asm-x86/viommu.h b/xen/include/asm-x86/viommu.h
>> index 366fbb6..586b6bd 100644
>> --- a/xen/include/asm-x86/viommu.h
>> +++ b/xen/include/asm-x86/viommu.h
>> @@ -24,6 +24,14 @@
>>  #define VIOMMU_REQUEST_IRQ_MSI          0
>>  #define VIOMMU_REQUEST_IRQ_APIC         1
>>  
>> +struct arch_irq_remapping_info
>> +{
>> +    uint8_t  vector;
>> +    uint32_t dest;
>> +    uint32_t dest_mode:1;
>> +    uint32_t delivery_mode:3;
> 
> Why uint32_t for this two last fields? Also please sort them so that
> the padding is limited at the end of the structure.

Yes, this makes sense.

> 
>> +};
>> +
>>  struct arch_irq_remapping_request
>>  {
>>      union {
>> diff --git a/xen/include/xen/viommu.h b/xen/include/xen/viommu.h
>> index 230f6b1..beb40cd 100644
>> --- a/xen/include/xen/viommu.h
>> +++ b/xen/include/xen/viommu.h
>> @@ -21,6 +21,7 @@
>>  #define __XEN_VIOMMU_H__
>>  
>>  struct viommu;
>> +struct arch_irq_remapping_info;
>>  struct arch_irq_remapping_request;
> 
> If you include asm/viommu.h in viommu.h you don't need to forward
> declarations.

Will update.

> 
>>  
>>  struct viommu_ops {
>> @@ -28,6 +29,9 @@ struct viommu_ops {
>>      int (*destroy)(struct viommu *viommu);
>>      int (*handle_irq_request)(struct domain *d,
>>                                struct arch_irq_remapping_request *request);
>> +    int (*get_irq_info)(struct domain *d,
>> +                        struct arch_irq_remapping_request *request,
> 
> AFAICT d and request should be constified.

Did you mean to keep d and request in the same line? This will exceed 80
chars.


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-10-25  7:43       ` Roger Pau Monné
@ 2017-10-25  7:38         ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-25  7:38 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月25日 15:43, Roger Pau Monné wrote:
> On Wed, Oct 25, 2017 at 03:30:39PM +0800, Lan Tianyu wrote:
>> On 2017年10月19日 23:42, Roger Pau Monné wrote:
>>> On Thu, Sep 21, 2017 at 11:02:01PM -0400, Lan Tianyu wrote:
>>>
>>>>  
>>>>  struct viommu_ops {
>>>> @@ -28,6 +29,9 @@ struct viommu_ops {
>>>>      int (*destroy)(struct viommu *viommu);
>>>>      int (*handle_irq_request)(struct domain *d,
>>>>                                struct arch_irq_remapping_request *request);
>>>> +    int (*get_irq_info)(struct domain *d,
>>>> +                        struct arch_irq_remapping_request *request,
>>>
>>> AFAICT d and request should be constified.
>>
>> Did you mean to keep d and request in the same line? This will exceed 80
>> chars.
> 
> No, I meant that the parameters of the function should be "const struct
> domain *d" and "const struct arch_irq_remapping_request *request".
> AFAICT you should never modify them inside of get_irq_info.
> 

OK. I got it. This makes sense and will update.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request
  2017-10-25  7:30     ` Lan Tianyu
@ 2017-10-25  7:43       ` Roger Pau Monné
  2017-10-25  7:38         ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-25  7:43 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Wed, Oct 25, 2017 at 03:30:39PM +0800, Lan Tianyu wrote:
> On 2017年10月19日 23:42, Roger Pau Monné wrote:
> > On Thu, Sep 21, 2017 at 11:02:01PM -0400, Lan Tianyu wrote:
> > 
> >>  
> >>  struct viommu_ops {
> >> @@ -28,6 +29,9 @@ struct viommu_ops {
> >>      int (*destroy)(struct viommu *viommu);
> >>      int (*handle_irq_request)(struct domain *d,
> >>                                struct arch_irq_remapping_request *request);
> >> +    int (*get_irq_info)(struct domain *d,
> >> +                        struct arch_irq_remapping_request *request,
> > 
> > AFAICT d and request should be constified.
> 
> Did you mean to keep d and request in the same line? This will exceed 80
> chars.

No, I meant that the parameters of the function should be "const struct
domain *d" and "const struct arch_irq_remapping_request *request".
AFAICT you should never modify them inside of get_irq_info.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction
  2017-10-19 10:13   ` Roger Pau Monné
@ 2017-10-26 12:05     ` Wei Liu
  2017-10-27  1:58       ` Lan Tianyu
  0 siblings, 1 reply; 108+ messages in thread
From: Wei Liu @ 2017-10-26 12:05 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lan Tianyu, tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	Chao Gao

On Thu, Oct 19, 2017 at 11:13:57AM +0100, Roger Pau Monné wrote:
> > +
> > +            if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
> > +                ret = xc_viommu_create(ctx->xch, domid, VIOMMU_TYPE_INTEL_VTD,
> > +                                       viommu->base_addr, viommu->cap, &id);
> 
> As said in another patch: this will break compilation because
> xc_viommu_create is introduced in patch 9.
> 
> Please organize the patches in a way that the code always compiles and
> works fine. Keep in mind that the Xen tree should be bisectable
> always.
> 

+10 to this.

We rely heavily on our test system's bisector to tell us what is wrong.
The bisector works on patch level. Please make sure every patch builds,
otherwise the test system will just give up.

If triaging can be done automatically by computers, maintainers can
spend less time doing tedious work and more time reviewing patches
(yours included).

Normally I use git-rebase to build every commit, but I figured that's a
bit too dangerous so I wrote a script.

Please check out:

  [PATCH v3 for-4.10] scripts: introduce a script for build test

It is still under review, but you can fish out some of the runes to do
build tests.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction
  2017-10-26 12:05     ` Wei Liu
@ 2017-10-27  1:58       ` Lan Tianyu
  0 siblings, 0 replies; 108+ messages in thread
From: Lan Tianyu @ 2017-10-27  1:58 UTC (permalink / raw)
  To: Wei Liu, Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, konrad.wilk, George.Dunlap,
	andrew.cooper3, ian.jackson, xen-devel, jbeulich, Chao Gao

On 2017年10月26日 20:05, Wei Liu wrote:
> On Thu, Oct 19, 2017 at 11:13:57AM +0100, Roger Pau Monné wrote:
>>> +
>>> +            if (viommu->type == LIBXL_VIOMMU_TYPE_INTEL_VTD) {
>>> +                ret = xc_viommu_create(ctx->xch, domid, VIOMMU_TYPE_INTEL_VTD,
>>> +                                       viommu->base_addr, viommu->cap, &id);
>>
>> As said in another patch: this will break compilation because
>> xc_viommu_create is introduced in patch 9.
>>
>> Please organize the patches in a way that the code always compiles and
>> works fine. Keep in mind that the Xen tree should be bisectable
>> always.
>>
> 
> +10 to this.
> 
> We rely heavily on our test system's bisector to tell us what is wrong.
> The bisector works on patch level. Please make sure every patch builds,
> otherwise the test system will just give up.
> 
> If triaging can be done automatically by computers, maintainers can
> spend less time doing tedious work and more time reviewing patches
> (yours included).

Sure. Will pay more attention on this.

> 
> Normally I use git-rebase to build every commit, but I figured that's a
> bit too dangerous so I wrote a script.
> 
> Please check out:
> 
>   [PATCH v3 for-4.10] scripts: introduce a script for build test
> 
> It is still under review, but you can fish out some of the runes to do
> build tests.

This is very helpful. Thanks.
-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-25  1:43         ` Lan Tianyu
@ 2017-10-30  1:41           ` Lan Tianyu
  2017-10-30  9:54             ` Roger Pau Monné
  0 siblings, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-10-30  1:41 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On 2017年10月25日 09:43, Lan Tianyu wrote:
>> For all platforms supporting HVM, for PV I don't think it makes sense.
>> > Since AFAIK ARM guest type is also HVM I would rather introduce this
>> > field in the hvm_domain structure rather than the generic domain
>> > structure.
>> > 
> This sounds reasonable.
> 
>> > You might want to wait for feedback from others regarding this issue.
>> > 
> I discussed with Julien before. He hoped no to add viommu code for ARM
> first.So struct hvm_domain seems to be better place since it's arch
> specific definition and only add struct viommu for struct hvm_domain of x86.

Hi Roger:
	If PV guest needs PV IOMMU support, struct iommu should be put  into
struct domain and it can be reused by full-virtualization and PV iommu.
Malcolm Crossley sent out RFC patch of pv iommu before. I found it also
needs to change struct domain.

https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01441.html


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-18 14:05   ` Roger Pau Monné
  2017-10-19  6:31     ` Lan Tianyu
@ 2017-10-30  1:51     ` Lan Tianyu
  2017-11-06  8:19       ` Jan Beulich
  1 sibling, 1 reply; 108+ messages in thread
From: Lan Tianyu @ 2017-10-30  1:51 UTC (permalink / raw)
  To: Roger Pau Monné, jbeulich
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, chao.gao

On 2017年10月18日 22:05, Roger Pau Monné wrote:
>> +int viommu_register_type(uint64_t type, struct viommu_ops *ops)
>> > +{
>> > +    struct viommu_type *viommu_type = NULL;
>> > +
>> > +    if ( !viommu_enabled() )
>> > +        return -ENODEV;
>> > +
>> > +    if ( viommu_get_type(type) )
>> > +        return -EEXIST;
>> > +
>> > +    viommu_type = xzalloc(struct viommu_type);
>> > +    if ( !viommu_type )
>> > +        return -ENOMEM;
>> > +
>> > +    viommu_type->type = type;
>> > +    viommu_type->ops = ops;
>> > +
>> > +    spin_lock(&type_list_lock);
>> > +    list_add_tail(&viommu_type->node, &type_list);
>> > +    spin_unlock(&type_list_lock);
>> > +
>> > +    return 0;
>> > +}
> As mentioned above, I think this viommu_register_type helper could be
> avoided. I would rather use a macro similar to REGISTER_SCHEDULER in
> order to populate an array at link time, and then just iterate over
> it.
> 

Hi Jan:
	Could you help to check whether REGISTER_SCHEDULER is right direction
for vIOMMU? It needs to change Xen lds layout. From my view, a list to
manage vIOMMU device model types will be more easy and this maybe a
common solution.

-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-30  1:41           ` Lan Tianyu
@ 2017-10-30  9:54             ` Roger Pau Monné
  0 siblings, 0 replies; 108+ messages in thread
From: Roger Pau Monné @ 2017-10-30  9:54 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel, jbeulich,
	chao.gao

On Mon, Oct 30, 2017 at 09:41:23AM +0800, Lan Tianyu wrote:
> On 2017年10月25日 09:43, Lan Tianyu wrote:
> >> For all platforms supporting HVM, for PV I don't think it makes sense.
> >> > Since AFAIK ARM guest type is also HVM I would rather introduce this
> >> > field in the hvm_domain structure rather than the generic domain
> >> > structure.
> >> > 
> > This sounds reasonable.
> > 
> >> > You might want to wait for feedback from others regarding this issue.
> >> > 
> > I discussed with Julien before. He hoped no to add viommu code for ARM
> > first.So struct hvm_domain seems to be better place since it's arch
> > specific definition and only add struct viommu for struct hvm_domain of x86.
> 
> Hi Roger:
> 	If PV guest needs PV IOMMU support, struct iommu should be put  into
> struct domain and it can be reused by full-virtualization and PV iommu.
> Malcolm Crossley sent out RFC patch of pv iommu before. I found it also
> needs to change struct domain.
> 
> https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01441.html

This patch series is from February 2016: almost two years old and
there's been no further repost.

If this can indeed be shared with a future pv-iommu work, have you
checked whether the current structure data and hooks would be
suitable for a pv-iommu implementation?

I would rather prefer to move the viommu structure from hvm_domain to
the generic domain struct when it's actually needed (ie: when pv-iommu
is implemented) rather than doing it here.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance
  2017-10-30  1:51     ` Lan Tianyu
@ 2017-11-06  8:19       ` Jan Beulich
  0 siblings, 0 replies; 108+ messages in thread
From: Jan Beulich @ 2017-11-06  8:19 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: tim, kevin.tian, sstabellini, wei.liu2, konrad.wilk,
	George.Dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Roger Pau Monné,
	chao.gao

>>> On 30.10.17 at 02:51, <tianyu.lan@intel.com> wrote:
> On 2017年10月18日 22:05, Roger Pau Monné wrote:
>>> +int viommu_register_type(uint64_t type, struct viommu_ops *ops)
>>> > +{
>>> > +    struct viommu_type *viommu_type = NULL;
>>> > +
>>> > +    if ( !viommu_enabled() )
>>> > +        return -ENODEV;
>>> > +
>>> > +    if ( viommu_get_type(type) )
>>> > +        return -EEXIST;
>>> > +
>>> > +    viommu_type = xzalloc(struct viommu_type);
>>> > +    if ( !viommu_type )
>>> > +        return -ENOMEM;
>>> > +
>>> > +    viommu_type->type = type;
>>> > +    viommu_type->ops = ops;
>>> > +
>>> > +    spin_lock(&type_list_lock);
>>> > +    list_add_tail(&viommu_type->node, &type_list);
>>> > +    spin_unlock(&type_list_lock);
>>> > +
>>> > +    return 0;
>>> > +}
>> As mentioned above, I think this viommu_register_type helper could be
>> avoided. I would rather use a macro similar to REGISTER_SCHEDULER in
>> order to populate an array at link time, and then just iterate over
>> it.
>> 
> 
> Hi Jan:
> 	Could you help to check whether REGISTER_SCHEDULER is right direction
> for vIOMMU? It needs to change Xen lds layout. From my view, a list to
> manage vIOMMU device model types will be more easy and this maybe a
> common solution.

I think the suggested approach is generally the neater one; there
may be a few other things we could convert to a similar model, to
clean up code. Hence yes, unless there are strong reasons against
it, I agree with Roger.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2017-11-06  8:19 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-22  3:01 [PATCH V3 00/29] Lan Tianyu
2017-09-22  3:01 ` [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc Lan Tianyu
2017-10-18 13:26   ` Roger Pau Monné
2017-10-19  2:26     ` Lan Tianyu
2017-10-19  8:49       ` Roger Pau Monné
2017-10-19 11:28         ` Jan Beulich
2017-10-24  7:16           ` Lan Tianyu
2017-09-22  3:01 ` [PATCH V3 2/29] VIOMMU: Add vIOMMU helper functions to create, destroy vIOMMU instance Lan Tianyu
2017-10-18 14:05   ` Roger Pau Monné
2017-10-19  6:31     ` Lan Tianyu
2017-10-19  8:47       ` Roger Pau Monné
2017-10-25  1:43         ` Lan Tianyu
2017-10-30  1:41           ` Lan Tianyu
2017-10-30  9:54             ` Roger Pau Monné
2017-10-30  1:51     ` Lan Tianyu
2017-11-06  8:19       ` Jan Beulich
2017-09-22  3:01 ` [PATCH V3 3/29] DOMCTL: Introduce new DOMCTL commands for vIOMMU support Lan Tianyu
2017-10-18 14:18   ` Roger Pau Monné
2017-10-19  6:41     ` Lan Tianyu
2017-09-22  3:01 ` [PATCH V3 4/29] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Lan Tianyu
2017-10-18 14:36   ` Roger Pau Monné
2017-10-19  6:46     ` Lan Tianyu
2017-09-22  3:01 ` [PATCH V3 5/29] tools/libacpi: Add new fields in acpi_config for DMAR table Lan Tianyu
2017-10-18 15:12   ` Roger Pau Monné
2017-10-19  8:09     ` Lan Tianyu
2017-10-19  8:40       ` Roger Pau Monné
2017-10-25  6:06         ` Lan Tianyu
2017-10-19 11:31       ` Jan Beulich
2017-09-22  3:01 ` [PATCH V3 6/29] tools/libxl: Add a user configurable parameter to control vIOMMU attributes Lan Tianyu
2017-10-19  9:49   ` Roger Pau Monné
2017-10-20  1:36     ` Chao Gao
2017-09-22  3:01 ` [PATCH V3 7/29] tools/libxl: build DMAR table for a guest with one virtual VTD Lan Tianyu
2017-10-19 10:00   ` Roger Pau Monné
2017-10-20  1:44     ` Chao Gao
2017-09-22  3:01 ` [PATCH V3 8/29] tools/libxl: create vIOMMU during domain construction Lan Tianyu
2017-10-19 10:13   ` Roger Pau Monné
2017-10-26 12:05     ` Wei Liu
2017-10-27  1:58       ` Lan Tianyu
2017-09-22  3:01 ` [PATCH V3 9/29] tools/libxc: Add viommu operations in libxc Lan Tianyu
2017-10-19 10:17   ` Roger Pau Monné
2017-09-22  3:01 ` [PATCH V3 10/29] vtd: add and align register definitions Lan Tianyu
2017-10-19 10:21   ` Roger Pau Monné
2017-10-20  1:47     ` Chao Gao
2017-09-22  3:01 ` [PATCH V3 11/29] x86/hvm: Introduce a emulated VTD for HVM Lan Tianyu
2017-10-19 11:20   ` Roger Pau Monné
2017-10-20  2:46     ` Chao Gao
2017-10-20  6:56       ` Jan Beulich
2017-10-20  6:12         ` Chao Gao
2017-10-20  8:37         ` Lan Tianyu
2017-09-22  3:01 ` [PATCH V3 12/29] x86/vvtd: Add MMIO handler for VVTD Lan Tianyu
2017-10-19 11:34   ` Roger Pau Monné
2017-10-20  2:58     ` Chao Gao
2017-10-20  9:51       ` Roger Pau Monné
2017-09-22  3:01 ` [PATCH V3 13/29] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Lan Tianyu
2017-10-19 11:56   ` Roger Pau Monné
2017-10-20  4:08     ` Chao Gao
2017-10-20  6:57       ` Jan Beulich
2017-09-22  3:01 ` [PATCH V3 14/29] x86/vvtd: Enable Interrupt Remapping " Lan Tianyu
2017-10-19 13:42   ` Roger Pau Monné
2017-09-22  3:01 ` [PATCH V3 15/29] x86/vvtd: Process interrupt remapping request Lan Tianyu
2017-10-19 14:26   ` Roger Pau Monné
2017-10-20  5:16     ` Chao Gao
2017-10-20 10:01       ` Roger Pau Monné
2017-10-23  6:44         ` Chao Gao
2017-09-22  3:01 ` [PATCH V3 16/29] x86/vvtd: decode interrupt attribute from IRTE Lan Tianyu
2017-10-19 14:39   ` Roger Pau Monné
2017-10-20  5:22     ` Chao Gao
2017-09-22  3:01 ` [PATCH V3 17/29] x86/vvtd: add a helper function to decide the interrupt format Lan Tianyu
2017-10-19 14:43   ` Roger Pau Monné
2017-09-22  3:01 ` [PATCH V3 18/29] VIOMMU: Add irq request callback to deal with irq remapping Lan Tianyu
2017-10-19 15:00   ` Roger Pau Monné
2017-09-22  3:02 ` [PATCH V3 19/29] x86/vioapic: Hook interrupt delivery of vIOAPIC Lan Tianyu
2017-10-19 15:37   ` Roger Pau Monné
2017-09-22  3:02 ` [PATCH V3 20/29] VIOMMU: Add get irq info callback to convert irq remapping request Lan Tianyu
2017-10-19 15:42   ` Roger Pau Monné
2017-10-25  7:30     ` Lan Tianyu
2017-10-25  7:43       ` Roger Pau Monné
2017-10-25  7:38         ` Lan Tianyu
2017-09-22  3:02 ` [PATCH V3 21/29] VIOMMU: Introduce callback of checking irq remapping mode Lan Tianyu
2017-10-19 15:43   ` Roger Pau Monné
2017-09-22  3:02 ` [PATCH V3 22/29] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Lan Tianyu
2017-10-19 15:49   ` Roger Pau Monné
2017-10-19 15:56     ` Jan Beulich
2017-10-20  1:04       ` Chao Gao
2017-09-22  3:02 ` [PATCH V3 23/29] passthrough: move some fields of hvm_gmsi_info to a sub-structure Lan Tianyu
2017-09-22  3:02 ` [PATCH V3 24/29] tools/libxc: Add a new interface to bind remapping format msi with pirq Lan Tianyu
2017-10-19 16:03   ` Roger Pau Monné
2017-10-20  5:39     ` Chao Gao
2017-09-22  3:02 ` [PATCH V3 25/29] x86/vmsi: Hook delivering remapping format msi to guest Lan Tianyu
2017-10-19 16:07   ` Roger Pau Monné
2017-10-20  6:48     ` Jan Beulich
2017-09-22  3:02 ` [PATCH V3 26/29] x86/vvtd: Handle interrupt translation faults Lan Tianyu
2017-10-19 16:31   ` Roger Pau Monné
2017-10-20  5:54     ` Chao Gao
2017-10-20 10:08       ` Roger Pau Monné
2017-10-20 14:20         ` Jan Beulich
2017-09-22  3:02 ` [PATCH V3 27/29] x86/vvtd: Enable Queued Invalidation through GCMD Lan Tianyu
2017-10-20 10:30   ` Roger Pau Monné
2017-09-22  3:02 ` [PATCH V3 28/29] x86/vvtd: Add queued invalidation (QI) support Lan Tianyu
2017-10-20 11:20   ` Roger Pau Monné
2017-10-23  7:50     ` Chao Gao
2017-10-23  8:57       ` Roger Pau Monné
2017-10-23  8:52         ` Chao Gao
2017-10-23 23:26           ` Tian, Kevin
2017-09-22  3:02 ` [PATCH V3 29/29] x86/vvtd: save and restore emulated VT-d Lan Tianyu
2017-10-20 11:25   ` Roger Pau Monné
2017-10-20 11:36 ` [PATCH V3 00/29] Roger Pau Monné
2017-10-23  1:23   ` Lan Tianyu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.