All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
@ 2016-04-19  8:38 Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 01/16] acpi: enable INTR for DMAR report structure Peter Xu
                   ` (16 more replies)
  0 siblings, 17 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Jan, Michael,

Still haven't got your response from v3 review comments, but I
decided to move on to provide a workable version first (v3 is
missing the first patch, so it is not working). Also, misc issues
are addressed from Radim. Anyway, I am always ready for v5 and
further. :)

Github branch for your reference in case needed:

  https://github.com/xzpeter/qemu vtd-intr-v4

Thanks,

v4 changes (all patch number corresponds to v3):
- add one patch at the start of v3 series: I missed to send the
  first patch in v3. adding it in. [Jan]
- patch 9: add support for compatible mode (no reason not to support
  it, if not, we will get some warnings when using split irqchip)
- patch 11: further simplify ioapic_update_kvm_routes() using the
  helper function.
- patch 12: tweak on kvm_arch_fixup_msi_route() rather than
  ioapic_update_kvm_routes() only. [Radim]
- add patch 15: introduce IEC (Interrupt Entry Cache) invalidation
  notifier list. We can register to this list if we want to be
  notified when we got IR invalidation requests [Radim]
- add patch 16: let IOAPIC the first consumer for the above IEC
  notifier list. [Radim]
- several other trivial fixes (like moving some defines from .c to
  .h, moving several lines of changes from one patch to another to
  make it make more sense, etc.)

v3 changes (all patch numbers corresponds to v2):
- patch 1 (-> v3 patch 13)
  - move to the end of series [Alex]
- patch 10 (dropped)
  - drop this one, since re-worked on IOAPIC support, so we do not
    need this any more.
- patch 12 (-> v3 patch 10)
  - leverage MSI path for IOAPIC IR [Jan]
- patch 13 (v3 -> patch 9)
  - remove vtd_interrupt_remap_msi() declaration by reordering the
    functions [mst]
  - vtd_generate_msi_message(): init msg using {}, remove FIXME
    [mst]
- new patches
  - v3 patch 11: introduce ioapic_entry_parse() helper function
  - v3 patch 12: add support for kernel-irqchip=split. This needs
    more reviews, logically this should enable lots of things:
	splitted irqchip, irqfd, vhost, and irqfd support for
	passthrough devices (not tested). Please refer to the patch for
	more information.

v2 changes:
- patch 1
  - rename "int_remap" to "intr" in several places [Marcel]
  - remove "Intel" specific words in desc or commit message, prepare
    itself with further AMD support [Marcel]
  - avoid using object_property_get_bool() [Marcel]
- patch 5
  - use PCI bus number 0xff rather than 0xf0 for the IOAPIC scope
    definition. (please let me know if anyone knows how I can avoid
	user using PCI bus number 0xff... TIA)
- patch 11
  - fix comments [Marcel]
- all
  - remove intr_supported variable [Marcel]

This patchset provide very basic functionalities for interrupt
remapping (IR) support of the emulated Intel IOMMU device.

By default, IR is disabled to be better compatible with current
QEMU. To enable IR, we can using the following command to boot a
IR-supported VM with virtio-net device with vhost (still do not
support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
here):

$ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
     -enable-kvm -m 1024 \
	 -netdev tap,id=net0,vhost=on \
     -device virtio-net-pci,netdev=user.0 \
     -monitor telnet::3333,server,nowait \
	 /var/lib/libvirt/images/vm1.qcow2

When guest boots, we can verify whether IR enabled by grepping the
dmesg like:

[root@localhost ~]# journalctl -k | grep "DMAR-IR"
Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode

Currently supported devices:

- Emulated/Splitted irqchip
- Generic PCI Devices
- vhost devices
- pass through device support? Not tested, but suppose it should work.

TODO List:

- kvm-ioapic support (?)
- EIM support
- IR fault reporting
- source-id validation for IRTE
- IRTE cache and better queued invalidation
- migration support (for IOMMU as general?)
- more?

Peter Xu (16):
  acpi: enable INTR for DMAR report structure
  intel_iommu: allow queued invalidation for IR
  intel_iommu: set IR bit for ECAP register
  acpi: add DMAR scope definition for root IOAPIC
  intel_iommu: define interrupt remap table addr register
  intel_iommu: handle interrupt remap enable
  intel_iommu: define several structs for IOMMU IR
  intel_iommu: provide helper function vtd_get_iommu
  intel_iommu: add IR translation faults defines
  intel_iommu: Add support for PCI MSI remap
  q35: ioapic: add support for emulated IOAPIC IR
  ioapic: introduce ioapic_entry_parse() helper
  intel_iommu: add support for split irqchip
  q35: add "int-remap" flag to enable intr
  intel_iommu: introduce IEC notifiers
  ioapic: register VT-d IEC invalidate notifier

 hw/core/machine.c                 |  22 +++
 hw/i386/acpi-build.c              |  36 ++--
 hw/i386/intel_iommu.c             | 376 +++++++++++++++++++++++++++++++++++++-
 hw/i386/intel_iommu_internal.h    |  47 ++++-
 hw/i386/pc.c                      |   3 +
 hw/intc/ioapic.c                  | 125 ++++++++-----
 hw/pci-host/q35.c                 |   4 +
 include/hw/acpi/acpi-defs.h       |  15 ++
 include/hw/boards.h               |   1 +
 include/hw/i386/apic-msidef.h     |   1 +
 include/hw/i386/intel_iommu.h     | 145 +++++++++++++++
 include/hw/i386/ioapic_internal.h |   1 +
 include/hw/i386/pc.h              |   4 +
 include/hw/pci-host/q35.h         |   9 +
 target-i386/kvm.c                 |  24 +++
 15 files changed, 754 insertions(+), 59 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 01/16] acpi: enable INTR for DMAR report structure
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 02/16] intel_iommu: allow queued invalidation for IR Peter Xu
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Introduce iommu_intr in MachineState to show whether IOMMU IR is
enabled. By default, IR is off.

In ACPI DMA remapping report structure, enable INTR flag when specified.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/core/machine.c             |  2 ++
 hw/i386/acpi-build.c          | 12 +++++++++---
 include/hw/boards.h           |  1 +
 include/hw/i386/intel_iommu.h |  2 ++
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 6dbbc85..276ad61 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -382,6 +382,8 @@ static void machine_initfn(Object *obj)
     ms->kvm_shadow_mem = -1;
     ms->dump_guest_core = true;
     ms->mem_merge = true;
+    /* Disable interrupt remapping by default. */
+    ms->iommu_intr = false;
 
     object_property_add_str(obj, "accel",
                             machine_get_accel, machine_set_accel, NULL);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6477003..80dd1bb 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2575,16 +2575,22 @@ build_mcfg_q35(GArray *table_data, GArray *linker, AcpiMcfgInfo *info)
 }
 
 static void
-build_dmar_q35(GArray *table_data, GArray *linker)
+build_dmar_q35(MachineState *ms, GArray *table_data, GArray *linker)
 {
     int dmar_start = table_data->len;
 
     AcpiTableDmar *dmar;
     AcpiDmarHardwareUnit *drhd;
+    uint8_t dmar_flags = 0;
+
+    if (ms->iommu_intr) {
+        /* enable INTR for the IOMMU device */
+        dmar_flags |= DMAR_REPORT_F_INTR;
+    }
 
     dmar = acpi_data_push(table_data, sizeof(*dmar));
     dmar->host_address_width = VTD_HOST_ADDRESS_WIDTH - 1;
-    dmar->flags = 0;    /* No intr_remap for now */
+    dmar->flags = dmar_flags;
 
     /* DMAR Remapping Hardware Unit Definition structure */
     drhd = acpi_data_push(table_data, sizeof(*drhd));
@@ -2745,7 +2751,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
     }
     if (acpi_has_iommu()) {
         acpi_add_table(table_offsets, tables_blob);
-        build_dmar_q35(tables_blob, tables->linker);
+        build_dmar_q35(MACHINE(pcms), tables_blob, tables->linker);
     }
     if (pcms->acpi_nvdimm_state.is_enabled) {
         nvdimm_build_acpi(table_offsets, tables_blob, tables->linker);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 8d4fe56..43f4976 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -152,6 +152,7 @@ struct MachineState {
     bool igd_gfx_passthru;
     char *firmware;
     bool iommu;
+    bool iommu_intr;
     bool suppress_vmdesc;
     bool enforce_config_section;
 
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index b024ffa..0d89796 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -44,6 +44,8 @@
 #define VTD_HOST_ADDRESS_WIDTH      39
 #define VTD_HAW_MASK                ((1ULL << VTD_HOST_ADDRESS_WIDTH) - 1)
 
+#define DMAR_REPORT_F_INTR          (1)
+
 typedef struct VTDContextEntry VTDContextEntry;
 typedef struct VTDContextCacheEntry VTDContextCacheEntry;
 typedef struct IntelIOMMUState IntelIOMMUState;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 02/16] intel_iommu: allow queued invalidation for IR
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 01/16] acpi: enable INTR for DMAR report structure Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 03/16] intel_iommu: set IR bit for ECAP register Peter Xu
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Queued invalidation is required for IR. This patch add basic support for
interrupt cache invalidate requests. Since we currently have no IR cache
implemented yet, we can just skip all interrupt cache invalidation
requests for now.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c          | 9 +++++++++
 hw/i386/intel_iommu_internal.h | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 347718f..4b0558e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1400,6 +1400,15 @@ static bool vtd_process_inv_desc(IntelIOMMUState *s)
         }
         break;
 
+    case VTD_INV_DESC_IEC:
+        VTD_DPRINTF(INV, "Interrupt Entry Cache Invalidation "
+                    "not implemented yet");
+        /*
+         * Since currently we do not cache interrupt entries, we can
+         * just mark this descriptor as "good" and move on.
+         */
+        break;
+
     default:
         VTD_DPRINTF(GENERAL, "error: unkonw Invalidation Descriptor type "
                     "hi 0x%"PRIx64 " lo 0x%"PRIx64 " type %"PRIu8,
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e5f514c..b648e69 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -286,6 +286,8 @@ typedef struct VTDInvDesc VTDInvDesc;
 #define VTD_INV_DESC_TYPE               0xf
 #define VTD_INV_DESC_CC                 0x1 /* Context-cache Invalidate Desc */
 #define VTD_INV_DESC_IOTLB              0x2
+#define VTD_INV_DESC_IEC                0x4 /* Interrupt Entry Cache
+                                               Invalidate Descriptor */
 #define VTD_INV_DESC_WAIT               0x5 /* Invalidation Wait Descriptor */
 #define VTD_INV_DESC_NONE               0   /* Not an Invalidate Descriptor */
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 03/16] intel_iommu: set IR bit for ECAP register
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 01/16] acpi: enable INTR for DMAR report structure Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 02/16] intel_iommu: allow queued invalidation for IR Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 04/16] acpi: add DMAR scope definition for root IOAPIC Peter Xu
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Enable IR in IOMMU Extended Capability register.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c          | 7 +++++++
 hw/i386/intel_iommu_internal.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 4b0558e..17668d6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -24,6 +24,7 @@
 #include "exec/address-spaces.h"
 #include "intel_iommu_internal.h"
 #include "hw/pci/pci.h"
+#include "hw/boards.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -1941,6 +1942,8 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
  */
 static void vtd_init(IntelIOMMUState *s)
 {
+    MachineState *ms = MACHINE(qdev_get_machine());
+
     memset(s->csr, 0, DMAR_REG_SIZE);
     memset(s->wmask, 0, DMAR_REG_SIZE);
     memset(s->w1cmask, 0, DMAR_REG_SIZE);
@@ -1961,6 +1964,10 @@ static void vtd_init(IntelIOMMUState *s)
              VTD_CAP_SAGAW | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS;
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
+    if (ms->iommu_intr) {
+        s->ecap |= VTD_ECAP_IR;
+    }
+
     vtd_reset_context_cache(s);
     vtd_reset_iotlb(s);
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index b648e69..5b98a11 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -176,6 +176,8 @@
 /* (offset >> 4) << 8 */
 #define VTD_ECAP_IRO                (DMAR_IOTLB_REG_OFFSET << 4)
 #define VTD_ECAP_QI                 (1ULL << 1)
+/* Interrupt Remapping support */
+#define VTD_ECAP_IR                 (1ULL << 3)
 
 /* CAP_REG */
 /* (offset >> 4) << 24 */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 04/16] acpi: add DMAR scope definition for root IOAPIC
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (2 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 03/16] intel_iommu: set IR bit for ECAP register Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 05/16] intel_iommu: define interrupt remap table addr register Peter Xu
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

To enable interrupt remapping for intel IOMMU device, each IOAPIC device
in the system reported via ACPI MADT must be explicitly enumerated under
one specific remapping hardware unit. This patch adds the root-complex
IOAPIC into the default DMAR device.

Please refer to VT-d spec 8.3.1.1 for more information.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/acpi-build.c        | 17 +++++++++++++++--
 include/hw/acpi/acpi-defs.h | 15 +++++++++++++++
 include/hw/pci-host/q35.h   |  9 +++++++++
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 80dd1bb..5d2d87b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -77,6 +77,9 @@
 #define ACPI_BUILD_DPRINTF(fmt, ...)
 #endif
 
+/* Default IOAPIC ID */
+#define ACPI_BUILD_IOAPIC_ID 0x0
+
 typedef struct AcpiMcfgInfo {
     uint64_t mcfg_base;
     uint32_t mcfg_size;
@@ -375,7 +378,6 @@ build_madt(GArray *table_data, GArray *linker, PCMachineState *pcms)
     io_apic = acpi_data_push(table_data, sizeof *io_apic);
     io_apic->type = ACPI_APIC_IO;
     io_apic->length = sizeof(*io_apic);
-#define ACPI_BUILD_IOAPIC_ID 0x0
     io_apic->io_apic_id = ACPI_BUILD_IOAPIC_ID;
     io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
     io_apic->interrupt = cpu_to_le32(0);
@@ -2582,6 +2584,9 @@ build_dmar_q35(MachineState *ms, GArray *table_data, GArray *linker)
     AcpiTableDmar *dmar;
     AcpiDmarHardwareUnit *drhd;
     uint8_t dmar_flags = 0;
+    AcpiDmarDeviceScope *scope = NULL;
+    /* Root complex IOAPIC use one path[0] only */
+    uint16_t scope_size = sizeof(*scope) + sizeof(uint16_t);
 
     if (ms->iommu_intr) {
         /* enable INTR for the IOMMU device */
@@ -2595,11 +2600,19 @@ build_dmar_q35(MachineState *ms, GArray *table_data, GArray *linker)
     /* DMAR Remapping Hardware Unit Definition structure */
     drhd = acpi_data_push(table_data, sizeof(*drhd));
     drhd->type = cpu_to_le16(ACPI_DMAR_TYPE_HARDWARE_UNIT);
-    drhd->length = cpu_to_le16(sizeof(*drhd));   /* No device scope now */
+    drhd->length = cpu_to_le16(sizeof(*drhd) + scope_size);
     drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
     drhd->pci_segment = cpu_to_le16(0);
     drhd->address = cpu_to_le64(Q35_HOST_BRIDGE_IOMMU_ADDR);
 
+    /* Scope definition for the root-complex IOAPIC */
+    scope = acpi_data_push(table_data, scope_size);
+    scope->entry_type = cpu_to_le16(ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC);
+    scope->length = scope_size;
+    scope->enumeration_id = cpu_to_le16(ACPI_BUILD_IOAPIC_ID);
+    scope->bus = cpu_to_le16(Q35_PSEUDO_BUS_PLATFORM);
+    scope->path[0] = cpu_to_le16(Q35_PSEUDO_DEVFN_IOAPIC);
+
     build_header(linker, table_data, (void *)(table_data->data + dmar_start),
                  "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
 }
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index c7a03d4..2430af6 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -556,6 +556,20 @@ enum {
 /*
  * Sub-structures for DMAR
  */
+
+#define ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC     (0x03)
+
+/* Device scope structure for DRHD. */
+struct AcpiDmarDeviceScope {
+    uint8_t entry_type;
+    uint8_t length;
+    uint16_t reserved;
+    uint8_t enumeration_id;
+    uint8_t bus;
+    uint16_t path[0];           /* list of dev:func pairs */
+} QEMU_PACKED;
+typedef struct AcpiDmarDeviceScope AcpiDmarDeviceScope;
+
 /* Type 0: Hardware Unit Definition */
 struct AcpiDmarHardwareUnit {
     uint16_t type;
@@ -564,6 +578,7 @@ struct AcpiDmarHardwareUnit {
     uint8_t reserved;
     uint16_t pci_segment;   /* The PCI Segment associated with this unit */
     uint64_t address;   /* Base address of remapping hardware register-set */
+    AcpiDmarDeviceScope scope[0];
 } QEMU_PACKED;
 typedef struct AcpiDmarHardwareUnit AcpiDmarHardwareUnit;
 
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index c5c073d..9afc221 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -175,4 +175,13 @@ typedef struct Q35PCIHost {
 
 uint64_t mch_mcfg_base(void);
 
+/*
+ * Arbitary but unique BNF number for IOAPIC device. This is only
+ * used when interrupt remapping is enabled.
+ *
+ * TODO: make sure there would have no conflict with real PCI bus
+ */
+#define Q35_PSEUDO_BUS_PLATFORM         (0xff)
+#define Q35_PSEUDO_DEVFN_IOAPIC         (0x00)
+
 #endif /* HW_Q35_H */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 05/16] intel_iommu: define interrupt remap table addr register
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (3 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 04/16] acpi: add DMAR scope definition for root IOAPIC Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 06/16] intel_iommu: handle interrupt remap enable Peter Xu
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Defined Interrupt Remap Table Address register to store IR table
pointer. Also, do proper handling on global command register writes to
store table pointer and its size.

One more debug flag "DEBUG_IR" is added for interrupt remapping.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c          | 52 +++++++++++++++++++++++++++++++++++++++++-
 hw/i386/intel_iommu_internal.h |  4 ++++
 include/hw/i386/intel_iommu.h  |  5 ++++
 3 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 17668d6..00b873c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -30,7 +30,7 @@
 #ifdef DEBUG_INTEL_IOMMU
 enum {
     DEBUG_GENERAL, DEBUG_CSR, DEBUG_INV, DEBUG_MMU, DEBUG_FLOG,
-    DEBUG_CACHE,
+    DEBUG_CACHE, DEBUG_IR,
 };
 #define VTD_DBGBIT(x)   (1 << DEBUG_##x)
 static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | VTD_DBGBIT(CSR);
@@ -900,6 +900,19 @@ static void vtd_root_table_setup(IntelIOMMUState *s)
                 (s->root_extended ? "(extended)" : ""));
 }
 
+static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
+{
+    uint64_t value = 0;
+    value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
+    s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
+    s->intr_root = value & VTD_IRTA_ADDR_MASK;
+
+    /* TODO: invalidate interrupt entry cache */
+
+    VTD_DPRINTF(CSR, "int remap table addr 0x%"PRIx64 " size %"PRIu32,
+                s->intr_root, s->intr_size);
+}
+
 static void vtd_context_global_invalidate(IntelIOMMUState *s)
 {
     s->context_cache_gen++;
@@ -1138,6 +1151,16 @@ static void vtd_handle_gcmd_srtp(IntelIOMMUState *s)
     vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS);
 }
 
+/* Set Interrupt Remap Table Pointer */
+static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
+{
+    VTD_DPRINTF(CSR, "set Interrupt Remap Table Pointer");
+
+    vtd_interrupt_remap_table_setup(s);
+    /* Ok - report back to driver */
+    vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
+}
+
 /* Handle Translation Enable/Disable */
 static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
 {
@@ -1177,6 +1200,10 @@ static void vtd_handle_gcmd_write(IntelIOMMUState *s)
         /* Queued Invalidation Enable */
         vtd_handle_gcmd_qie(s, val & VTD_GCMD_QIE);
     }
+    if (val & VTD_GCMD_SIRTP) {
+        /* Set/update the interrupt remapping root-table pointer */
+        vtd_handle_gcmd_sirtp(s);
+    }
 }
 
 /* Handle write to Context Command Register */
@@ -1838,6 +1865,23 @@ static void vtd_mem_write(void *opaque, hwaddr addr,
         vtd_update_fsts_ppf(s);
         break;
 
+    case DMAR_IRTA_REG:
+        VTD_DPRINTF(IR, "DMAR_IRTA_REG write addr 0x%"PRIx64
+                    ", size %d, val 0x%"PRIx64, addr, size, val);
+        if (size == 4) {
+            vtd_set_long(s, addr, val);
+        } else {
+            vtd_set_quad(s, addr, val);
+        }
+        break;
+
+    case DMAR_IRTA_REG_HI:
+        VTD_DPRINTF(IR, "DMAR_IRTA_REG_HI write addr 0x%"PRIx64
+                    ", size %d, val 0x%"PRIx64, addr, size, val);
+        assert(size == 4);
+        vtd_set_long(s, addr, val);
+        break;
+
     default:
         VTD_DPRINTF(GENERAL, "error: unhandled reg write addr 0x%"PRIx64
                     ", size %d, val 0x%"PRIx64, addr, size, val);
@@ -2017,6 +2061,12 @@ static void vtd_init(IntelIOMMUState *s)
     /* Fault Recording Registers, 128-bit */
     vtd_define_quad(s, DMAR_FRCD_REG_0_0, 0, 0, 0);
     vtd_define_quad(s, DMAR_FRCD_REG_0_2, 0, 0, 0x8000000000000000ULL);
+
+    /*
+     * Interrupt remapping registers, not support extended interrupt
+     * mode for now.
+     */
+    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff00fULL, 0);
 }
 
 /* Should not reset address_spaces when reset because devices will still use
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 5b98a11..309833f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -172,6 +172,10 @@
 #define VTD_RTADDR_RTT              (1ULL << 11)
 #define VTD_RTADDR_ADDR_MASK        (VTD_HAW_MASK ^ 0xfffULL)
 
+/* IRTA_REG */
+#define VTD_IRTA_ADDR_MASK          (VTD_HAW_MASK ^ 0xfffULL)
+#define VTD_IRTA_SIZE_MASK          (0xfULL)
+
 /* ECAP_REG */
 /* (offset >> 4) << 8 */
 #define VTD_ECAP_IRO                (DMAR_IOTLB_REG_OFFSET << 4)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 0d89796..cc49839 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -125,6 +125,11 @@ struct IntelIOMMUState {
     MemoryRegionIOMMUOps iommu_ops;
     GHashTable *vtd_as_by_busptr;   /* VTDBus objects indexed by PCIBus* reference */
     VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed by bus number */
+
+    /* interrupt remapping */
+    bool intr_enabled;              /* Whether guest enabled IR */
+    dma_addr_t intr_root;           /* Interrupt remapping table pointer */
+    uint32_t intr_size;             /* Number of IR table entries */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 06/16] intel_iommu: handle interrupt remap enable
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (4 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 05/16] intel_iommu: define interrupt remap table addr register Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 07/16] intel_iommu: define several structs for IOMMU IR Peter Xu
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Handle writting to IRE bit in global command register.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 00b873c..4d14124 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1180,6 +1180,22 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
     }
 }
 
+/* Handle Interrupt Remap Enable/Disable */
+static void vtd_handle_gcmd_ire(IntelIOMMUState *s, bool en)
+{
+    VTD_DPRINTF(CSR, "Interrupt Remap Enable %s", (en ? "on" : "off"));
+
+    if (en) {
+        s->intr_enabled = true;
+        /* Ok - report back to driver */
+        vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRES);
+    } else {
+        s->intr_enabled = false;
+        /* Ok - report back to driver */
+        vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_IRES, 0);
+    }
+}
+
 /* Handle write to Global Command Register */
 static void vtd_handle_gcmd_write(IntelIOMMUState *s)
 {
@@ -1204,6 +1220,10 @@ static void vtd_handle_gcmd_write(IntelIOMMUState *s)
         /* Set/update the interrupt remapping root-table pointer */
         vtd_handle_gcmd_sirtp(s);
     }
+    if (changed & VTD_GCMD_IRE) {
+        /* Interrupt remap enable/disable */
+        vtd_handle_gcmd_ire(s, val & VTD_GCMD_IRE);
+    }
 }
 
 /* Handle write to Context Command Register */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 07/16] intel_iommu: define several structs for IOMMU IR
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (5 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 06/16] intel_iommu: handle interrupt remap enable Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 08/16] intel_iommu: provide helper function vtd_get_iommu Peter Xu
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Several data structs are defined to better support the rest of the
patches: IRTE to parse remapping table entries, and IOAPIC/MSI related
structure bits to parse interrupt entries to be filled in by guest
kernel.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/hw/i386/intel_iommu.h | 60 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index cc49839..4914fe6 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -52,6 +52,9 @@ typedef struct IntelIOMMUState IntelIOMMUState;
 typedef struct VTDAddressSpace VTDAddressSpace;
 typedef struct VTDIOTLBEntry VTDIOTLBEntry;
 typedef struct VTDBus VTDBus;
+typedef union VTD_IRTE VTD_IRTE;
+typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
+typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
 
 /* Context-Entry */
 struct VTDContextEntry {
@@ -90,6 +93,63 @@ struct VTDIOTLBEntry {
     bool write_flags;
 };
 
+/* Interrupt Remapping Table Entry Definition */
+union VTD_IRTE {
+    struct {
+        uint8_t present:1;          /* Whether entry present/available */
+        uint8_t fault_disable:1;    /* Fault Processing Disable */
+        uint8_t dest_mode:1;        /* Destination Mode */
+        uint8_t redir_hint:1;       /* Redirection Hint */
+        uint8_t trigger_mode:1;     /* Trigger Mode */
+        uint8_t delivery_mode:3;    /* Delivery Mode */
+        uint8_t __avail:4;          /* Available spaces for software */
+        uint8_t __reserved_0:3;     /* Reserved 0 */
+        uint8_t irte_mode:1;        /* IRTE Mode */
+        uint8_t vector:8;           /* Interrupt Vector */
+        uint8_t __reserved_1:8;     /* Reserved 1 */
+        uint32_t dest_id:32;        /* Destination ID */
+        uint16_t source_id:16;      /* Source-ID */
+        uint8_t sid_q:2;            /* Source-ID Qualifier */
+        uint8_t sid_vtype:2;        /* Source-ID Validation Type */
+        uint64_t __reserved_2:44;   /* Reserved 2 */
+    } QEMU_PACKED;
+    uint64_t data[2];
+};
+
+/* Programming format for IOAPIC table entries */
+union VTD_IR_IOAPICEntry {
+    struct {
+        uint8_t vector:8;           /* Vector */
+        uint8_t __zeros:3;          /* Reserved (all zero) */
+        uint8_t index_h:1;          /* Interrupt Index bit 15 */
+        uint8_t status:1;           /* Deliver Status */
+        uint8_t polarity:1;         /* Interrupt Polarity */
+        uint8_t remote_irr:1;       /* Remote IRR */
+        uint8_t trigger_mode:1;     /* Trigger Mode */
+        uint8_t mask:1;             /* Mask */
+        uint32_t __reserved:31;     /* Reserved (should all zero) */
+        uint8_t int_mode:1;         /* Interrupt Format */
+        uint16_t index_l:15;        /* Interrupt Index bits 14-0 */
+    } QEMU_PACKED;
+    uint64_t data;
+};
+
+/* Programming format for MSI/MSI-X addresses */
+union VTD_IR_MSIAddress {
+    struct {
+        uint8_t __not_care:2;
+        uint8_t index_h:1;          /* Interrupt index bit 15 */
+        uint8_t sub_valid:1;        /* SHV: Sub-Handle Valid bit */
+        uint8_t int_mode:1;         /* Interrupt format */
+        uint16_t index_l:15;        /* Interrupt index bit 14-0 */
+        uint16_t __head:12;         /* Should always be: 0x0fee */
+    } QEMU_PACKED;
+    uint32_t data;
+};
+
+/* When IR is enabled, all MSI/MSI-X data bits should be zero */
+#define VTD_IR_MSI_DATA          (0)
+
 /* The iommu (DMAR) device state struct */
 struct IntelIOMMUState {
     SysBusDevice busdev;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 08/16] intel_iommu: provide helper function vtd_get_iommu
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (6 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 07/16] intel_iommu: define several structs for IOMMU IR Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 09/16] intel_iommu: add IR translation faults defines Peter Xu
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Moves acpi_get_iommu() under VT-d to make it a public function.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/acpi-build.c          |  7 +------
 hw/i386/intel_iommu.c         | 13 +++++++++++++
 include/hw/i386/intel_iommu.h |  2 ++
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 5d2d87b..b064bc2 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2677,12 +2677,7 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 
 static bool acpi_has_iommu(void)
 {
-    bool ambiguous;
-    Object *intel_iommu;
-
-    intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
-                                           &ambiguous);
-    return intel_iommu && !ambiguous;
+    return !!vtd_iommu_get();
 }
 
 static
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 4d14124..a44289f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2001,6 +2001,19 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
     return vtd_dev_as;
 }
 
+IntelIOMMUState *vtd_iommu_get(void)
+{
+    bool ambiguous = false;
+    Object *intel_iommu = NULL;
+
+    intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
+                                 &ambiguous);
+    if (ambiguous)
+        intel_iommu = NULL;
+
+    return (IntelIOMMUState *)intel_iommu;
+}
+
 /* Do the initialization. It will also be called when reset, so pay
  * attention when adding new initialization stuff.
  */
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 4914fe6..9ee84f7 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -196,5 +196,7 @@ struct IntelIOMMUState {
  * create a new one if none exists
  */
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
+/* Get default IOMMU object */
+IntelIOMMUState *vtd_iommu_get(void);
 
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 09/16] intel_iommu: add IR translation faults defines
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (7 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 08/16] intel_iommu: provide helper function vtd_get_iommu Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 10/16] intel_iommu: Add support for PCI MSI remap Peter Xu
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Adding translation fault definitions for interrupt remapping. Please
refer to VT-d spec section 7.1.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu_internal.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 309833f..2a9987f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -271,6 +271,19 @@ typedef enum VTDFaultReason {
      * context-entry.
      */
     VTD_FR_CONTEXT_ENTRY_TT,
+
+    /* Interrupt remapping transition faults */
+    VTD_FR_IR_REQ_RSVD = 0x20, /* One or more IR request reserved
+                                * fields set */
+    VTD_FR_IR_INDEX_OVER = 0x21, /* Index value greater than max */
+    VTD_FR_IR_ENTRY_P = 0x22,    /* Present (P) not set in IRTE */
+    VTD_FR_IR_ROOT_INVAL = 0x23, /* IR Root table invalid */
+    VTD_FR_IR_IRTE_RSVD = 0x24,  /* IRTE Rsvd field non-zero with
+                                  * Present flag set */
+    VTD_FR_IR_REQ_COMPAT = 0x25, /* Encountered compatible IR
+                                  * request while disabled */
+    VTD_FR_IR_SID_ERR = 0x26,   /* Invalid Source-ID */
+
     /* This is not a normal fault reason. We use this to indicate some faults
      * that are not referenced by the VT-d specification.
      * Fault event with such reason should not be recorded.
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 10/16] intel_iommu: Add support for PCI MSI remap
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (8 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 09/16] intel_iommu: add IR translation faults defines Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 11/16] q35: ioapic: add support for emulated IOAPIC IR Peter Xu
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

This patch enables interrupt remapping for PCI devices.

To play the trick, one memory region "iommu_ir" is added as child region
of the original iommu memory region, covering range 0xfeeXXXXX (which is
the address range for APIC). All the writes to this range will be taken
as MSI, and translation is carried out only when IR is enabled.

Idea suggested by Paolo Bonzini.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c          | 232 +++++++++++++++++++++++++++++++++++++++++
 hw/i386/intel_iommu_internal.h |   2 +
 include/hw/i386/intel_iommu.h  |  52 +++++++++
 3 files changed, 286 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a44289f..58f6064 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1969,6 +1969,233 @@ static Property vtd_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+/* Read IRTE entry with specific index */
+static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
+                        VTD_IRTE *entry)
+{
+    dma_addr_t addr = 0x00;
+
+    addr = iommu->intr_root + index * sizeof(*entry);
+    if (dma_memory_read(&address_space_memory, addr, entry,
+                        sizeof(*entry))) {
+        VTD_DPRINTF(GENERAL, "error: fail to access IR root at 0x%"PRIx64
+                    " + %"PRIu16, iommu->intr_root, index);
+        return -VTD_FR_IR_ROOT_INVAL;
+    }
+
+    if (!entry->present) {
+        VTD_DPRINTF(GENERAL, "error: present flag not set in IRTE"
+                    " entry index %u value 0x%"PRIx64 " 0x%"PRIx64,
+                    index, le64_to_cpu(entry->data[1]),
+                    le64_to_cpu(entry->data[0]));
+        return -VTD_FR_IR_ENTRY_P;
+    }
+
+    if (entry->__reserved_0 || entry->__reserved_1 || \
+        entry->__reserved_2) {
+        VTD_DPRINTF(GENERAL, "error: IRTE entry index %"PRIu16
+                    " reserved fields non-zero: 0x%"PRIx64 " 0x%"PRIx64,
+                    index, le64_to_cpu(entry->data[1]),
+                    le64_to_cpu(entry->data[0]));
+        return -VTD_FR_IR_IRTE_RSVD;
+    }
+
+    /*
+     * TODO: Check Source-ID corresponds to SVT (Source Validation
+     * Type) bits
+     */
+
+    return 0;
+}
+
+/* Fetch IRQ information of specific IR index */
+static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq *irq)
+{
+    VTD_IRTE irte;
+    int ret = 0;
+
+    bzero(&irte, sizeof(irte));
+
+    ret = vtd_irte_get(iommu, index, &irte);
+    if (ret) {
+        return ret;
+    }
+
+    irq->trigger_mode = irte.trigger_mode;
+    irq->vector = irte.vector;
+    irq->delivery_mode = irte.delivery_mode;
+    /* Not support EIM yet: please refer to vt-d 9.10 DST bits */
+#define  VTD_IR_APIC_DEST_MASK         (0xff00ULL)
+#define  VTD_IR_APIC_DEST_SHIFT        (8)
+    irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
+        VTD_IR_APIC_DEST_SHIFT;
+    irq->dest_mode = irte.dest_mode;
+    irq->redir_hint = irte.redir_hint;
+
+    VTD_DPRINTF(IR, "remapping interrupt index %d: trig:%u,vec:%u,"
+                "deliver:%u,dest:%u,dest_mode:%u", index,
+                irq->trigger_mode, irq->vector, irq->delivery_mode,
+                irq->dest, irq->dest_mode);
+
+    return 0;
+}
+
+/* Generate one MSI message from VTDIrq info */
+static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
+{
+    VTD_MSIMessage msg = {};
+
+    /* Generate address bits */
+    msg.dest_mode = irq->dest_mode;
+    msg.redir_hint = irq->redir_hint;
+    msg.dest = irq->dest;
+    msg.__addr_head = 0xfee;
+    /* Keep this from original MSI address bits */
+    msg.__not_used = irq->msi_addr_last_bits;
+
+    /* Generate data bits */
+    msg.vector = irq->vector;
+    msg.delivery_mode = irq->delivery_mode;
+    msg.level = 1;
+    msg.trigger_mode = irq->trigger_mode;
+
+    msg_out->address = msg.msi_addr;
+    msg_out->data = msg.msi_data;
+}
+
+/* Interrupt remapping for MSI/MSI-X entry */
+static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
+                                   MSIMessage *origin,
+                                   MSIMessage *translated)
+{
+    int ret = 0;
+    VTD_IR_MSIAddress addr;
+    uint16_t index = 0;
+    VTDIrq irq = {0};
+
+    assert(origin && translated);
+
+    if (!iommu || !iommu->intr_enabled) {
+        goto do_not_translate;
+    }
+
+    if (origin->address & VTD_MSI_ADDR_HI_MASK) {
+        VTD_DPRINTF(GENERAL, "error: MSI addr high 32 bits nonzero"
+                    " during interrupt remapping: 0x%"PRIx32,
+                    (uint32_t)((origin->address & VTD_MSI_ADDR_HI_MASK) >> \
+                    VTD_MSI_ADDR_HI_SHIFT));
+        return -VTD_FR_IR_REQ_RSVD;
+    }
+
+    addr.data = origin->address & VTD_MSI_ADDR_LO_MASK;
+    if (addr.__head != 0xfee) {
+        VTD_DPRINTF(GENERAL, "error: MSI addr low 32 bits invalid: "
+                    "0x%"PRIx32, addr.data);
+        return -VTD_FR_IR_REQ_RSVD;
+    }
+
+    /* This is compatible mode. */
+    if (addr.int_mode != VTD_IR_INT_FORMAT_REMAP) {
+        goto do_not_translate;
+    }
+
+    if (addr.sub_valid == 1) {
+        VTD_DPRINTF(IR, "received MSI interrupt");
+        if (origin->data) {
+            VTD_DPRINTF(GENERAL, "error: MSI data bits non-zero for "
+                        "interrupt remappable entry: 0x%"PRIx32,
+                        origin->data);
+            return -VTD_FR_IR_REQ_RSVD;
+        }
+    } else {
+        VTD_DPRINTF(IR, "received IOAPIC interrupt");
+        /* TODO: IOAPIC interrupt check. */
+    }
+
+    index = addr.index_h << 15 | addr.index_l;
+
+    ret = vtd_remap_irq_get(iommu, index, &irq);
+    if (ret) {
+        return ret;
+    }
+
+    /*
+     * We'd better keep the last two bits, assuming that guest OS
+     * might modify it. Keep it does not hurt after all.
+     */
+    irq.msi_addr_last_bits = addr.__not_care;
+
+    /* Translate VTDIrq to MSI message */
+    vtd_generate_msi_message(&irq, translated);
+
+    VTD_DPRINTF(IR, "mapping MSI 0x%"PRIx64":0x%"PRIx32 " -> "
+                "0x%"PRIx64":0x%"PRIx32, origin->address, origin->data,
+                translated->address, translated->data);
+    return 0;
+
+do_not_translate:
+    memcpy(translated, origin, sizeof(*origin));
+    return 0;
+}
+
+static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
+{
+    uint64_t data = 0;
+
+    addr += VTD_INTERRUPT_ADDR_FIRST;
+
+    VTD_DPRINTF(IR, "read mem_ir addr 0x%"PRIx64 " size %u",
+                addr, size);
+
+    if (dma_memory_read(&address_space_memory, addr, &data, size)) {
+        VTD_DPRINTF(GENERAL, "error: fail to access 0x%"PRIx64, addr);
+        return (uint32_t) -1;
+    }
+
+    return data;
+}
+
+static void vtd_mem_ir_write(void *opaque, hwaddr addr,
+                             uint64_t value, unsigned size)
+{
+    int ret = 0;
+    MSIMessage from = {0}, to = {0};
+
+    from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
+    from.data = (uint32_t) value;
+
+    ret = vtd_interrupt_remap_msi(opaque, &from, &to);
+    if (ret) {
+        /* TODO: report error */
+        VTD_DPRINTF(GENERAL, "int remap fail for addr 0x%"PRIx64
+                    " data 0x%"PRIx32, from.address, from.data);
+        /* Drop this interrupt */
+        return;
+    }
+
+    VTD_DPRINTF(IR, "delivering MSI 0x%"PRIx64":0x%"PRIx32,
+                to.address, to.data);
+
+    if (dma_memory_write(&address_space_memory, to.address,
+                         &to.data, size)) {
+        VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64
+                    " value 0x%"PRIx32, to.address, to.data);
+    }
+}
+
+static const MemoryRegionOps vtd_mem_ir_ops = {
+    .read = vtd_mem_ir_read,
+    .write = vtd_mem_ir_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
 
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
 {
@@ -1995,6 +2222,11 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
         vtd_dev_as->context_cache_entry.context_cache_gen = 0;
         memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
                                  &s->iommu_ops, "intel_iommu", UINT64_MAX);
+        memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
+                              &vtd_mem_ir_ops, s, "intel_iommu_ir",
+                              VTD_INTERRUPT_ADDR_SIZE);
+        memory_region_add_subregion(&vtd_dev_as->iommu, VTD_INTERRUPT_ADDR_FIRST,
+                                    &vtd_dev_as->iommu_ir);
         address_space_init(&vtd_dev_as->as,
                            &vtd_dev_as->iommu, "intel_iommu");
     }
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 2a9987f..e1a08cb 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -110,6 +110,8 @@
 /* Interrupt Address Range */
 #define VTD_INTERRUPT_ADDR_FIRST    0xfee00000ULL
 #define VTD_INTERRUPT_ADDR_LAST     0xfeefffffULL
+#define VTD_INTERRUPT_ADDR_SIZE     (VTD_INTERRUPT_ADDR_LAST - \
+                                     VTD_INTERRUPT_ADDR_FIRST + 1)
 
 /* The shift of source_id in the key of IOTLB hash table */
 #define VTD_IOTLB_SID_SHIFT         36
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 9ee84f7..5945670 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -23,6 +23,8 @@
 #define INTEL_IOMMU_H
 #include "hw/qdev.h"
 #include "sysemu/dma.h"
+#include "hw/i386/ioapic.h"
+#include "hw/pci/msi.h"
 
 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
 #define INTEL_IOMMU_DEVICE(obj) \
@@ -46,6 +48,10 @@
 
 #define DMAR_REPORT_F_INTR          (1)
 
+#define  VTD_MSI_ADDR_HI_MASK        (0xffffffff00000000ULL)
+#define  VTD_MSI_ADDR_HI_SHIFT       (32)
+#define  VTD_MSI_ADDR_LO_MASK        (0x00000000ffffffffULL)
+
 typedef struct VTDContextEntry VTDContextEntry;
 typedef struct VTDContextCacheEntry VTDContextCacheEntry;
 typedef struct IntelIOMMUState IntelIOMMUState;
@@ -55,6 +61,8 @@ typedef struct VTDBus VTDBus;
 typedef union VTD_IRTE VTD_IRTE;
 typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
 typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
+typedef struct VTDIrq VTDIrq;
+typedef struct VTD_MSIMessage VTD_MSIMessage;
 
 /* Context-Entry */
 struct VTDContextEntry {
@@ -75,6 +83,7 @@ struct VTDAddressSpace {
     uint8_t devfn;
     AddressSpace as;
     MemoryRegion iommu;
+    MemoryRegion iommu_ir;      /* Interrupt region: 0xfeeXXXXX */
     IntelIOMMUState *iommu_state;
     VTDContextCacheEntry context_cache_entry;
 };
@@ -116,6 +125,9 @@ union VTD_IRTE {
     uint64_t data[2];
 };
 
+#define VTD_IR_INT_FORMAT_COMPAT     (0) /* Compatible Interrupt */
+#define VTD_IR_INT_FORMAT_REMAP      (1) /* Remappable Interrupt */
+
 /* Programming format for IOAPIC table entries */
 union VTD_IR_IOAPICEntry {
     struct {
@@ -147,6 +159,46 @@ union VTD_IR_MSIAddress {
     uint32_t data;
 };
 
+/* Generic IRQ entry information */
+struct VTDIrq {
+    /* Used by both IOAPIC/MSI interrupt remapping */
+    uint8_t trigger_mode;
+    uint8_t vector;
+    uint8_t delivery_mode;
+    uint32_t dest;
+    uint8_t dest_mode;
+
+    /* only used by MSI interrupt remapping */
+    uint8_t redir_hint;
+    uint8_t msi_addr_last_bits;
+};
+
+struct VTD_MSIMessage {
+    union {
+        struct {
+            uint16_t __not_used:2;
+            uint16_t dest_mode:1;
+            uint16_t redir_hint:1;
+            uint16_t __reserved:8;
+            uint16_t dest:8;
+            uint16_t __addr_head:12; /* 0xfee */
+            uint32_t __addr_hi:32;
+        } QEMU_PACKED;
+        uint64_t msi_addr;
+    };
+    union {
+        struct {
+            uint16_t vector:8;
+            uint16_t delivery_mode:3;
+            uint16_t __resved:3;
+            uint16_t level:1;
+            uint16_t trigger_mode:1;
+            uint16_t __resved1:16;
+        } QEMU_PACKED;
+        uint32_t msi_data;
+    };
+};
+
 /* When IR is enabled, all MSI/MSI-X data bits should be zero */
 #define VTD_IR_MSI_DATA          (0)
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 11/16] q35: ioapic: add support for emulated IOAPIC IR
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (9 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 10/16] intel_iommu: Add support for PCI MSI remap Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 12/16] ioapic: introduce ioapic_entry_parse() helper Peter Xu
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

This patch translates all IOAPIC interrupts into MSI ones. One pseudo
ioapic address space is added to transfer the MSI message. By default,
it will be system memory address space. When IR is enabled, it will be
IOMMU address space.

Currently, only emulated IOAPIC is supported.

Idea suggested by Jan Kiszka and Rita Sinha in the following patch:

https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg01933.html

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/pc.c                      |  3 +++
 hw/intc/ioapic.c                  | 28 ++++++++++++++++++++++++----
 hw/pci-host/q35.c                 |  4 ++++
 include/hw/i386/apic-msidef.h     |  1 +
 include/hw/i386/ioapic_internal.h |  1 +
 include/hw/i386/pc.h              |  4 ++++
 6 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 99437e0..365e82f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1395,6 +1395,9 @@ void pc_memory_init(PCMachineState *pcms,
         rom_add_option(option_rom[i].name, option_rom[i].bootindex);
     }
     pcms->fw_cfg = fw_cfg;
+
+    /* Init default IOAPIC address space */
+    pcms->ioapic_as = &address_space_memory;
 }
 
 qemu_irq pc_allocate_cpu_irq(void)
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 378e663..92334a6 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -28,6 +28,8 @@
 #include "hw/i386/ioapic_internal.h"
 #include "include/hw/pci/msi.h"
 #include "sysemu/kvm.h"
+#include "target-i386/cpu.h"
+#include "hw/i386/apic-msidef.h"
 
 //#define DEBUG_IOAPIC
 
@@ -49,13 +51,15 @@ extern int ioapic_no;
 
 static void ioapic_service(IOAPICCommonState *s)
 {
+    AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
+    uint32_t addr, data;
     uint8_t i;
     uint8_t trig_mode;
     uint8_t vector;
     uint8_t delivery_mode;
     uint32_t mask;
     uint64_t entry;
-    uint8_t dest;
+    uint16_t dest_idx;
     uint8_t dest_mode;
 
     for (i = 0; i < IOAPIC_NUM_PINS; i++) {
@@ -66,7 +70,14 @@ static void ioapic_service(IOAPICCommonState *s)
             entry = s->ioredtbl[i];
             if (!(entry & IOAPIC_LVT_MASKED)) {
                 trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
-                dest = entry >> IOAPIC_LVT_DEST_SHIFT;
+                /*
+                 * By default, this would be dest_id[8] +
+                 * reserved[8]. When IR is enabled, this would be
+                 * interrupt_index[15] + interrupt_format[1]. This
+                 * field never means anything, but only used to
+                 * generate corresponding MSI.
+                 */
+                dest_idx = entry >> IOAPIC_LVT_DEST_IDX_SHIFT;
                 dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
                 delivery_mode =
                     (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
@@ -96,8 +107,17 @@ static void ioapic_service(IOAPICCommonState *s)
 #else
                 (void)coalesce;
 #endif
-                apic_deliver_irq(dest, dest_mode, delivery_mode, vector,
-                                 trig_mode);
+                /* No matter whether IR is enabled, we translate
+                 * the IOAPIC message into a MSI one, and its
+                 * address space will decide whether we need a
+                 * translation. */
+                addr = APIC_DEFAULT_ADDRESS | \
+                    (dest_idx << MSI_ADDR_DEST_IDX_SHIFT) |
+                    (dest_mode << MSI_ADDR_DEST_MODE_SHIFT);
+                data = (vector << MSI_DATA_VECTOR_SHIFT) |
+                    (trig_mode << MSI_DATA_TRIGGER_SHIFT) |
+                    (delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
+                stl_le_phys(ioapic_as, addr, data);
             }
         }
     }
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 70f897e..d32c123 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -437,6 +437,7 @@ static AddressSpace *q35_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
 
 static void mch_init_dmar(MCHPCIState *mch)
 {
+    PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
     PCIBus *pci_bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
 
     mch->iommu = INTEL_IOMMU_DEVICE(qdev_create(NULL, TYPE_INTEL_IOMMU_DEVICE));
@@ -446,6 +447,9 @@ static void mch_init_dmar(MCHPCIState *mch)
     sysbus_mmio_map(SYS_BUS_DEVICE(mch->iommu), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
 
     pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu);
+    /* Pseudo address space under root PCI bus. */
+    pcms->ioapic_as = q35_host_dma_iommu(pci_bus, mch->iommu,
+                                         Q35_PSEUDO_DEVFN_IOAPIC);
 }
 
 static void mch_realize(PCIDevice *d, Error **errp)
diff --git a/include/hw/i386/apic-msidef.h b/include/hw/i386/apic-msidef.h
index 6e2eb71..8b4d4cc 100644
--- a/include/hw/i386/apic-msidef.h
+++ b/include/hw/i386/apic-msidef.h
@@ -25,6 +25,7 @@
 #define MSI_ADDR_REDIRECTION_SHIFT      3
 
 #define MSI_ADDR_DEST_ID_SHIFT          12
+#define MSI_ADDR_DEST_IDX_SHIFT         4
 #define  MSI_ADDR_DEST_ID_MASK          0x00ffff0
 
 #endif /* HW_APIC_MSIDEF_H */
diff --git a/include/hw/i386/ioapic_internal.h b/include/hw/i386/ioapic_internal.h
index 797ed47..d279f2d 100644
--- a/include/hw/i386/ioapic_internal.h
+++ b/include/hw/i386/ioapic_internal.h
@@ -31,6 +31,7 @@
 #define IOAPIC_VERSION                  0x11
 
 #define IOAPIC_LVT_DEST_SHIFT           56
+#define IOAPIC_LVT_DEST_IDX_SHIFT       48
 #define IOAPIC_LVT_MASKED_SHIFT         16
 #define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
 #define IOAPIC_LVT_REMOTE_IRR_SHIFT     14
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 96f0b66..cde6934 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -72,6 +72,10 @@ struct PCMachineState {
     uint64_t numa_nodes;
     uint64_t *node_mem;
     uint64_t *node_cpu;
+
+    /* Address space used by IOAPIC device. All IOAPIC interrupts
+     * will be translated to MSI messages in the address space. */
+    AddressSpace *ioapic_as;
 };
 
 #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 12/16] ioapic: introduce ioapic_entry_parse() helper
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (10 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 11/16] q35: ioapic: add support for emulated IOAPIC IR Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 13/16] intel_iommu: add support for split irqchip Peter Xu
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Abstract IOAPIC entry parsing logic into a helper function.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/intc/ioapic.c | 110 +++++++++++++++++++++++++++----------------------------
 1 file changed, 54 insertions(+), 56 deletions(-)

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 92334a6..d6e88d5 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -49,18 +49,56 @@ static IOAPICCommonState *ioapics[MAX_IOAPICS];
 /* global variable from ioapic_common.c */
 extern int ioapic_no;
 
+struct ioapic_entry_info {
+    /* fields parsed from IOAPIC entries */
+    uint8_t masked;
+    uint8_t trig_mode;
+    uint16_t dest_idx;
+    uint8_t dest_mode;
+    uint8_t delivery_mode;
+    uint8_t vector;
+
+    /* MSI message generated from above parsed fields */
+    uint32_t addr;
+    uint32_t data;
+};
+
+static void ioapic_entry_parse(uint64_t entry, struct ioapic_entry_info *info)
+{
+    bzero(info, sizeof(*info));
+    info->masked = (entry >> IOAPIC_LVT_MASKED_SHIFT) & 1;
+    info->trig_mode = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
+    /*
+     * By default, this would be dest_id[8] + reserved[8]. When IR
+     * is enabled, this would be interrupt_index[15] +
+     * interrupt_format[1]. This field never means anything, but
+     * only used to generate corresponding MSI.
+     */
+    info->dest_idx = (entry >> IOAPIC_LVT_DEST_IDX_SHIFT) & 0xffff;
+    info->dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
+    info->delivery_mode = (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) \
+        & IOAPIC_DM_MASK;
+    if (info->delivery_mode == IOAPIC_DM_EXTINT) {
+        info->vector = pic_read_irq(isa_pic);
+    } else {
+        info->vector = entry & IOAPIC_VECTOR_MASK;
+    }
+
+    info->addr = APIC_DEFAULT_ADDRESS | \
+        (info->dest_idx << MSI_ADDR_DEST_IDX_SHIFT) | \
+        (info->dest_mode << MSI_ADDR_DEST_MODE_SHIFT);
+    info->data = (info->vector << MSI_DATA_VECTOR_SHIFT) | \
+        (info->trig_mode << MSI_DATA_TRIGGER_SHIFT) | \
+        (info->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
+}
+
 static void ioapic_service(IOAPICCommonState *s)
 {
     AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
-    uint32_t addr, data;
+    struct ioapic_entry_info info;
     uint8_t i;
-    uint8_t trig_mode;
-    uint8_t vector;
-    uint8_t delivery_mode;
     uint32_t mask;
     uint64_t entry;
-    uint16_t dest_idx;
-    uint8_t dest_mode;
 
     for (i = 0; i < IOAPIC_NUM_PINS; i++) {
         mask = 1 << i;
@@ -68,33 +106,18 @@ static void ioapic_service(IOAPICCommonState *s)
             int coalesce = 0;
 
             entry = s->ioredtbl[i];
-            if (!(entry & IOAPIC_LVT_MASKED)) {
-                trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
-                /*
-                 * By default, this would be dest_id[8] +
-                 * reserved[8]. When IR is enabled, this would be
-                 * interrupt_index[15] + interrupt_format[1]. This
-                 * field never means anything, but only used to
-                 * generate corresponding MSI.
-                 */
-                dest_idx = entry >> IOAPIC_LVT_DEST_IDX_SHIFT;
-                dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
-                delivery_mode =
-                    (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
-                if (trig_mode == IOAPIC_TRIGGER_EDGE) {
+            ioapic_entry_parse(entry, &info);
+            if (!info.masked) {
+                if (info.trig_mode == IOAPIC_TRIGGER_EDGE) {
                     s->irr &= ~mask;
                 } else {
                     coalesce = s->ioredtbl[i] & IOAPIC_LVT_REMOTE_IRR;
                     s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
                 }
-                if (delivery_mode == IOAPIC_DM_EXTINT) {
-                    vector = pic_read_irq(isa_pic);
-                } else {
-                    vector = entry & IOAPIC_VECTOR_MASK;
-                }
+
 #ifdef CONFIG_KVM
                 if (kvm_irqchip_is_split()) {
-                    if (trig_mode == IOAPIC_TRIGGER_EDGE) {
+                    if (info.trig_mode == IOAPIC_TRIGGER_EDGE) {
                         kvm_set_irq(kvm_state, i, 1);
                         kvm_set_irq(kvm_state, i, 0);
                     } else {
@@ -111,13 +134,7 @@ static void ioapic_service(IOAPICCommonState *s)
                  * the IOAPIC message into a MSI one, and its
                  * address space will decide whether we need a
                  * translation. */
-                addr = APIC_DEFAULT_ADDRESS | \
-                    (dest_idx << MSI_ADDR_DEST_IDX_SHIFT) |
-                    (dest_mode << MSI_ADDR_DEST_MODE_SHIFT);
-                data = (vector << MSI_DATA_VECTOR_SHIFT) |
-                    (trig_mode << MSI_DATA_TRIGGER_SHIFT) |
-                    (delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
-                stl_le_phys(ioapic_as, addr, data);
+                stl_le_phys(ioapic_as, info.addr, info.data);
             }
         }
     }
@@ -168,30 +185,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
 
     if (kvm_irqchip_is_split()) {
         for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-            uint64_t entry = s->ioredtbl[i];
-            uint8_t trig_mode;
-            uint8_t delivery_mode;
-            uint8_t dest;
-            uint8_t dest_mode;
-            uint64_t pin_polarity;
             MSIMessage msg;
-
-            trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
-            dest = entry >> IOAPIC_LVT_DEST_SHIFT;
-            dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
-            pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
-            delivery_mode =
-                (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
-
-            msg.address = APIC_DEFAULT_ADDRESS;
-            msg.address |= dest_mode << 2;
-            msg.address |= dest << 12;
-
-            msg.data = entry & IOAPIC_VECTOR_MASK;
-            msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT;
-            msg.data |= pin_polarity << APIC_POLARITY_SHIFT;
-            msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT;
-
+            struct ioapic_entry_info info;
+            ioapic_entry_parse(s->ioredtbl[i], &info);
+            msg.address = info.addr;
+            msg.data = info.data;
             kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
         }
         kvm_irqchip_commit_routes(kvm_state);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 13/16] intel_iommu: add support for split irqchip
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (11 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 12/16] ioapic: introduce ioapic_entry_parse() helper Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 14/16] q35: add "int-remap" flag to enable intr Peter Xu
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

In split irqchip mode, IOAPIC is working in user space, only update
kernel irq routes when entry changed. When IR is enabled, we directly
update the kernel with translated messages. It works just like a kernel
cache for the remapping entries.

Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
long as we can support split irqchip, we will support irqfd as
well. Also, since kernel gsi routes will cache translated interrupts,
irqfd delivery will not suffer from any performance impact due to IR.

And, since we supported irqfd, vhost devices will be able to work
seamlessly with IR now. Logically this should contain both vhost-net and
vhost-user case.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c         |  5 +++++
 include/hw/i386/intel_iommu.h |  2 ++
 target-i386/kvm.c             | 24 ++++++++++++++++++++++++
 3 files changed, 31 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 58f6064..ca7e069 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2138,6 +2138,11 @@ do_not_translate:
     return 0;
 }
 
+int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst)
+{
+    return vtd_interrupt_remap_msi(iommu, src, dst);
+}
+
 static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
 {
     uint64_t data = 0;
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 5945670..5910e6f 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -25,6 +25,7 @@
 #include "sysemu/dma.h"
 #include "hw/i386/ioapic.h"
 #include "hw/pci/msi.h"
+#include "hw/sysbus.h"
 
 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
 #define INTEL_IOMMU_DEVICE(obj) \
@@ -250,5 +251,6 @@ struct IntelIOMMUState {
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
 /* Get default IOMMU object */
 IntelIOMMUState *vtd_iommu_get(void);
+int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst);
 
 #endif
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 799fdfa..7a996a2 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -36,6 +36,7 @@
 #include "hw/i386/apic.h"
 #include "hw/i386/apic_internal.h"
 #include "hw/i386/apic-msidef.h"
+#include "hw/i386/intel_iommu.h"
 
 #include "exec/ioport.h"
 #include "standard-headers/asm-x86/hyperv.h"
@@ -3327,6 +3328,29 @@ int kvm_device_msix_deassign(KVMState *s, uint32_t dev_id)
 int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
                              uint64_t address, uint32_t data, PCIDevice *dev)
 {
+    IntelIOMMUState *iommu = vtd_iommu_get();
+
+    if (iommu) {
+        int ret;
+        MSIMessage src, dst;
+
+        src.address = route->u.msi.address_hi;
+        src.address <<= VTD_MSI_ADDR_HI_SHIFT;
+        src.address |= route->u.msi.address_lo;
+        src.data = route->u.msi.data;
+
+        ret = vtd_int_remap(iommu, &src, &dst);
+        if (ret) {
+            error_report("VT-d Failed to remap interrupt for gsi %d.",
+                         route->gsi);
+            return 1;
+        }
+
+        route->u.msi.address_hi = dst.address >> VTD_MSI_ADDR_HI_SHIFT;
+        route->u.msi.address_lo = dst.address & VTD_MSI_ADDR_LO_MASK;
+        route->u.msi.data = dst.data;
+    }
+
     return 0;
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 14/16] q35: add "int-remap" flag to enable intr
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (12 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 13/16] intel_iommu: add support for split irqchip Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 15/16] intel_iommu: introduce IEC notifiers Peter Xu
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

One flag is added to specify whether to enable INTR for emulated
IOMMU. By default, interrupt remapping is not supportted. To enable it,
we should specify something like:

$ qemu-system-x86_64 -M q35,iommu=on,intr=on

To be more clear, the following command:

$ qemu-system-x86_64 -M q35,iommu=on

Will enable IOMMU only, without interrupt remapping support.

Currently, Intel IOMMU IR only support kernel-irqchip={off|split}. We
need to specify either of it in -M as well.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/core/machine.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 276ad61..b00f39f 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -300,6 +300,20 @@ static void machine_set_iommu(Object *obj, bool value, Error **errp)
     ms->iommu = value;
 }
 
+static bool machine_get_intr(Object *obj, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    return ms->iommu_intr;
+}
+
+static void machine_set_intr(Object *obj, bool value, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    ms->iommu_intr = value;
+}
+
 static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
 {
     MachineState *ms = MACHINE(obj);
@@ -480,6 +494,12 @@ static void machine_initfn(Object *obj)
     object_property_set_description(obj, "iommu",
                                     "Set on/off to enable/disable Intel IOMMU (VT-d)",
                                     NULL);
+    object_property_add_bool(obj, "intr", machine_get_intr,
+                             machine_set_intr, NULL);
+    object_property_set_description(obj, "intr",
+                                    "Set on/off to enable/disable IOMMU"
+                                    " interrupt remapping",
+                                    NULL);
     object_property_add_bool(obj, "suppress-vmdesc",
                              machine_get_suppress_vmdesc,
                              machine_set_suppress_vmdesc, NULL);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 15/16] intel_iommu: introduce IEC notifiers
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (13 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 14/16] q35: add "int-remap" flag to enable intr Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 16/16] ioapic: register VT-d IEC invalidate notifier Peter Xu
  2016-04-25  5:16 ` [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

This patch introduces Intel VT-d IEC (Interrupt Entry Cache)
invalidation notifier list. When vIOMMU receives IEC invalidate request,
all the registered units will be notified with specific invalidation
requests.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c          | 52 ++++++++++++++++++++++++++++++++++++------
 hw/i386/intel_iommu_internal.h | 24 +++++++++++++++----
 include/hw/i386/intel_iommu.h  | 22 ++++++++++++++++++
 3 files changed, 87 insertions(+), 11 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ca7e069..1082ab5 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -902,12 +902,18 @@ static void vtd_root_table_setup(IntelIOMMUState *s)
 
 static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
 {
+    VTD_IEC_Notifier *notifier;
     uint64_t value = 0;
     value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
     s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
     s->intr_root = value & VTD_IRTA_ADDR_MASK;
 
-    /* TODO: invalidate interrupt entry cache */
+    QLIST_FOREACH(notifier, &s->iec_notifiers, list) {
+        if (notifier->iec_notify) {
+            /* Notify global invalidation */
+            notifier->iec_notify(notifier->private, true, 0, 0);
+        }
+    }
 
     VTD_DPRINTF(CSR, "int remap table addr 0x%"PRIx64 " size %"PRIu32,
                 s->intr_root, s->intr_size);
@@ -1409,6 +1415,28 @@ static bool vtd_process_iotlb_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc)
     return true;
 }
 
+static bool vtd_process_inv_iec_desc(IntelIOMMUState *s,
+                                     VTDInvDesc *inv_desc)
+{
+    VTD_IEC_Notifier *notifier;
+
+    VTD_DPRINTF(INV, "inv ir glob %d index %d mask %d",
+                inv_desc->iec.granularity,
+                inv_desc->iec.index,
+                inv_desc->iec.index_mask);
+
+    QLIST_FOREACH(notifier, &s->iec_notifiers, list) {
+        if (notifier->iec_notify) {
+            notifier->iec_notify(notifier->private,
+                                 inv_desc->iec.granularity,
+                                 inv_desc->iec.index,
+                                 inv_desc->iec.index_mask);
+        }
+    }
+
+    return true;
+}
+
 static bool vtd_process_inv_desc(IntelIOMMUState *s)
 {
     VTDInvDesc inv_desc;
@@ -1449,12 +1477,12 @@ static bool vtd_process_inv_desc(IntelIOMMUState *s)
         break;
 
     case VTD_INV_DESC_IEC:
-        VTD_DPRINTF(INV, "Interrupt Entry Cache Invalidation "
-                    "not implemented yet");
-        /*
-         * Since currently we do not cache interrupt entries, we can
-         * just mark this descriptor as "good" and move on.
-         */
+        VTD_DPRINTF(INV, "Invalidation Interrupt Entry Cache "
+                    "Descriptor hi 0x%"PRIx64 " lo 0x%"PRIx64,
+                    inv_desc.hi, inv_desc.lo);
+        if (!vtd_process_inv_iec_desc(s, &inv_desc)) {
+            return false;
+        }
         break;
 
     default:
@@ -2202,6 +2230,15 @@ static const MemoryRegionOps vtd_mem_ir_ops = {
     },
 };
 
+void vtd_iec_register_notifier(IntelIOMMUState *s, vtd_iec_notify_fn fn,
+                               void *data)
+{
+    VTD_IEC_Notifier *notifier = g_new0(VTD_IEC_Notifier, 1);
+    notifier->iec_notify = fn;
+    notifier->private = data;
+    QLIST_INSERT_HEAD(&s->iec_notifiers, notifier, list);
+}
+
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
 {
     uintptr_t key = (uintptr_t)bus;
@@ -2364,6 +2401,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
                                      g_free, g_free);
     s->vtd_as_by_busptr = g_hash_table_new_full(vtd_uint64_hash, vtd_uint64_equal,
                                               g_free, g_free);
+    QLIST_INIT(&s->iec_notifiers);
     vtd_init(s);
 }
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e1a08cb..10c20fe 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -296,12 +296,28 @@ typedef enum VTDFaultReason {
 
 #define VTD_CONTEXT_CACHE_GEN_MAX       0xffffffffUL
 
+/* Interrupt Entry Cache Invalidation Descriptor: VT-d 6.5.2.7. */
+struct VTDInvDescIEC {
+    uint32_t type:4;            /* Should always be 0x4 */
+    uint32_t granularity:1;     /* If set, it's global IR invalidation */
+    uint32_t resved_1:22;
+    uint32_t index_mask:5;      /* 2^N for continuous int invalidation */
+    uint32_t index:16;          /* Start index to invalidate */
+    uint32_t reserved_2:16;
+};
+typedef struct VTDInvDescIEC VTDInvDescIEC;
+
 /* Queued Invalidation Descriptor */
-struct VTDInvDesc {
-    uint64_t lo;
-    uint64_t hi;
+union VTDInvDesc {
+    struct {
+        uint64_t lo;
+        uint64_t hi;
+    };
+    union {
+        VTDInvDescIEC iec;
+    };
 };
-typedef struct VTDInvDesc VTDInvDesc;
+typedef union VTDInvDesc VTDInvDesc;
 
 /* Masks for struct VTDInvDesc */
 #define VTD_INV_DESC_TYPE               0xf
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 5910e6f..d7613af 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -203,6 +203,24 @@ struct VTD_MSIMessage {
 /* When IR is enabled, all MSI/MSI-X data bits should be zero */
 #define VTD_IR_MSI_DATA          (0)
 
+/**
+ * iec_notifier - IEC (Interrupt Entry Cache) notifier, triggered
+ *                when IR hhinvalidation happens.
+ * @private: private data
+ * @global: whether this is a global IEC invalidation
+ * @index: IRTE index to invalidate (start from)
+ * @mask: invalidation mask
+ */
+typedef void (*vtd_iec_notify_fn)(void *private, bool global,
+                                  uint32_t index, uint32_t mask);
+
+struct VTD_IEC_Notifier {
+    vtd_iec_notify_fn iec_notify;
+    void *private;
+    QLIST_ENTRY(VTD_IEC_Notifier) list;
+};
+typedef struct VTD_IEC_Notifier VTD_IEC_Notifier;
+
 /* The iommu (DMAR) device state struct */
 struct IntelIOMMUState {
     SysBusDevice busdev;
@@ -243,6 +261,7 @@ struct IntelIOMMUState {
     bool intr_enabled;              /* Whether guest enabled IR */
     dma_addr_t intr_root;           /* Interrupt remapping table pointer */
     uint32_t intr_size;             /* Number of IR table entries */
+    QLIST_HEAD(, VTD_IEC_Notifier) iec_notifiers; /* IEC notify list */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,
@@ -252,5 +271,8 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
 /* Get default IOMMU object */
 IntelIOMMUState *vtd_iommu_get(void);
 int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst);
+/* Register IEC invalidate notifier */
+void vtd_iec_register_notifier(IntelIOMMUState *s, vtd_iec_notify_fn fn,
+                               void *data);
 
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH v4 16/16] ioapic: register VT-d IEC invalidate notifier
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (14 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 15/16] intel_iommu: introduce IEC notifiers Peter Xu
@ 2016-04-19  8:38 ` Peter Xu
  2016-04-25  5:16 ` [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
  16 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-19  8:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	jan.kiszka, rkrcmar, alex.williamson, wexu, peterx

Let IOAPIC the first consumer of VT-d IEC invalidation notifiers. This
is only used for split irqchip case, when VT-d receives IR invalidation
requests, IOAPIC will be notified to update kernel irq routes. For
simplicity, we just update all IOAPIC routes, even if the invalidated
entries are not IOAPIC ones.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/intc/ioapic.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index d6e88d5..b41ab89 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -30,6 +30,7 @@
 #include "sysemu/kvm.h"
 #include "target-i386/cpu.h"
 #include "hw/i386/apic-msidef.h"
+#include "hw/i386/intel_iommu.h"
 
 //#define DEBUG_IOAPIC
 
@@ -197,6 +198,14 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
 #endif
 }
 
+static void ioapic_iec_notifier(void *private, bool global,
+                                uint32_t index, uint32_t mask)
+{
+    IOAPICCommonState *s = (IOAPICCommonState *)private;
+    /* For simplicity, we just update all the routes */
+    ioapic_update_kvm_routes(s);
+}
+
 void ioapic_eoi_broadcast(int vector)
 {
     IOAPICCommonState *s;
@@ -330,6 +339,18 @@ static void ioapic_realize(DeviceState *dev, Error **errp)
     qdev_init_gpio_in(dev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
     ioapics[ioapic_no] = s;
+
+#ifdef CONFIG_KVM
+    if (kvm_irqchip_is_split()) {
+        IntelIOMMUState *iommu = vtd_iommu_get();
+        if (iommu) {
+            /* Register this IOAPIC with IOMMU IEC notifier, so that
+             * when there are IR invalidates, we can be notified to
+             * update kernel IR cache. */
+            vtd_iec_register_notifier(iommu, ioapic_iec_notifier, s);
+        }
+    }
+#endif
 }
 
 static void ioapic_class_init(ObjectClass *klass, void *data)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
                   ` (15 preceding siblings ...)
  2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 16/16] ioapic: register VT-d IEC invalidate notifier Peter Xu
@ 2016-04-25  5:16 ` Jan Kiszka
  2016-04-25  7:18   ` Peter Xu
  16 siblings, 1 reply; 43+ messages in thread
From: Jan Kiszka @ 2016-04-25  5:16 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
	rkrcmar, alex.williamson, wexu

On 2016-04-19 10:38, Peter Xu wrote:
> Jan, Michael,
> 
> Still haven't got your response from v3 review comments, but I
> decided to move on to provide a workable version first (v3 is
> missing the first patch, so it is not working). Also, misc issues
> are addressed from Radim. Anyway, I am always ready for v5 and
> further. :)
> 
> Github branch for your reference in case needed:
> 
>   https://github.com/xzpeter/qemu vtd-intr-v4
> 
> Thanks,
> 
> v4 changes (all patch number corresponds to v3):
> - add one patch at the start of v3 series: I missed to send the
>   first patch in v3. adding it in. [Jan]
> - patch 9: add support for compatible mode (no reason not to support
>   it, if not, we will get some warnings when using split irqchip)
> - patch 11: further simplify ioapic_update_kvm_routes() using the
>   helper function.
> - patch 12: tweak on kvm_arch_fixup_msi_route() rather than
>   ioapic_update_kvm_routes() only. [Radim]
> - add patch 15: introduce IEC (Interrupt Entry Cache) invalidation
>   notifier list. We can register to this list if we want to be
>   notified when we got IR invalidation requests [Radim]
> - add patch 16: let IOAPIC the first consumer for the above IEC
>   notifier list. [Radim]
> - several other trivial fixes (like moving some defines from .c to
>   .h, moving several lines of changes from one patch to another to
>   make it make more sense, etc.)
> 
> v3 changes (all patch numbers corresponds to v2):
> - patch 1 (-> v3 patch 13)
>   - move to the end of series [Alex]
> - patch 10 (dropped)
>   - drop this one, since re-worked on IOAPIC support, so we do not
>     need this any more.
> - patch 12 (-> v3 patch 10)
>   - leverage MSI path for IOAPIC IR [Jan]
> - patch 13 (v3 -> patch 9)
>   - remove vtd_interrupt_remap_msi() declaration by reordering the
>     functions [mst]
>   - vtd_generate_msi_message(): init msg using {}, remove FIXME
>     [mst]
> - new patches
>   - v3 patch 11: introduce ioapic_entry_parse() helper function
>   - v3 patch 12: add support for kernel-irqchip=split. This needs
>     more reviews, logically this should enable lots of things:
> 	splitted irqchip, irqfd, vhost, and irqfd support for
> 	passthrough devices (not tested). Please refer to the patch for
> 	more information.
> 
> v2 changes:
> - patch 1
>   - rename "int_remap" to "intr" in several places [Marcel]
>   - remove "Intel" specific words in desc or commit message, prepare
>     itself with further AMD support [Marcel]
>   - avoid using object_property_get_bool() [Marcel]
> - patch 5
>   - use PCI bus number 0xff rather than 0xf0 for the IOAPIC scope
>     definition. (please let me know if anyone knows how I can avoid
> 	user using PCI bus number 0xff... TIA)
> - patch 11
>   - fix comments [Marcel]
> - all
>   - remove intr_supported variable [Marcel]
> 
> This patchset provide very basic functionalities for interrupt
> remapping (IR) support of the emulated Intel IOMMU device.
> 
> By default, IR is disabled to be better compatible with current
> QEMU. To enable IR, we can using the following command to boot a
> IR-supported VM with virtio-net device with vhost (still do not
> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
> here):
> 
> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \

"intr" sounds a bit too much like "interrupt", not "interrupt
remapping". Why not use the kernel's form, "intremap"?

>      -enable-kvm -m 1024 \
> 	 -netdev tap,id=net0,vhost=on \
>      -device virtio-net-pci,netdev=user.0 \
>      -monitor telnet::3333,server,nowait \
> 	 /var/lib/libvirt/images/vm1.qcow2
> 
> When guest boots, we can verify whether IR enabled by grepping the
> dmesg like:
> 
> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
> 
> Currently supported devices:
> 
> - Emulated/Splitted irqchip
> - Generic PCI Devices
> - vhost devices
> - pass through device support? Not tested, but suppose it should work.

I've tested this series against my Jailhouse setup, and it works pretty
well! Actually considering to move my test setup over this branch.

However, split irqchip still has some issues: When I boot a q35 machine
with Linux, the e1000 network adapter only gets a single IRQ delivered.
Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
didn't debug this in details yet.

> 
> TODO List:
> 
> - kvm-ioapic support (?)
> - EIM support

That should be fairly easy, I already played with it (hack in EIM cap,
change vtd_remap_irq_get, assuming EIME would be set). However, it
depends on split irqchip to work properly (there is no x2apic in
userspace APIC), and that is not yet the case.

> - IR fault reporting

Would be welcome! I found a "test case" yesterday: misconfigured IOAPIC
ID blocked its IRQs under Jailhouse, and I first had to enable tracing
to realize it ;).

Jan

> - source-id validation for IRTE
> - IRTE cache and better queued invalidation
> - migration support (for IOMMU as general?)
> - more?
> 
> Peter Xu (16):
>   acpi: enable INTR for DMAR report structure
>   intel_iommu: allow queued invalidation for IR
>   intel_iommu: set IR bit for ECAP register
>   acpi: add DMAR scope definition for root IOAPIC
>   intel_iommu: define interrupt remap table addr register
>   intel_iommu: handle interrupt remap enable
>   intel_iommu: define several structs for IOMMU IR
>   intel_iommu: provide helper function vtd_get_iommu
>   intel_iommu: add IR translation faults defines
>   intel_iommu: Add support for PCI MSI remap
>   q35: ioapic: add support for emulated IOAPIC IR
>   ioapic: introduce ioapic_entry_parse() helper
>   intel_iommu: add support for split irqchip
>   q35: add "int-remap" flag to enable intr
>   intel_iommu: introduce IEC notifiers
>   ioapic: register VT-d IEC invalidate notifier
> 
>  hw/core/machine.c                 |  22 +++
>  hw/i386/acpi-build.c              |  36 ++--
>  hw/i386/intel_iommu.c             | 376 +++++++++++++++++++++++++++++++++++++-
>  hw/i386/intel_iommu_internal.h    |  47 ++++-
>  hw/i386/pc.c                      |   3 +
>  hw/intc/ioapic.c                  | 125 ++++++++-----
>  hw/pci-host/q35.c                 |   4 +
>  include/hw/acpi/acpi-defs.h       |  15 ++
>  include/hw/boards.h               |   1 +
>  include/hw/i386/apic-msidef.h     |   1 +
>  include/hw/i386/intel_iommu.h     | 145 +++++++++++++++
>  include/hw/i386/ioapic_internal.h |   1 +
>  include/hw/i386/pc.h              |   4 +
>  include/hw/pci-host/q35.h         |   9 +
>  target-i386/kvm.c                 |  24 +++
>  15 files changed, 754 insertions(+), 59 deletions(-)
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-25  5:16 ` [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
@ 2016-04-25  7:18   ` Peter Xu
  2016-04-25  7:24     ` Jan Kiszka
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-04-25  7:18 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
> On 2016-04-19 10:38, Peter Xu wrote:

[...]

> > By default, IR is disabled to be better compatible with current
> > QEMU. To enable IR, we can using the following command to boot a
> > IR-supported VM with virtio-net device with vhost (still do not
> > support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
> > here):
> > 
> > $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
> 
> "intr" sounds a bit too much like "interrupt", not "interrupt
> remapping". Why not use the kernel's form, "intremap"?

Sure. It sounds nice to be aligned with the kernel one. Let me take
it in v5.

> 
> >      -enable-kvm -m 1024 \
> > 	 -netdev tap,id=net0,vhost=on \
> >      -device virtio-net-pci,netdev=user.0 \
> >      -monitor telnet::3333,server,nowait \
> > 	 /var/lib/libvirt/images/vm1.qcow2
> > 
> > When guest boots, we can verify whether IR enabled by grepping the
> > dmesg like:
> > 
> > [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> > Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
> > Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
> > 
> > Currently supported devices:
> > 
> > - Emulated/Splitted irqchip
> > - Generic PCI Devices
> > - vhost devices
> > - pass through device support? Not tested, but suppose it should work.
> 
> I've tested this series against my Jailhouse setup, and it works pretty
> well! Actually considering to move my test setup over this branch.

This is really encouraging feedback! Btw, thanks for all kinds of
help on this patchset. :-)

> 
> However, split irqchip still has some issues: When I boot a q35 machine
> with Linux, the e1000 network adapter only gets a single IRQ delivered.
> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
> didn't debug this in details yet.

I reproduced this problem. It seems that it fails even with
kernel-irqchip=off. Will try to dig it out.

> 
> > 
> > TODO List:
> > 
> > - kvm-ioapic support (?)
> > - EIM support
> 
> That should be fairly easy, I already played with it (hack in EIM cap,
> change vtd_remap_irq_get, assuming EIME would be set). However, it
> depends on split irqchip to work properly (there is no x2apic in
> userspace APIC), and that is not yet the case.

That's cool. Never tried it though. Anyway, will leave x2apic
related work for Radim. :)

> 
> > - IR fault reporting
> 
> Would be welcome! I found a "test case" yesterday: misconfigured IOAPIC
> ID blocked its IRQs under Jailhouse, and I first had to enable tracing
> to realize it ;).

Yes, it sounds nice to have guest side feedback on IR faults. Will
do more reading, and see whether I can add one more patch in v5 to
do this.

Thanks!

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-25  7:18   ` Peter Xu
@ 2016-04-25  7:24     ` Jan Kiszka
  2016-04-25 16:38       ` Radim Krčmář
  2016-04-26  7:34       ` Peter Xu
  0 siblings, 2 replies; 43+ messages in thread
From: Jan Kiszka @ 2016-04-25  7:24 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On 2016-04-25 09:18, Peter Xu wrote:
> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
>> On 2016-04-19 10:38, Peter Xu wrote:
> 
> [...]
> 
>>> By default, IR is disabled to be better compatible with current
>>> QEMU. To enable IR, we can using the following command to boot a
>>> IR-supported VM with virtio-net device with vhost (still do not
>>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
>>> here):
>>>
>>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
>>
>> "intr" sounds a bit too much like "interrupt", not "interrupt
>> remapping". Why not use the kernel's form, "intremap"?
> 
> Sure. It sounds nice to be aligned with the kernel one. Let me take
> it in v5.
> 
>>
>>>      -enable-kvm -m 1024 \
>>> 	 -netdev tap,id=net0,vhost=on \
>>>      -device virtio-net-pci,netdev=user.0 \
>>>      -monitor telnet::3333,server,nowait \
>>> 	 /var/lib/libvirt/images/vm1.qcow2
>>>
>>> When guest boots, we can verify whether IR enabled by grepping the
>>> dmesg like:
>>>
>>> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
>>>
>>> Currently supported devices:
>>>
>>> - Emulated/Splitted irqchip
>>> - Generic PCI Devices
>>> - vhost devices
>>> - pass through device support? Not tested, but suppose it should work.
>>
>> I've tested this series against my Jailhouse setup, and it works pretty
>> well! Actually considering to move my test setup over this branch.
> 
> This is really encouraging feedback! Btw, thanks for all kinds of
> help on this patchset. :-)
> 
>>
>> However, split irqchip still has some issues: When I boot a q35 machine
>> with Linux, the e1000 network adapter only gets a single IRQ delivered.
>> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
>> didn't debug this in details yet.
> 
> I reproduced this problem. It seems that it fails even with
> kernel-irqchip=off. Will try to dig it out.

Very good. Hope it can be easily fixed.

> 
>>
>>>
>>> TODO List:
>>>
>>> - kvm-ioapic support (?)
>>> - EIM support
>>
>> That should be fairly easy, I already played with it (hack in EIM cap,
>> change vtd_remap_irq_get, assuming EIME would be set). However, it
>> depends on split irqchip to work properly (there is no x2apic in
>> userspace APIC), and that is not yet the case.
> 
> That's cool. Never tried it though. Anyway, will leave x2apic
> related work for Radim. :)

Whoever is faster ;) - I would really like to have this feature in order
to test x2apic support of Jailhouse also in the virtual test bed. Will
surely hack on it as soon as that other bug is fixed.

> 
>>
>>> - IR fault reporting
>>
>> Would be welcome! I found a "test case" yesterday: misconfigured IOAPIC
>> ID blocked its IRQs under Jailhouse, and I first had to enable tracing
>> to realize it ;).
> 
> Yes, it sounds nice to have guest side feedback on IR faults. Will
> do more reading, and see whether I can add one more patch in v5 to
> do this.

It's not a must-have for getting things merged. In fact, any additional
feature that could now delay the merge of what you have should rather
wait. Stabilizing, addressing style and structure comments is more
important IMO.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-25  7:24     ` Jan Kiszka
@ 2016-04-25 16:38       ` Radim Krčmář
  2016-04-26  7:34       ` Peter Xu
  1 sibling, 0 replies; 43+ messages in thread
From: Radim Krčmář @ 2016-04-25 16:38 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Peter Xu, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
	mst, pbonzini, alex.williamson, wexu

2016-04-25 09:24+0200, Jan Kiszka:
> On 2016-04-25 09:18, Peter Xu wrote:
>> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
>>> On 2016-04-19 10:38, Peter Xu wrote:
>>>> - EIM support
>>>
>>> That should be fairly easy, I already played with it (hack in EIM cap,
>>> change vtd_remap_irq_get, assuming EIME would be set). However, it
>>> depends on split irqchip to work properly (there is no x2apic in
>>> userspace APIC), and that is not yet the case.
>> 
>> That's cool. Never tried it though. Anyway, will leave x2apic
>> related work for Radim. :)
> 
> Whoever is faster ;) - I would really like to have this feature in order
> to test x2apic support of Jailhouse also in the virtual test bed. Will
> surely hack on it as soon as that other bug is fixed.

I accept the challenge, and resign. :)  Thanks!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-25  7:24     ` Jan Kiszka
  2016-04-25 16:38       ` Radim Krčmář
@ 2016-04-26  7:34       ` Peter Xu
  2016-04-26  7:57         ` Jan Kiszka
                           ` (2 more replies)
  1 sibling, 3 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-26  7:34 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote:
> On 2016-04-25 09:18, Peter Xu wrote:
> > On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
> >> On 2016-04-19 10:38, Peter Xu wrote:
> > 
> > [...]
> > 
> >>> By default, IR is disabled to be better compatible with current
> >>> QEMU. To enable IR, we can using the following command to boot a
> >>> IR-supported VM with virtio-net device with vhost (still do not
> >>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
> >>> here):
> >>>
> >>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
> >>
> >> "intr" sounds a bit too much like "interrupt", not "interrupt
> >> remapping". Why not use the kernel's form, "intremap"?
> > 
> > Sure. It sounds nice to be aligned with the kernel one. Let me take
> > it in v5.
> > 
> >>
> >>>      -enable-kvm -m 1024 \
> >>> 	 -netdev tap,id=net0,vhost=on \
> >>>      -device virtio-net-pci,netdev=user.0 \
> >>>      -monitor telnet::3333,server,nowait \
> >>> 	 /var/lib/libvirt/images/vm1.qcow2
> >>>
> >>> When guest boots, we can verify whether IR enabled by grepping the
> >>> dmesg like:
> >>>
> >>> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> >>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
> >>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
> >>>
> >>> Currently supported devices:
> >>>
> >>> - Emulated/Splitted irqchip
> >>> - Generic PCI Devices
> >>> - vhost devices
> >>> - pass through device support? Not tested, but suppose it should work.
> >>
> >> I've tested this series against my Jailhouse setup, and it works pretty
> >> well! Actually considering to move my test setup over this branch.
> > 
> > This is really encouraging feedback! Btw, thanks for all kinds of
> > help on this patchset. :-)
> > 
> >>
> >> However, split irqchip still has some issues: When I boot a q35 machine
> >> with Linux, the e1000 network adapter only gets a single IRQ delivered.
> >> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
> >> didn't debug this in details yet.
> > 
> > I reproduced this problem. It seems that it fails even with
> > kernel-irqchip=off. Will try to dig it out.
> 
> Very good. Hope it can be easily fixed.

Hi, Jan,

The above issue should be caused by EOI missing of level-triggered
interrupts. Before that, I was always using edge-triggered
interrupts for test, so didn't encounter this one. Would you please
help try below patch? It can be applied directly onto the series,
and should solve the issue (it works on my test vm, and I'll take it
in v5 as well if it also works for you):

-------------------------

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index b41ab89..de6a8cf 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int size)
     return val;
 }

+/*
+ * This is to satisfy the hack in Linux kernel. One hack of it is to
+ * simulate clearing the Remote IRR bit of IOAPIC entry using the
+ * following:
+ *
+ * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
+ * Otherwise, we simulate the EOI message manually by changing the trigger
+ * mode to edge and then back to level, with RTE being masked during
+ * this."
+ *
+ * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
+ *
+ * This is based on the assumption that, Remote IRR bit will be
+ * cleared by IOAPIC hardware for edge-triggered interrupts (I
+ * believe that's what the IOAPIC version 0x1X hardware does). So
+ * if we are emulating it, we'd better do it the same here, so that
+ * the guest kernel hack will work as well on QEMU.
+ *
+ * Without this, level-triggered interrupts in IR mode might fail to
+ * work correctly.
+ */
+static inline void
+ioapic_fix_edge_remote_irr(uint64_t *entry)
+{
+    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
+        /* Level triggered interrupts, make sure remote IRR is zero */
+        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
+    }
+}
+
 static void
 ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
                  unsigned int size)
@@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
                     s->ioredtbl[index] &= ~0xffffffffULL;
                     s->ioredtbl[index] |= val;
                 }
+                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
                 ioapic_service(s);
             }
         }

------------------------

I am still looking into guest part codes. Although the above patch
should solve the issue, there are still issues in guest codes when
IR is enabled:

- mismatched "vector" in IOAPIC entry and IRTE entry (this is
  required in vt-d spec 5.1.5.1, and required to correctly deliver
  EOI broadcast I guess). See intel_irq_remapping_prepare_irte():

        ...
        /*
         * IO-APIC RTE will be configured with virtual vector.
         * irq handler will do the explicit EOI to the io-apic.
         */
        entry->vector   = info->ioapic_pin;
        ...

- I encountered that level-triggered entries in IOAPIC is marked as
  edge-triggered interrupt in APIC (which is strange)... This will
  also affect correct delivery of EOI broadcast. I still need time
  to figure out why.

If EOI broadcast can work, e1000 issue would be solved as
well even without above patch.

[...]

> > 
> >>
> >>> - IR fault reporting
> >>
> >> Would be welcome! I found a "test case" yesterday: misconfigured IOAPIC
> >> ID blocked its IRQs under Jailhouse, and I first had to enable tracing
> >> to realize it ;).
> > 
> > Yes, it sounds nice to have guest side feedback on IR faults. Will
> > do more reading, and see whether I can add one more patch in v5 to
> > do this.
> 
> It's not a must-have for getting things merged. In fact, any additional
> feature that could now delay the merge of what you have should rather
> wait. Stabilizing, addressing style and structure comments is more
> important IMO.

Okay, then let me add this into my todo list, and will pick this up
when got time.

Thanks,

-- peterx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26  7:34       ` Peter Xu
@ 2016-04-26  7:57         ` Jan Kiszka
  2016-04-26  8:15           ` Jan Kiszka
  2016-04-26 14:19         ` Radim Krčmář
  2016-05-09 11:58         ` Paolo Bonzini
  2 siblings, 1 reply; 43+ messages in thread
From: Jan Kiszka @ 2016-04-26  7:57 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On 2016-04-26 09:34, Peter Xu wrote:
> On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote:
>> On 2016-04-25 09:18, Peter Xu wrote:
>>> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
>>>> On 2016-04-19 10:38, Peter Xu wrote:
>>>
>>> [...]
>>>
>>>>> By default, IR is disabled to be better compatible with current
>>>>> QEMU. To enable IR, we can using the following command to boot a
>>>>> IR-supported VM with virtio-net device with vhost (still do not
>>>>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
>>>>> here):
>>>>>
>>>>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
>>>>
>>>> "intr" sounds a bit too much like "interrupt", not "interrupt
>>>> remapping". Why not use the kernel's form, "intremap"?
>>>
>>> Sure. It sounds nice to be aligned with the kernel one. Let me take
>>> it in v5.
>>>
>>>>
>>>>>      -enable-kvm -m 1024 \
>>>>> 	 -netdev tap,id=net0,vhost=on \
>>>>>      -device virtio-net-pci,netdev=user.0 \
>>>>>      -monitor telnet::3333,server,nowait \
>>>>> 	 /var/lib/libvirt/images/vm1.qcow2
>>>>>
>>>>> When guest boots, we can verify whether IR enabled by grepping the
>>>>> dmesg like:
>>>>>
>>>>> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
>>>>>
>>>>> Currently supported devices:
>>>>>
>>>>> - Emulated/Splitted irqchip
>>>>> - Generic PCI Devices
>>>>> - vhost devices
>>>>> - pass through device support? Not tested, but suppose it should work.
>>>>
>>>> I've tested this series against my Jailhouse setup, and it works pretty
>>>> well! Actually considering to move my test setup over this branch.
>>>
>>> This is really encouraging feedback! Btw, thanks for all kinds of
>>> help on this patchset. :-)
>>>
>>>>
>>>> However, split irqchip still has some issues: When I boot a q35 machine
>>>> with Linux, the e1000 network adapter only gets a single IRQ delivered.
>>>> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
>>>> didn't debug this in details yet.
>>>
>>> I reproduced this problem. It seems that it fails even with
>>> kernel-irqchip=off. Will try to dig it out.
>>
>> Very good. Hope it can be easily fixed.
> 
> Hi, Jan,
> 
> The above issue should be caused by EOI missing of level-triggered
> interrupts. Before that, I was always using edge-triggered
> interrupts for test, so didn't encounter this one. Would you please
> help try below patch? It can be applied directly onto the series,
> and should solve the issue (it works on my test vm, and I'll take it
> in v5 as well if it also works for you):
> 

Works here as well. I even made EIM working with some hack, though
Jailhouse spits out strange warnings, despite it works fine (x2apic
mode, split irqchip).

> -------------------------
> 
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> index b41ab89..de6a8cf 100644
> --- a/hw/intc/ioapic.c
> +++ b/hw/intc/ioapic.c
> @@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int size)
>      return val;
>  }
> 
> +/*
> + * This is to satisfy the hack in Linux kernel. One hack of it is to
> + * simulate clearing the Remote IRR bit of IOAPIC entry using the
> + * following:
> + *
> + * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
> + * Otherwise, we simulate the EOI message manually by changing the trigger
> + * mode to edge and then back to level, with RTE being masked during
> + * this."
> + *
> + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
> + *
> + * This is based on the assumption that, Remote IRR bit will be
> + * cleared by IOAPIC hardware for edge-triggered interrupts (I
> + * believe that's what the IOAPIC version 0x1X hardware does). So
> + * if we are emulating it, we'd better do it the same here, so that
> + * the guest kernel hack will work as well on QEMU.
> + *
> + * Without this, level-triggered interrupts in IR mode might fail to
> + * work correctly.
> + */
> +static inline void
> +ioapic_fix_edge_remote_irr(uint64_t *entry)
> +{
> +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> +        /* Level triggered interrupts, make sure remote IRR is zero */
> +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> +    }
> +}
> +
>  static void
>  ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                   unsigned int size)
> @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                      s->ioredtbl[index] &= ~0xffffffffULL;
>                      s->ioredtbl[index] |= val;
>                  }
> +                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
>                  ioapic_service(s);
>              }
>          }
> 
> ------------------------
> 
> I am still looking into guest part codes. Although the above patch
> should solve the issue, there are still issues in guest codes when
> IR is enabled:
> 
> - mismatched "vector" in IOAPIC entry and IRTE entry (this is
>   required in vt-d spec 5.1.5.1, and required to correctly deliver
>   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
> 
>         ...
>         /*
>          * IO-APIC RTE will be configured with virtual vector.
>          * irq handler will do the explicit EOI to the io-apic.
>          */
>         entry->vector   = info->ioapic_pin;
>         ...
> 
> - I encountered that level-triggered entries in IOAPIC is marked as
>   edge-triggered interrupt in APIC (which is strange)... This will
>   also affect correct delivery of EOI broadcast. I still need time
>   to figure out why.
> 
> If EOI broadcast can work, e1000 issue would be solved as
> well even without above patch.
> 
> [...]

I don't remember details in this area, but maybe it's worth to look how
my hacks dealt with these cause (or made Linux to not create such weird
configurations).

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26  7:57         ` Jan Kiszka
@ 2016-04-26  8:15           ` Jan Kiszka
  2016-04-26 10:38             ` Peter Xu
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kiszka @ 2016-04-26  8:15 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On 2016-04-26 09:57, Jan Kiszka wrote:
> On 2016-04-26 09:34, Peter Xu wrote:
>> On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote:
>>> On 2016-04-25 09:18, Peter Xu wrote:
>>>> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
>>>>> On 2016-04-19 10:38, Peter Xu wrote:
>>>>
>>>> [...]
>>>>
>>>>>> By default, IR is disabled to be better compatible with current
>>>>>> QEMU. To enable IR, we can using the following command to boot a
>>>>>> IR-supported VM with virtio-net device with vhost (still do not
>>>>>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
>>>>>> here):
>>>>>>
>>>>>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
>>>>>
>>>>> "intr" sounds a bit too much like "interrupt", not "interrupt
>>>>> remapping". Why not use the kernel's form, "intremap"?
>>>>
>>>> Sure. It sounds nice to be aligned with the kernel one. Let me take
>>>> it in v5.
>>>>
>>>>>
>>>>>>      -enable-kvm -m 1024 \
>>>>>> 	 -netdev tap,id=net0,vhost=on \
>>>>>>      -device virtio-net-pci,netdev=user.0 \
>>>>>>      -monitor telnet::3333,server,nowait \
>>>>>> 	 /var/lib/libvirt/images/vm1.qcow2
>>>>>>
>>>>>> When guest boots, we can verify whether IR enabled by grepping the
>>>>>> dmesg like:
>>>>>>
>>>>>> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
>>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
>>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
>>>>>>
>>>>>> Currently supported devices:
>>>>>>
>>>>>> - Emulated/Splitted irqchip
>>>>>> - Generic PCI Devices
>>>>>> - vhost devices
>>>>>> - pass through device support? Not tested, but suppose it should work.
>>>>>
>>>>> I've tested this series against my Jailhouse setup, and it works pretty
>>>>> well! Actually considering to move my test setup over this branch.
>>>>
>>>> This is really encouraging feedback! Btw, thanks for all kinds of
>>>> help on this patchset. :-)
>>>>
>>>>>
>>>>> However, split irqchip still has some issues: When I boot a q35 machine
>>>>> with Linux, the e1000 network adapter only gets a single IRQ delivered.
>>>>> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
>>>>> didn't debug this in details yet.
>>>>
>>>> I reproduced this problem. It seems that it fails even with
>>>> kernel-irqchip=off. Will try to dig it out.
>>>
>>> Very good. Hope it can be easily fixed.
>>
>> Hi, Jan,
>>
>> The above issue should be caused by EOI missing of level-triggered
>> interrupts. Before that, I was always using edge-triggered
>> interrupts for test, so didn't encounter this one. Would you please
>> help try below patch? It can be applied directly onto the series,
>> and should solve the issue (it works on my test vm, and I'll take it
>> in v5 as well if it also works for you):
>>
> 
> Works here as well. I even made EIM working with some hack, though
> Jailhouse spits out strange warnings, despite it works fine (x2apic
> mode, split irqchip).

Corrections: the warnings are issued by qemu, not Jailhouse, e.g.

qemu-system-x86_64: VT-d Failed to remap interrupt for gsi 22.

I suspect that comes from the hand-over phase of Jailhouse, when it
mutes all interrupts in the system while reconfiguring IR and IOAPIC.

Please convert this error (in kvm_arch_fixup_msi_route) into a trace
point. It shall not annoy the host. Also check if you have more of such
guest-triggerable error messages.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26  8:15           ` Jan Kiszka
@ 2016-04-26 10:38             ` Peter Xu
  2016-04-26 10:51               ` Jan Kiszka
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-04-26 10:38 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On Tue, Apr 26, 2016 at 10:15:46AM +0200, Jan Kiszka wrote:
> On 2016-04-26 09:57, Jan Kiszka wrote:
> > On 2016-04-26 09:34, Peter Xu wrote:
> >> On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote:
> >>> On 2016-04-25 09:18, Peter Xu wrote:
> >>>> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
> >>>>> On 2016-04-19 10:38, Peter Xu wrote:
> >>>>
> >>>> [...]
> >>>>
> >>>>>> By default, IR is disabled to be better compatible with current
> >>>>>> QEMU. To enable IR, we can using the following command to boot a
> >>>>>> IR-supported VM with virtio-net device with vhost (still do not
> >>>>>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
> >>>>>> here):
> >>>>>>
> >>>>>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
> >>>>>
> >>>>> "intr" sounds a bit too much like "interrupt", not "interrupt
> >>>>> remapping". Why not use the kernel's form, "intremap"?
> >>>>
> >>>> Sure. It sounds nice to be aligned with the kernel one. Let me take
> >>>> it in v5.
> >>>>
> >>>>>
> >>>>>>      -enable-kvm -m 1024 \
> >>>>>> 	 -netdev tap,id=net0,vhost=on \
> >>>>>>      -device virtio-net-pci,netdev=user.0 \
> >>>>>>      -monitor telnet::3333,server,nowait \
> >>>>>> 	 /var/lib/libvirt/images/vm1.qcow2
> >>>>>>
> >>>>>> When guest boots, we can verify whether IR enabled by grepping the
> >>>>>> dmesg like:
> >>>>>>
> >>>>>> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> >>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base  0xfed90000 IOMMU 0
> >>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
> >>>>>>
> >>>>>> Currently supported devices:
> >>>>>>
> >>>>>> - Emulated/Splitted irqchip
> >>>>>> - Generic PCI Devices
> >>>>>> - vhost devices
> >>>>>> - pass through device support? Not tested, but suppose it should work.
> >>>>>
> >>>>> I've tested this series against my Jailhouse setup, and it works pretty
> >>>>> well! Actually considering to move my test setup over this branch.
> >>>>
> >>>> This is really encouraging feedback! Btw, thanks for all kinds of
> >>>> help on this patchset. :-)
> >>>>
> >>>>>
> >>>>> However, split irqchip still has some issues: When I boot a q35 machine
> >>>>> with Linux, the e1000 network adapter only gets a single IRQ delivered.
> >>>>> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
> >>>>> didn't debug this in details yet.
> >>>>
> >>>> I reproduced this problem. It seems that it fails even with
> >>>> kernel-irqchip=off. Will try to dig it out.
> >>>
> >>> Very good. Hope it can be easily fixed.
> >>
> >> Hi, Jan,
> >>
> >> The above issue should be caused by EOI missing of level-triggered
> >> interrupts. Before that, I was always using edge-triggered
> >> interrupts for test, so didn't encounter this one. Would you please
> >> help try below patch? It can be applied directly onto the series,
> >> and should solve the issue (it works on my test vm, and I'll take it
> >> in v5 as well if it also works for you):
> >>
> > 
> > Works here as well. I even made EIM working with some hack, though
> > Jailhouse spits out strange warnings, despite it works fine (x2apic
> > mode, split irqchip).
> 
> Corrections: the warnings are issued by qemu, not Jailhouse, e.g.
> 
> qemu-system-x86_64: VT-d Failed to remap interrupt for gsi 22.
> 
> I suspect that comes from the hand-over phase of Jailhouse, when it
> mutes all interrupts in the system while reconfiguring IR and IOAPIC.
> 
> Please convert this error (in kvm_arch_fixup_msi_route) into a trace
> point. It shall not annoy the host. Also check if you have more of such
> guest-triggerable error messages.

Okay. This should be the only one. I can use trace instead.

Meanwhile, I still suppose we should not seen it even with
error_report().. Would this happen when boot e.g. generic kernels?

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 10:38             ` Peter Xu
@ 2016-04-26 10:51               ` Jan Kiszka
  2016-04-26 11:40                 ` Peter Xu
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kiszka @ 2016-04-26 10:51 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On 2016-04-26 12:38, Peter Xu wrote:
>>>> Hi, Jan,
>>>>
>>>> The above issue should be caused by EOI missing of level-triggered
>>>> interrupts. Before that, I was always using edge-triggered
>>>> interrupts for test, so didn't encounter this one. Would you please
>>>> help try below patch? It can be applied directly onto the series,
>>>> and should solve the issue (it works on my test vm, and I'll take it
>>>> in v5 as well if it also works for you):
>>>>
>>>
>>> Works here as well. I even made EIM working with some hack, though
>>> Jailhouse spits out strange warnings, despite it works fine (x2apic
>>> mode, split irqchip).
>>
>> Corrections: the warnings are issued by qemu, not Jailhouse, e.g.
>>
>> qemu-system-x86_64: VT-d Failed to remap interrupt for gsi 22.
>>
>> I suspect that comes from the hand-over phase of Jailhouse, when it
>> mutes all interrupts in the system while reconfiguring IR and IOAPIC.
>>
>> Please convert this error (in kvm_arch_fixup_msi_route) into a trace
>> point. It shall not annoy the host. Also check if you have more of such
>> guest-triggerable error messages.
> 
> Okay. This should be the only one. I can use trace instead.
> 
> Meanwhile, I still suppose we should not seen it even with
> error_report().. Would this happen when boot e.g. generic kernels?

No, this is - most probably, still need to validate in details - an
effect due to the special case with Jailhouse that interrupt and VT-d
management is handed over from a running OS to a hypervisor.

Jan

PS: Current quick hack for EIM support (lacks compat mode blocking at
least):

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1082ab5..709a92c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -907,6 +907,7 @@ static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
     value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
     s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
     s->intr_root = value & VTD_IRTA_ADDR_MASK;
+    s->intr_eime = value & VTD_IRTA_EIME;
 
     QLIST_FOREACH(notifier, &s->iec_notifiers, list) {
         if (notifier->iec_notify) {
@@ -2052,11 +2053,13 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq *irq
     irq->trigger_mode = irte.trigger_mode;
     irq->vector = irte.vector;
     irq->delivery_mode = irte.delivery_mode;
-    /* Not support EIM yet: please refer to vt-d 9.10 DST bits */
+    irq->dest = irte.dest_id;
+    if (!iommu->intr_eime) {
 #define  VTD_IR_APIC_DEST_MASK         (0xff00ULL)
 #define  VTD_IR_APIC_DEST_SHIFT        (8)
-    irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
-        VTD_IR_APIC_DEST_SHIFT;
+        irq->dest = (irq->dest & VTD_IR_APIC_DEST_MASK) >>
+            VTD_IR_APIC_DEST_SHIFT;
+    }
     irq->dest_mode = irte.dest_mode;
     irq->redir_hint = irte.redir_hint;
 
@@ -2316,7 +2319,7 @@ static void vtd_init(IntelIOMMUState *s)
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
     if (ms->iommu_intr) {
-        s->ecap |= VTD_ECAP_IR;
+        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM;
     }
 
     vtd_reset_context_cache(s);
@@ -2370,10 +2373,9 @@ static void vtd_init(IntelIOMMUState *s)
     vtd_define_quad(s, DMAR_FRCD_REG_0_2, 0, 0, 0x8000000000000000ULL);
 
     /*
-     * Interrupt remapping registers, not support extended interrupt
-     * mode for now.
+     * Interrupt remapping registers.
      */
-    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff00fULL, 0);
+    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff80fULL, 0);
 }
 
 /* Should not reset address_spaces when reset because devices will still use
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 10c20fe..72b0114 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -176,6 +176,7 @@
 
 /* IRTA_REG */
 #define VTD_IRTA_ADDR_MASK          (VTD_HAW_MASK ^ 0xfffULL)
+#define VTD_IRTA_EIME               (1ULL << 11)
 #define VTD_IRTA_SIZE_MASK          (0xfULL)
 
 /* ECAP_REG */
@@ -184,6 +185,7 @@
 #define VTD_ECAP_QI                 (1ULL << 1)
 /* Interrupt Remapping support */
 #define VTD_ECAP_IR                 (1ULL << 3)
+#define VTD_ECAP_EIM                (1ULL << 4)
 
 /* CAP_REG */
 /* (offset >> 4) << 24 */
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index d7613af..c6c53c6 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -261,6 +261,7 @@ struct IntelIOMMUState {
     bool intr_enabled;              /* Whether guest enabled IR */
     dma_addr_t intr_root;           /* Interrupt remapping table pointer */
     uint32_t intr_size;             /* Number of IR table entries */
+    bool intr_eime;                 /* Extended interrupt mode enabled */
     QLIST_HEAD(, VTD_IEC_Notifier) iec_notifiers; /* IEC notify list */
 };
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 10:51               ` Jan Kiszka
@ 2016-04-26 11:40                 ` Peter Xu
  2016-04-26 14:24                   ` Jan Kiszka
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-04-26 11:40 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, rkrcmar, alex.williamson, wexu

On Tue, Apr 26, 2016 at 12:51:44PM +0200, Jan Kiszka wrote:
> On 2016-04-26 12:38, Peter Xu wrote:
> >>>> Hi, Jan,
> >>>>
> >>>> The above issue should be caused by EOI missing of level-triggered
> >>>> interrupts. Before that, I was always using edge-triggered
> >>>> interrupts for test, so didn't encounter this one. Would you please
> >>>> help try below patch? It can be applied directly onto the series,
> >>>> and should solve the issue (it works on my test vm, and I'll take it
> >>>> in v5 as well if it also works for you):
> >>>>
> >>>
> >>> Works here as well. I even made EIM working with some hack, though
> >>> Jailhouse spits out strange warnings, despite it works fine (x2apic
> >>> mode, split irqchip).
> >>
> >> Corrections: the warnings are issued by qemu, not Jailhouse, e.g.
> >>
> >> qemu-system-x86_64: VT-d Failed to remap interrupt for gsi 22.
> >>
> >> I suspect that comes from the hand-over phase of Jailhouse, when it
> >> mutes all interrupts in the system while reconfiguring IR and IOAPIC.
> >>
> >> Please convert this error (in kvm_arch_fixup_msi_route) into a trace
> >> point. It shall not annoy the host. Also check if you have more of such
> >> guest-triggerable error messages.
> > 
> > Okay. This should be the only one. I can use trace instead.
> > 
> > Meanwhile, I still suppose we should not seen it even with
> > error_report().. Would this happen when boot e.g. generic kernels?
> 
> No, this is - most probably, still need to validate in details - an
> effect due to the special case with Jailhouse that interrupt and VT-d
> management is handed over from a running OS to a hypervisor.
> 
> Jan
> 
> PS: Current quick hack for EIM support (lacks compat mode blocking at
> least):
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 1082ab5..709a92c 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -907,6 +907,7 @@ static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
>      value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
>      s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
>      s->intr_root = value & VTD_IRTA_ADDR_MASK;
> +    s->intr_eime = value & VTD_IRTA_EIME;
>  
>      QLIST_FOREACH(notifier, &s->iec_notifiers, list) {
>          if (notifier->iec_notify) {
> @@ -2052,11 +2053,13 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq *irq
>      irq->trigger_mode = irte.trigger_mode;
>      irq->vector = irte.vector;
>      irq->delivery_mode = irte.delivery_mode;
> -    /* Not support EIM yet: please refer to vt-d 9.10 DST bits */
> +    irq->dest = irte.dest_id;
> +    if (!iommu->intr_eime) {
>  #define  VTD_IR_APIC_DEST_MASK         (0xff00ULL)
>  #define  VTD_IR_APIC_DEST_SHIFT        (8)
> -    irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
> -        VTD_IR_APIC_DEST_SHIFT;
> +        irq->dest = (irq->dest & VTD_IR_APIC_DEST_MASK) >>
> +            VTD_IR_APIC_DEST_SHIFT;
> +    }
>      irq->dest_mode = irte.dest_mode;
>      irq->redir_hint = irte.redir_hint;
>  
> @@ -2316,7 +2319,7 @@ static void vtd_init(IntelIOMMUState *s)
>      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
>  
>      if (ms->iommu_intr) {
> -        s->ecap |= VTD_ECAP_IR;
> +        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM;
>      }
>  
>      vtd_reset_context_cache(s);
> @@ -2370,10 +2373,9 @@ static void vtd_init(IntelIOMMUState *s)
>      vtd_define_quad(s, DMAR_FRCD_REG_0_2, 0, 0, 0x8000000000000000ULL);
>  
>      /*
> -     * Interrupt remapping registers, not support extended interrupt
> -     * mode for now.
> +     * Interrupt remapping registers.
>       */
> -    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff00fULL, 0);
> +    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff80fULL, 0);
>  }
>  
>  /* Should not reset address_spaces when reset because devices will still use
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index 10c20fe..72b0114 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -176,6 +176,7 @@
>  
>  /* IRTA_REG */
>  #define VTD_IRTA_ADDR_MASK          (VTD_HAW_MASK ^ 0xfffULL)
> +#define VTD_IRTA_EIME               (1ULL << 11)
>  #define VTD_IRTA_SIZE_MASK          (0xfULL)
>  
>  /* ECAP_REG */
> @@ -184,6 +185,7 @@
>  #define VTD_ECAP_QI                 (1ULL << 1)
>  /* Interrupt Remapping support */
>  #define VTD_ECAP_IR                 (1ULL << 3)
> +#define VTD_ECAP_EIM                (1ULL << 4)
>  
>  /* CAP_REG */
>  /* (offset >> 4) << 24 */
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index d7613af..c6c53c6 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -261,6 +261,7 @@ struct IntelIOMMUState {
>      bool intr_enabled;              /* Whether guest enabled IR */
>      dma_addr_t intr_root;           /* Interrupt remapping table pointer */
>      uint32_t intr_size;             /* Number of IR table entries */
> +    bool intr_eime;                 /* Extended interrupt mode enabled */
>      QLIST_HEAD(, VTD_IEC_Notifier) iec_notifiers; /* IEC notify list */
>  };
>  

Currently, all the interrupts will be translated into one MSI in
vtd_generate_msi_message(), in which only 8 bits of dest_id is used
(msg.dest = irq->dest). We may possibly need to use the high 32 bits
of MSI address to store the rest dest[31:8]? Don't know whether this
would be enough though.

Thanks.

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26  7:34       ` Peter Xu
  2016-04-26  7:57         ` Jan Kiszka
@ 2016-04-26 14:19         ` Radim Krčmář
  2016-04-27  7:29           ` Peter Xu
  2016-05-09 11:58         ` Paolo Bonzini
  2 siblings, 1 reply; 43+ messages in thread
From: Radim Krčmář @ 2016-04-26 14:19 UTC (permalink / raw)
  To: Peter Xu
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

2016-04-26 15:34+0800, Peter Xu:
> Hi, Jan,
> 
> The above issue should be caused by EOI missing of level-triggered
> interrupts. Before that, I was always using edge-triggered
> interrupts for test, so didn't encounter this one. Would you please
> help try below patch? It can be applied directly onto the series,
> and should solve the issue (it works on my test vm, and I'll take it
> in v5 as well if it also works for you):
> 
> -------------------------
> 
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> @@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int size)
> +/*
> + * This is to satisfy the hack in Linux kernel. One hack of it is to
> + * simulate clearing the Remote IRR bit of IOAPIC entry using the
> + * following:
> + *
> + * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
> + * Otherwise, we simulate the EOI message manually by changing the trigger
> + * mode to edge and then back to level, with RTE being masked during
> + * this."
> + *
> + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
> + *
> + * This is based on the assumption that, Remote IRR bit will be
> + * cleared by IOAPIC hardware for edge-triggered interrupts (I
> + * believe that's what the IOAPIC version 0x1X hardware does).

I thought that Linux doesn't use explicit "EOI" to IO-APIC, but relies
on EOI broadcast from LAPIC -- does that change with IR?

> + *                                                             So
> + * if we are emulating it, we'd better do it the same here, so that
> + * the guest kernel hack will work as well on QEMU.

Totally.

> + * Without this, level-triggered interrupts in IR mode might fail to
> + * work correctly.

(I don't really understand why it worked before.)

> + */
> +static inline void
> +ioapic_fix_edge_remote_irr(uint64_t *entry)
> +{
> +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> +        /* Level triggered interrupts, make sure remote IRR is zero */
> +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);

(You can just unconditionally zero it, edge doesn't care.)

> +    }
> +}
> +
> @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                      s->ioredtbl[index] &= ~0xffffffffULL;
>                      s->ioredtbl[index] |= val;
>                  }
> +                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);

I think this can be done only in the else branch of (s->ioregsel & 1).

(If the guest kernel does level->edge->level, then remote_irr probably
 should be cleared only on edge->level transition and not on
 level->level, but I haven't seen that in the spec ...)

>                  ioapic_service(s);
> ------------------------
> 
> I am still looking into guest part codes. Although the above patch
> should solve the issue, there are still issues in guest codes when
> IR is enabled:
> 
> - mismatched "vector" in IOAPIC entry and IRTE entry (this is
>   required in vt-d spec 5.1.5.1, and required to correctly deliver
>   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():

"required" is a way of saying that the opposite is undefined.
No need to think about it in IOMMU.

> - I encountered that level-triggered entries in IOAPIC is marked as
>   edge-triggered interrupt in APIC (which is strange)...

What/where do you mean?
(The only difference I know of is that level triggered vectors in LAPIC
 have their respective TMR bit set while edge do not.)

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 11:40                 ` Peter Xu
@ 2016-04-26 14:24                   ` Jan Kiszka
  2016-04-26 14:59                     ` Radim Krčmář
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kiszka @ 2016-04-26 14:24 UTC (permalink / raw)
  To: Peter Xu, rkrcmar
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	pbonzini, alex.williamson, wexu

On 2016-04-26 13:40, Peter Xu wrote:
> On Tue, Apr 26, 2016 at 12:51:44PM +0200, Jan Kiszka wrote:
>> On 2016-04-26 12:38, Peter Xu wrote:
>>>>>> Hi, Jan,
>>>>>>
>>>>>> The above issue should be caused by EOI missing of level-triggered
>>>>>> interrupts. Before that, I was always using edge-triggered
>>>>>> interrupts for test, so didn't encounter this one. Would you please
>>>>>> help try below patch? It can be applied directly onto the series,
>>>>>> and should solve the issue (it works on my test vm, and I'll take it
>>>>>> in v5 as well if it also works for you):
>>>>>>
>>>>>
>>>>> Works here as well. I even made EIM working with some hack, though
>>>>> Jailhouse spits out strange warnings, despite it works fine (x2apic
>>>>> mode, split irqchip).
>>>>
>>>> Corrections: the warnings are issued by qemu, not Jailhouse, e.g.
>>>>
>>>> qemu-system-x86_64: VT-d Failed to remap interrupt for gsi 22.
>>>>
>>>> I suspect that comes from the hand-over phase of Jailhouse, when it
>>>> mutes all interrupts in the system while reconfiguring IR and IOAPIC.
>>>>
>>>> Please convert this error (in kvm_arch_fixup_msi_route) into a trace
>>>> point. It shall not annoy the host. Also check if you have more of such
>>>> guest-triggerable error messages.
>>>
>>> Okay. This should be the only one. I can use trace instead.
>>>
>>> Meanwhile, I still suppose we should not seen it even with
>>> error_report().. Would this happen when boot e.g. generic kernels?
>>
>> No, this is - most probably, still need to validate in details - an
>> effect due to the special case with Jailhouse that interrupt and VT-d
>> management is handed over from a running OS to a hypervisor.
>>
>> Jan
>>
>> PS: Current quick hack for EIM support (lacks compat mode blocking at
>> least):
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 1082ab5..709a92c 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -907,6 +907,7 @@ static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
>>      value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
>>      s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
>>      s->intr_root = value & VTD_IRTA_ADDR_MASK;
>> +    s->intr_eime = value & VTD_IRTA_EIME;
>>  
>>      QLIST_FOREACH(notifier, &s->iec_notifiers, list) {
>>          if (notifier->iec_notify) {
>> @@ -2052,11 +2053,13 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq *irq
>>      irq->trigger_mode = irte.trigger_mode;
>>      irq->vector = irte.vector;
>>      irq->delivery_mode = irte.delivery_mode;
>> -    /* Not support EIM yet: please refer to vt-d 9.10 DST bits */
>> +    irq->dest = irte.dest_id;
>> +    if (!iommu->intr_eime) {
>>  #define  VTD_IR_APIC_DEST_MASK         (0xff00ULL)
>>  #define  VTD_IR_APIC_DEST_SHIFT        (8)
>> -    irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
>> -        VTD_IR_APIC_DEST_SHIFT;
>> +        irq->dest = (irq->dest & VTD_IR_APIC_DEST_MASK) >>
>> +            VTD_IR_APIC_DEST_SHIFT;
>> +    }
>>      irq->dest_mode = irte.dest_mode;
>>      irq->redir_hint = irte.redir_hint;
>>  
>> @@ -2316,7 +2319,7 @@ static void vtd_init(IntelIOMMUState *s)
>>      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
>>  
>>      if (ms->iommu_intr) {
>> -        s->ecap |= VTD_ECAP_IR;
>> +        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM;
>>      }
>>  
>>      vtd_reset_context_cache(s);
>> @@ -2370,10 +2373,9 @@ static void vtd_init(IntelIOMMUState *s)
>>      vtd_define_quad(s, DMAR_FRCD_REG_0_2, 0, 0, 0x8000000000000000ULL);
>>  
>>      /*
>> -     * Interrupt remapping registers, not support extended interrupt
>> -     * mode for now.
>> +     * Interrupt remapping registers.
>>       */
>> -    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff00fULL, 0);
>> +    vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff80fULL, 0);
>>  }
>>  
>>  /* Should not reset address_spaces when reset because devices will still use
>> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
>> index 10c20fe..72b0114 100644
>> --- a/hw/i386/intel_iommu_internal.h
>> +++ b/hw/i386/intel_iommu_internal.h
>> @@ -176,6 +176,7 @@
>>  
>>  /* IRTA_REG */
>>  #define VTD_IRTA_ADDR_MASK          (VTD_HAW_MASK ^ 0xfffULL)
>> +#define VTD_IRTA_EIME               (1ULL << 11)
>>  #define VTD_IRTA_SIZE_MASK          (0xfULL)
>>  
>>  /* ECAP_REG */
>> @@ -184,6 +185,7 @@
>>  #define VTD_ECAP_QI                 (1ULL << 1)
>>  /* Interrupt Remapping support */
>>  #define VTD_ECAP_IR                 (1ULL << 3)
>> +#define VTD_ECAP_EIM                (1ULL << 4)
>>  
>>  /* CAP_REG */
>>  /* (offset >> 4) << 24 */
>> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
>> index d7613af..c6c53c6 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -261,6 +261,7 @@ struct IntelIOMMUState {
>>      bool intr_enabled;              /* Whether guest enabled IR */
>>      dma_addr_t intr_root;           /* Interrupt remapping table pointer */
>>      uint32_t intr_size;             /* Number of IR table entries */
>> +    bool intr_eime;                 /* Extended interrupt mode enabled */
>>      QLIST_HEAD(, VTD_IEC_Notifier) iec_notifiers; /* IEC notify list */
>>  };
>>  
> 
> Currently, all the interrupts will be translated into one MSI in
> vtd_generate_msi_message(), in which only 8 bits of dest_id is used
> (msg.dest = irq->dest). We may possibly need to use the high 32 bits
> of MSI address to store the rest dest[31:8]? Don't know whether this
> would be enough though.

Yes, I ran into this topic as well as I hacked up those line. Currently,
KVM does not support more than 254 vCPUs, so 8 bits of those 32 are
actually fine, and piggy-backing them in an MSI message works.

Once KVM supports more CPUs, it has to come up with a new userspace
interface to inject APIC events for more than 255 CPUs. Maybe the
existing direct MSI inject with its unused flags could be "bended",
maybe there are already better ideas (Radim?).

In any case, the KVM layer in userspace will then have to pick up all 32
destination bits from the IRTE and deliver them via that new interface
to the in-kernel APICs.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 14:24                   ` Jan Kiszka
@ 2016-04-26 14:59                     ` Radim Krčmář
  2016-04-26 15:28                       ` Jan Kiszka
  0 siblings, 1 reply; 43+ messages in thread
From: Radim Krčmář @ 2016-04-26 14:59 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Peter Xu, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
	mst, pbonzini, alex.williamson, wexu

2016-04-26 16:24+0200, Jan Kiszka:
> On 2016-04-26 13:40, Peter Xu wrote:
>> Currently, all the interrupts will be translated into one MSI in
>> vtd_generate_msi_message(), in which only 8 bits of dest_id is used
>> (msg.dest = irq->dest). We may possibly need to use the high 32 bits
>> of MSI address to store the rest dest[31:8]? Don't know whether this
>> would be enough though.
> 
> Yes, I ran into this topic as well as I hacked up those line. Currently,
> KVM does not support more than 254 vCPUs, so 8 bits of those 32 are
> actually fine, and piggy-backing them in an MSI message works.
> 
> Once KVM supports more CPUs, it has to come up with a new userspace
> interface to inject APIC events for more than 255 CPUs. Maybe the
> existing direct MSI inject with its unused flags could be "bended",
> maybe there are already better ideas (Radim?).

Adding a flag to msi_msg and taking 3-4 bytes from padding to express
x2APIC addresses is reasonable.  (It is what my prototype did. :])

The conceptually different idea is forcing all userspace interrupts
through irqfd routes, which would obsolete the ad-host inject.

> In any case, the KVM layer in userspace will then have to pick up all 32
> destination bits from the IRTE and deliver them via that new interface
> to the in-kernel APICs.

I'll keep that in mind, thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 14:59                     ` Radim Krčmář
@ 2016-04-26 15:28                       ` Jan Kiszka
  2016-04-26 16:07                         ` Radim Krčmář
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kiszka @ 2016-04-26 15:28 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Peter Xu, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
	mst, pbonzini, alex.williamson, wexu

On 2016-04-26 16:59, Radim Krčmář wrote:
> 2016-04-26 16:24+0200, Jan Kiszka:
>> On 2016-04-26 13:40, Peter Xu wrote:
>>> Currently, all the interrupts will be translated into one MSI in
>>> vtd_generate_msi_message(), in which only 8 bits of dest_id is used
>>> (msg.dest = irq->dest). We may possibly need to use the high 32 bits
>>> of MSI address to store the rest dest[31:8]? Don't know whether this
>>> would be enough though.
>>
>> Yes, I ran into this topic as well as I hacked up those line. Currently,
>> KVM does not support more than 254 vCPUs, so 8 bits of those 32 are
>> actually fine, and piggy-backing them in an MSI message works.
>>
>> Once KVM supports more CPUs, it has to come up with a new userspace
>> interface to inject APIC events for more than 255 CPUs. Maybe the
>> existing direct MSI inject with its unused flags could be "bended",
>> maybe there are already better ideas (Radim?).
> 
> Adding a flag to msi_msg and taking 3-4 bytes from padding to express
> x2APIC addresses is reasonable.  (It is what my prototype did. :])
> 
> The conceptually different idea is forcing all userspace interrupts
> through irqfd routes, which would obsolete the ad-host inject.

irqfd for userspace sources is a bit clumsy from the API POV. On the
other hand, we need to tweak the routing API anyway to achieve the same
address extension there, too.

Jan

> 
>> In any case, the KVM layer in userspace will then have to pick up all 32
>> destination bits from the IRTE and deliver them via that new interface
>> to the in-kernel APICs.
> 
> I'll keep that in mind, thanks.
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 15:28                       ` Jan Kiszka
@ 2016-04-26 16:07                         ` Radim Krčmář
  2016-04-26 17:47                           ` Jan Kiszka
  0 siblings, 1 reply; 43+ messages in thread
From: Radim Krčmář @ 2016-04-26 16:07 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Peter Xu, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
	mst, pbonzini, alex.williamson, wexu

2016-04-26 17:28+0200, Jan Kiszka:
> On 2016-04-26 16:59, Radim Krčmář wrote:
>> 2016-04-26 16:24+0200, Jan Kiszka:
>>> On 2016-04-26 13:40, Peter Xu wrote:
>>>> Currently, all the interrupts will be translated into one MSI in
>>>> vtd_generate_msi_message(), in which only 8 bits of dest_id is used
>>>> (msg.dest = irq->dest). We may possibly need to use the high 32 bits
>>>> of MSI address to store the rest dest[31:8]? Don't know whether this
>>>> would be enough though.
>>>
>>> Yes, I ran into this topic as well as I hacked up those line. Currently,
>>> KVM does not support more than 254 vCPUs, so 8 bits of those 32 are
>>> actually fine, and piggy-backing them in an MSI message works.
>>>
>>> Once KVM supports more CPUs, it has to come up with a new userspace
>>> interface to inject APIC events for more than 255 CPUs. Maybe the
>>> existing direct MSI inject with its unused flags could be "bended",
>>> maybe there are already better ideas (Radim?).
>> 
>> Adding a flag to msi_msg and taking 3-4 bytes from padding to express
>> x2APIC addresses is reasonable.  (It is what my prototype did. :])
>> 
>> The conceptually different idea is forcing all userspace interrupts
>> through irqfd routes, which would obsolete the ad-host inject.
> 
> irqfd for userspace sources is a bit clumsy from the API POV. On the
> other hand, we need to tweak the routing API anyway to achieve the same
> address extension there, too.

I agree, both of them should change.  KVM_SIGNAL_MSI was added just
because the route-based injection in kvm_irqchip_send_msi() sucked;
maybe we could find a good solution now, but the direct interface is
already here ...

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 16:07                         ` Radim Krčmář
@ 2016-04-26 17:47                           ` Jan Kiszka
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Kiszka @ 2016-04-26 17:47 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Peter Xu, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
	mst, pbonzini, alex.williamson, wexu

On 2016-04-26 18:07, Radim Krčmář wrote:
> 2016-04-26 17:28+0200, Jan Kiszka:
>> On 2016-04-26 16:59, Radim Krčmář wrote:
>>> 2016-04-26 16:24+0200, Jan Kiszka:
>>>> On 2016-04-26 13:40, Peter Xu wrote:
>>>>> Currently, all the interrupts will be translated into one MSI in
>>>>> vtd_generate_msi_message(), in which only 8 bits of dest_id is used
>>>>> (msg.dest = irq->dest). We may possibly need to use the high 32 bits
>>>>> of MSI address to store the rest dest[31:8]? Don't know whether this
>>>>> would be enough though.
>>>>
>>>> Yes, I ran into this topic as well as I hacked up those line. Currently,
>>>> KVM does not support more than 254 vCPUs, so 8 bits of those 32 are
>>>> actually fine, and piggy-backing them in an MSI message works.
>>>>
>>>> Once KVM supports more CPUs, it has to come up with a new userspace
>>>> interface to inject APIC events for more than 255 CPUs. Maybe the
>>>> existing direct MSI inject with its unused flags could be "bended",
>>>> maybe there are already better ideas (Radim?).
>>>
>>> Adding a flag to msi_msg and taking 3-4 bytes from padding to express
>>> x2APIC addresses is reasonable.  (It is what my prototype did. :])
>>>
>>> The conceptually different idea is forcing all userspace interrupts
>>> through irqfd routes, which would obsolete the ad-host inject.
>>
>> irqfd for userspace sources is a bit clumsy from the API POV. On the
>> other hand, we need to tweak the routing API anyway to achieve the same
>> address extension there, too.
> 
> I agree, both of them should change.  KVM_SIGNAL_MSI was added just
> because the route-based injection in kvm_irqchip_send_msi() sucked;
> maybe we could find a good solution now, but the direct interface is
> already here ...

We won't get away without irq routes because we need them for sources
that only issue binary signals for interrupt events (in-kernel, eventfd
from userspace). In fact, those sources should not know more than that
about their irqs. But we also do not want them to route everything
through userspace for on-the-fly translation and reinjection to kvm.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26 14:19         ` Radim Krčmář
@ 2016-04-27  7:29           ` Peter Xu
  2016-04-27 14:31             ` Radim Krčmář
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-04-27  7:29 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

On Tue, Apr 26, 2016 at 04:19:00PM +0200, Radim Krčmář wrote:
> 2016-04-26 15:34+0800, Peter Xu:
> > Hi, Jan,
> > 
> > The above issue should be caused by EOI missing of level-triggered
> > interrupts. Before that, I was always using edge-triggered
> > interrupts for test, so didn't encounter this one. Would you please
> > help try below patch? It can be applied directly onto the series,
> > and should solve the issue (it works on my test vm, and I'll take it
> > in v5 as well if it also works for you):
> > 
> > -------------------------
> > 
> > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> > @@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int size)
> > +/*
> > + * This is to satisfy the hack in Linux kernel. One hack of it is to
> > + * simulate clearing the Remote IRR bit of IOAPIC entry using the
> > + * following:
> > + *
> > + * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
> > + * Otherwise, we simulate the EOI message manually by changing the trigger
> > + * mode to edge and then back to level, with RTE being masked during
> > + * this."
> > + *
> > + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
> > + *
> > + * This is based on the assumption that, Remote IRR bit will be
> > + * cleared by IOAPIC hardware for edge-triggered interrupts (I
> > + * believe that's what the IOAPIC version 0x1X hardware does).
> 
> I thought that Linux doesn't use explicit "EOI" to IO-APIC, but relies
> on EOI broadcast from LAPIC -- does that change with IR?

IIUC, ioapic_ack_level() should be the one to handle EOI when IR is
disabled. And, the EOI broadcast should be happening at:

	ack_APIC_irq();

While, after that, we can see some more lines:

	/*
	 * Tail end of clearing remote IRR bit (either by delivering the EOI
	 * message via io-apic EOI register write or simulating it using
	 * mask+edge followed by unnask+level logic) manually when the
	 * level triggered interrupt is seen as the edge triggered interrupt
	 * at the cpu.
	 */
	if (!(v & (1 << (i & 0x1f)))) {
		atomic_inc(&irq_mis_count);
		eoi_ioapic_pin(cfg->vector, irq_data->chip_data);
	}

What I understand the above is that: first of all, we will do EOI
broadcast. However, if we found that one level-triggered interrupt
is treated as edge-triggered interrupt (that is exactly what I have
encountered below), we will do one more explicit EOI in
eoi_ioapic_pin(), in which we played the edge-mask/level-unmask
trick for IOAPIC with version 0x1X.

For IR enabled case, we just do both without checking (see
ioapic_ir_ack_level()).

So that's why I think this should not happen if either way
works... Or say, if without this patch, both "EOI broadcast" and
"explicit EOI (hacky version)" are not working for IR case. And I am
still looking for the reason for previous one (this patch fix the
latter one).

> 
> > + *                                                             So
> > + * if we are emulating it, we'd better do it the same here, so that
> > + * the guest kernel hack will work as well on QEMU.
> 
> Totally.
> 
> > + * Without this, level-triggered interrupts in IR mode might fail to
> > + * work correctly.
> 
> (I don't really understand why it worked before.)

Yes, actually what I want to try is to have one IOMMU hardware
machine, plug e1000 (I mean real hardware) into it, and see whether
current Linux kernel IOMMU driver can cope well with level-triggered
devices (I suppose this scenario is rarely used, since
level-triggered interrupts are most legacy IIUC).

> 
> > + */
> > +static inline void
> > +ioapic_fix_edge_remote_irr(uint64_t *entry)
> > +{
> > +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> > +        /* Level triggered interrupts, make sure remote IRR is zero */
> > +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> 
> (You can just unconditionally zero it, edge doesn't care.)

Ah! I made a mistake. I suppose what I really want is:

+    if (!(*entry & IOAPIC_LVT_TRIGGER_MODE)) {
+        /* Edge-triggered interrupts, make sure remote IRR is zero */
+        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
+    }

Though both should help do the trick, I should be using this new
one in v5.

> 
> > +    }
> > +}
> > +
> > @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
> >                      s->ioredtbl[index] &= ~0xffffffffULL;
> >                      s->ioredtbl[index] |= val;
> >                  }
> > +                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
> 
> I think this can be done only in the else branch of (s->ioregsel & 1).

Yes. I can move it there, but there will be hidden assumption (or
say, truth...) that these magic bits are inside entry bits 31-0, and
people might be confused if we do not know that.  IMHO, for better
readability of code, I would still prefer to put it here (it means
"we need to make sure the entry satisfy some kind of rule, but we do
not need to know further about what the rule is"). If you still
insist, I'd like to take your advice though. :)

> 
> (If the guest kernel does level->edge->level, then remote_irr probably
>  should be cleared only on edge->level transition and not on
>  level->level, but I haven't seen that in the spec ...)

Agree. That's what my above diff is trying to fix. Thanks to point out.

> 
> >                  ioapic_service(s);
> > ------------------------
> > 
> > I am still looking into guest part codes. Although the above patch
> > should solve the issue, there are still issues in guest codes when
> > IR is enabled:
> > 
> > - mismatched "vector" in IOAPIC entry and IRTE entry (this is
> >   required in vt-d spec 5.1.5.1, and required to correctly deliver
> >   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
> 
> "required" is a way of saying that the opposite is undefined.
> No need to think about it in IOMMU.

Why? Without correct vector information, IOAPIC will not be able to
know which entry to clear the Remote IRR bit (please check
ioapic_eoi_broadcast())?

> 
> > - I encountered that level-triggered entries in IOAPIC is marked as
> >   edge-triggered interrupt in APIC (which is strange)...
> 
> What/where do you mean?
> (The only difference I know of is that level triggered vectors in LAPIC
>  have their respective TMR bit set while edge do not.)

Exactly. Here is what I mean:

static void apic_eoi(APICCommonState *s)
{
    int isrv;
    isrv = get_highest_priority_int(s->isr);
    if (isrv < 0)
        return;
    apic_reset_bit(s->isr, isrv);
    if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && apic_get_bit(s->tmr, isrv)) {
        ioapic_eoi_broadcast(isrv);
    }
    apic_sync_vapic(s, SYNC_FROM_VAPIC | SYNC_TO_VAPIC);
    apic_update_irq(s);
}

APIC will notify IOAPIC only if the corresponding vector in TMR bit
is set (in "apic_get_bit(s->tmr, isrv)", or say, it's a
level-triggered interrupt in APIC registers). What I have traced is
that, the EOI broadcast is missing because this bit is cleared in
APIC TMR while it should be set. I need some more tests to double
confirm this though, in case I made any mistake.

(P.S. Actually I saw some similiar comments in kernel codes around,
please check the long comments in ioapic_ack_level().  Not sure
whether these are related.)

Thanks!

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-27  7:29           ` Peter Xu
@ 2016-04-27 14:31             ` Radim Krčmář
  2016-04-28  5:27               ` Peter Xu
  2016-04-28  6:06               ` Peter Xu
  0 siblings, 2 replies; 43+ messages in thread
From: Radim Krčmář @ 2016-04-27 14:31 UTC (permalink / raw)
  To: Peter Xu
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

2016-04-27 15:29+0800, Peter Xu:
> On Tue, Apr 26, 2016 at 04:19:00PM +0200, Radim Krčmář wrote:
>> 2016-04-26 15:34+0800, Peter Xu:
>> > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
>> > @@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int size)
>> > +/*
>> > + * This is to satisfy the hack in Linux kernel. One hack of it is to
>> > + * simulate clearing the Remote IRR bit of IOAPIC entry using the
>> > + * following:
>> > + *
>> > + * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
>> > + * Otherwise, we simulate the EOI message manually by changing the trigger
>> > + * mode to edge and then back to level, with RTE being masked during
>> > + * this."
>> > + *
>> > + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
>> > + *
>> > + * This is based on the assumption that, Remote IRR bit will be
>> > + * cleared by IOAPIC hardware for edge-triggered interrupts (I
>> > + * believe that's what the IOAPIC version 0x1X hardware does).
>> 
>> I thought that Linux doesn't use explicit "EOI" to IO-APIC, but relies
>> on EOI broadcast from LAPIC -- does that change with IR?
> 
> IIUC, ioapic_ack_level() should be the one to handle EOI when IR is
> disabled. And, the EOI broadcast should be happening at:
> 
> 	ack_APIC_irq();
> 
> While, after that, we can see some more lines:
> 
> 	/*
> 	 * Tail end of clearing remote IRR bit (either by delivering the EOI
> 	 * message via io-apic EOI register write or simulating it using
> 	 * mask+edge followed by unnask+level logic) manually when the
> 	 * level triggered interrupt is seen as the edge triggered interrupt
> 	 * at the cpu.
> 	 */
> 	if (!(v & (1 << (i & 0x1f)))) {
> 		atomic_inc(&irq_mis_count);
> 		eoi_ioapic_pin(cfg->vector, irq_data->chip_data);
> 	}
> 
> What I understand the above is that: first of all, we will do EOI
> broadcast. However, if we found that one level-triggered interrupt
> is treated as edge-triggered interrupt (that is exactly what I have
> encountered below), we will do one more explicit EOI in
> eoi_ioapic_pin(), in which we played the edge-mask/level-unmask
> trick for IOAPIC with version 0x1X.

Indeed, thanks for the explanation.

> For IR enabled case, we just do both without checking (see
> ioapic_ir_ack_level()).

(IR with IO-APIC below version 0x20 probably does not exist in the wild.
 I don't find any reason why the interaction would bug, though.)

>> > + */
>> > +static inline void
>> > +ioapic_fix_edge_remote_irr(uint64_t *entry)
>> > +{
>> > +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
>> > +        /* Level triggered interrupts, make sure remote IRR is zero */
>> > +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
>> 
>> (You can just unconditionally zero it, edge doesn't care.)
> 
> Ah! I made a mistake. I suppose what I really want is:
> 
> +    if (!(*entry & IOAPIC_LVT_TRIGGER_MODE)) {
> +        /* Edge-triggered interrupts, make sure remote IRR is zero */
> +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> +    }
> 
> Though both should help do the trick, I should be using this new
> one in v5.

(You'd need to look at the old value for this to work.)

>> > @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>> >                      s->ioredtbl[index] &= ~0xffffffffULL;
>> >                      s->ioredtbl[index] |= val;
>> >                  }
>> > +                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
>> 
>> I think this can be done only in the else branch of (s->ioregsel & 1).
> 
> Yes. I can move it there, but there will be hidden assumption (or
> say, truth...) that these magic bits are inside entry bits 31-0, and
> people might be confused if we do not know that.  IMHO, for better
> readability of code, I would still prefer to put it here (it means
> "we need to make sure the entry satisfy some kind of rule, but we do
> not need to know further about what the rule is"). If you still
> insist, I'd like to take your advice though. :)

I don't.  If you clear it only on edge->level transition, then those two
also behave the same.

>> > I am still looking into guest part codes. Although the above patch
>> > should solve the issue, there are still issues in guest codes when
>> > IR is enabled:
>> > 
>> > - mismatched "vector" in IOAPIC entry and IRTE entry (this is
>> >   required in vt-d spec 5.1.5.1, and required to correctly deliver
>> >   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
>> 
>> "required" is a way of saying that the opposite is undefined.
>> No need to think about it in IOMMU.
> 
> Why? Without correct vector information, IOAPIC will not be able to
> know which entry to clear the Remote IRR bit (please check
> ioapic_eoi_broadcast())?

IOAPIC won't get correct EOI and Intel made it into an OS bug, because
there was no good action that the hardware could take.  (We have a lot
more freedom, but I think that partially fixing something that doesn't
work on real hardware is a wasted effort.)

Or did you mean that mismatched vector is a possible source of the fixed
bug?  (I originally dismissed it, because real hardware works.)

>> > - I encountered that level-triggered entries in IOAPIC is marked as
>> >   edge-triggered interrupt in APIC (which is strange)...
>> 
>> What/where do you mean?
>> (The only difference I know of is that level triggered vectors in LAPIC
>>  have their respective TMR bit set while edge do not.)
> 
> Exactly. Here is what I mean:
> 
> static void apic_eoi(APICCommonState *s)
> {
>     int isrv;
>     isrv = get_highest_priority_int(s->isr);
>     if (isrv < 0)
>         return;
>     apic_reset_bit(s->isr, isrv);
>     if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && apic_get_bit(s->tmr, isrv)) {
>         ioapic_eoi_broadcast(isrv);
>     }
>     apic_sync_vapic(s, SYNC_FROM_VAPIC | SYNC_TO_VAPIC);
>     apic_update_irq(s);
> }
> 
> APIC will notify IOAPIC only if the corresponding vector in TMR bit
> is set (in "apic_get_bit(s->tmr, isrv)", or say, it's a
> level-triggered interrupt in APIC registers). What I have traced is
> that, the EOI broadcast is missing because this bit is cleared in
> APIC TMR while it should be set. I need some more tests to double
> confirm this though, in case I made any mistake.

(There are two "legal" situations where TMR can be 0 and IOAPIC sets
 remote IRR -- if edge and level interrupts are assigned to the same
 vector and if IOAPIC is level while IR and OS edge, both would bug on
 real hardware too ...)

Does QEMU bug with TCG?

> (P.S. Actually I saw some similiar comments in kernel codes around,
> please check the long comments in ioapic_ack_level().  Not sure
> whether these are related.)

I hope we didn't emulate the hardware bug. :)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-27 14:31             ` Radim Krčmář
@ 2016-04-28  5:27               ` Peter Xu
  2016-04-28 16:24                 ` Radim Krčmář
  2016-04-28  6:06               ` Peter Xu
  1 sibling, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-04-28  5:27 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

On Wed, Apr 27, 2016 at 04:31:13PM +0200, Radim Krčmář wrote:

[...]

> >> > I am still looking into guest part codes. Although the above patch
> >> > should solve the issue, there are still issues in guest codes when
> >> > IR is enabled:
> >> > 
> >> > - mismatched "vector" in IOAPIC entry and IRTE entry (this is
> >> >   required in vt-d spec 5.1.5.1, and required to correctly deliver
> >> >   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
> >> 
> >> "required" is a way of saying that the opposite is undefined.
> >> No need to think about it in IOMMU.
> > 
> > Why? Without correct vector information, IOAPIC will not be able to
> > know which entry to clear the Remote IRR bit (please check
> > ioapic_eoi_broadcast())?
> 
> IOAPIC won't get correct EOI and Intel made it into an OS bug, because
> there was no good action that the hardware could take.  (We have a lot
> more freedom, but I think that partially fixing something that doesn't
> work on real hardware is a wasted effort.)

To make sure I understand this correctly... Do you mean that real
IOAPIC hardware will not handle this EOI broadcast correctly even if
we fill in matched vector in the IOAPIC entry with IRTE one (when IR
is enabled)?

I'd appreciate if there is any link or anything that can provide me
more background on this matter.. TIA.

> 
> Or did you mean that mismatched vector is a possible source of the fixed
> bug?  (I originally dismissed it, because real hardware works.)

Nop. The above patch fixes the hack for "explicit IOAPIC EOI", and I
suppose mismatched vector issue will cause "EOI broadcast" problem.
But IIUC from your above comment, we can temporarily skip this
"issue" for now, if it won't work even on real hardwares and even
vectors are matched.

Anyway, as long as the explicit EOI works, we can survive. And this
gives me the reason to send v5 first.

> 
> >> > - I encountered that level-triggered entries in IOAPIC is marked as
> >> >   edge-triggered interrupt in APIC (which is strange)...
> >> 
> >> What/where do you mean?
> >> (The only difference I know of is that level triggered vectors in LAPIC
> >>  have their respective TMR bit set while edge do not.)
> > 
> > Exactly. Here is what I mean:
> > 
> > static void apic_eoi(APICCommonState *s)
> > {
> >     int isrv;
> >     isrv = get_highest_priority_int(s->isr);
> >     if (isrv < 0)
> >         return;
> >     apic_reset_bit(s->isr, isrv);
> >     if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && apic_get_bit(s->tmr, isrv)) {
> >         ioapic_eoi_broadcast(isrv);
> >     }
> >     apic_sync_vapic(s, SYNC_FROM_VAPIC | SYNC_TO_VAPIC);
> >     apic_update_irq(s);
> > }
> > 
> > APIC will notify IOAPIC only if the corresponding vector in TMR bit
> > is set (in "apic_get_bit(s->tmr, isrv)", or say, it's a
> > level-triggered interrupt in APIC registers). What I have traced is
> > that, the EOI broadcast is missing because this bit is cleared in
> > APIC TMR while it should be set. I need some more tests to double
> > confirm this though, in case I made any mistake.
> 
> (There are two "legal" situations where TMR can be 0 and IOAPIC sets
>  remote IRR -- if edge and level interrupts are assigned to the same
>  vector and if IOAPIC is level while IR and OS edge, both would bug on
>  real hardware too ...)
> 
> Does QEMU bug with TCG?

Gave it a shot today. It happens as well.

Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-27 14:31             ` Radim Krčmář
  2016-04-28  5:27               ` Peter Xu
@ 2016-04-28  6:06               ` Peter Xu
  2016-04-28  6:44                 ` Peter Xu
  1 sibling, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-04-28  6:06 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

On Wed, Apr 27, 2016 at 04:31:13PM +0200, Radim Krčmář wrote:
> >> > + */
> >> > +static inline void
> >> > +ioapic_fix_edge_remote_irr(uint64_t *entry)
> >> > +{
> >> > +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> >> > +        /* Level triggered interrupts, make sure remote IRR is zero */
> >> > +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> >> 
> >> (You can just unconditionally zero it, edge doesn't care.)
> > 
> > Ah! I made a mistake. I suppose what I really want is:
> > 
> > +    if (!(*entry & IOAPIC_LVT_TRIGGER_MODE)) {
> > +        /* Edge-triggered interrupts, make sure remote IRR is zero */
> > +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> > +    }
> > 
> > Though both should help do the trick, I should be using this new
> > one in v5.
> 
> (You'd need to look at the old value for this to work.)

Yes, you are right. The problem is that, we actually has RW
permission for remote IRR bit in emulated IOAPIC. If so, I'd rather
take the original version, and unconditionally zero it, as you have
adviced (also, will fix up the comments to get them aligned).

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-28  6:06               ` Peter Xu
@ 2016-04-28  6:44                 ` Peter Xu
  0 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-04-28  6:44 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

On Thu, Apr 28, 2016 at 02:06:17PM +0800, Peter Xu wrote:
> On Wed, Apr 27, 2016 at 04:31:13PM +0200, Radim Krčmář wrote:
> > >> > + */
> > >> > +static inline void
> > >> > +ioapic_fix_edge_remote_irr(uint64_t *entry)
> > >> > +{
> > >> > +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> > >> > +        /* Level triggered interrupts, make sure remote IRR is zero */
> > >> > +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> > >> 
> > >> (You can just unconditionally zero it, edge doesn't care.)
> > > 
> > > Ah! I made a mistake. I suppose what I really want is:
> > > 
> > > +    if (!(*entry & IOAPIC_LVT_TRIGGER_MODE)) {
> > > +        /* Edge-triggered interrupts, make sure remote IRR is zero */
> > > +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> > > +    }
> > > 
> > > Though both should help do the trick, I should be using this new
> > > one in v5.
> > 
> > (You'd need to look at the old value for this to work.)
> 
> Yes, you are right. The problem is that, we actually has RW
> permission for remote IRR bit in emulated IOAPIC. If so, I'd rather
> take the original version, and unconditionally zero it, as you have
> adviced (also, will fix up the comments to get them aligned).

After a second thought, a better idea (though may need several more
lines of codes) is to make sure the RO bits in IOAPIC entry are
read-only (I mean, "real" read-only) before the above hack. I
suppose this further matches with real hardware behavior.

Let me send v5 directly to see the codes.

Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-28  5:27               ` Peter Xu
@ 2016-04-28 16:24                 ` Radim Krčmář
  0 siblings, 0 replies; 43+ messages in thread
From: Radim Krčmář @ 2016-04-28 16:24 UTC (permalink / raw)
  To: Peter Xu
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, pbonzini, alex.williamson, wexu

2016-04-28 13:27+0800, Peter Xu:
> On Wed, Apr 27, 2016 at 04:31:13PM +0200, Radim Krčmář wrote:
> 
> [...]
> 
>> >> > I am still looking into guest part codes. Although the above patch
>> >> > should solve the issue, there are still issues in guest codes when
>> >> > IR is enabled:
>> >> > 
>> >> > - mismatched "vector" in IOAPIC entry and IRTE entry (this is
>> >> >   required in vt-d spec 5.1.5.1, and required to correctly deliver
>> >> >   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
>> >> 
>> >> "required" is a way of saying that the opposite is undefined.
>> >> No need to think about it in IOMMU.
>> > 
>> > Why? Without correct vector information, IOAPIC will not be able to
>> > know which entry to clear the Remote IRR bit (please check
>> > ioapic_eoi_broadcast())?
>> 
>> IOAPIC won't get correct EOI and Intel made it into an OS bug, because
>> there was no good action that the hardware could take.  (We have a lot
>> more freedom, but I think that partially fixing something that doesn't
>> work on real hardware is a wasted effort.)
> 
> To make sure I understand this correctly... Do you mean that real
> IOAPIC hardware will not handle this EOI broadcast correctly even if
> we fill in matched vector in the IOAPIC entry with IRTE one (when IR
> is enabled)?

No, if the OS configures same vector in IR and IOAPIC, then EOI
broadcast will work just fine.

My point was that the OS *must* do it that way.  If the OS doesn't, then
hardware's behavior is undefined = everything that happens is correct.
QEMU/KVM just shouldn't bug.  I think that QEMU even behaves pretty much
like real hardware here, so doing nothing now is the best choice.

> I'd appreciate if there is any link or anything that can provide me
> more background on this matter.. TIA.

Hm, I only read the specs ...

LAPIC EOI broadcast doesn't distinguish whether IOAPIC or IR injected
the interrupt and notifies IOAPICs with the vector in ISR.  The vector
doesn't provide enough information for a unique mapping between IOAPIC
and IR entries, so IOAPIC just clears Remote IRR bits of the vector.
There is no nice solution if you allow different vectors, so the
hardware doesn't.

>> Or did you mean that mismatched vector is a possible source of the fixed
>> bug?  (I originally dismissed it, because real hardware works.)
> 
> Nop. The above patch fixes the hack for "explicit IOAPIC EOI", and I
> suppose mismatched vector issue will cause "EOI broadcast" problem.
> But IIUC from your above comment, we can temporarily skip this
> "issue" for now, if it won't work even on real hardwares and even
> vectors are matched.
> 
> Anyway, as long as the explicit EOI works, we can survive. And this
> gives me the reason to send v5 first.

Yep.  EOI broadcast has to work in some cases, though, I'm sorry if I
said the opposite.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-04-26  7:34       ` Peter Xu
  2016-04-26  7:57         ` Jan Kiszka
  2016-04-26 14:19         ` Radim Krčmář
@ 2016-05-09 11:58         ` Paolo Bonzini
  2016-05-10  6:09           ` Peter Xu
  2 siblings, 1 reply; 43+ messages in thread
From: Paolo Bonzini @ 2016-05-09 11:58 UTC (permalink / raw)
  To: Peter Xu, Jan Kiszka
  Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
	rkrcmar, alex.williamson, wexu



On 26/04/2016 09:34, Peter Xu wrote:
> +/*
> + * This is to satisfy the hack in Linux kernel. One hack of it is to
> + * simulate clearing the Remote IRR bit of IOAPIC entry using the
> + * following:
> + *
> + * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
> + * Otherwise, we simulate the EOI message manually by changing the trigger
> + * mode to edge and then back to level, with RTE being masked during
> + * this."
> + *
> + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
> + *
> + * This is based on the assumption that, Remote IRR bit will be
> + * cleared by IOAPIC hardware for edge-triggered interrupts (I
> + * believe that's what the IOAPIC version 0x1X hardware does). So
> + * if we are emulating it, we'd better do it the same here, so that
> + * the guest kernel hack will work as well on QEMU.
> + *
> + * Without this, level-triggered interrupts in IR mode might fail to
> + * work correctly.
> + */
> +static inline void
> +ioapic_fix_edge_remote_irr(uint64_t *entry)
> +{
> +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> +        /* Level triggered interrupts, make sure remote IRR is zero */
> +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> +    }
> +}
> +
>  static void
>  ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                   unsigned int size)
> @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                      s->ioredtbl[index] &= ~0xffffffffULL;
>                      s->ioredtbl[index] |= val;
>                  }
> +                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
>                  ioapic_service(s);
>              }
>          }

Is this enough too?

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 378e663..2443a35 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -72,6 +72,7 @@ static void ioapic_service(IOAPICCommonState *s)
                     (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
                 if (trig_mode == IOAPIC_TRIGGER_EDGE) {
                     s->irr &= ~mask;
+                    s->ioredtbl[i] &= ~IOAPIC_LVT_REMOTE_IRR;
                 } else {
                     coalesce = s->ioredtbl[i] & IOAPIC_LVT_REMOTE_IRR;
                     s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;

Thanks,

Paolo

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-05-09 11:58         ` Paolo Bonzini
@ 2016-05-10  6:09           ` Peter Xu
  2016-05-10  8:58             ` Paolo Bonzini
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Xu @ 2016-05-10  6:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, rkrcmar, alex.williamson, wexu

On Mon, May 09, 2016 at 01:58:48PM +0200, Paolo Bonzini wrote:
> Is this enough too?
> 
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> index 378e663..2443a35 100644
> --- a/hw/intc/ioapic.c
> +++ b/hw/intc/ioapic.c
> @@ -72,6 +72,7 @@ static void ioapic_service(IOAPICCommonState *s)
>                      (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
>                  if (trig_mode == IOAPIC_TRIGGER_EDGE) {
>                      s->irr &= ~mask;
> +                    s->ioredtbl[i] &= ~IOAPIC_LVT_REMOTE_IRR;
>                  } else {
>                      coalesce = s->ioredtbl[i] & IOAPIC_LVT_REMOTE_IRR;
>                      s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;

I gave it a quick shot on this but still got the warning. :(

I _guess_ the problem is: the above change is in the "if" block of
(s->irr & mask), when the kernel plays the trick of EOI, the irq
should be pulled down already by the device (or say, irr bit is
cleared). So it does not go into this "if" block.

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-05-10  6:09           ` Peter Xu
@ 2016-05-10  8:58             ` Paolo Bonzini
  2016-05-10 10:10               ` Peter Xu
  0 siblings, 1 reply; 43+ messages in thread
From: Paolo Bonzini @ 2016-05-10  8:58 UTC (permalink / raw)
  To: Peter Xu
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, rkrcmar, alex.williamson, wexu



On 10/05/2016 08:09, Peter Xu wrote:
> On Mon, May 09, 2016 at 01:58:48PM +0200, Paolo Bonzini wrote:
>> Is this enough too?
>>
>> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
>> index 378e663..2443a35 100644
>> --- a/hw/intc/ioapic.c
>> +++ b/hw/intc/ioapic.c
>> @@ -72,6 +72,7 @@ static void ioapic_service(IOAPICCommonState *s)
>>                      (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
>>                  if (trig_mode == IOAPIC_TRIGGER_EDGE) {
>>                      s->irr &= ~mask;
>> +                    s->ioredtbl[i] &= ~IOAPIC_LVT_REMOTE_IRR;
>>                  } else {
>>                      coalesce = s->ioredtbl[i] & IOAPIC_LVT_REMOTE_IRR;
>>                      s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
> 
> I gave it a quick shot on this but still got the warning. :(
> 
> I _guess_ the problem is: the above change is in the "if" block of
> (s->irr & mask), when the kernel plays the trick of EOI, the irq
> should be pulled down already by the device (or say, irr bit is
> cleared). So it does not go into this "if" block.

No problem; feel free to send the other patch separately and I'll take
care of merging it.  Otherwise mst can merge it too.

Paolo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
  2016-05-10  8:58             ` Paolo Bonzini
@ 2016-05-10 10:10               ` Peter Xu
  0 siblings, 0 replies; 43+ messages in thread
From: Peter Xu @ 2016-05-10 10:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang,
	marcel, mst, rkrcmar, alex.williamson, wexu

On Tue, May 10, 2016 at 10:58:02AM +0200, Paolo Bonzini wrote:
> 
> 
> On 10/05/2016 08:09, Peter Xu wrote:
> > On Mon, May 09, 2016 at 01:58:48PM +0200, Paolo Bonzini wrote:
> >> Is this enough too?
> >>
> >> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> >> index 378e663..2443a35 100644
> >> --- a/hw/intc/ioapic.c
> >> +++ b/hw/intc/ioapic.c
> >> @@ -72,6 +72,7 @@ static void ioapic_service(IOAPICCommonState *s)
> >>                      (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
> >>                  if (trig_mode == IOAPIC_TRIGGER_EDGE) {
> >>                      s->irr &= ~mask;
> >> +                    s->ioredtbl[i] &= ~IOAPIC_LVT_REMOTE_IRR;
> >>                  } else {
> >>                      coalesce = s->ioredtbl[i] & IOAPIC_LVT_REMOTE_IRR;
> >>                      s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
> > 
> > I gave it a quick shot on this but still got the warning. :(
> > 
> > I _guess_ the problem is: the above change is in the "if" block of
> > (s->irr & mask), when the kernel plays the trick of EOI, the irq
> > should be pulled down already by the device (or say, irr bit is
> > cleared). So it does not go into this "if" block.
> 
> No problem; feel free to send the other patch separately and I'll take
> care of merging it.  Otherwise mst can merge it too.

Indeed these two patches are totally independent from the IOMMU
content. I'll send them seperately then. :)

Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2016-05-10 10:11 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-19  8:38 [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 01/16] acpi: enable INTR for DMAR report structure Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 02/16] intel_iommu: allow queued invalidation for IR Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 03/16] intel_iommu: set IR bit for ECAP register Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 04/16] acpi: add DMAR scope definition for root IOAPIC Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 05/16] intel_iommu: define interrupt remap table addr register Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 06/16] intel_iommu: handle interrupt remap enable Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 07/16] intel_iommu: define several structs for IOMMU IR Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 08/16] intel_iommu: provide helper function vtd_get_iommu Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 09/16] intel_iommu: add IR translation faults defines Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 10/16] intel_iommu: Add support for PCI MSI remap Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 11/16] q35: ioapic: add support for emulated IOAPIC IR Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 12/16] ioapic: introduce ioapic_entry_parse() helper Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 13/16] intel_iommu: add support for split irqchip Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 14/16] q35: add "int-remap" flag to enable intr Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 15/16] intel_iommu: introduce IEC notifiers Peter Xu
2016-04-19  8:38 ` [Qemu-devel] [PATCH v4 16/16] ioapic: register VT-d IEC invalidate notifier Peter Xu
2016-04-25  5:16 ` [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
2016-04-25  7:18   ` Peter Xu
2016-04-25  7:24     ` Jan Kiszka
2016-04-25 16:38       ` Radim Krčmář
2016-04-26  7:34       ` Peter Xu
2016-04-26  7:57         ` Jan Kiszka
2016-04-26  8:15           ` Jan Kiszka
2016-04-26 10:38             ` Peter Xu
2016-04-26 10:51               ` Jan Kiszka
2016-04-26 11:40                 ` Peter Xu
2016-04-26 14:24                   ` Jan Kiszka
2016-04-26 14:59                     ` Radim Krčmář
2016-04-26 15:28                       ` Jan Kiszka
2016-04-26 16:07                         ` Radim Krčmář
2016-04-26 17:47                           ` Jan Kiszka
2016-04-26 14:19         ` Radim Krčmář
2016-04-27  7:29           ` Peter Xu
2016-04-27 14:31             ` Radim Krčmář
2016-04-28  5:27               ` Peter Xu
2016-04-28 16:24                 ` Radim Krčmář
2016-04-28  6:06               ` Peter Xu
2016-04-28  6:44                 ` Peter Xu
2016-05-09 11:58         ` Paolo Bonzini
2016-05-10  6:09           ` Peter Xu
2016-05-10  8:58             ` Paolo Bonzini
2016-05-10 10:10               ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.