All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM
@ 2016-10-10 15:28 Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 1/7] apic: add global apic_get_class() Radim Krčmář
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

v4: http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg00698.html
v5 is one patch shorter as it merged two patches from v4 into [6/7].

The x2APIC mode works on >=2.8 machine types with this series and <2.7
remain compatible (aka broken).


Radim Krčmář (7):
  apic: add global apic_get_class()
  apic: add send_msi() to APICCommonClass
  intel_iommu: pass whole remapped addresses to apic
  intel_iommu: redo configuraton check in realize
  intel_iommu: add OnOffAuto intr_eim as "eim" property
  intel_iommu: reject broken EIM
  target-i386/kvm: cache the return value of kvm_enable_x2apic()

 hw/i386/intel_iommu.c           | 81 ++++++++++++++++++++++++++++++-----------
 hw/i386/kvm/apic.c              | 19 +++++++---
 hw/i386/xen/xen_apic.c          |  6 +++
 hw/intc/apic.c                  |  8 +++-
 hw/intc/apic_common.c           |  1 +
 include/hw/compat.h             |  4 ++
 include/hw/i386/apic_internal.h |  6 +++
 include/hw/i386/intel_iommu.h   |  2 +
 target-i386/cpu.c               | 13 +++++--
 target-i386/kvm-stub.c          |  5 +++
 target-i386/kvm.c               | 26 +++++++++++++
 target-i386/kvm_i386.h          |  1 +
 12 files changed, 140 insertions(+), 32 deletions(-)

-- 
2.10.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 1/7] apic: add global apic_get_class()
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 2/7] apic: add send_msi() to APICCommonClass Radim Krčmář
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

Every configuration has only up to one APIC class and we'll be extending
the class with a function that can be called without an instanced
object, so a direct access to the class is convenient.

This patch will break compilation if some code uses apic_get_class()
with CONFIG_USER_ONLY.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5: r-b Eduardo and Peter
v4: do not use private class attribute [Eduardo]
v3: completely rewrite the mechanism [Eduardo]
v2: assert() instead of error_report() and exit() [Peter]
---
 hw/intc/apic_common.c           |  1 +
 include/hw/i386/apic_internal.h |  2 ++
 target-i386/cpu.c               | 13 ++++++++++---
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 14ac43c18666..8d01c9c8750e 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -18,6 +18,7 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>
  */
 #include "qemu/osdep.h"
+#include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qemu-common.h"
 #include "cpu.h"
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 06c4e9f6f95b..286684857e9f 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -222,4 +222,6 @@ static inline int apic_get_bit(uint32_t *tab, int index)
     return !!(tab[i] & mask);
 }
 
+APICCommonClass *apic_get_class(void);
+
 #endif /* QEMU_APIC_INTERNAL_H */
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 1c57fce81bcd..13505ab156e0 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2848,9 +2848,8 @@ static void mce_init(X86CPU *cpu)
 }
 
 #ifndef CONFIG_USER_ONLY
-static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
+APICCommonClass *apic_get_class(void)
 {
-    APICCommonState *apic;
     const char *apic_type = "apic";
 
     if (kvm_apic_in_kernel()) {
@@ -2859,7 +2858,15 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
         apic_type = "xen-apic";
     }
 
-    cpu->apic_state = DEVICE(object_new(apic_type));
+    return APIC_COMMON_CLASS(object_class_by_name(apic_type));
+}
+
+static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
+{
+    APICCommonState *apic;
+    ObjectClass *apic_class = OBJECT_CLASS(apic_get_class());
+
+    cpu->apic_state = DEVICE(object_new(object_class_get_name(apic_class)));
 
     object_property_add_child(OBJECT(cpu), "lapic",
                               OBJECT(cpu->apic_state), &error_abort);
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 2/7] apic: add send_msi() to APICCommonClass
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 1/7] apic: add global apic_get_class() Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 3/7] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

The MMIO based interface to APIC doesn't work well with MSIs that have
upper address bits set (remapped x2APIC MSIs).  A specialized interface
is a quick and dirty way to avoid the shortcoming.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5: r-b Peter
v4: r-b Igor
v2: change apic_send_msi() to accept MSIMessage [Igor]
---
 hw/i386/kvm/apic.c              | 19 +++++++++++++------
 hw/i386/xen/xen_apic.c          |  6 ++++++
 hw/intc/apic.c                  |  8 ++++++--
 include/hw/i386/apic_internal.h |  4 ++++
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index c016e63fc2ba..be55102c00ca 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -169,6 +169,17 @@ static void kvm_apic_external_nmi(APICCommonState *s)
     run_on_cpu(CPU(s->cpu), do_inject_external_nmi, s);
 }
 
+static void kvm_send_msi(MSIMessage *msg)
+{
+    int ret;
+
+    ret = kvm_irqchip_send_msi(kvm_state, *msg);
+    if (ret < 0) {
+        fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
+                strerror(-ret));
+    }
+}
+
 static uint64_t kvm_apic_mem_read(void *opaque, hwaddr addr,
                                   unsigned size)
 {
@@ -179,13 +190,8 @@ static void kvm_apic_mem_write(void *opaque, hwaddr addr,
                                uint64_t data, unsigned size)
 {
     MSIMessage msg = { .address = addr, .data = data };
-    int ret;
 
-    ret = kvm_irqchip_send_msi(kvm_state, msg);
-    if (ret < 0) {
-        fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
-                strerror(-ret));
-    }
+    kvm_send_msi(&msg);
 }
 
 static const MemoryRegionOps kvm_apic_io_ops = {
@@ -232,6 +238,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void *data)
     k->enable_tpr_reporting = kvm_apic_enable_tpr_reporting;
     k->vapic_base_update = kvm_apic_vapic_base_update;
     k->external_nmi = kvm_apic_external_nmi;
+    k->send_msi = kvm_send_msi;
 }
 
 static const TypeInfo kvm_apic_info = {
diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
index 21d68ee04b0a..55769eba7ede 100644
--- a/hw/i386/xen/xen_apic.c
+++ b/hw/i386/xen/xen_apic.c
@@ -68,6 +68,11 @@ static void xen_apic_external_nmi(APICCommonState *s)
 {
 }
 
+static void xen_send_msi(MSIMessage *msi)
+{
+    xen_hvm_inject_msi(msi->address, msi->data);
+}
+
 static void xen_apic_class_init(ObjectClass *klass, void *data)
 {
     APICCommonClass *k = APIC_COMMON_CLASS(klass);
@@ -78,6 +83,7 @@ static void xen_apic_class_init(ObjectClass *klass, void *data)
     k->get_tpr = xen_apic_get_tpr;
     k->vapic_base_update = xen_apic_vapic_base_update;
     k->external_nmi = xen_apic_external_nmi;
+    k->send_msi = xen_send_msi;
 }
 
 static const TypeInfo xen_apic_info = {
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 7bd1d279c463..fe15fb602473 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -740,8 +740,10 @@ static uint32_t apic_mem_readl(void *opaque, hwaddr addr)
     return val;
 }
 
-static void apic_send_msi(hwaddr addr, uint32_t data)
+static void apic_send_msi(MSIMessage *msi)
 {
+    uint64_t addr = msi->address;
+    uint32_t data = msi->data;
     uint8_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
     uint8_t vector = (data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
     uint8_t dest_mode = (addr >> MSI_ADDR_DEST_MODE_SHIFT) & 0x1;
@@ -762,7 +764,8 @@ static void apic_mem_writel(void *opaque, hwaddr addr, uint32_t val)
          * APIC is connected directly to the CPU.
          * Mapping them on the global bus happens to work because
          * MSI registers are reserved in APIC MMIO and vice versa. */
-        apic_send_msi(addr, val);
+        MSIMessage msi = { .address = addr, .data = val };
+        apic_send_msi(&msi);
         return;
     }
 
@@ -913,6 +916,7 @@ static void apic_class_init(ObjectClass *klass, void *data)
     k->external_nmi = apic_external_nmi;
     k->pre_save = apic_pre_save;
     k->post_load = apic_post_load;
+    k->send_msi = apic_send_msi;
 }
 
 static const TypeInfo apic_info = {
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 286684857e9f..cdd11fb0938f 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -146,6 +146,10 @@ typedef struct APICCommonClass
     void (*pre_save)(APICCommonState *s);
     void (*post_load)(APICCommonState *s);
     void (*reset)(APICCommonState *s);
+    /* send_msi emulates an APIC bus and its proper place would be in a new
+     * device, but it's convenient to have it here for now.
+     */
+    void (*send_msi)(MSIMessage *msi);
 } APICCommonClass;
 
 struct APICCommonState {
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 3/7] intel_iommu: pass whole remapped addresses to apic
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 1/7] apic: add global apic_get_class() Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 2/7] apic: add send_msi() to APICCommonClass Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 4/7] intel_iommu: redo configuraton check in realize Radim Krčmář
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

The MMIO interface to APIC only allowed 8 bit addresses, which is not
enough for 32 bit addresses from EIM remapping.
Intel stored upper 24 bits in the high MSI address, so use the same
technique. The technique is also used in KVM MSI interface.
Other APICs are unlikely to handle those upper bits.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5: added r-b Peter, removed r-b Igor (added by mistake)
v4: r-b Igor
v2: fix build with enabled DEBUG_INTEL_IOMMU [Peter]
---
 hw/i386/intel_iommu.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9f4e64af1ad5..c39b62b898d8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -31,6 +31,7 @@
 #include "hw/i386/x86-iommu.h"
 #include "hw/pci-host/q35.h"
 #include "sysemu/kvm.h"
+#include "hw/i386/apic_internal.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
 static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
                                    hwaddr mesg_data_reg)
 {
-    hwaddr addr;
-    uint32_t data;
+    MSIMessage msi;
 
     assert(mesg_data_reg < DMAR_REG_SIZE);
     assert(mesg_addr_reg < DMAR_REG_SIZE);
 
-    addr = vtd_get_long_raw(s, mesg_addr_reg);
-    data = vtd_get_long_raw(s, mesg_data_reg);
+    msi.address = vtd_get_long_raw(s, mesg_addr_reg);
+    msi.data = vtd_get_long_raw(s, mesg_data_reg);
 
-    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
-    address_space_stl_le(&address_space_memory, addr, data,
-                         MEMTXATTRS_UNSPECIFIED, NULL);
+    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32,
+                msi.address, msi.data);
+    apic_get_class()->send_msi(&msi);
 }
 
 /* Generate a fault event to software via MSI if conditions are met.
@@ -2133,6 +2133,7 @@ static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
     msg.dest_mode = irq->dest_mode;
     msg.redir_hint = irq->redir_hint;
     msg.dest = irq->dest;
+    msg.__addr_hi = irq->dest & 0xffffff00;
     msg.__addr_head = cpu_to_le32(0xfee);
     /* Keep this from original MSI address bits */
     msg.__not_used = irq->msi_addr_last_bits;
@@ -2281,11 +2282,7 @@ static MemTxResult vtd_mem_ir_write(void *opaque, hwaddr addr,
                 " for device sid 0x%04x",
                 to.address, to.data, sid);
 
-    if (dma_memory_write(&address_space_memory, to.address,
-                         &to.data, size)) {
-        VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64
-                    " value 0x%"PRIx32, to.address, to.data);
-    }
+    apic_get_class()->send_msi(&to);
 
     return MEMTX_OK;
 }
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 4/7] intel_iommu: redo configuraton check in realize
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
                   ` (2 preceding siblings ...)
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 3/7] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 5/7] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

* there no point in configuring the device if realization is going to
  fail, so move the check to the beginning,
* create a separate function for the check,
* use error_setg() instead error_report().

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5: r-b Peter
v4: r-b Igor
v3:
* use error_setg [Paolo]
* create a new function [Peter]
---
 hw/i386/intel_iommu.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index c39b62b898d8..5b06b4091f36 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -21,6 +21,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
+#include "qapi/error.h"
 #include "hw/sysbus.h"
 #include "exec/address-spaces.h"
 #include "intel_iommu_internal.h"
@@ -2448,6 +2449,18 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return &vtd_as->as;
 }
 
+static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
+{
+    /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
+    if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
+        !kvm_irqchip_is_split()) {
+        error_setg(errp, "Intel Interrupt Remapping cannot work with "
+                         "kernel-irqchip=on, please use 'split|off'.");
+        return false;
+    }
+    return true;
+}
+
 static void vtd_realize(DeviceState *dev, Error **errp)
 {
     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
@@ -2457,6 +2470,11 @@ static void vtd_realize(DeviceState *dev, Error **errp)
 
     VTD_DPRINTF(GENERAL, "");
     x86_iommu->type = TYPE_INTEL;
+
+    if (!vtd_check_config(x86_iommu, errp)) {
+        return;
+    }
+
     memset(s->vtd_as_by_bus_num, 0, sizeof(s->vtd_as_by_bus_num));
     memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s,
                           "intel_iommu", DMAR_REG_SIZE);
@@ -2471,14 +2489,6 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
     /* Pseudo address space under root PCI bus. */
     pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
-
-    /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
-    if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
-        !kvm_irqchip_is_split()) {
-        error_report("Intel Interrupt Remapping cannot work with "
-                     "kernel-irqchip=on, please use 'split|off'.");
-        exit(1);
-    }
 }
 
 static void vtd_class_init(ObjectClass *klass, void *data)
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 5/7] intel_iommu: add OnOffAuto intr_eim as "eim" property
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
                   ` (3 preceding siblings ...)
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 4/7] intel_iommu: redo configuraton check in realize Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM Radim Krčmář
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

The default (auto) emulates the current behavior.
A user can now control EIM like
  -device intel-iommu,intremap=on,eim=off

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5: r-b Peter
v4:
 * r-b Igor
 * added an example to the commit message
v3:
 * use error_setg [Paolo]
 * shorten the code [Peter]
---
 hw/i386/intel_iommu.c         | 24 +++++++++++++++++++++---
 include/hw/i386/intel_iommu.h |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 5b06b4091f36..17892b8c336b 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2012,6 +2012,8 @@ static const MemoryRegionOps vtd_mem_ops = {
 
 static Property vtd_properties[] = {
     DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
+    DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
+                            ON_OFF_AUTO_AUTO),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2368,7 +2370,11 @@ static void vtd_init(IntelIOMMUState *s)
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
     if (x86_iommu->intr_supported) {
-        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
+        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
+        if (s->intr_eim == ON_OFF_AUTO_ON) {
+            s->ecap |= VTD_ECAP_EIM;
+        }
+        assert(s->intr_eim != ON_OFF_AUTO_AUTO);
     }
 
     vtd_reset_context_cache(s);
@@ -2449,8 +2455,10 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return &vtd_as->as;
 }
 
-static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
+static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
 {
+    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+
     /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
     if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
         !kvm_irqchip_is_split()) {
@@ -2458,6 +2466,16 @@ static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
                          "kernel-irqchip=on, please use 'split|off'.");
         return false;
     }
+    if (s->intr_eim == ON_OFF_AUTO_ON && !x86_iommu->intr_supported) {
+        error_setg(errp, "eim=on cannot be selected without intremap=on");
+        return false;
+    }
+
+    if (s->intr_eim == ON_OFF_AUTO_AUTO) {
+        s->intr_eim = x86_iommu->intr_supported ?
+                                              ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
+    }
+
     return true;
 }
 
@@ -2471,7 +2489,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     VTD_DPRINTF(GENERAL, "");
     x86_iommu->type = TYPE_INTEL;
 
-    if (!vtd_check_config(x86_iommu, errp)) {
+    if (!vtd_decide_config(s, errp)) {
         return;
     }
 
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index a42dbd745a70..b5ac60927b1f 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -289,6 +289,7 @@ struct IntelIOMMUState {
     dma_addr_t intr_root;           /* Interrupt remapping table pointer */
     uint32_t intr_size;             /* Number of IR table entries */
     bool intr_eime;                 /* Extended interrupt mode enabled */
+    OnOffAuto intr_eim;             /* Toggle for EIM cabability */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
                   ` (4 preceding siblings ...)
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 5/7] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-10 17:49   ` Eduardo Habkost
  2016-10-11  8:52   ` Peter Xu
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 7/7] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
  2016-10-14 14:58 ` [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Eduardo Habkost
  7 siblings, 2 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

Cluster x2APIC cannot work without KVM's x2apic API when the maximal
APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
forbid other APICs and also the old KVM case with less than 9, to
simplify the code.

There is no point in enabling EIM in forbidden APICs, so we keep it
enabled only for the KVM APIC;  unconditionally, because making the
option depend on KVM version would be a maintanance burden.

Old QEMUs would enable eim whenever intremap was on, which would trick
guests into thinking that they can enable cluster x2APIC even if any
interrupt destination would get clamped to 8 bits.
Depending on your configuration, QEMU could notice that the destination
LAPIC is not present and report it with a very non-obvious:

  KVM: injection failed, MSI lost (Operation not permitted)

Or the guest could say something about unexpected interrupts, because
clamping leads to aliasing so interrupts were being delivered to
incorrect VCPUs.

KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.

QEMU 2.7 allowed EIM whenever interrupt remapping was enabled.  In order
to keep backward compatibility, we again allow guests to misbehave in
non-obvious ways, and make it the default for old machine types.

A user can enable the buggy mode it with "x-buggy-eim=on".

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5:
 * squash with the patch that added compat property:
   v5: rename property to x-buggy-eim [Eduardo and Michael]
   v4:
    * use a device property [Igor]
    * clarify the last sentence of the commit message
   v3: shorten the code [Peter]
 * no one gave r-b to both patches, so all were dropped
 * beautify a nested condition in vtd_decide_config() [Peter]
v4: be more specific in the comment [Igor]
v3:
 * use error_setg [Paolo]
 * shorten the code [Peter]
v2:
 * adapt to new intr_eim parameter
 * provide first linux version that has x2apic api
 * disable QEMU's LAPIC
---
 hw/i386/intel_iommu.c         | 16 +++++++++++++++-
 include/hw/compat.h           |  4 ++++
 include/hw/i386/intel_iommu.h |  1 +
 target-i386/kvm-stub.c        |  5 +++++
 target-i386/kvm.c             | 13 +++++++++++++
 target-i386/kvm_i386.h        |  1 +
 6 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 17892b8c336b..c8dcdf965025 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -33,6 +33,7 @@
 #include "hw/pci-host/q35.h"
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
+#include "kvm_i386.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -2014,6 +2015,7 @@ static Property vtd_properties[] = {
     DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
     DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
                             ON_OFF_AUTO_AUTO),
+    DEFINE_PROP_BOOL("x-buggy-eim", IntelIOMMUState, buggy_eim, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2472,9 +2474,21 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
     }
 
     if (s->intr_eim == ON_OFF_AUTO_AUTO) {
-        s->intr_eim = x86_iommu->intr_supported ?
+        s->intr_eim = (kvm_irqchip_in_kernel() || s->buggy_eim)
+                      && x86_iommu->intr_supported ?
                                               ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
     }
+    if (s->intr_eim == ON_OFF_AUTO_ON && !s->buggy_eim) {
+        if (!kvm_irqchip_in_kernel()) {
+            error_setg(errp, "eim=on requires accel=kvm,kernel-irqchip=split");
+            return false;
+        }
+        if (!kvm_enable_x2apic()) {
+            error_setg(errp, "eim=on requires support on the KVM side"
+                             "(X2APIC_API, first shipped in v4.7)");
+            return false;
+        }
+    }
 
     return true;
 }
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 46412b229a70..f9aa85d7ac42 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -10,6 +10,10 @@
         .driver   = "ioapic",\
         .property = "version",\
         .value    = "0x11",\
+    },{\
+        .driver   = "intel-iommu",\
+        .property = "x-buggy-eim",\
+        .value    = "true",\
     },
 
 #define HW_COMPAT_2_6 \
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index b5ac60927b1f..1989c1eec10a 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -290,6 +290,7 @@ struct IntelIOMMUState {
     uint32_t intr_size;             /* Number of IR table entries */
     bool intr_eime;                 /* Extended interrupt mode enabled */
     OnOffAuto intr_eim;             /* Toggle for EIM cabability */
+    bool buggy_eim;                 /* Force buggy EIM unless eim=off */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,
diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c
index cdf15061091d..bda4dc2f0c57 100644
--- a/target-i386/kvm-stub.c
+++ b/target-i386/kvm-stub.c
@@ -25,6 +25,11 @@ bool kvm_has_smm(void)
     return 1;
 }
 
+bool kvm_enable_x2apic(void)
+{
+    return false;
+}
+
 /* This function is only called inside conditionals which we
  * rely on the compiler to optimize out when CONFIG_KVM is not
  * defined.
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index ee1f53e56953..0fd664648665 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -122,6 +122,19 @@ bool kvm_allows_irq0_override(void)
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
 }
 
+static bool kvm_x2apic_api_set_flags(uint64_t flags)
+{
+    KVMState *s = KVM_STATE(current_machine->accelerator);
+
+    return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
+}
+
+bool kvm_enable_x2apic(void)
+{
+    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
+                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
+}
+
 static int kvm_get_tsc(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
diff --git a/target-i386/kvm_i386.h b/target-i386/kvm_i386.h
index 36407e0a5dc7..5c369b1c5b40 100644
--- a/target-i386/kvm_i386.h
+++ b/target-i386/kvm_i386.h
@@ -43,4 +43,5 @@ int kvm_device_msix_deassign(KVMState *s, uint32_t dev_id);
 
 void kvm_put_apicbase(X86CPU *cpu, uint64_t value);
 
+bool kvm_enable_x2apic(void);
 #endif
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v5 7/7] target-i386/kvm: cache the return value of kvm_enable_x2apic()
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
                   ` (5 preceding siblings ...)
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM Radim Krčmář
@ 2016-10-10 15:28 ` Radim Krčmář
  2016-10-14 14:58 ` [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Eduardo Habkost
  7 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2016-10-10 15:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

Assume that KVM would have returned the same on subsequent runs.
Abstract the memoizaiton pattern into macros and call it memorize as
adding the r makes it less obscure.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v5: r-b Igor
v4:
 * changed the name to memorize [Igor]
 * remove useless underscores in macro arguments
 * merge the two macros, as it seems that the deleted one wouldn't get
   other users anytime soon
---
 target-i386/kvm.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 0fd664648665..0472f45fd092 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -129,10 +129,23 @@ static bool kvm_x2apic_api_set_flags(uint64_t flags)
     return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
 }
 
+#define MEMORIZE(fn) \
+    ({ \
+        static typeof(fn) _result; \
+        static bool _memorized; \
+        \
+        if (_memorized) { \
+            return _result; \
+        } \
+        _memorized = true; \
+        _result = fn; \
+    })
+
 bool kvm_enable_x2apic(void)
 {
-    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
-                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
+    return MEMORIZE(
+             kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
+                                      KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK));
 }
 
 static int kvm_get_tsc(CPUState *cs)
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM Radim Krčmář
@ 2016-10-10 17:49   ` Eduardo Habkost
  2016-10-11  8:52   ` Peter Xu
  1 sibling, 0 replies; 11+ messages in thread
From: Eduardo Habkost @ 2016-10-10 17:49 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Igor Mammedov, Paolo Bonzini,
	Richard Henderson, Michael S. Tsirkin

On Mon, Oct 10, 2016 at 05:28:47PM +0200, Radim Krčmář wrote:
> Cluster x2APIC cannot work without KVM's x2apic API when the maximal
> APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
> forbid other APICs and also the old KVM case with less than 9, to
> simplify the code.
> 
> There is no point in enabling EIM in forbidden APICs, so we keep it
> enabled only for the KVM APIC;  unconditionally, because making the
> option depend on KVM version would be a maintanance burden.
> 
> Old QEMUs would enable eim whenever intremap was on, which would trick
> guests into thinking that they can enable cluster x2APIC even if any
> interrupt destination would get clamped to 8 bits.
> Depending on your configuration, QEMU could notice that the destination
> LAPIC is not present and report it with a very non-obvious:
> 
>   KVM: injection failed, MSI lost (Operation not permitted)
> 
> Or the guest could say something about unexpected interrupts, because
> clamping leads to aliasing so interrupts were being delivered to
> incorrect VCPUs.
> 
> KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.
> 
> QEMU 2.7 allowed EIM whenever interrupt remapping was enabled.  In order
> to keep backward compatibility, we again allow guests to misbehave in
> non-obvious ways, and make it the default for old machine types.
> 
> A user can enable the buggy mode it with "x-buggy-eim=on".
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM Radim Krčmář
  2016-10-10 17:49   ` Eduardo Habkost
@ 2016-10-11  8:52   ` Peter Xu
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Xu @ 2016-10-11  8:52 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Mon, Oct 10, 2016 at 05:28:47PM +0200, Radim Krčmář wrote:
> Cluster x2APIC cannot work without KVM's x2apic API when the maximal
> APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
> forbid other APICs and also the old KVM case with less than 9, to
> simplify the code.
> 
> There is no point in enabling EIM in forbidden APICs, so we keep it
> enabled only for the KVM APIC;  unconditionally, because making the
> option depend on KVM version would be a maintanance burden.
> 
> Old QEMUs would enable eim whenever intremap was on, which would trick
> guests into thinking that they can enable cluster x2APIC even if any
> interrupt destination would get clamped to 8 bits.
> Depending on your configuration, QEMU could notice that the destination
> LAPIC is not present and report it with a very non-obvious:
> 
>   KVM: injection failed, MSI lost (Operation not permitted)
> 
> Or the guest could say something about unexpected interrupts, because
> clamping leads to aliasing so interrupts were being delivered to
> incorrect VCPUs.
> 
> KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.
> 
> QEMU 2.7 allowed EIM whenever interrupt remapping was enabled.  In order
> to keep backward compatibility, we again allow guests to misbehave in
> non-obvious ways, and make it the default for old machine types.
> 
> A user can enable the buggy mode it with "x-buggy-eim=on".
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM
  2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
                   ` (6 preceding siblings ...)
  2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 7/7] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
@ 2016-10-14 14:58 ` Eduardo Habkost
  7 siblings, 0 replies; 11+ messages in thread
From: Eduardo Habkost @ 2016-10-14 14:58 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Michael S. Tsirkin, Peter Xu, Paolo Bonzini,
	Igor Mammedov, Richard Henderson

On Mon, Oct 10, 2016 at 05:28:41PM +0200, Radim Krčmář wrote:
> v4: http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg00698.html
> v5 is one patch shorter as it merged two patches from v4 into [6/7].
> 
> The x2APIC mode works on >=2.8 machine types with this series and <2.7
> remain compatible (aka broken).
> 
> 
> Radim Krčmář (7):
>   apic: add global apic_get_class()
>   apic: add send_msi() to APICCommonClass
>   intel_iommu: pass whole remapped addresses to apic
>   intel_iommu: redo configuraton check in realize
>   intel_iommu: add OnOffAuto intr_eim as "eim" property
>   intel_iommu: reject broken EIM
>   target-i386/kvm: cache the return value of kvm_enable_x2apic()

Applied to x86-next. Thanks!

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-10-14 14:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-10 15:28 [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Radim Krčmář
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 1/7] apic: add global apic_get_class() Radim Krčmář
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 2/7] apic: add send_msi() to APICCommonClass Radim Krčmář
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 3/7] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 4/7] intel_iommu: redo configuraton check in realize Radim Krčmář
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 5/7] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 6/7] intel_iommu: reject broken EIM Radim Krčmář
2016-10-10 17:49   ` Eduardo Habkost
2016-10-11  8:52   ` Peter Xu
2016-10-10 15:28 ` [Qemu-devel] [PATCH v5 7/7] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
2016-10-14 14:58 ` [Qemu-devel] [PATCH v5 0/7] intel_iommu: fix EIM Eduardo Habkost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.